- Create a new project log-parser with the simple Stack template:
stack new log-parser simple
- Open log-parser.cabal and add the library containers as a dependent library in the subsection build-depends of the section executable:
executable log-parser hs-source-dirs: src main-is: Main.hs default-language: Haskell2010 build-depends: base >= 4.7 && < 5 , containers
- Open src/Main.hs and edit it. We will add log parsing and analysis here:
module Main where
- Import modules for file IO and the strict map:
import System.IO import qualified Data.Map.Strict as M import System.Environment import Control.Monad
- Read the file line by line:
hLines :: Handle -> IO [String] hLines h = do isEOF <- hIsEOF h if isEOF then return [] else (:) <$> hGetLine h <*> hLines h
- We are only interested in the host or IP. Grab it. Return an empty string if you are presented with an empty list:
host :: [String] -> String host (h:_) = h host _ = ""
- Convert the list of lines into a list of host names using the functions words (to convert a line into words) and host (to take only the first of those words):
hosts :: Handle -> IO [String] hosts h = fmap (host . words) <$> hLines h
- Given a hostname and a map, add the hostname to the map with access count one. If the host is already present in the map, then add the counts:
updateAccessCount :: String -> M.Map String Int -> M.Map String
Int updateAccessCount h mp = M.insertWith (+) h 1 mp
- Fold over the list of hosts, starting with an empty map and adding the hostname with the access count. Use the function updateAccessCount to combine the hostname (or IP) with the access count map:
foldHosts :: [String] -> M.Map String Int foldHosts = foldr updateAccessCount M.empty
-
Get the data from http://www.monitorware.com/en/logsamples/apache.php. The data is free to be used. We will give the relative path of the log file as an argument. Then, proceed to get a map. We will then convert the map to the list and then print the names of the hosts and their access count:
main :: IO () main = do (log:_) <- getArgs accessMap <- withFile log ReadMode (fmap foldHosts . hosts) let accesses = M.toAscList accessMap forM_ accesses $ \(host, count) -> do putStrLn $ host ++ "\t" ++ show count
- Build and run the project:
stack build stack exec -- log-parser access_log/access_log
- The output should print the statistics of hostname or IP against its accesses:
