How to do it...

  1. Create a new project log-parser with the simple Stack template:
        stack new log-parser simple
  1. Open log-parser.cabal and add the library containers as a dependent library in the subsection build-depends of the section executable:
        executable log-parser
        hs-source-dirs:      src
        main-is:             Main.hs
        default-language:    Haskell2010
        build-depends:       base >= 4.7 && < 5
                     , containers
  1. Open src/Main.hs and edit it. We will add log parsing and analysis here:
        module Main where
  1. Import modules for file IO and the strict map:
        import System.IO
        import qualified Data.Map.Strict as M
        import System.Environment
        import Control.Monad
  1. Read the file line by line:
        hLines :: Handle -> IO [String]
        hLines h = do
        isEOF <- hIsEOF h
        if isEOF then
          return []
        else
         (:) <$> hGetLine h <*> hLines h
  1. We are only interested in the host or IP. Grab it. Return an empty string if you are presented with an empty list:
        host :: [String] -> String
        host (h:_) = h
        host _     = ""
  1. Convert the list of lines into a list of host names using the functions words (to convert a line into words) and host (to take only the first of those words):
        hosts :: Handle -> IO [String]
        hosts h = fmap (host . words) <$> hLines h
  1. Given a hostname and a map, add the hostname to the map with access count one. If the host is already present in the map, then add the counts:
        updateAccessCount :: String -> M.Map String Int -> M.Map String 
Int updateAccessCount h mp = M.insertWith (+) h 1 mp
  1. Fold over the list of hosts, starting with an empty map and adding the hostname with the access count. Use the function updateAccessCount to combine the hostname (or IP) with the access count map:
        foldHosts :: [String] -> M.Map String Int
        foldHosts = foldr updateAccessCount M.empty
  1. Get the data from http://www.monitorware.com/en/logsamples/apache.php. The data is free to be used. We will give the relative path of the log file as an argument. Then, proceed to get a map. We will then convert the map to the list and then print the names of the hosts and their access count:

        main :: IO ()
        main = do
        (log:_) <- getArgs
        accessMap <- withFile log ReadMode (fmap foldHosts . hosts)
        let accesses = M.toAscList accessMap
        forM_ accesses $ \(host, count) -> do
        putStrLn $ host ++ "\t" ++ show count
  1. Build and run the project:
      stack build
      stack exec -- log-parser access_log/access_log
  1. The output should print the statistics of hostname or IP against its accesses: