Lesson 39. Making HTTP requests in Haskell

After reading lesson 39, you’ll be able to

In this lesson, you’ll learn how to make an HTTP request in Haskell and save the results to a file. The data you’ll fetch is from the National Oceanic and Atmospheric Administration (NOAA) Climate Data API. This API requires you to send a custom HTTP request that uses SSL and has a custom header for authentication. You’ll use the Network.HTTP.Simple library, which will allow you to make simple requests as well as create custom HTTP requests. You’ll start by learning how to use Network.HTTP.Simple to fetch a web page from a URL. Then you’ll create a specific request for the NOAA API. In the end, you’ll have fetched JSON data from this API to be used in the next lesson.

Consider this

How would you go about writing a Haskell program that when ran would fetch the homepage of reddit.com and write it to a local .html file?

39.1. Getting your project set up

In this lesson, you’ll look at one of the most common tasks in contemporary programming: making an HTTP request. The aim of this project is to create a script that makes a request to the NOAA Climate Data API. The NOAA Climate Data API contains access to a wide range of climate-related data. On the API’s website (https://www.ncdc.noaa.gov/cdo-web/webservices/v2#gettingStarted), you can find a list of endpoints that the API offers. Here are a few of them:

Building a full wrapper for the NOAA API would be a project beyond the scope of a single lesson. You’ll focus on the first step in this process: getting results from the /datasets endpoint. The /datasets endpoint provides essential metadata you need to pass to the /data endpoint to request your data. Here’s an example entry:

"uid":"gov.noaa.ncdc:C00822",
"mindate":"2010-01-01",
"maxdate":"2010-12-01",
"name":"Normals Monthly",
"datacoverage":1,
"id":"NORMAL_MLY"

Even though fetching this data is a small part of the overall API, after you understand the basics of working with HTTP in Haskell, extending the project is straightforward. After you’ve made the request, you’ll write the body of the request to a JSON file. Although this is a fairly straightforward task, you’ll learn much about working with real-world Haskell along the way.

You’ll create a new stack project called http-lesson. As a quick refresher, the following steps will create and build your project:

$ stack update
$ stack new http-lesson
$ cd http-lesson
$ stack setup
$ stack build

For this simple project, you’ll keep everything in the Main module located in app/Main.hs.

Note

This project uses the NOAA Climate Data API to fetch JSON and save it to a file. In the next lesson, you’ll parse that JSON. This API is free to use but does require the user to request an API token. To get your token, go to www.ncdc.noaa.gov/cdo-web/token and fill in the form with your email address. Your token should be sent quickly. You’ll be making a request to see which data sets the API allows access to.

After you have your API token, you can start coding up your project.

39.1.1. Your starter code

You’ll start with adding imports to your Main module. Notice that you’ll import both Data.ByteString and Data.ByteString.Lazy. Importing multiple text or ByteString types is common in real-world Haskell. In this case, you’re doing so because different parts of the library you’ll be using require using either strict or lazy ByteStrings. You’ll import the Char8 module for both of these libraries, as it will make using them much easier, as we discussed in lesson 25. Finally, you’ll add the Network.HTTP.Simple library, which you’ll use for your HTTP requests.

Listing 39.1. The imports for your app/Main.hs file
module Main where

import qualified Data.ByteString as B
import qualified Data.ByteString.Char8 as BC
import qualified Data.ByteString.Lazy as L
import qualified Data.ByteString.Lazy.Char8 as LC
import Network.HTTP.Simple

Before you continue, you also need to update your http-lesson.cabal file to support these imports. You’ll add bytestring and http-conduit to your build-depends section. Because you’re working with ByteStrings and Char8, it’s also helpful to include the OverloadedStrings extension.

Listing 39.2. Modifying your project’s .cabal file
executable http-lesson-exe
  hs-source-dirs:      app
  main-is:             Main.hs
  ghc-options:         -threaded -rtsopts -with-rtsopts=-N
build-depends:       base
                     , http-lesson
                     , bytestring                   1
                     , http-conduit                 2
  default-language:    Haskell2010
  extensions:          OverloadedStrings            3

Note that stack will handle downloading all of your dependencies for http-conduit, and you don’t need to explicitly use the stack install command.

Next you can start fleshing out your Main with variables for the data you need. You’re only going to be concerned about a single API request, which will allow you to list all the data sets in the NOAA Climate Data API.

Listing 39.3. Variables that will be helpful in making your HTTP requests
myToken :: BC.ByteString
myToken = "<API TOKEN HERE>"

noaaHost :: BC.ByteString
noaaHost = "www.ncdc.noaa.gov"

apiPath :: BC.ByteString
apiPath = "/cdo-web/api/v2/datasets"

You also need placeholder code in your main IO action to ensure that your code will compile.

Listing 39.4. Placeholder code for your main action
main :: IO ()
main = print "hi"
Quick check 39.1

Q1:

If you didn’t include your OverloadedStrings extension in the .cabal file, how could you modify Main.hs to support OverloadedStrings?

QC 39.1 answer

1:

You could use the LANGUAGE pragma:

{-# LANGUAGE OverloadedStrings -#}

 

39.2. Using the HTTP.Simple module

Now that you have the basics in place, you can start playing around with HTTP requests. You’ll use the module Network.HTTP.Simple, which is part of the http-conduit package. As the name indicates, HTTP.Simple makes it easy for you to make simple HTTP requests. You’ll use the httpLBS (the LBS stands for lazy ByteString) function to submit your request. Normally, you’d have to create an instance of the Request data type to pass into this function. But httpLBS is able to cleverly take advantage of OverloadedStrings to make sure the correct type is passed in. Here’s a quick sample of fetching the data from the popular tech news site, Hacker News (https://news.ycombinator.com):

GHCi> import Network.HTTP.Simple
GHCi> response = httpLBS "http://news.ycombinator.com"

If you type this into GHCi, you’ll notice that the response variable is set instantly, even though you’re making an HTTP request. Typically, an HTTP request results in a noticeable delay in time due to the nature of making the request itself. Your variable is assigned instantly because of lazy evaluation. Even though you’ve defined a request, you still haven’t used it. If you enter response again, you’ll notice a slight delay:

GHCi> response
<large output>

You want to be able to access different pieces of your response. The first thing to check is the status code of the response. This is the HTTP code that tells whether your request was successful.

Common HTTP codes

In case you’re unfamiliar, here are some common HTTP status codes:

Network.HTTP.Simple contains the function getResponseStatusCode that gives you the status of your response. If you run this in GHCi, you immediately come across a problem:

GHCi> getResponseStatusCode response

<interactive>:6:23: error:
     No instance for (Control.Monad.IO.Class.MonadIO Response)
        arising from a use of 'response'

What happened here? The issue is that getResponseStatusCode is expecting a plain response type, as you can see from its type signature:

getResponseStatusCode :: Response a -> Int

But to make your HTTP request, you had to use IO, which means that your response variable is an IO (Response a) type.

A popular alternative to HTTP.Simple

Although Network.HTTP.Simple is fairly straightforward, it’s relatively bare bones. Many other Haskell packages are available for making HTTP requests. One of the more popular is the wreq package (https://hackage.haskell.org/package/wreq). Although wreq is a nice library, it would require you to learn another abstract Haskell topic: Lens. It’s worth pointing out that it’s common for Haskell packages to use new and interesting abstractions. If you loved unit 5 on monads, you may find this one of the more exciting parts of writing Haskell. But the love of abstraction can also be a frustration for beginners who may not want to learn yet another new idea when they just want to fetch data from an API.

You can solve this problem in two ways. The first way is to use your Functor <$> operator:

GHCi> getResponseStatusCode <$> response
200

Remember that <$> allows you to take a pure function and put it in a context. If you look at the type of your result, you’ll see it’s also in a context:

GHCi> :t getResponseStatusCode <$> response
getResponseStatusCode <$> response
:: Control.Monad.IO.Class.MonadIO f => f Int

An alternative solution is to assign response by using <- rather than =. Just as when you’re using do-notation, this allows you to treat a value in a context as though it were a pure value:

GHCi> response <- httpLBS "http://news.ycombinator.com"
GHCi> getResponseStatusCode response
200

Now that you understand the basics of an HTTP request, you’ll move on to make a more sophisticated request.

Quick check 39.2

Q1:

There’s also a getResponseHeader function. Use both <$> and <- to get the header of the response.

QC 39.2 answer

1:

Method 1:

GHCi> import Network.HTTP.Simple
GHCi> response = httpLBS "http://news.ycombinator.com"
GHCi> getResponseHeader <$> response

Method 2:

GHCi> response <- httpLBS "http://news.ycombinator.com"
GHCi> getResponseHeader response

 

39.3. Making an HTTP request

Although your simple use of httpLBS is convenient, you need to change a few things. Your request to the API requires you to use HTTPS rather than plain HTTP, as well as to pass your token in the header. You can’t simply put a URL into your request; you also need to do the following:

You can do this by using a series of functions that set these properties for your request. The code to build your request follows. Even though making this request is straightforward, you’re using an operator that you haven’t used in this book so far: the $ operator. The $ operator automatically wraps parentheses around your code (for more details, see the following sidebar).

Listing 39.5. The code for building an HTTPS request for the API
buildRequest :: BC.ByteString -> BC.ByteString -> BC.ByteString
             -> BC.ByteString -> Request
buildRequest token host method path  = setRequestMethod method
                                  $ setRequestHost host
                                  $ setRequestHeader "token" [token]
                                  $ setRequestPath path
                                  $ setRequestSecure True
                                  $ setRequestPort 443
                                  $ defaultRequest

request :: Request
request = buildRequest myToken noaaHost "GET" apiPath
The $ operator

The $ operator is most commonly used to automatically create parentheses. You can visualize the opening parentheses as starting with the $ and ending at the end of the function definition (covering multiple lines if necessary). For example, suppose you want to double 2 + 2. You need to add parentheses to make sure the operation works correctly:

GHCi> (*2) 2 + 2
6
GHCi> (*2) (2 + 2)
8

You could alternatively write this:

GHCi> (*2) $ 2 + 2
8

Here’s another example:

GHCi> head (map (++"!") ["dog","cat"])
"dog!"
GHCi> head $ map (++"!") ["dog","cat"]
"dog!"

For beginners, the $ often makes Haskell code more difficult to parse. In practice, the $ operator is used frequently, and you’ll likely find you prefer using it over many parentheses. There’s nothing magical about $; if you look at its type signature, you can see how it works:

($) :: (a -> b) -> a -> b

The arguments are just a function and a value. The trick is that $ is a binary operator, so it has lower precedence than the other functions you’re using. Therefore, the argument for the function will be evaluated as though it were in parentheses.

The interesting thing about this code is the way you’re handling changing the state of your request. You have a bunch of setValue functions, but how are they setting a value? You’ll get a better sense of what’s going on if you explore the types of these set methods:

GHCi> :t setRequestMethod
setRequestMethod :: BC.ByteString -> Request -> Request

GHCi> :t setRequestHeader
setRequestHeader:: HeaderName -> [BC.ByteString] -> Request -> Request

Here you see one functional solution to having state. Each setValue function takes the argument for the parameter it’s going to set and existing request data. You start with an initial request, defaultRequest, which is provided by the Network.HTTP.Simple module. You then create a new copy of the request data with the modified parameter, finally returning the modified request as a result. You saw this type of solution in unit 1, only much more verbose. You could rewrite your function, explicitly controlling your state with a let clause. Notice that these function calls are in reverse order.

Listing 39.6. buildRequest rewritten with the state saved as variables
buildRequest token host method path  =
   let state1 = setRequestPort 443 defaultRequest
   in let state2 = setRequestSecure True state1
      in let state3 = setRequestPath path state2
         in let state4 = setRequestHeader "token" [token] state3
            in setRequestHost host state4

Using the $ operator to make each setValue function serve as the argument to the next function makes the code much more compact. Haskellers strongly prefer terse code whenever possible, though this can sometimes make reading the code more difficult when starting out.

39.4. Putting it all together

Now you have to put together your main IO action. You can pass your request into httpLBS. Then you’ll get the status. After you have the status, you’ll check whether it’s 200. If it’s 200, you’ll write the data to a file by using the getResponseBody function. Otherwise, you’ll alert the user that there was an error in your request. When you write your file, it’s important to notice that you’re using the raw lazy ByteStrings with L.writeFile rather than the Char8 version LC.writeFile. In lesson 25, we mentioned that when you use binary data that may include Unicode, you should never write it using the Char8 interface, as it can corrupt your data.

Listing 39.7. Your final main for writing your request to a JSON file
main :: IO ()
main = do
  response <- httpLBS request
  let status = getResponseStatusCode response
  if status == 200
    then do
         print "saving request to file"
         let jsonBody = getResponseBody response
         L.writeFile "data.json" jsonBody
    else print "request failed with error"

Now you have a basic application that can fetch data from the REST API and write it to a file. This is just a taste of the type of HTTP request you can make using Haskell. The full documentation for this library can be found at https://haskell-lang.org/library/http-client.

Summary

In this lesson, our objective was to give you a quick overview of how to make an HTTP request in Haskell. In addition to learning how to make an HTTP request, you learned how to go about learning new libraries in Haskell. Let’s see if you got this.

Q39.1

Build a function buildRequestNOSSL that works exactly like buildRequest, only it doesn’t support SSL.

Q39.2

Improve the output of your code when something goes wrong. getResponseStatus will give you a data type including both the statusCode and the statusMessage. Fix main so that if you do get a non-200 statusCode, you print out the appropriate error.