After reading lesson 40, you’ll be able to
In this lesson, you’ll work with JavaScript Object Notation (JSON) data, one of the most popular ways to store and transmit data. The JSON format originates in simple JavaScript objects and is heavily used in transmitting data through HTTP APIs. Because the format is so simple, it has seen widespread adoption outside the web, frequently being used as a method of storing data and for tasks such as creating configuration files. Figure 40.1 shows an example JSON object used with the Google Analytics API.
In the previous lesson, you ended up downloading a JSON file containing information on data sets available through the NOAA Climate Data API. In this lesson, you’re going to build a simple command-line application that opens that JSON file and prints out the data sources in the file. Before you get there, you’ll learn how to work with JSON. You’ll create types that you can turn into JSON as well as create types representing JSON that you’ll read in.
You have a data type representing a user:
data User = User { userId :: Int , userName :: T.Text , email :: T.Text }
The process of transforming objects into and out of JSON is known as serialization and deserialization, respectively. You may have come across this in other languages. If you have a data type representing a user, how did you serialize and deserialize to and from this type?
The key challenge of working with JSON in Haskell is that JSON supports only a few simple types: objects, strings, numbers (technically just floats), Booleans, and lists. In many programming languages, JSON is supported by using a dictionary-like data structure. You’ll use the Aeson library, which provides a much more appropriate solution for Haskell. Aeson allows you to translate back and forth between Haskell’s powerful data types and JSON.
Aeson relies on two key functions for translating back and forth between Haskell data types and JSON: encode and decode. To use these two functions, you need to make your data an instance of two type classes: ToJSON (encode) and FromJSON (decode). We’ll demonstrate two ways to do this. The first is automatically deriving the type classes with the help of a language extension. The other is to implement these classes yourself.
After you’ve learned how to use Aeson, you can create a data type representing the JSON data you downloaded from NOAA. The JSON response from the NOAA Climate Data API involves nested objects, so you’ll implement a nontrivial data type to interact with this data. Finally, you’ll put everything together that will allow you to list the contents of your file.
You’ll use a stack project called json-lesson for this lesson. As you did last time, for convenience, you’ll keep all of your code in the Main module. The first thing you need to do is set up your Main.hs file. You’ll start by importing the basics. You’ll use the popular Aeson library for working with JSON (Aeson was the father of the ancient Greek mythical hero Jason). In this lesson, all the textual data you’re working with will be in the form of Data.Text, because this is the preferred method in Haskell for representing text. You also need to import lazy ByteStrings and the Char8 helper for these. Your JSON will be represented as ByteStrings by default until you transform it into more meaningful types. Here’s your starter Main.hs file, which includes all the imports you need for this lesson.
module Main where import Data.Aeson import Data.Text as T import Data.ByteString.Lazy as B import Data.ByteString.Lazy.Char8 as BC import GHC.Generics main :: IO () main = print "hi"
You also have to add these libraries to your json-lesson.cabal file. You want to make sure you’re using the OverloadedStrings extension. You’ll also use a new extension for this lesson.
build-depends: base , json-lesson , aeson , bytestring , text default-language: Haskell2010 extensions: OverloadedStrings , DeriveGeneric
Now you can start exploring how you’re going to model JSON in Haskell.
To work with JSON, you’ll use the most popular Haskell library for JSON: Aeson. The main challenge you face when working with JSON and Haskell is that JSON has little regard for types, representing most of its data as strings. The great thing about Aeson is that it lets you apply Haskell’s strong type system to JSON data. You get the ease of working with a widely adopted and flexible data format without having to sacrifice any of Haskell’s power and type-related safety.
Aeson relies on two straightforward functions to do the bulk of the work. The decode function takes JSON data and transforms it into a target type. Here’s the type signature for decode:
decode :: FromJSON a => ByteString -> Maybe a
Two things are worth noticing here. First is that you return a Maybe type. As mentioned in lesson 38, Maybe is a good way to handle errors in Haskell. In this case, the type of errors you’re concerned with are parsing the JSON data correctly. There are many ways your parse could go wrong; for example, the JSON could be malformed or not match the type you expect. If something goes wrong with parsing the JSON data, you’ll get a Nothing value. You also learned that Either is often a better type because it can tell you what went wrong. Aeson also offers an eitherDecode function that will give you more informative error messages by using the Left constructor (remember that Left is the constructor used for errors):
eitherDecode :: FromJSON a => ByteString -> Either String a
The other important thing to notice is that the type parameter of your Maybe (or Either) type is constrained by the type class FromJSON. Making a type an instance of FromJSON enables you to convert raw JSON into a Maybe instance of your type. You’ll explore ways of making data an instance of FromJSON in the next section.
The other important function in Aeson is encode, which performs the opposite function as decode. Here’s the type signature of encode:
encode :: ToJSON a => a -> ByteString
The encode function takes a type that’s an instance of ToJSON and returns a JSON object represented as a ByteString. ToJSON is the counterpart to FromJSON. If a type is an instance of both FromJSON and ToJSON, it can trivially be converted to and from JSON. Next you’ll look at how to take your data and make it an instance of each of these type classes.
Why does encode return a ByteString rather than a Maybe ByteString?
Because there’s no way that your data type could fail to be turned into JSON. The issue arises only when you have JSON that may not be able to be parsed into your original type.
The aim of Aeson is to make it trivial to convert back and forth between Haskell data types and raw JSON. This is a particularly interesting challenge because JSON has a limited number of types to work with: numbers (technically just floats), strings, Booleans, and arrays of values. To do this, Aeson uses two type classes: FromJSON and ToJSON. The FromJSON type class allows you to parse JSON and turn it into a Haskell data type, and ToJSON allows you to turn Haskell data types into JSON. Aeson does a remarkably good job of making this easy in many cases.
For many data types in Haskell, implementing both ToJSON and FromJSON is remarkably easy. Let’s start with a Book type, which you’ll make an instance of both ToJSON and FromJSON. Your Book type will be incredibly simple, having only a text value for the title, another text value for the author, and an Int for the year of publication. Later in this lesson, you’ll look at more complicated data. Here’s the definition of your Book type.
data Book = Book { title :: T.Text , author :: T.Text , year :: Int } deriving Show
There’s an easy way to make the Book type both an instance of FromJSON and ToJSON. To do this, you need to use another language extension called DeriveGeneric. This extension adds support for better generic programming in Haskell. This makes it possible to write generic instances of a type class definition, allowing for new data to easily be an instance of a class with no extra code required. The DeriveGeneric extension makes it possible to easily derive instances of FromJSON and ToJSON. All you have to do is add Generic to your deriving statement.
data Book = Book { title :: T.Text , author :: T.Text , year :: Int } deriving (Show,Generic)
Finally, you have to declare Book an instance of FromJSON and ToJSON. You need to do nothing more than add these two lines (no additional where clause or definition is necessary).
instance FromJSON Book instance ToJSON Book
To demonstrate the power of these type classes, let’s take an example of your type and encode it.
myBook :: Book myBook = Book {author="Will Kurt" ,title="Learn Haskell" ,year=2017} myBookJSON :: BC.ByteString myBookJSON = encode myBook
In GHCi, you can see how this looks:
GHCi> myBook Book {title = "Learn Haskell", author = "Will Kurt", year = 2017} GHCi> myBookJSON "{\"author\":\"Will Kurt\",\"title\":\"Learn Haskell\",\"year\":2017}"
You can also do the reverse just as easily. Here’s a raw JSON ByteString that you’ll parse into your data type.
rawJSON :: BC.ByteString rawJSON = "{\"author\":\"Emil Ciroan\",\"title\":\"A Short History of Decay\",\"year=1949}" bookFromJSON :: Maybe Book bookFromJSON = decode rawJSON
In GHCi, you can see that you’ve successfully created a Book from this JSON:
GHCi> bookFromJSON Just (Book { title = "A Short History of Decay" , author = "Emil Ciroan" , year = 1949})
This is a powerful feature of Aeson. From a string of JSON, which usually has little type information, you were able to successfully create a Haskell type. In many languages, parsing JSON means getting a hash table or a dictionary of keys and values. Because of Aeson, you can get something much more powerful from your JSON.
Notice that your result is wrapped in the Just data constructor. That’s because a parsing error could have easily made it impossible to make an instance of your type. If you have malformed JSON that doesn’t work, you get nothing.
wrongJSON :: BC.ByteString wrongJSON = "{\"writer\":\"Emil Cioran\",\"title\":\"A Short History of Decay\",\"year\"=1949}" bookFromWrongJSON :: Maybe Book bookFromWrongJSON = decode wrongJSON
As expected, when you load this into GHCi, you see that your result is Nothing:
GHCi> bookFromWrongJSON Nothing
This is also a great example of the limitations of Maybe. You know what went wrong when parsing this JSON because you purposefully wrote this code with an error. But in a real project, this would be an amazingly frustrating error, especially if you didn’t have easy access to the raw JSON data to inspect. As an alternative, you can use eitherDecode, which gives you much more information:
GHCi> eitherDecode wrongJSON :: Either String Book Left "Error in $: The key \"author\" was not found"
Now you know exactly why your parse failed.
Although using DeriveGeneric makes using Aeson incredibly easy, you won’t always be able to take advantage of this. Occasionally, you’ll have to help Aeson figure out how exactly to parse your data.
Use Generic to implement ToJSON and FromJSON for this type:
data Name = Name { firstName :: T.Text , lastName :: T.Text } deriving (Show)
data Name = Name { firstName :: T.Text , lastName :: T.Text } deriving (Show,Generic) instance FromJSON Name instance ToJSON Name
In the preceding example, you started with a type you defined and made it work with JSON. In practice, you’re just as often working with someone else’s JSON data. Here’s an example of an error message you might get as a response to a JSON request because of an error on the other person’s server.
sampleError :: BC.ByteString sampleError = "{\"message\":\"oops!\",\"error\": 123}"
To use Aeson, you need to model this request with your own data type. When you do this, you’ll immediately see there’s a problem. Here’s the first attempt to model this error message.
data ErrorMessage = ErrorMessage { message :: T.Text , error :: Int 1 } deriving Show
The problem here is that you have a property named error, but you can’t have this, because error is already defined in Haskell. You could rewrite your type to avoid this collision.
data ErrorMessage = ErrorMessage { message :: T.Text , errorCode :: Int } deriving Show
Unfortunately, if you try to automatically derive ToJSON and FromJSON, your programs will expect an errorCode field instead of error. If you were in control of this JSON, you could rename the field, but you’re not. You need another solution to this problem.
To make your ErrorMessage type an instance of FromJSON, you need to define one function: parseJSON. You can do this in the following way.
instance FromJSON ErrorMessage where parseJSON (Object v) = ErrorMessage <$> v .: "message" <*> v .: "error"
This code is confusing, so breaking it down is worthwhile. The first part shows the method you need to define and the argument it takes:
parseJSON (Object v)
The (Object v) is the JSON object being parsed. When you take just the v inside, you’re accessing that value of that JSON object. Next you have a bunch of infix operators you need to make sense of. You’ve seen this pattern before, in unit 5, when you learned about common uses of applicatives:
ErrorMessage <$> value <*> value
As a refresher, suppose the values for your ErrorMessage were in a Maybe context.
exampleMessage :: Maybe T.Text exampleMessage = Just "Opps" exampleError :: Maybe Int exampleError = Just 123
If you want to make an ErrorMessage, you can combine <$> and <*> to safely make this ErrorMessage in the context of a Maybe:
GHCi> ErrorMessage <$> exampleMessage <*> exampleError Just (ErrorMessage {message = "Opps", errorCode = 123})
This pattern works with any instance of Monad. In this case, you’re not working with values in a Maybe context but in a Parser context. This brings you to the final mystery: what’s the (.:) operator? You can figure this out by looking at its type:
(.:) :: FromJSON a => Object -> Text -> Parser a
This operator takes an Object (your JSON object) and some text and returns a value parsed into a context. For example, this line of code is trying to parse the message field from your JSON object:
v .: "message"
The result is a value in a Parser context. The reason you need a context for your parse is that it can fail if there’s trouble parsing.
Make the Name type into an instance of FromJSON without Generic:
data Name = Name { firstName :: T.Text , lastName :: T.Text } deriving (Show)
instance FromJSON Name where parseJSON (Object v) = Name <$> v .: "firstName" <*> v .: "lastName"
Now that your ErrorMessage type is an instance of FromJSON, you can finally parse the incoming JSON ErrorMessages.
sampleErrorMessage :: Maybe ErrorMessage sampleErrorMessage = decode sampleError
In GHCi, you find this works as expected:
GHCi> sampleErrorMessage Just (ErrorMessage {message = "oops!", errorCode = 123})
And of course you want to go back again. The syntax for creating your message is different:
instance ToJSON ErrorMessage where toJSON (ErrorMessage message errorCode) = object [ "message" .= message , "error" .= errorCode ]
Once again you have a confusing bit of code. This time you’re defining the toJSON method. You can see that the method takes your data constructor and pattern matches on its two arguments, message and errorCode:
toJSON (ErrorMessage message errorCode)
You then use the object function to create your JSON object, passing the values of your data type into the correct fields for the JSON object:
object [ "message" .= message , "error" .= errorCode ]
You have another new operator here, (.=). This operator is used to create a key/value pair matching the value of your data with the field name for the JSON object.
Finally, make Name an instance of ToJSON without Generic:
data Name = Name { firstName :: T.Text , lastName :: T.Text } deriving (Show)
instance ToJSON Name where toJSON (Name firstName lastName) = object [ "firstName" .= firstName , "lastName" .= lastName ]
Now you can create your own raw JSON, just like the one you received.
anErrorMessage :: ErrorMessage anErrorMessage = ErrorMessage "Everything is Okay" 0
Again, you can see that this works exactly as you expect:
GHCi> encode anErrorMessage "{\"error\":0,\"message\":\"Everything is Okay\"}"
Now that you have down all the basics of working with JSON data in Haskell, let’s take a look at a more complex problem.
In the preceding lesson, you learned how to use HTTP.Simple to save a JSON file to disk. You saved a list of NOAA data sets to a file named data.json. If you didn’t run the code from lesson 39, you can get the data here: https://gist.github.com/willkurt/9dc14babbffea1a30c2a1e121a81bc0a. Now you’re going to read in that file and print the names of the data sets. The interesting thing about this file is that the JSON isn’t a simple type. Your JSON data has nested results and looks like this.
{ "metadata":{ "resultset":{ "offset":1, "count":11, "limit":25 } }, "results":[ { "uid":"gov.noaa.ncdc:C00861", "mindate":"1763-01-01", "maxdate":"2017-02-01", "name":"Daily Summaries", "datacoverage":1, "id":"GHCND" }, .....
You’re going to model the entire response with a NOAAResponse data type. NOAAResponse is made up of two types: Metadata and Results. Metadata itself contains another type, Resultset. Then you have NOAAResults, which contains values.
You’ll start with your basic result, because that’s ultimately what you’re interested in, and it doesn’t contain any more sophisticated types. Because Result contains an id value, you need to define a custom implementation of your instances. Here’s the data type for Result. You’ll name this type NOAAResult to distinguish it from the Result type in Aeson.
data NOAAResult = NOAAResult { uid :: T.Text , mindate :: T.Text , maxdate :: T.Text , name :: T.Text , datacoverage :: Int , resultId :: T.Text } deriving Show
Because the data uses id instead of resultId, you need to make your own instance of FromJSON. You’re not concerned about ToJSON, because you’ll be reading only from the data.
instance FromJSON NOAAResult where parseJSON (Object v) = NOAAResult <$> v .: "uid" <*> v .: "mindate" <*> v .: "maxdate" <*> v .: "name" <*> v .: "datacoverage" <*> v .: "id"
Next you need to tackle the Metadata type. The first part of your Metadata is Resultset. Thankfully, you don’t need a custom implementation of FromJSON. You just need to define your type, add deriving (Generic), and make it an instance of your type class.
data Resultset = Resultset { offset :: Int , count :: Int , limit :: Int } deriving (Show,Generic) instance FromJSON Resultset
The Metadata data type itself has only the Resultset value, and it’s simple to write.
data Metadata = Metadata { resultset :: Resultset } deriving (Show,Generic) instance FromJSON Metadata
Finally, you put together these other types into your NOAAResponse. Like your other types, there’s no issue with the naming of your values so you can derive the necessary class.
data NOAAResponse = NOAAResponse { metadata :: Metadata , results :: [NOAAResult] } deriving (Show,Generic) instance FromJSON NOAAResponse
Your goal is to print out all the types in the file. To do this, you’ll create a printResults IO action. Because your data will be a Maybe type, you need to handle the case of the parse failing. For this, you’ll print a message that an error occurred. Otherwise, you’ll use forM_ from the Control.Monad module (remember to import Control.Monad) to loop through your results and print them. The forM_ function works just like the mapM_ function, only it reverses the order of the data and the function used to map over the data.
printResults :: Maybe [NOAAResult] -> IO () printResults Nothing = print "error loading data" printResults (Just results) = do forM_ results (print . name) print dataName
Now you can write your main, which will read in the file, parse the JSON, and iterate through your results.
main :: IO () main = do jsonData <- B.readFile "data.json" let noaaResponse = decode jsonData :: Maybe NOAAResponse let noaaResults = results <$> noaaResponse printResults noaaResults
Now you can load your project into GHCi (or use stack build to run it if you’d prefer) and see how it works:
GHCi> main "Daily Summaries" "Global Summary of the Month" "Global Summary of the Year" "Weather Radar (Level II)" "Weather Radar (Level III)" "Normals Annual/Seasonal" "Normals Daily" "Normals Hourly" "Normals Monthly" "Precipitation 15 Minute" "Precipitation Hourly"
And there you have it; you’ve successfully used Haskell to parse a nontrivial JSON file.
In this lesson, our objective was to teach you how to parse and create JSON files by using Haskell. You used the popular Aeson library, which makes it possible to convert back and forth between Haskell data types and JSON. The conversion between data types and JSON is achieved with two type classes: FromJSON and ToJSON. In the best case, you can use the DeriveGeneric language extension to derive these classes automatically. Even in the worst case, where you have to help Aeson translate your data types, doing this is still relatively easy. Let’s see if you got this.
Make your NOAAResponse type an instance of ToJSON. This requires making all the types used by this type instances of ToJSON as well.
Make a Sum type called IntList and use DerivingGeneric to make it an instance of ToJSON. Don’t use the existing List type, but rather write it from scratch. Here’s an example of an IntList:
intListExample :: IntList intListExample = Cons 1 $ Cons 2 EmptyList