Chapter 7. Building Recommendation Systems for Businesses

"By leveraging Azure Machine Learning and the Recommendations API, we have launched a new Personalized Commerce Experience for retailers that grows shopper conversion and engagement on any channel."
– Frank Kouretas, Chief Product Officer at Orckestra

In the previous chapter, we covered the remaining language APIs. In this chapter, we will look at the Recommendations Solution template. This is a template for Microsoft Azure that contains the resources required to run Recommendations Solution. This is a solution well suited for e-commerce applications, where you can recommend different items based on different criteria. Recommending items in an online store is a process that can be very time-consuming if it is done by following a rule set. The Recommendations Solution allows us to utilize the power of machine learning to get good recommendations, potentially increasing the number of sales.

This chapter will cover the following topics:

Deploying the Recommendations Solution template
Training the recommendation model
Consuming recommendations

Providing personalized recommendations

If you run an e-commerce site, a feature that is nice for your customers to have is recommendations. Using the Recommendation Solution, you can easily add this. Utilizing Microsoft Azure Machine Learning, the API can be trained to recognize items that should be recommended.

There are two common scenarios for recommendations, as follows:

Item-to-Item Recommendations (I2I): I2I is the scenario where certain items are often viewed after other items. Typically, this will be in the form of people who visited this item also visited this other item.
Customer-to-Item Recommendations (U2I): U2I is the scenario where you utilize a customer's previous actions to recommend items. If you sell movies, for example, then you can recommend other movies based on a customer's previous movie choices.

The general steps to use the Recommendation Solution are as follows:

Deploy the template in Azure
Import the catalog data (the items in your e-commerce site)
Import usage data
Train a recommendation model
Consume recommendations

If you have not already done so, you should sign up for an API key at https://portal.azure.com.

Deploying the Recommendation Solution template in Azure

To deploy the Recommendations Solution, you must have an active Microsoft Azure subscription.

Head over to https://github.com/Microsoft/Product-Recommendations/tree/master/deploy to start the deployment. Click on Deploy to Azure, as shown in the following screenshot:

Deploying the Recommendation Solution template in Azure

This will take you to the following page in Microsoft Azure:

Enter the required information, accept the terms and conditions, and click on Purchase. This will start the process of deploying the required resources for the Recommendations Solution.

After a few minutes, the deployment is done. You are now ready to upload data to train a model.

Importing catalog data

With the solution deployed, we can add catalog data. This is where you would typically add items from your database. Items need to be uploaded as files. The files need to be in CSV format.

The following table describes the data that is required for each item in your catalog:

Name	Description
Item ID	A unique identifier for a given item
Item name	The name of the item
Item category	The category for the item, such as hardware, software, book genre, and so on

In addition, there are a few data fields that are optional. These are described in the following table:

Name	Description
Description	A description of the item
Feature list	A comma-separated feature list that can enhance recommendations

A file that has all the data included may have items that look like the following:

C9F00168, Kiruna Flip Cover, Accessories, Description of item, compatibility = lumia, hardware type = mobile

It is typically better to add features as this improves the recommendations. Any new item that has little usage is unlikely to be recommended if no features exist.

Features should be categorical. This means that a feature can be a price range. A price alone would not serve as a good feature.

You can add up to 20 features per item. When a catalog containing features for items is uploaded, you need to perform a rank build. This will rank each feature, where features of a higher ranking will typically be better to use.

The code example for this chapter contains a sample catalog. We will use this for the following example. Alternatively, you can download some data from Microsoft from http://aka.ms/RecoSampleData. We want to use the data from MsStoreData.zip.

With the files downloaded, we can upload the catalog to our storage. This can be done by heading to your newly created storage account and creating a new blob container for the catalog, as shown in the following screenshot:

Click on Upload, browse to the sample files you downloaded, and choose the catalog.csv file. This will upload the catalog.

Note

Note that the catalog file is not required, but it is recommended that you upload it in order to supply it to the model.

The maximum number of items in a catalog is 100,000. Any given catalog file cannot be larger than 200 MB. If your file is larger, and you still have more items, you can upload several files.

Importing usage data

The next step we need to make is to upload usage data. This is a file describing all the transactions from your customers in the past. The file contains rows, with transactions, where each transaction is a comma-separated line containing data.

The required data fields are as follows:

Name	Description
User ID	A unique identifier for each customer
Item ID	A unique identifier for items that correlate to the catalog
Time	The time of the transaction

In addition, it is possible to have a field called Event. This describes the type of transaction. The allowed values for this field are Click, RecommendationClick, AddShopCart, RemoveShopCart, and Purchase.

Given the preceding example from the catalog, a line in the usage file may look as follows:

    00030000D16C4237, C9F00168, 2015/08/04 T 11:02:37, Purchase

The maximum file size for a usage file is 200 MB.

The quality of recommendations relies on the amount of usage data. Typically, you should have about 20 transactions registered per item. This means that if you have 100 items in the catalog, you should aim for 2,000 transactions in the usage file.

Note that the current maximum number of transactions that the API accepts is 5 million. If new transactions are added above this maximum, the oldest data will be deleted.

Again, you can find an example usage file at http://aka.ms/RecoSampleData. Create another blob container called usage and click on Upload. Upload all the usage files from the sample folder.

Training a model

With the catalog and usage data in place, it is time to train a model.

Starting to train

To start a training process, we need to make an API call to an endpoint on the newly created app service. This can be done using a tool, such as Postman, or through your own application. We will use Postman for the purposes of this book.

Note

To download Postman, please visit https://www.getpostman.com/.

The training process can be started by sending a POST request to the following URL:

https://<service_name>.azurewebsites.net/api/models

The request must include a header, x-api-key, with your API key. It must also include another header, Content-Type, which should be set to application/json.

In addition, the request must contain a body containing the following:

Property	Mandatory	Description
`description`	No	Textual description.
`blobContainerName`	Yes	Name of the blob container where the catalog and usage data are stored.
`usageRelativePath`	Yes	Relative path to either a virtual directory that contains the usage file(s) or a specific usage file to be used for training.
`catalogFileRelativePath`	No	Relative path to the catalog file.
`evaluationUsageRelativePath`	No	Relative path to either a virtual directory that contains the usage file(s) or to a specific usage file to be used for evaluation.
`supportThreshold`	No	How conservative the model is, measured in the number of cooccurrences of items to be considered for modeling.
`cooccurrenceUnit`	No	Indicates how to group usage events before counting cooccurrence.
`similarityFunction`	No	Defines the similarity function to be used. Can be Jaccard, Cooccurrence, or Lift.
`enableColdItemPlacement`	No	This will be either `true` or `false`. Indicates whether recommendations should push cold items via feature similarity.
`enableColdToColdRecommendations`	No	This will be either `true` or `false`. Indicates whether or not the similarity between pairs of cold items should be calculated.
`enableUserAffinity`	No	This will be either `true` or `false`. Defines whether the event type and time of event should be considered as inputs to the result.
`enableBackfilling`	No	This will be either `true` or `false`. This will backfill with popular items if not enough relevant items are returned.
`allowSeedItemsInRecommendations`	No	This will be either `true` or `false`. Determines whether input items can be returned as results.
`decayPeriodInDays`	No	The decay period in days. The longer the time since an event has occurred, the less weight the event will have.
`enableUserToItemRecommendations`	No	This will be either `true` or `false`. If `true`, the user ID will be taken into account when personalized recommendations are requested.

A successful call may yield the following result:

The id field returned can be used to check the training status.

Verifying the completion of training

Using the ID returned in the previous request, we can now run a GET request to the following endpoint:

https://<service_name>.azurewebsites.net/api/models/<model_id>

This request requires a header, x-api-key, containing your API key. A successful request may give the following response:

Response of GET request

As you can see, a modelStatus field is presented. Once this is Completed, the model is trained and ready to be used. You will also be presented with statistics, such as the duration of training, among other details.

Tip

If you prefer to use a user interface for the model training, you can visit https://<your_service>.azurewebsites.net/ui.

Consuming recommendations

To use the recommendation models we just created, we will create a new example application. Create this using the MVVM template we created previously.

At the time of writing, there is no client package for the recommendations API. This means that we need to rely on web requests, as we saw in Chapter 6, Understanding Text. To speed up the development time, copy the WebRequest.cs file from the example code in Chapter 6, Understanding Text. Paste this file into the Model folder, and make sure that you update the namespace.

Note

Remember to add references to System.Web and System.Runtime.Serialization.

As there is no need for much UI, we are going to add everything in the MainView.xaml file. We are going to need two ComboBox elements. These will list our recommendation models and catalog items. We also need a Button element to get the recommendations and a TextBox element to show the resultant recommendations.

The corresponding ViewModel, MainViewModel.cs, will need properties to correspond to the UI elements. Add an ObservableCollection of a RecommendationModel type to hold our models. We will look at the type in a bit. We need a property of a RecommendationModel type to hold the selected model. Add an ObservableCollection property of a Product type with a corresponding Product property for the available and selected properties. We will also need a string property for the results and an ICommand property for our button.

Add a private member of a WebRequest type so that we can call the API.

Add a new file called Product in the Model folder. To use the items from our catalog, we will load the catalog file into the application, creating a Product for each item. Ensure that Product looks as follows:

    public class Product { 
        public string Id { get; set; } 
        public string Name { get; set; } 
        public string Category { get; set; }
        public Product(string id, string name, string category) { 
            Id = id; 
            Name = name; 
            Category = category; 
        } 
    }

We need the Id of an item, as well as the Name and the Category.

The constructor should create a WebRequest object, as shown in the following code:

    public MainViewModel() 
    { 
        _webRequest = new WebRequest ("https://<YOUR_WEB_SERVICE>.azurewebsites.net/api/models/", "API_KEY_HERE"); 
        RecommendCommand = new DelegateCommand(RecommendBook, CanRecommendBook); 
 
        Initialize(); 
    }

When we create the WebRequest object, we specify the recommendation endpoint and our API key. The RecommendCommand phrase is the ICommand object, as a DelegateCommand. We need to specify the action to be executed and the conditions under which we are allowed to execute the command. We should be allowed to execute the command if we have selected a recommendation model and a product.

The Initialize phrase will make sure that we fetch our recommendation models and products, as shown in the following code:

    private async void Initialize() { 
        await GetModels(); 
        GetProducts(); 
    }

The GetModels method will make a call to the API, as shown in the following code:

    private async Task GetModels() 
    { 
        List<RecommandationModel> models = await _webRequest.GetModels(HttpMethod.Get);

This call is a GET request, so we specify this in GetModels. A successful call should result in a JSON response that we then deserialize into a RecommendationModel object. This is a data contract, so add a file called Models.cs in a folder called Contracts.

A successful result will give the following output:

[
   { 
      "id": "string",
      "description": "string",
      "creationTime": "string",
      "modelStatus": "string"
   }
   {...}
   {...}
]

We have an array of models. Each item in this array has an id, name, description, createdDateTime, activeBuildId, and catalogDisplayName. Make sure that the RecommendationModels class contains this data.

If the call succeeds, we add the models to the ObservableCollection of available models, as shown in the following code:

        foreach (RecommandationModel model in models) { 
            AvailableModels.Add(model); 
        }          
        SelectedModel = AvailableModels.FirstOrDefault(); 
    }

When all items are added, we set the SelectedModel to the first available option.

To add the items from our catalog, we need to read from the catalog file. In the example code provided with the book, this file is added to the project and copied to the output directory. The GetProducts method will look as follows:

    private void GetProducts() { 
        try { 
            var reader = new StreamReader (File.OpenRead("catalog.csv")); 
 
            while(!reader.EndOfStream) { 
                string line = reader.ReadLine(); 
                var productInfo = line.Split(','); 
 
                AvailableProducts.Add(new Product(productInfo[0], productInfo[1], productInfo[2])); 
            } 
 
            SelectedProduct = AvailableProducts.FirstOrDefault(); 
        } 
        catch(Exception ex) { 
            Debug.WriteLine(ex.Message); 
        } 
    }

This is a basic file operation, reading in each line from the catalog. For each item, we get the required information, creating a Product for each item. This is then added to the AvailableProducts in the ObservableCollection property, and the SelectedProduct is the first available.

Now that we have our recommendation models and our products, we can execute the recommendation request, as shown in the following code:

    private async void RecommendProduct(object obj) 
    {  
        List<RecommendedItem> recommendations = await _webRequest.RecommendItem(HttpMethod.Get, $"{SelectedModel.id}/recommend?item={SelectedProduct.Id}");

The call to get the recommendations is a GET request. This requires us to add itemIds.

The itemIds parameter must be the ID of a selected product.

We call the RecommendItem method on the _webRequest object. This is a GET request, and we need to specify the ID of the SelectedModel in the query string. We also need to add a bit to the query string so that we reach the correct endpoint. A successful response will result in JSON output, which will look as follows:

[
   { 
      "recommendedItemId": "string",
      "score": "float"
   },
   {...}
   {...}
]

The result consists of an array of objects. Each item will have a recommendedItemId and a score. The score gives an indication of how likely a customer is to want the given item.

This result should be deserialized into a list of data contracts of a RecommandedItem type, so make sure you add this in the Contracts folder.

When we have made a successful call, we want to display this in the UI, as follows:

        if(recommendations.Count == 0) { 
            Recommendations = "No recommendations found"; 
            return; 
        } 
         StringBuilder sb = new StringBuilder(); 
        sb.Append("Recommended items:\n\n");

First, we check to see whether we have any recommendations. If we do not have any, we will not move on. If we do have any items, we create a StringBuilder to format our output, as follows:

        foreach(RecommendedItem recommendedItem in recommendations)  { 
            sb.AppendFormat("Score: {0}n", recommendedItem.score); 
            sb.AppendFormat("Item ID: {0}\n", item.id); 
 
            sb.Append("n"); 
        } 
        Recommendations = sb.ToString(); 
    }

We loop through all the recommendedItems. We output the score and the id. This will be printed in the UI.

A successful test run may give the following result:

There are a few special cases to note:

If the item list contains a single item that does not exist in the catalog, then an empty result is returned
If the item list contains some items that are not in the catalog, then these are removed from the query
If the item list contains only cold items (items that have no usage data connected to them), then the most popular recommendation is returned
If the item list contains some cold items, then recommendations are returned for the other items

Recommending items based on prior activities

To make recommendations based on user activity, we need a list of users. As such a list would be too cumbersome to create just for an example, we will only look at the steps and parameters that are required to make this kind of recommendation.

The endpoint for this usage is a bit different, as it is another GET call. In code, it would look as follows:

    $"{SelectedModel.id}/recommend/user?{queryString.ToString()}"

The parameters in the query string are as follows:

Parameter	Description
`userId` (required)	A unique identifier of a given user.
`numberOfResults` (required)	The number of recommendations returned.
`itemsIds` (optional)	A list or single ID of the selected item(s).
`includeMetadata` (optional)	If true, then the item's metadata will be included.
`buildId` (optional)	A number identifying the build we want to use. If none is specified, then the active build is used.

A successful call will result in the same JSON output as the other recommendation models. Recommended items will, of course, be based on users' past activities.

Note

Note that, to be able to use this, U2I must be set to true when creating a model build.