Error modeling

Error modeling is another form of meta-level modeling but in this case we will be modeling cases where there were errors in our predictions. In this way, we can increase the accuracy of that prediction. Using an example, we will walk through how to do error modeling.

Consider the following scenario, for example:

Here, we have a dataset named LoyalTrain. This is just a training dataset; we have our testing and validation dataset at a different place and will build a model only on the training dataset. Theer is also a Type node and a Neural Net model, where we are predicting the variable loyal. Run the Analysis node to see the results as shown in the following screenshot:

You can see that there are two categories in the outcome variable: people are either predicted to stay or to leave. You can also see that correct predictions were made in 79% of the cases. Mistakes were made in 21% of the cases. In total, there were 236 errors.

From this example, you can also see that the Neural Net model was copied and placed in another part of the stream. A new variable, CORRECT, was also made using a Derive node. Let's take a look at what's happened here, as shown in the following screenshot:

Here, we have created a new field as CORRECT, and we have kept its values as True and False. We are telling Modeler here that if it finds a variable, LOYAL, and if it is equal to the prediction of LOYAL, then the value is True; otherwise, it is False.

If you run the Distribution node placed above it, you will see the following results:

Next, we will use the Type node to instantiate the data, after which we can use a C5.0 decision tree model that looks at the data in a very different way. Here we have built a C5.0 model that is trying to predict if we are getting a correct or an incorrect prediction. Click on the generated C5.0 model to see its results, as shown in the following screenshot:

In this example, we can see that we have 14 rows with 4 rule(s) for a False prediction, that is, when we are predicting incorrectly, and 10 rule(s) for True values when we are predicting correctly.

You can expand the rules and click on the % sign above them to get the following results:

In this example, the first rule basically states that if you're male and you're using fewer than 1 minute of international calls, fewer than 1 minute of long-distance calls, and your status is single, we are predicting that we will have a value of False. If you want to see the numbers, click on the % sign, where you will see the following results:

As shown in the preceding screenshot, first rule had 22 people, and the accuracy of predictions relating to them was around 82%.

From this, we can see that there are certain kinds of mistakes cropping up while we are making the predictions. We might therefore need to use another kind of model instead of a Neural Net model. To do this, click on the Generate option and select the Rule Trace Node, as shown in the following screenshot:

This step created the FALSE_TRUE node that you can see in the example scenario as the Start icon. This creates all of our rules. If you wish to take a look inside it, click on the Start + icon on the Tools tab, where you should see the following result:

Let's now take a look at the first rule. Click on the expression builder in that rule, as shown in the following screenshot:

Here, the rule appears to state that if you're male and you're using fewer than 1 minute of international calls, fewer than 1 minute of long-distance calls, and your status is also single, we're predicting that you're going to have a value of False. You can see the accuracy in that prediction.

Go back using the Start icon. Here, we have the classify node, Split. Let's see what we have done so far, as follows:

We have taken the variable RULE and clicked on Get, which gave us all of these different original values of False, which we renamed to Incorrect and all the values of True, which were renamed to Correct, and then we had just the Correct Predictions:

We have now built the Neural Net model. If you run the Analysis node of the generated Neural Net model from the Correct Predictions, you should see the following results:

Remember that the overall accuracy of the earlier model was around 79%, which has now improved to around 84%.

We have also done the same thing for incorrect predictions in a separate field from the Type node. Let's have a look at that, as follows:

We built a C5.0 model for incorrect predictions, so let's take a look at its analysis, as follows:

The C5.0 model has done a great job at predicting the incorrect values where the Neural Net model didn't work well; we now have an overall accuracy of 89%.

Let's sum up what we did here. We had a dataset that we split into correct and incorrect results and separately modeled each one to give us fewer errors than we used one model.

Now we need to combine the predictions from the two models. For this, go to the Error 2 Stream from the Streams tab at the right, as shown in the following screenshot:

Here, we have combined the predictions of both the models and have used a Derive node Prediction, as shown in the following screenshot:

Here, we have specified that if a prediction is correct, the prediction of the Neural Net model should be opted for. If a prediction is incorrect, we should opt for the prediction of the C5.0 model.

Then, having added the Matrix node, run the following:

What we can see in the preceding screenshot is that we have correctly predicted that 437 people will leave with 118 errors, and that 498 people will stay with just 55 errors. This means there is a total number of 173 errors.
Our original model made 236 errors, so we have brought down the number of errors by a great extent. Just by using two different models for different groups of people and by combining them with, we have produced an output with 63 fewer errors.

This is error modeling. In error modelling you can build one model, see what the results look like, and then decide from there whether to build two or three models for different types of people, because it can't be assumed that one size fits all. Therefore, we can build different kinds of models, feed different types of data to those models, and then ultimately combine the results of each model to produce a final prediction that can end up having fewer errors in terms of the predictive modeling undertaken.