The evaluation is heavy, both computationally and memory-wise, so make sure that you initialize the JVM with extra heap space (for instance, java -Xmx16g). The computation can take a couple of hours or days, depending on the number of algorithms that you include in the model library. This example took 4 hours and 22 minutes on a 12-core Intel Xeon E5-2420 CPU with 32 GB of memory and utilizing 10% CPU and 6 GB of memory on average.
We call our evaluation method and provide the results as output, as follows:
double resES[] = evaluate(ensambleSel); System.out.println("Ensemble Selection\n" + "\tchurn: " + resES[0] + "\n" + "\tappetency: " + resES[1] + "\n" + "\tup-sell: " + resES[2] + "\n" + "\toverall: " + resES[3] + "\n");
The specific set of classifiers in the model library achieved the following result:
Ensamble churn: 0.7109874158176481 appetency: 0.786325687118347 up-sell: 0.8521363243575182 overall: 0.7831498090978378
Overall, the approach has brought us to a significant improvement of more than 15 percentage points, compared to the initial baseline that we designed at the beginning of this chapter. While it is hard to give a definite answer, the improvement was mainly due to three factors: data preprocessing and attribute selection, the exploration of a large variety of learning methods, and the use of an ensemble-building technique that is able to take advantage of the variety of base classifiers without overfitting. However, the improvement requires a significant increase in processing time, as well as working memory.