Creating histograms

We will create a histogram for a selected time window with the WIN_SIZE width.

The histogram will hold the HIST_BINS value buckets. The histograms consisting of lists of doubles will be stored in an array list:

int WIN_SIZE = 500; 
int HIST_BINS = 20; 
int current = 0; 
 
List<double[]> dataHist = new ArrayList<double[]>(); 
for(List<Double> sample : rawData){ 
  double[] histogram = new double[HIST_BINS]; 
  for(double value : sample){ 
    int bin = toBin(normalize(value, min, max), HIST_BINS); 
    histogram[bin]++; 
    current++; 
    if(current == WIN_SIZE){ 
      current = 0; 
      dataHist.add(histogram); 
      histogram = new double[HIST_BINS]; 
    } 
  } 
  dataHist.add(histogram); 
} 

The histograms are now completed. The last step is to transform them into Weka's Instance objects. Each histogram value will correspond to one Weka attribute, as follows:

ArrayList<Attribute> attributes = new ArrayList<Attribute>(); 
for(int i = 0; i<HIST_BINS; i++){ 
  attributes.add(new Attribute("Hist-"+i)); 
} 
Instances dataset = new Instances("My dataset", attributes, 
dataHist.size()); for(double[] histogram: dataHist){ dataset.add(new Instance(1.0, histogram)); }

The dataset has been now loaded, and is ready to be plugged into an anomaly detection algorithm.