Apriori algorithm

Apriori is a classical algorithm that is used to mine frequent itemsets to derive various association rules. It will help set up a retail store in a much better way, which will aid revenue generation.

The anti-monotonicity of the support measure is one of the prime concepts around which Apriori revolves. It assumes the following:

Let's look at an example and explain it:

Transaction ID

Milk

Butter

Cereal

Bread

Book

t1

1

1

1

0

0

t2

0

1

1

1

0

t3

0

0

0

1

1

t4

1

1

0

1

0

t5

1

1

1

0

1

t6

1

1

1

1

1

 

We have got the transaction ID and items such as milk, butter, cereal, bread, and book. 1 denotes that item is part of the transaction and 0 means that it is not.

Items

Number of transactions

Support

Milk

4

67%

Butter

5

83%

Cereal

4

67%

Bread

4

67%

Book

3

50%

Items

Number of transactions

Milk

4

Butter

5

Cereal

4

Bread

4

Items

Number of transactions

Milk, Butter

4

Milk, Cereal

3

Milk, Bread

2

Butter, Bread

3

Butter, Cereal

4

Cereal, Bread

2

 

Now, again, we have to find out the support for the preceding examples and filter them by threshold, which is support at 60%

Similarly, the combinations have to be formed with three items at a time (for example, Milk, Butter, and Bread) and support needs to be calculated for them. And, finally, we will filter them out by threshold. The same process needs to be done by doing four items at a time. The step that we have done till now is called frequent itemset generation.