Database of transactions

In association rule mining, the dataset is structured a bit differently than the approach presented in the first chapter. First, there is no class value, as this is not required for learning association rules. Next, the dataset is presented as a transactional table, where each supermarket item corresponds to a binary attribute. Hence, the feature vector could be extremely large.

Consider the following example. Suppose we have four receipts, as shown next. Each receipt corresponds to a purchasing transaction:

To write these receipts in the form of a transactional database, we first identify all of the possible items that appear in the receipts. These items are onions, potatoes, burger, beer, and dippers. Each purchase, that is, transaction, is presented in a row, and there is 1 if an item was purchased within the transaction and 0 otherwise, as shown in the following table:

Transaction ID Onions Potatoes Burger Beer Dippers
1 0 1 1 0 0
2 1 1 1 1 0
3 0 0 0 1 1
4 1 0 1 1 0

 

This example is really small. In practical applications, the dataset often contains thousands or millions of transactions, which allow the learning algorithm the discovery of statistically significant patterns.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.77.54