Populating the FP-tree

Let's consider the transaction data shown in the following table. Let's first represent it as a sparse matrix:

ID	Bat	Wickets	Pads	Helmet	Ball
1	0	1	1	0	0
2	1	1	1	1	0
3	0	0	0	1	1
4	1	0	1	1	0

Let's calculate the frequency of each item and sort them in descending order by frequency:

Item	Frequency
pads	3
helmet	3
bat	2
wicket	2
ball	1

Now let's rearrange the transaction-based data based on the frequency:

ID	Original Items	Reordered Items
t1	Wickets, pads	Pads, wickets
t2	Bat, wickets, pads, helmet	Helmet, pads, wickets, bat
t3	Helmet, ball	Helmet, ball
t4	Bat, pads, helmet	Helmet, pads, bat

To build the FP-tree, let's start with the first branch of the FP-tree. The FP-tree starts with a Null as the root. To build the tree, we can represent each item with a node, as shown in the following diagram (the tree representation of t₁ is shown here). Note that the label of each node is the name of the item and its frequency is appended after the colon. Also, note that the pads item has a frequency of 1:

Using the same pattern, let's draw all four transactions, resulting in the full FP-tree. The FP-tree has four leaf nodes, each representing the itemset associated with the four transactions. Note that we need to count the frequencies of each item and need to increase it when used multiple times—for example, when adding t₂to the FP-tree, the frequency of helmet was increased to two. Similarly, while adding t₄, it was increased again to three. The resulting tree is shown in the following diagram:

Note that the FP-tree generated in the preceding diagram is an ordered tree.

Table of Contents for Populating the FP-tree

Create new playlist

Sign In

Sign Up

Table of Contents for
Populating the FP-tree