Mining sequence patterns in transactional databases
Mining sequence patterns can be thought of as association discovery over the temporal data or sequence dataset. Similarly, the classic pattern-mining algorithm should be extended or modified as per the sequence dataset's scenario.
The PrefixSpan algorithm
PrefixSpan is a frequent sequence-mining algorithm. The summarized pseudocode for the PrefixSpan algorithm is as follows:
The R implementation
Please take a look at the R codes file ch_08_prefix_span.R from the bundle of R codes for the previous algorithm. The codes can be tested with the following command: