Sequential Pattern Mining of Interval-Based Data

Research topic: Data Analysis and Knowledge Extraction

Almost all activities observed in nowadays applications are correlated with a timing sequence. Users are mainly looking for interesting sequences out of such data. Sequential pattern mining algorithms aim at finding frequent sequences. Usually, the mined activities have timing durations that represent time intervals between their starting and ending points. The majority of sequential pattern mining approaches dealt with such activities as a single point event and thus lost many valuable information in the collected patterns. Recently, some approaches have carefully considered this interval-based nature of the events, but they had major limitations. They concentrate only on the order of the events without taking the durations of the gaps between them into account. Additionally, they assume found patterns to be equal only if they are neatly synchronized, and perfectly timely aligned and thus they are inapplicable for multimodal, noisy event streams.


The aim of this project is to resolve such problems by designing algorithms that can flexibly work on data presented as any number of not necessarily aligned interval sequences. In particular, the algorithms should be able to efficiently utilize data presented as single interval sequence noisy stream of input without the need to create samples or batches, while delivering meaningful patterns.