Process Mining

Research topic: Data Analysis and Knowledge Extraction

Process mining is an emerging research area that brings the well-established data mining solutions to the challenging business process modeling problems. These problems are provided to the data mining algorithms in the form of event logs. Typically, event logs contain detailed information about the activities that are taken within a business process. The main aim of process mining is to discover, monitor and improve real processes by extracting knowledge from event logs. The achieved improvement of the extracted process models can be measured by the extent of conformance between them and the original event logs.

Although process mining is relatively a young research field, several approaches are already existing in its literature. These works presented interesting methods by examining different features of the event logs. All of them have however assumed the existence of the complete event logs and the possibility to access it as much as needed to generate, in most of them, a single final process model. This is infeasible when considering the huge increases in the size of event logs generated from modern information systems supporting business processes. The proposed approaches will face serious efficiency issues with the increase in both the size and the dimensionality of the collected events. Moreover, one important recent research question in the field of process mining is the concept drift of the underlying business process. Decision makers will lose important insights over such drifting process by having merely a single final model. An important additional evolving requirement in this context is the necessity to have instant knowledge about the process model in the real time of observing the event logs.       

With these new requirements, one started to speak about event streams, streams of process models and streaming process discovery. Consider the event stream flowing from an example Traffic Fine Management System in Fig 1. Observed events are received over the time by the streaming process discovery part in the middle column which continuously extracts and updates a stream of process models in the right side of the figure.