Adaptive Multiple-Resolution Stream Clustering

Stream data applications have become more and more prominent recently and the requirements for stream clustering algorithms have increased drastically. Due to continuously  evolving nature of the stream, it is crucial that the algorithm autonomously detects clusters of arbitrary shape, with different densities, and varying number of clusters. Although available density-based stream clustering are able to detect clusters with arbitrary shapes and varying numbers, they fail to adapt their thresholds to detect clusters with different densities.
In this paper we propose a stream clustering algorithm HASTREAM, which is based on a hierarchical density-based clustering model that automatically detects clusters of different densities. The density thresholds are independently adapted to the existing data without the need of any user intervention. To reduce the high computational cost of the presented approach, techniques from the graph theory domain are utilized to devise an incremental update of the underlying model.
To show the effectiveness of HASTREAM and hierarchical density-based approaches in general, several synthetic and real world data sets are evaluated using various quality measures. The results showed that the hierarchical property of the model was able to improve the quality of density-based stream clusterings and enabled HASTREAM to detect streaming clusters of different densities.

Authors: Hassani M., Spaus P., Seidl T.
Published in: In Proceedings of the 10th International Conference on Machine Learning and Data Mining MLDM 2014, July, 21-24, St. Petersburg, Russia.
Publisher: Springer
Language: EN
Year: 2014
Pages: 134-148
ISBN: 978-3-319-08979-9
ISSN: 0302-9743
Conference: MLDM
Url:MLDM 2014
Type: Conference papers (peer reviewed)
Research topic: Fast Access to Complex Data