Subspace MOA: Subspace Stream Clustering Evaluation Using the MOA Framework

 

Most available static data are becoming more and more high-dimensional. Therefore, subspace clustering, which aims at finding clusters not only within the full dimension but also within subgroups of dimensions, has gained a significant importance. Recently,OpenSubspace framework was proposed to evaluate and explorate subspace clustering algorithms in WEKA with a rich body of most state of the art subspace clustering algorithms and measures. Parallel to it, MOA (Massive Online Analysis) framework was developed also above WEKA to provide algorithms and evaluation methods for mining tasks on evolving data streams over the full space only.

Similar to static data, most streaming data sources are becoming high-dimensional, and tracking their evolving clusters is also becoming important and challenging. In this demonstrator, we present, to the best of our knowledge, the first subspace clustering evaluation framework over data streams called Subspace MOA. Our demonstrator follows the online-offline model which is used in most data stream clustering algorithms. In the online phase, users have the possibility to select one of three most famous summarization techniques to form the microclusters. In the offline phase, one of five subspace clustering algorithms can be selected. The framework is supported with a subspace stream generator, a visualization interface to present the evolving clusters over different subspaces, and various subspace clustering evaluation measures.

Authors: Hassani M., Kim Y., Seidl T.
Published in: The 18th International Conference on Database Systems for Advanced Applications (DASFAA 2013), Wuhan, China (Best Demo Award Runner-Up)
Publisher: Springer
Sprache: EN
Jahr: 2013
Additional:

(Demo)

Seiten: 446-449
ISBN: 978-3-642-37450-0
ISSN: 0302-9743
Konferenz: DASFAA
DOI: 10.1007/978-3-642-37450-0_33
URL:DASFAA 2013
Typ: Tagungsbeiträge
Forschungsgebiet: Data Analysis and Knowledge Extraction