Flexible Fault Tolerant Subspace Clustering for Data with Missing Values

Supplementary Material

Title

Flexible Fault Tolerant Subspace Clustering for Data with Missing Values

Authors

Stephan Günnemann, Emmanuel Müller, Sebastian Raubach and Thomas Seidl

On this page we offer the datasets and algorithms that were used for the experiments in our paper "Flexible Fault Tolerant Subspace Clustering for Data with Missing Values". Thus, repeatability and comparison are available for the data mining community. Please note the citation information.

 

For an easy evaluation we integrated all algorithms and the used evaluation measures into the popular WEKA framework. In previous work, we therefore extended the framework to subspace and projected clustering. A short description of this extension and how to use it can be found on our OpenSubspace project website. For our submitted work we extended this framework to handle data with missing values. We integrated our algorithm "FTSC" as well as the competing approaches "CLIQUE del/fill" and "SCHISM del/fill" that perform a prepocessing on the data.

 

For each experiment we provide the corresponding datasets (.arff files), the information about the hidden clusters (.true files) and the used parameter settings. To determine the clustering quality, the quality measures have to be selected and the hidden clusters have to be loaded (cf. screenshot)

 

 

 

 

 

Executables: executables
Datasets: data-part1, data-part2, data-part3, data-part4, data-part5, data-part6

Citation Information


If you publish material based on databases, algorithms, parameter settings or evaluation measures obtained from this repository, then, in your acknowledgments, please note the assistance you received by using this repository. This will help others to obtain the same data sets, algorithms, parameter settings and evaluation measures and replicate your experiments. We suggest the following reference format for referring to this project:

 

Günnemann S., Müller E., Raubach S., Seidl T.:
Flexible Fault Tolerant Subspace Clustering for Data with Missing Values

http://dme.rwth-aachen.de/FTSC/

In Proc. IEEE International Conference on Data Mining (ICDM 2011), Vancouver, Canada. (2011)