DB-CSC

Supplementary Material

 

On this page we offer the executables used for the experiments in our paper "DB-CSC: A density-based approach for subspace clustering in graphs with feature vectors". Please note the citation information below.

 

 

Algorithms

 

Our executables are available as a jar file (download DBCSC.jar). The usage and the used formats are explained in the README file: README.txt. A small example file can be downloaded here: Supplementary_DBCSC.zip. If you are interested in the source code, please contact the authors via email.

 

The jar file expects as parameter the name of a config file which has the format of a Java Properties file. This config file contains the file name of the processed graph (the graph has to be represented in the graphml format, c.f. http://graphml.graphdrawing.org) and the values of the required parameters.

 

For example, the command line would be:

 

java -Xmx2g -jar DBCSC.jar example.properties

 

 

After processing, the algorithms will create a file containing the result. Here we present a tiny example that explains the contents of such a file:

 

 

 

In such a result file, every line corresponds to one identified cluster. The last line contains the processing time of the algorithm (in ms).

 

For every cluster, the first numbers correspond to the relevant dimensions of the cluster. In our example, the original dataset has three dimensions, so the first three numbers show the dimensions of the clusters, e.g. the first cluster lies only in the second dimension, indicated by the entry '1'. The first and the third dimension are not relevant for this cluster, indicated by the entry '0'.

 

The next number shows the number of nodes in the cluster, e.g. '5' for the first cluster.

 

The last numbers are the IDs of the nodes that the cluster contains. For the first cluster that would be the nodes with the IDs 3, 6, 7, 8 and 9.

 

Citation information

 

If you publish material based on algorithms, executables or parameter settings obtained from this site, then please note the assistance you received by using this repository. This will help others to obtain the same algorithms and parameter settings and allows to replicate your experiments. We suggest the following reference format for referring to this project:
Günnemann S., Boden B., Seidl T.:
DB-CSC: A density-based approach for subspace clustering in graphs with feature vectors
Proc. European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2011), Athens, Greece S.565-580 (2011).