Clustering in Attributed Graphs

Forschungsgebiet: Data Analysis and Knowledge Extraction

 

The aim of data mining approaches is to extract novel knowledge from large sets of data. These data can be represented in different manners. Two of the most common data types are vector data, where each object is represented as a vector containing different attributes of the object, and graph data, which represents relationships between different objects as edges in a graph. While the first data type can be analyzed by subspace clustering approaches, the second one can be analyzed by graph clustering/dense subgraph mining methods.

 

In many applications, data of both types is available simultaneously: for the vertices or the edges of a graph, additional information is available which can be described as an attribute vector. Analyzing both data sources simultaneously can increase the quality of mining methods. However, most clustering approaches deal only with one of these data types.

 

In our works, we develop combined clustering approaches that use both data types simultaneously and thereby obtain better clustering results. So far, we developed approaches for graphs with vertex attributes, for graphs with edge attributes and for heterogeneous networks containing different types of attributed vertices. For all of those data types, our approaches focus on realizing an unbiased combination of graph and attribute data and avoiding redundancy in the clustering result.

 

 

Example: Simultaneous mining of dense subgraphs and subspace clusters in a graph with attributed vertices