An Extension of the PMML Standard to Subspace Clustering Models

In today's applications we face the challenge of analyzing databases with many attributes per object. For these high dimensional data it is known that traditional clustering algorithms fail to detect meaningful patterns: mining the full-space is futile. As a solution subspace clustering techniques were introduced. They analyze arbitrary subspace projections of the data to detect clustering structures. Recently, public available mining software integrates subspace clustering as a novel mining paradigm and sets the stage for its wide applicability. Though, a common standard to describe, exchange and process the subspace clustering results is still missing, which hinders the application in practice.

 

In this work, we propose an extension of the PMML standard to describe mining models resulting from subspace clustering methods. Thus, we bridge the gap between the different tools and realize a common baseline the user can rely on. Our extension considers the various aspects subspace clustering models have to cope with, going beyond the ones of traditional clustering. We will integrate this novel PMML extension in the next version of our OpenSubspace toolkit.

Authors: Günnemann S., Kremer H., Seidl T.
Published in: Workshop on Predictive Model Markup Language (PMML) in conj. with the 17th ACM Conference on Knowledge Discovery and Data Mining (SIGKDD 2011), San Diego, CA, USA
Publisher: ACM - New York,NY,USA
Language: EN
Year: 2011
Pages: 48-53
ISBN: 978-1-4503-0837-3
Conference: KDD
DOI:10.1145/2023598.2023605
Url:PMML 2011
KDD 2011
Full Text PDF
Type: Conference papers (peer reviewed)
Research topic: