Resource-Aware Distributed Clustering of Drifting Sensor Data Streams

Collecting data from sensor nodes is the ultimate goal of Wireless Sensor Networks. This is performed by transmitting the sensed measurements to some data collecting station. In sensor nodes, radio communication is the dominating consumer of the energy resources which are usually limited. Summarizing the sensed data internally on sensor nodes and sending only the summaries will considerably save energy. Clustering is an established data mining technique for grouping objects based on similarity. For sensor networks, k-center clustering aims at grouping sensor measurements in k groups, each contains similar measurements. In this paper we propose a novel resource-aware k-center clustering algorithm called: SenClu. Our algorithm immediately detects new trends in the drifting sensor data stream and follows them. SenClu powerfully uses a light-weighted decaying technique that gives lower influence to old data. As sensor data are usually noisy, our algorithm is also outlier-aware. In thorough experiments on drifting synthetic and real world data sets, we show that SenClu outperforms two state-of-the-art algorithms by producing higher clustering quality and following trends in the stream, while consuming nearly the same amount of energy.

Authors: Hassani M., Seidl T.
Published in: The 4th International Conference on Networked Digital Technologies (NDT 2012), Dubai, UAE.
Publisher: Springer
Language: EN
Year: 2012
Pages: 592-607
ISBN: 978-3-642-30507-8
ISSN: 1865-0929
Conference: NDT
Url:NDT 2012
Type: Conference papers (peer reviewed)
Research topic: Data Analysis and Knowledge Extraction