AnyOut: Anytime Outlier Detection on Streaming Data

With the increase of sensor and monitoring applications, data mining on streaming data is receiving increasing research attention. As data is continuously generated, mining algorithms need to be able to analyze the data in a one-pass fashion. In many applications the rate at which the data objects arrive varies greatly. This has led to anytime mining algorithms for classification or clustering. They successfully mine data until the a priori unknown point of interruption by the next data in the stream.

In this work we investigate anytime outlier detection. Anytime outlier detection denotes the problem of determining within any period of time whether an object in a data stream is anomalous. The more time is available, the more reliable the decision should be. We introduce AnyOut, an algorithm capable of solving anytime outlier detection, and investigate different approaches to build up the underlying data structure. We propose a confidence measure for AnyOut that allows to improve the performance on constant data streams. We evaluate our method in thorough experiments and compare our performance to established baseline algorithms for outlier detection.

Authors: Assent I., Kranen P., Baldauf C., Seidl T.
Published in: The 17th International Conference on Database Systems for Advanced Applications (DASFAA 2012), Busan, South Korea
Sprache: EN
Jahr: 2012
Konferenz: DASFAA
Typ: Tagungsbeiträge
Forschungsgebiet: Data Analysis and Knowledge Extraction