Content-Based Retrieval in Multimedia Databases
Content-based retrieval in multimedia databases is becoming an increasingly important task in modern application domains including medical imaging, molecular biology, mechanical engineering, e-commerce and e-services. New requirements for database systems arise from the high complexity of media contents. In addition, an integration of effective and efficient retrieval techniques into (object-) relational database systems is indispensible for industrial applications.
Our first step to cope with the characteristics of media objects is to present a variety of similarity models to formalize the content of multimedia objects. Most of the models follow the paradigm of feature-based similarity and represent complex objects by high-dimensional feature vectors. Quadratic form-based distance functions provide a particular adaptability to application requirements and individual user preferences and, thus, take the subjective character of similarity into account.
In addition to similarity models, we propose a bunch of algorithms to efficiently support similarity search on large multimedia databases by new approximation techniques, multidimensional index structures, and reduction of dimensionality. Particular emphasis is given to nearest neighbor queries as a fundamental query type for similarity search, and we develop an optimal filter-and-refine algorithm for k-nearest neighbor search. As our key idea for the desired integration into relational database systems, we investigate a novel approach based on the approximation of nearest neighbor cells by rectilinear boxes. The question remains how to manage one-dimensional range data in relational database systems.
After a thorough investigation of related work from the literature, we propose the Relational Interval Tree (RI-tree) as a novel technique for efficiently managing interval data in relational database systems. A theoretical analysis reveals the optimal complexity of storage space requirement, of I/O operations for insertions and deletions, and of I/Os for intersection query and point query processing. Experimental evaluations demonstrate the good performance of the structure, and competing techniques are outperformed by factors of 42 for disk accesses and 4.9 for query response time.
As a final proposal in this thesis, we apply the Relational Interval Tree to temporal databases and extend the algorithms to Allen's general interval relationships after, before, contains, during, equals, finishes, finishedBy, meets, metBy, overlaps, overlappedBy, starts, and startedBy. On top of theoretical analyses, experiments on 1,000,000 weblog session profiles confirm the good performance of our solutions.
|Published in:||Habilitationsschrift (postdoctoral thesis), Faculty of Mathematics and Computer Science, University of Munich|
|Forschungsgebiet:||Exploration of Multimedia Databases|