|Institution:||University of New South Wales|
|Department:||Computer Science & Engineering|
|Keywords:||Trajectories; Uncertainty; Spatial database|
|Full text PDF:||http://handle.unsw.edu.au/1959.4/54165|
Uncertainty is inherent in many important applications, and many important queries are re-investigated in the context of uncertain data models. Efficient algorithms are strongly demanded to analyze spatial uncertain data. This thesis studies four fundamental problems to analyze spatial uncertain data by proposing efficient query processing algorithms, including (1) find top k influential facilities, (2) identify top k dominating objects, (3) range search on uncertain trajectories, and (4) top k similarity join. Firstly, we study the problem of finding top k most influential facilities over uncertain objects. We propose a new ranking model to identify the top k most influential facilities, which captures influence of facilities on the uncertain objects. Effective and efficient algorithms are proposed following the filtering-verification paradigm by utilizing two uncertain object indexing techniques. To effectively support uncertain objects with a large number of instances, we further develop randomized algorithms with accuracy guarantee. Secondly, we study the problem of top k dominating query on uncertain data, which is an essential method in the multi-criteria decision analysis when an explicit scoring function is not available. We formally introduce the top k dominating model, and propose effective and efficient algorithms to identify the top k dominating objects. Novel pruning techniques are proposed by utilizing the spatial indexing and statistic information to reduce CPU and I/O costs. Thirdly, we investigate the problem of range search on uncertain trajectories by assuming uncertain trajectories are modeled by the Markov Chains. We propose a general framework for range search on uncertain trajectories following the filtering-refinement paradigm where summaries of uncertain trajectories are constructed to facilitate the filtering process. Statistics based and partition based filtering techniques are developed to enhance the filtering capabilities. Finally, we investigate the problem of top k similarity join over multi-valued objects. We apply two types of quantile based distance measures to explore the relative instance distribution among the multiple instances of objects. Following a filtering-refinement framework, efficient and effective techniques to process top k similarity joins over multi-valued objects are developed. Novel distance, statistic and weight based pruning techniques are proposed to speed up the computations.