AbstractsComputer Science

SVM-based algorithims for multi-class classification and outlier detection and data streams problems

by Bo Liu

Institution: University of Technology, Sydney
Year: 2010
Record ID: 1069143
Full text PDF: http://hdl.handle.net/10453/34051


NO FULL TEXT AVAILABLE. This thesis contains 3rd party copyright material.  –  – SVM-based methods including one-class SVM, binary SVM and multi-class classification SVM have shown their great potential compared with many classification methods. This thesis aims to develop a series of SVM-based algorithms to cope with the challenges in SVM-based multi-class classification, outlier detection and data streams. These challenges are briefly introduced as follows. In SVM-based multi-class classification, traditional SVM-based methods mainly adopt the strategy of mapping the dataset with all classes into a single feature space via a kernel function, in which SVM is constructed for each decomposed binary classification problem. However, it is not always possible - and is sometimes very costly - to find an appropriate kernel function to render all the classes distinguishable from one another in a single feature space. This is because each class is always derived from individual uniform distribution and the selection of kernel function is related to the distribution of individual classes. This always results that the classification accuracy is not being as good as expected. How to improve the performance of multi-class classification is a challenge for SVM-based multi-class classification algorithms. In outlier detection, most of the existing works on outlier detection have not explicitly dealt with the uncertainty of the input data. An underlying assumption is that the training dataset is perfectly labeled for building outlier detection models or classifiers. However, in many real-world applications, the data may be corrupted with noises or may only be partially complete. Moreover, another important observation is that, negative examples or outliers, although very few, do exist in many applications. For example, in the network intrusion domain, in addition to extensive data about the normal traffic conditions in the network, there also exist a small number of cyber attacks that can be collected to facilitate outlier detection. Therefore, how to cope with data uncertainty and incorporate a small number of outliers into the learning phase to improve the performance of outlier detection is very important. In one-class data streams, one of challenges is to learn one-class classifiers on uncertain data streams. This is because the presence of uncertain data always makes one-class learning far more difficult than traditional data stream learning methods. Therefore, how to cope with data uncertainty in one-class data streams is a key challenge in one-class uncertain data stream learning. To cope with the above challenges, this thesis aims to (1) design a more efficient and accurate SVM-based multi-class classification algorithm; (2) design a novel and robust support vector data description (SVDD) approach for outlier detection with uncertain data; (3) design a novel approach to one-class-based uncertain data stream learning. Firstly, design a more efficient and accurate SVM-based multi-class classification algorithm. In order…