Global ETD Search

381	Ensemble diversity for class imbalance learning Wang, Shuo January 2011 (has links) This thesis studies the diversity issue of classification ensembles for class imbalance learning problems. Class imbalance learning refers to learning from imbalanced data sets, in which some classes of examples (minority) are highly under-represented comparing to other classes (majority). The very skewed class distribution degrades the learning ability of many traditional machine learning methods, especially in the recognition of examples from the minority classes, which are often deemed to be more important and interesting. Although quite a few ensemble learning approaches have been proposed to handle the problem, no in-depth research exists to explain why and when they can be helpful. Our objectives are to understand how ensemble diversity affects the classification performance for a class imbalance problem according to single-class and overall performance measures, and to make best use of diversity to improve the performance. As the first stage, we study the relationship between ensemble diversity and generalization performance for class imbalance problems. We investigate mathematical links between single-class performance and ensemble diversity. It is found that how the single-class measures change along with diversity falls into six different situations. These findings are then verified in class imbalance scenarios through empirical studies. The impact of diversity on overall performance is also investigated empirically. Strong correlations between diversity and the performance measures are found. Diversity shows a positive impact on the recognition of the minority class and benefits the overall performance of ensembles in class imbalance learning. Our results help to understand if and why ensemble diversity can help to deal with class imbalance problems. Encouraged by the positive role of diversity in class imbalance learning, we then focus on a specific ensemble learning technique, the negative correlation learning (NCL) algorithm, which considers diversity explicitly when creating ensembles and has achieved great empirical success. We propose a new learning algorithm based on the idea of NCL, named AdaBoost.NC, for classification problems. An ``ambiguity" term decomposed from the 0-1 error function is introduced into the training framework of AdaBoost. It demonstrates superiority in both effectiveness and efficiency. Its good generalization performance is explained by theoretical and empirical evidences. It can be viewed as the first NCL algorithm specializing in classification problems. Most existing ensemble methods for class imbalance problems suffer from the problems of overfitting and over-generalization. To improve this situation, we address the class imbalance issue by making use of ensemble diversity. We investigate the generalization ability of NCL algorithms, including AdaBoost.NC, to tackle two-class imbalance problems. We find that NCL methods integrated with random oversampling are effective in recognizing minority class examples without losing the overall performance, especially the AdaBoost.NC tree ensemble. This is achieved by providing smoother and less overfitting classification boundaries for the minority class. The results here show the usefulness of diversity and open up a novel way to deal with class imbalance problems. Since the two-class imbalance is not the only scenario in real-world applications, multi-class imbalance problems deserve equal attention. To understand what problems multi-class can cause and how it affects the classification performance, we study the multi-class difficulty by analyzing the multi-minority and multi-majority cases respectively. Both lead to a significant performance reduction. The multi-majority case appears to be more harmful. The results reveal possible issues that a class imbalance learning technique could have when dealing with multi-class tasks. Following this part of analysis and the promising results of AdaBoost.NC on two-class imbalance problems, we apply AdaBoost.NC to a set of multi-class imbalance domains with the aim of solving them effectively and directly. Our method shows good generalization in minority classes and balances the performance across different classes well without using any class decomposition schemes. Finally, we conclude this thesis with how the study has contributed to class imbalance learning and ensemble learning, and propose several possible directions for future research that may improve and extend this work. 006.3
382	Automated management cloud-platforms based on energy policies Alansari, Marwah January 2016 (has links) Delivering environmentally friendly services has become an important issue in Cloud Computing due to awareness provided by governments and environmental conservation organisations about the impact of electricity usage on carbon footprints. Cloud providers and cloud consumers (organisations/ enterprises) have their own defined \(green\) \(policies\) to control energy consumption at their data centers. At service management level, \(green\) \(policies\) can be mapped as \(energy\) \(management\) \(policies\) or \(management\) \(policies\). Focusing at cloud consumer's side, \(management\) \(policies\) are described by business managers which can change regularly. The continuous changing is based on the nature of the technical environment, changes in regulation; and business requirements. Therefore, there is a gap between the level of describing and implementing \(management\) \(policies\) in the cloud environment. This thesis provides a method to bridge that gap by (a) defining a specification for formulating \(management\) \(policies\) into executable form for an infrastructure-as-a-service (IaaS) cloud model; (b) designing a framework to execute the described \(management\) \(policies\) automatically; (c) proposing a modelling and analysis method to identify the potential \(energy\) \(management\) \(policy\) that would save energy-cost. Each aspect covered in the thesis is evaluated with a help of an Energy Management Case Study for a private cloud scenario. 004.67
383	Cluster-based semi-supervised ensemble learning Soares, Rodrigo Gabriel Ferreira January 2014 (has links) Semi-supervised classification consists of acquiring knowledge from both labelled and unlabelled data to classify test instances. The cluster assumption represents one of the potential relationships between true classes and data distribution that semi-supervised algorithms assume in order to use unlabelled data. Ensemble algorithms have been widely and successfully employed in both supervised and semi-supervised contexts. In this Thesis, we focus on the cluster assumption to study ensemble learning based on a new cluster regularisation technique for multi-class semi-supervised classification. Firstly, we introduce a multi-class cluster-based classifier, the Cluster-based Regularisation (Cluster- Reg) algorithm. ClusterReg employs a new regularisation mechanism based on posterior probabilities generated by a clustering algorithm in order to avoid generating decision boundaries that traverses high-density regions. Such a method possesses robustness to overlapping classes and to scarce labelled instances on uncertain and low-density regions, when data follows the cluster assumption. Secondly, we propose a robust multi-class boosting technique, Cluster-based Boosting (CBoost), which implements the proposed cluster regularisation for ensemble learning and uses ClusterReg as base learner. CBoost is able to overcome possible incorrect pseudo-labels and produces better generalisation than existing classifiers. And, finally, since there are often datasets with a large number of unlabelled instances, we propose the Efficient Cluster-based Boosting (ECB) for large multi-class datasets. ECB extends CBoost and has lower time and memory complexities than state-of-the-art algorithms. Such a method employs a sampling procedure to reduce the training set of base learners, an efficient clustering algorithm, and an approximation technique for nearest neighbours to avoid the computation of pairwise distance matrix. Hence, ECB enables semi-supervised classification for large-scale datasets. 004
384	Semantics, analysis and security of backtracking regular expression matchers Rathnayake, Asiri January 2015 (has links) Regular expressions are ubiquitous in computer science. Originally defined by Kleene in 1956, they have become a staple of the computer science undergraduate curriculum. Practical applications of regular expressions are numerous, ranging from compiler construction through smart text editors to network intrusion detection systems. Despite having been vigorously studied and formalized in many ways, recent practical implementations of regular expressions have drawn criticism for their use of a non-standard backtracking algorithm. In this research, we investigate the reasons for this deviation and develop a semantics view of regular expressions that formalizes the backtracking paradigm. In the process we discover a novel static analysis capable of detecting exponential runtime vulnerabilities; an extremely undesired reality of backtracking regular expression matchers. 004
385	Technology validation for e-trial systems Algharibi, Amani Jaber H. January 2016 (has links) This research study presents a Hypothesised Model, developed on the basis of the Unified Theory of Acceptance and Use of Technology (UTAUT). Its aim is to evaluate innovative Health Information Technology (HIT) at the early stages of projects. It is contended that this practice would support system developers at the design and implementation phases, and reduce the risk of underutilisation or rejection. The performance of the model was tested in three studies within the Clinical Trial Management Systems framework. The Hypothesised Model approaches Behavioural Intention from a socio-technical point of view, taking into consideration the complexity and need of HIT to achieve joint optimisation. Moreover, it simplifies and extends UTAUT so that it may fit soundly within the healthcare context. Hence, it excludes the moderators and adds three core constructs, including: System-Specific Features, Technology Anxiety, and Adaptation Timeline. However, the model is easily adjustable to fit specific situations, especially given that this research study posits the non-existence of a single model that suits all situations. This approach appears to have improved the final outcome and outperformed the use of generic models within the healthcare context. The total explained variance reported from the three studies is: (76%), (86%), and (87%) respectively. 610.72
386	Model-based high-density functional diffuse optical tomography of human brain Zhan, Yuxuan January 2013 (has links) Functional diffuse optical tomography (fDOT) is an emerging functional neuroimaging technology that allows non-invasive imaging of human brain functions. The aims of this thesis are to enhance current understandings and knowledge of fDOT image quality and to improve on its imaging performance using a model-based approach. Specifically we have established a computationally efficient finite element method (FEM)-based routine to conduct MRI-guided fDOT simulation studies. Based on this framework, we have demonstrated that HD-fDOT is capable of imaging focal haemodynamic response up to 18 mm depth below the human scalp surface at 10 mm image resolution and localisation accuracy, allowing distinguishability of gyri. In addition, we also investigated the effects of uncertainty in the background tissue optical property on HD-fDOT image quality. Our multi-model comparative study has concluded that the use of a proposed homogeneous background absorption fitting scheme in HD-fDOT can minimise the chances of obtaining sub-optimal image quality due to uncertainty in background tissue optical properties. Finally we have addressed and resolved a regularisation problem in spectral fDOT that was previously not understood. Our proposed singular-decomposition-based regularisation method has been shown to reduce imaging crosstalk observed in both spectral and non-spectral fDOT. 004
387	Adaptive operator search for the capacitated arc routine problem Consoli, Pietro A. January 2018 (has links) Evolutionary Computation approaches for Combinatorial Optimization have been successfully proposed for a plethora of different NP-Hard Problems. This research area has achieved acknowledgeable results and obtained remarkable progresses, and it has ultimately established itself as one of the most studied in AI. Yet, predicting the approximation ability of Evolutionary Algorithms (EAs) on novel problem instances remains a difficult easy task. As a consequence, their application in a real-world optimization context is reduced, as EAs are often considered not reliable and mature enough to be adopted in an industrial scenario. This thesis proposes new approaches to endow such meta-heuristics with a mechanism that would allow them to extract information during the search and to adaptively use such information in order to modify their behaviour and ultimately improve their performances. We consider the case study of the Capacitated Arc Routing Problem (CARP), to demonstrate the effectiveness of adaptive search techniques in a complex problem deeply connected with real-world scenarios. In particular, the main contributions of this thesis are: 1. An investigation of the adoption of a Parameter Tuning mechanism to adaptively choose the crossover operator that is used during the search; 2. The study of a novel Adaptive Operator Selection technique based on the use of Fitness Landscape Analysis techniques and on Online Learning; 3. A novel approach based on Knowledge Incorporation focusing on the reuse of information learned from the execution of a meta-heuristic on past instances, that is later used to improve the performances on the newly encountered. 004
388	Architectural stability of self-adaptive software systems Salama, Maria Mourad Ebeid Meleka January 2018 (has links) This thesis studies the notion of stability in software engineering with the aim of understanding its dimensions, facets and aspects, as well as characterising it. The thesis further investigates the aspect of behavioural stability at the architectural level, as a property concerned with the architecture's capability in maintaining the achievement of expected quality of service and accommodating runtime changes, in order to delay the architecture drifting and phasing-out as a consequence of the continuous unsuccessful provision of quality requirements. The research aims to provide a systematic and methodological support for analysing, modelling, designing and evaluating architectural stability. The novelty of this research is the consideration of stability during runtime operation, by focusing on the stable provision of quality of service without violations. As the runtime dimension is associated with adaptations, the research investigates stability in the context of self-adaptive software architectures, where runtime stability is challenged by the quality of adaptation, which in turn affects the quality of service. The research evaluation focuses on the effectiveness, scale and accuracy in handling runtime dynamics, using the self-adaptive cloud architectures. 004
389	Implementation of symbolic model checking for probabilistic systems Parker, David Anthony January 2003 (has links) In this thesis, we present efficient implementation techniques for probabilistic model checking, a method which can be used to analyse probabilistic systems such as randomised distributed algorithms, fault-tolerant processes and communication networks. A probabilistic model checker inputs a probabilistic model and a specification, such as "the message will be delivered with probability 1", "the probability of shutdown occurring is at most 0.02" or "the probability of a leader being elected within 5 rounds is at least 0.98", and can automatically verify if the specification is true in the model. Motivated by the success of symbolic approaches to non-probabilistic model checking, which are based on a data structure called binary decision diagrams (BDDs), we present an extension to the probabilistic case, using multi-terminal binary decision diagrams (MTBDDs). We demonstrate that MTBDDs can be used to perform probabilistic analysis of large, structured models with more than 7.5 billion states, way out of the reach of conventional, explicit techniques, based on sparse matrices. We also propose a novel, hybrid approach, combining features of both symbolic and explicit implementations and show, using results from a wide range of case studies, that this technique can almost match the speed of sparse matrix based implementations, but uses significantly less memory. This increases, by approximately an order of magnitude, the size of model which can be handled on a typical workstation. 005
390	Learning by observation using Qualitative Spatial Relations Young, Jay January 2016 (has links) We present an approach to the problem of learning by observation in spatially-situated tasks, whereby an agent learns to imitate the behaviour of an observed expert, with no direct interaction and limited observations. The form of knowledge representation used for these observations is crucial, and we apply Qualitative Spatial-Relational representations to compress continuous, metric state-spaces into symbolic states to maximise the generalisability of learned models and minimise knowledge engineering. Our system self-configures these representations of the world to discover configurations of features most relevant to the task, and thus build good predictive models. We then show how these models can be employed by situated agents to control their behaviour, closing the loop from observation to practical implementation. We evaluate our approach in the simulated RoboCup Soccer domain and the Real-Time Strategy game Starcraft, and successfully demonstrate how a system using our approach closely mimics the behaviour of both synthetic (AI controlled) players, and also human-controlled players through observation. We further evaluate our work in Reinforcement Learning tasks in these domains, and show that our approach improves the speed at which such models can be learned. 004.6

Search results