Global ETD Search

371	Automatic behavioural analysis of malware Santoro, Tiziano January 2010 (has links) With malware becoming more and more diffused and at the same time more sophisticated in its attack techniques, countermeasures need to be set up so that new kinds of threats can be identified and dismantled in the shortest possible time, before they cause harm to the system under attack. With new behaviour patterns like the one shown by polymorphic and metamorphic viruses, static analysis is not any more a reliable way to detect those threats, and behaviour analysis seems a good candidate to fight against the next-generation families of viruses. In this project, we describe a methodology to analyze and categorize binaries solely on the basis of their behaviour, in terms of their interaction with the Operating System, other processes and network. The approach can strengten host-based intrusion detection systems by a timely classification of unkown but similar malware code. It has been evaluated on a dataset from the research community and tried on a smaller data set from local companies collected at University of Mondragone. TECHNOLOGY TEKNIKVETENSKAP
372	Probabilistic Graphical Models and Algorithms for Jiao, Feng January 2008 (has links) In this thesis I present research in two fields: machine learning and computational biology. First, I develop new machine learning methods for graphical models that can be applied to protein problems. Then I apply graphical model algorithms to protein problems, obtaining improvements in protein structure prediction and protein structure alignment. First,in the machine learning work, I focus on a special kind of graphical model---conditional random fields (CRFs). Here, I present a new semi-supervised training procedure for CRFs that can be used to train sequence segmentors and labellers from a combination of labeled and unlabeled training data. Such learning algorithms can be applied to protein and gene name entity recognition problems. This work provides one of the first semi-supervised discriminative training methods for structured classification. Second, in my computational biology work, I focus mainly on protein problems. In particular, I first propose a tree decomposition method for solving the protein structure prediction and protein structure alignment problems. In so doing, I reveal why tree decomposition is a good method for many protein problems. Then, I propose a computational framework for detection of similar structures of a target protein with sparse NMR data, which can help to predict protein structure using experimental data. Finally, I propose a new machine learning approach---LS_Boost---to solve the protein fold recognition problem, which is one of the key steps in protein structure prediction. After a thorough comparison, the algorithm is proved to be both more accurate and more efficient than traditional z-Score method and other machine learning methods. machine learning computational biology Computer Science
373	Pre-processing of tandem mass spectra using machine learning methods Ding, Jiarui 27 May 2009 (has links) Protein identification has been more helpful than before in the diagnosis and treatment of many diseases, such as cancer, heart disease and HIV. Tandem mass spectrometry is a powerful tool for protein identification. In a typical experiment, proteins are broken into small amino acid oligomers called peptides. By determining the amino acid sequence of several peptides of a protein, its whole amino acid sequence can be inferred. Therefore, peptide identification is the first step and a central issue for protein identification. Tandem mass spectrometers can produce a large number of tandem mass spectra which are used for peptide identification. Two issues should be addressed to improve the performance of current peptide identification algorithms. Firstly, nearly all spectra are noise-contaminated. As a result, the accuracy of peptide identification algorithms may suffer from the noise in spectra. Secondly, the majority of spectra are not identifiable because they are of too poor quality. Therefore, much time is wasted attempting to identify these unidentifiable spectra.<p> The goal of this research is to design spectrum pre-processing algorithms to both speedup and improve the reliability of peptide identification from tandem mass spectra. Firstly, as a tandem mass spectrum is a one dimensional signal consisting of dozens to hundreds of peaks, and majority of peaks are noisy peaks, a spectrum denoising algorithm is proposed to remove most noisy peaks of spectra. Experimental results show that our denoising algorithm can remove about 69% of peaks which are potential noisy peaks among a spectrum. At the same time, the number of spectra that can be identified by Mascot algorithm increases by 31% and 14% for two tandem mass spectrum datasets. Next, a two-stage recursive feature elimination based on support vector machines (SVM-RFE) and a sparse logistic regression method are proposed to select the most relevant features to describe the quality of tandem mass spectra. Our methods can effectively select the most relevant features in terms of performance of classifiers trained with the different number of features. Thirdly, both supervised and unsupervised machine learning methods are used for the quality assessment of tandem mass spectra. A supervised classifier, (a support vector machine) can be trained to remove more than 90% of poor quality spectra without removing more than 10% of high quality spectra. Clustering methods such as model-based clustering are also used for quality assessment to cancel the need for a labeled training dataset and show promising results.
374	A study on machine learning algorithms for fall detection and movement classification Ralhan, Amitoz Singh 04 January 2010 (has links) Fall among the elderly is an important health issue. Fall detection and movement tracking techniques are therefore instrumental in dealing with this issue. This thesis responds to the challenge of classifying different movement types as a part of a system designed to fulfill the need for a wearable device to collect data for fall and near-fall analysis. Four different fall activities (forward, backward, left and right), three normal activities (standing, walking and lying down) and near-fall situations are identified and detected. Different machine learning algorithms are compared and the best one is used for the real time classification. The comparison is made using Waikato Environment for Knowledge Analysis or in short WEKA. The system also has the ability to adapt to different gaits of different people. A feature selection algorithm is also introduced to reduce the number of features required for the classification problem. Machine Learning Fall Detection Feature Selection
375	Metareasoning about propagators for constraint satisfaction Thompson, Craig Daniel Stewart 11 July 2011 (has links) Given the breadth of constraint satisfaction problems (CSPs) and the wide variety of CSP solvers, it is often very difficult to determine a priori which solving method is best suited to a problem. This work explores the use of machine learning to predict which solving method will be most effective for a given problem. We use four different problem sets to determine the CSP attributes that can be used to determine which solving method should be applied. After choosing an appropriate set of attributes, we determine how well j48 decision trees can predict which solving method to apply. Furthermore, we take a cost sensitive approach such that problem instances where there is a great difference in runtime between algorithms are emphasized. We also attempt to use information gained on one class of problems to inform decisions about a second class of problems. Finally, we show that the additional costs of deciding which method to apply are outweighed by the time savings compared to applying the same solving method to all problem instances. performance prediction algorithm selection propagation machine learning
376	Fatigue effect on task performance in haptic virtual environment for home-based rehabilitation Yang, Chun 11 July 2011 (has links) Stroke rehabilitation is to train the motor function of a patients limb. In this process, functional assessment is of importance, and it is primarily based on a patients task performance. The context of the rehabilitation discussed in this thesis is such that functional assessment is conducted through a computer system and the Internet. In particular, a patient performs the task at home in a haptic virtual environment, and the task performance is transmitted to the therapist over the Internet. One problem with this approach to functional assessment is that a patients mind state is little known to the therapist. This immediately leads to one question, that is, whether an elevated mind state will have some significant effect on the patients task performance? If so, this approach can result in a considerable error. The overall objective of this thesis study was to generate an answer to the aforementioned question. The study focused on a patients elevated fatigue state. The specific objectives of the study include: (i) developing a haptic virtual environment prototype system for functional assessment, (ii) developing a physiological-based inference system for fatigue state, and (iii) performing an experiment to generate knowledge regarding the fatigue effect on task performance. With a limited resource in recruiting patients in the experiment, the study conducted few experiments on patients but mostly on healthy subjects. The study has concluded: (1) the proposed haptic virtual environment system is effective for the wrist coordination task and is likely promising to other tasks, (2) the accuracy of proposed fatigue inference system achieves 89.54%, for two levels of fatigue state, which is promising, (3) the elevated fatigue state significantly affects task performance in the context of wrist coordination task, and (4) the accuracy of the individual-based inference approach is significantly higher than that of the group-based inference approach. The main contributions of the thesis are (1) generation of the new knowledge regarding the fatigue effect on task performance in the context of home-based rehabilitation, (2) provision of the new fatigue inference system with the highest accuracy in comparison with the existing approaches in literature, and (3) generation of the new knowledge regarding the difference between the individual-based inference and group-based inference approaches. Stroke Mind state Machine learning Task performance
377	Forecasting exchage rates using machine learning models with time-varying volatility Garg, Ankita January 2012 (has links) This thesis is focused on investigating the predictability of exchange rate returns on monthly and daily frequency using models that have been mostly developed in the machine learning field. The forecasting performance of these models will be compared to the Random Walk, which is the benchmark model for financial returns, and the popular autoregressive process. The machine learning models that will be used are Regression trees, Random Forests, Support Vector Regression (SVR), Least Absolute Shrinkage and Selection Operator (LASSO) and Bayesian Additive Regression trees (BART). A characterizing feature of financial returns data is the presence of volatility clustering, i.e. the tendency of persistent periods of low or high variance in the time series. This is in disagreement with the machine learning models which implicitly assume a constant variance. We therefore extend these models with the most widely used model for volatility clustering, the Generalized Autoregressive Conditional Heteroscedasticity (GARCH) process. This allows us to jointly estimate the time varying variance and the parameters of the machine learning using an iterative procedure. These GARCH-extended machine learning models are then applied to make one-step-ahead prediction by recursive estimation that the parameters estimated by this model are also updated with the new information. In order to predict returns, information related to the economic variables and the lagged variable will be used. This study is repeated on three different exchange rate returns: EUR/SEK, EUR/USD and USD/SEK in order to obtain robust results. Our result shows that machine learning models are capable of forecasting exchange returns both on daily and monthly frequency. The results were mixed, however. Overall, it was GARCH-extended SVR that shows great potential for improving the predictive performance of the forecasting of exchange rate returns. Forecasting exchange rates machine learning models
378	A magnetic intruder detection system based on cloud computing Sun, Rui-Ting 21 November 2012 (has links) Taiwan is surrounded by ocean, thus the ocean transportation has become the necessary support of Taiwan's economy. Due to this fact, this research provides a system based on cloud computing and distributed storage which is applied to compute large amount of data provided by many sensors on the sea in order to diagnose the existence of possible magnetized invaders. We use Hadoop platform from Apache Foundation to proceed distributable K-means clustering computation to process the data collected f rom many sensor nodes containing DGPS and magnetic sensors. With these data, it is possible to diagnose the existence and the moving direction of the possible invader. And the result can be return to remote monitoring terminal. Not only K-means can detect the irregularity of any axis of the magnetic field well, but also this system obtain good reliability and performance by Hadoop platform. The goal system can detect the irregularity of any axis of the magnetic field well enough by deploying K-Means clustering and obtain good reliability and performance by Hadoop platform. machine learning artificial intelligence cloud computing
379	Metrics for sampling-based motion planning Morales Aguirre, Marco Antonio 15 May 2009 (has links) A motion planner finds a sequence of potential motions for a robot to transit from an initial to a goal state. To deal with the intractability of this problem, a class of methods known as sampling-based planners build approximate representations of potential motions through random sampling. This selective random exploration of the space has produced many remarkable results, including solving many previously unsolved problems. Sampling-based planners usually represent the motions as a graph (e.g., the Probabilistic Roadmap Methods or PRMs), or as a tree (e.g., the Rapidly exploring Random Tree or RRT). Although many sampling-based planners have been proposed, we do not know how to select among them because their different sampling biases make their performance depend on the features of the planning space. Moreover, since a single problem can contain regions with vastly different features, there may not exist a simple exploration strategy that will perform well in every region. Unfortunately, we lack quantitative tools to analyze problem features and planners performance that would enable us to match planners to problems. We introduce novel metrics for the analysis of problem features and planner performance at multiple levels: node level, global level, and region level. At the node level, we evaluate how new samples improve coverage and connectivity of the evolving model. At the global level, we evaluate how new samples improve the structure of the model. At the region level, we identify groups or regions that share similar features. This is a set of general metrics that can be applied in both graph-based and tree-based planners. We show several applications for these tools to compare planners, to decide whether to stop planning or to switch strategies, and to adjust sampling in different regions of the problem. Motion Planning Metrics Machine Learning Robotics
380	Predicting homologous signaling pathways using machine learning Bostan, Babak. January 2009 (has links) Thesis (M. Sc.)--University of Alberta, 2009. / Title from PDF file main screen (viewed on Nov. 27, 2009). "A thesis submitted to the Faculty of Graduate Studies and Research in partial fulfillment of the requirements for the degree of Master of Science, Department of Computing Science, University of Alberta." Includes bibliographical references.

Search results