Global ETD Search

11	A study on machine learning algorithms for fall detection and movement classification Ralhan, Amitoz Singh 04 January 2010 Fall among the elderly is an important health issue. Fall detection and movement tracking techniques are therefore instrumental in dealing with this issue. This thesis responds to the challenge of classifying different movement types as a part of a system designed to fulfill the need for a wearable device to collect data for fall and near-fall analysis. Four different fall activities (forward, backward, left and right), three normal activities (standing, walking and lying down) and near-fall situations are identified and detected. Different machine learning algorithms are compared and the best one is used for the real time classification. The comparison is made using Waikato Environment for Knowledge Analysis or in short WEKA. The system also has the ability to adapt to different gaits of different people. A feature selection algorithm is also introduced to reduce the number of features required for the classification problem. Machine Learning Fall Detection Feature Selection
12	A study on machine learning algorithms for fall detection and movement classification Ralhan, Amitoz Singh 04 January 2010 (has links) Fall among the elderly is an important health issue. Fall detection and movement tracking techniques are therefore instrumental in dealing with this issue. This thesis responds to the challenge of classifying different movement types as a part of a system designed to fulfill the need for a wearable device to collect data for fall and near-fall analysis. Four different fall activities (forward, backward, left and right), three normal activities (standing, walking and lying down) and near-fall situations are identified and detected. Different machine learning algorithms are compared and the best one is used for the real time classification. The comparison is made using Waikato Environment for Knowledge Analysis or in short WEKA. The system also has the ability to adapt to different gaits of different people. A feature selection algorithm is also introduced to reduce the number of features required for the classification problem. Machine Learning Fall Detection Feature Selection
13	Voice and lip based speaker verification Pandit, Medha January 2000 (has links) No description available. 621.3994
14	Feature selection via joint likelihood Pocock, Adam Craig January 2012 (has links) We study the nature of filter methods for feature selection. In particular, we examine information theoretic approaches to this problem, looking at the literature over the past 20 years. We consider this literature from a different perspective, by viewing feature selection as a process which minimises a loss function. We choose to use the model likelihood as the loss function, and thus we seek to maximise the likelihood. The first contribution of this thesis is to show that the problem of information theoretic filter feature selection can be rephrased as maximising the likelihood of a discriminative model. From this novel result we can unify the literature revealing that many of these selection criteria are approximate maximisers of the joint likelihood. Many of these heuristic criteria were hand-designed to optimise various definitions of feature "relevancy" and "redundancy", but with our probabilistic interpretation we naturally include these concepts, plus the "conditional redundancy", which is a measure of positive interactions between features. This perspective allows us to derive the different criteria from the joint likelihood by making different independence assumptions on the underlying probability distributions. We provide an empirical study which reinforces our theoretical conclusions, whilst revealing implementation considerations due to the varying magnitudes of the relevancy and redundancy terms. We then investigate the benefits our probabilistic perspective provides for the application of these feature selection criteria in new areas. The joint likelihood automatically includes a prior distribution over the selected feature sets and so we investigate how including prior knowledge affects the feature selection process. We can now incorporate domain knowledge into feature selection, allowing the imposition of sparsity on the selected feature set without using heuristic stopping criteria. We investigate the use of priors mainly in the context of Markov Blanket discovery algorithms, in the process showing that a family of algorithms based upon IAMB are iterative maximisers of our joint likelihood with respect to a particular sparsity prior. We thus extend the IAMB family to include a prior for domain knowledge in addition to the sparsity prior. Next we investigate what the choice of likelihood function implies about the resulting filter criterion. We do this by applying our derivation to a cost-weighted likelihood, showing that this likelihood implies a particular cost-sensitive filter criterion. This criterion is based on a weighted branch of information theory and we prove several novel results justifying its use as a feature selection criterion, namely the positivity of the measure, and the chain rule of mutual information. We show that the feature set produced by this cost-sensitive filter criterion can be used to convert a cost-insensitive classifier into a cost-sensitive one by adjusting the features the classifier sees. This can be seen as an analogous process to that of adjusting the data via over or undersampling to create a cost-sensitive classifier, but with the crucial difference that it does not artificially alter the data distribution. Finally we conclude with a summary of the benefits this loss function view of feature selection has provided. This perspective can be used to analyse other feature selection techniques other than those based upon information theory, and new groups of selection criteria can be derived by considering novel loss functions. 006.3
15	Modern variable selection techniques in the generalised linear model with application in Biostatistics Millard, Salomi 10 1900 (has links) In a Biostatistics environment, the datasets to be analysed are frequently high-dimensional and multicollinearity is expected due to the nature of the features. However, many traditional approaches to statistical analysis and feature selection cease to be useful in the presence of high-dimensionality and multicollinearity. Penalised regression methods have proved to be practical and attractive for dealing with these problems. In this dissertation, we propose a new penalised approach, the modified elastic-net (MEnet), for statistical analysis and feature selection using a combination of the ridge and bridge penalties. This method is designed to deal with high-dimensional problems with highly correlated predictor variables. Furthermore, it has a closed-form solution, unlike the most frequently used penalised techniques, which makes it simple to implement on high-dimensional data. We show how this approach can be used to analyse high-dimensional data with binary responses, e.g., microarray data, and simultaneously select significant features. An extensive simulation study and analysis of a colon cancer dataset demonstrate the properties and practical aspects of the proposed method. / Mini Dissertation (MSc (Advanced Data Analytics))--University of Pretoria, 2020. / DSI-CSIR Interbursary Support (IBS) Programme / Statistics Industry HUB, Department of Statistics, University of Pretoria / Statistics / MSc / Restricted Mathematical statistics Penalised regression Feature selection UCTD
16	Machine Learning Identification of Protein Properties Useful for Specific Applications Khamis, Abdullah M. 31 March 2016 (has links) Proteins play critical roles in cellular processes of living organisms. It is therefore important to identify and characterize their key properties associated with their functions. Correlating protein’s structural, sequence and physicochemical properties of its amino acids (aa) with protein functions could identify some of the critical factors governing the specific functionality. We point out that not all functions of even well studied proteins are known. This, complemented by the huge increase in the number of newly discovered and predicted proteins, makes challenging the experimental characterization of the whole spectrum of possible protein functions for all proteins of interest. Consequently, the use of computational methods has become more attractive. Here we address two questions. The first one is how to use protein aa sequence and physicochemical properties to characterize a family of proteins. The second one focuses on how to use transcription factor (TF) protein’s domains to enhance accuracy of predicting TF DNA binding sites (TFBSs). To address the first question, we developed a novel method using computational representation of proteins based on characteristics of different protein regions (N-terminal, M-region and C-terminal) and combined these with the properties of protein aa sequences. We show that this description provides important biological insight about characterization of the protein functional groups. Using feature selection techniques, we identified key properties of proteins that allow for very accurate characterization of different protein families. We demonstrated efficiency of our method in application to a number of antimicrobial peptide families. To address the second question we developed another novel method that uses a combination of aa properties of DNA binding domains of TFs and their TFBS properties to develop machine learning models for predicting TFBSs. Feature selection is used to identify the most relevant characteristics of the aa for such modeling. In addition to reducing the number of required models to only 14 for several hundred TFs, the final prediction accuracy of our models appears dramatically better than with other methods. Overall, we show how to efficiently utilize properties of proteins in deriving more accurate solutions for two important problems of computational biology and bioinformatics. Machine Learning feature selection protein properties Bioinformatics
17	Evaluating and enhancing the security of cyber physical systems using machine learning approaches Sharma, Mridula 08 April 2020 (has links) The main aim of this dissertation is to address the security issues of the physical layer of Cyber Physical Systems. The network security is first assessed using a 5-level Network Security Evaluation Scheme (NSES). The network security is then enhanced using a novel Intrusion Detection System that is designed using Supervised Machine Learning. Defined as a complete architecture, this framework includes a complete packet analysis of radio traffic of Routing Protocol for Low-Power and Lossy Networks (RPL). A dataset of 300 different simulations of RPL network is defined for normal traffic, hello flood attack, DIS attack, increased version attack and decreased rank attack. The IDS is a multi-model detection model that provides an efficient detection against the known as well as new attacks. The model analysis is done with the cross-validation method as well as using the new data from a similar network. To detect the known attacks, the model performed at 99% accuracy rate and for the new attack, 85% accuracy is achieved. / Graduate CPS Supervised Machine Learning RPL Feature Selection
18	Application of Hyper-geometric Hypothesis-based Quantication and Markov Blanket Feature Selection Methods to Generate Signals for Adverse Drug Reaction Detection Zhang, Yi January 2012 (has links) No description available. Mechanical Engineering Pharmacovigilance Data Mining Feature Selection
19	Effective Linear-Time Feature Selection Pradhananga, Nripendra January 2007 (has links) The classification learning task requires selection of a subset of features to represent patterns to be classified. This is because the performance of the classifier and the cost of classification are sensitive to the choice of the features used to construct the classifier. Exhaustive search is impractical since it searches every possible combination of features. The runtime of heuristic and random searches are better but the problem still persists when dealing with high-dimensional datasets. We investigate a heuristic, forward, wrapper-based approach, called Linear Sequential Selection, which limits the search space at each iteration of the feature selection process. We introduce randomization in the search space. The algorithm is called Randomized Linear Sequential Selection. Our experiments demonstrate that both methods are faster, find smaller subsets and can even increase the classification accuracy. We also explore the idea of ensemble learning. We have proposed two ensemble creation methods, Feature Selection Ensemble and Random Feature Ensemble. Both methods apply a feature selection algorithm to create individual classifiers of the ensemble. Our experiments have shown that both methods work well with high-dimensional data. filter wrapper feature selection attribute selection ensemble learning machine learning Linear Feature Selection
20	Developing integrated data fusion algorithms for a portable cargo screening detection system Ayodeji, Akiwowo January 2012 (has links) Towards having a one size fits all solution to cocaine detection at borders; this thesis proposes a systematic cocaine detection methodology that can use raw data output from a fibre optic sensor to produce a set of unique features whose decisions can be combined to lead to reliable output. This multidisciplinary research makes use of real data sourced from cocaine analyte detecting fibre optic sensor developed by one of the collaborators - City University, London. This research advocates a two-step approach: For the first step, the raw sensor data are collected and stored. Level one fusion i.e. analyses, pre-processing and feature extraction is performed at this stage. In step two, using experimentally pre-determined thresholds, each feature decides on detection of cocaine or otherwise with a corresponding posterior probability. High level sensor fusion is then performed on this output locally to combine these decisions and their probabilities at time intervals. Output from every time interval is stored in the database and used as prior data for the next time interval. The final output is a decision on detection of cocaine. The key contributions of this thesis includes investigating the use of data fusion techniques as a solution for overcoming challenges in the real time detection of cocaine using fibre optic sensor technology together with an innovative user interface design. A generalizable sensor fusion architecture is suggested and implemented using the Bayesian and Dempster-Shafer techniques. The results from implemented experiments show great promise with this architecture especially in overcoming sensor limitations. A 5-fold cross validation system using a 12 13 - 1 Neural Network was used in validating the feature selection process. This validation step yielded 89.5% and 10.5% true positive and false alarm rates with 0.8 correlation coefficient. Using the Bayesian Technique, it is possible to achieve 100% detection whilst the Dempster Shafer technique achieves a 95% detection using the same features as inputs to the DF system.

Search results