I will investigate applications of machine learning algorithms to medical data, adaptations of differences in data collection, and the use of ensemble techniques.
Focusing on the binary classification problem of Parkinson’s Disease (PD) diagnosis, I will apply machine learning algorithms to a primary dataset consisting of voice recordings from healthy and PD subjects. Specifically, I will use Artificial Neural Networks, Support Vector Machines, and an Ensemble Learning algorithm to reproduce results from [MS12] and [GM09].
Next, I will adapt a secondary regression dataset of PD recordings and combine it with the primary binary classification dataset, testing various techniques to consolidate the data including treating the regression data as unlabeled data in a semi-supervised learning approach. I will determine the performance of the above algorithms on this consolidated dataset.
Performance of algorithms will be evaluated using 10-fold cross validation and results will be analyzed in a confusion matrix. Accuracy, precision, recall, and F-score will be calculated.
The expands on past related work, which has used either a regression dataset alone to predict a Unified Parkinson’s Disease Rating Scale score for PD patients, or a classification dataset to determine healthy or PD diagnosis. In past work, the datasets have not been combined, and the regression set has not been used to contribute to evaluation of healthy subjects.
Identifer | oai:union.ndltd.org:CLAREMONT/oai:scholarship.claremont.edu:cmc_theses-1784 |
Date | 01 January 2013 |
Creators | Hashmi, Sumaiya F |
Publisher | Scholarship @ Claremont |
Source Sets | Claremont Colleges |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | CMC Senior Theses |
Rights | © 2013 Sumaiya F. Hashmi |
Page generated in 0.002 seconds