Return to search

Improvement of Bacteria Detection Accuracy and Speed Using Raman Scattering and Machine Learning

Bacteria identification plays an essential role in preventing health complications and saving patients' lives. The most widely used method to identify bacteria, the bacterial cultural method, suffers from long processing times. Hence, an effective, rapid, and non-invasive method is needed as an alternative. Raman spectroscopy is a potential candidate for bacteria identifi cation due to its effective and rapid results and the fact that, similar to the uniqueness of a human fingerprint, the Raman spectrum is unique for every material.
In my lab at the University of Ottawa, we focus on the use of Raman scattering for
biosensing in order to achieve high identifi cation accuracy for different types of bacteria.
Based on the unique Raman fingerprint for each bacteria type, different types of bacteria can be identifi ed successfully. However, using the Raman spectrum to identify bacteria poses a few challenges. First, the Raman signal is a weak signal, and so enhancement of the signal intensity is essential, e.g., by using surface-enhanced Raman scattering (SERS).
Moreover, the Raman signal can be contaminated by different noise sources. Also, the signal consists of a large number of features, and is non-linear due to the correlation between the Raman features. Using machine learning (ML) along with SERS, we can overcome such challenges in the identifi cation process and achieve high accuracy for the system identifying bacteria.
In this thesis, I present a method to improve the identifi cation of different bacteria
types using a support vector machine (SVM) ML algorithm based on SERS. I also present dimension reduction techniques to reduce the complexity and processing time while maintaining high identifi cation accuracy in the classifi cation process. I consider four bacteria types: Escherichia coli (EC), Cutibacterium acnes (CA, it was formerly known as Propi-onibacterium acnes), methicillin-resistant Staphylococcus aureus (MRSA), and methicillin-sensitive Staphylococcus aureus (MSSA). Both the MRSA and MSSA are combined in a single class named MS in the classifi cation. We are focusing on using these types of bacteria as they are the most common types in the joint infection disease.
Using binary classi fication, I present the simulation results for three binary models: EC
vs CA, EC vs MS, and MS vs CA. Using the full data set, binary classi fication achieved a classi fication accuracy of more than 95% for the three models. When the samples data set was reduced, to decrease the complexity based on the samples' signal-to-noise ratio (SNR), a classi fication accuracy of more than 95% for the three models was achieved using less than 60% of the original data set. The recursive feature elimination (RFE) algorithm was then used to reduce the complexity in the feature dimension. Given that a small number of features were more heavily weighted than the rest of the features, the number of features used in the classifi cation could be signi ficantly reduced while maintaining high classi fication accuracy.
I also present the classifi cation accuracy of using the multiclass one-versus-all (OVA)
method, i.e., EC vs all, MS vs all, and CA vs all. Using the complete data set, the OVA
method achieved classi cation accuracy of more than 90%. Similar to the binary classifi cation, the dimension reduction was applied to the input samples. Using the SNR reduction, the input samples were reduced by more than 60% while maintaining classifi cation accuracy higher than 80%. Furthermore, when the RFE algorithm was used to reduce the complexity on the features, and only the 5% top-weighted features of the full data set were used, a classi fication accuracy of more than 90% was achieved. Finally, by combining both reduction dimensions, the classi fication accuracy for the reduced data set was above 92% for a signifi cantly reduced data set.
Both the dimension reduction and the improvement in the classi fication accuracy between different types of bacteria using the ML algorithm and SERS could have a signi ficant impact in ful lfiling the demand for accurate, fast, and non-destructive identi fication of bacteria samples in the medical fi eld, in turn potentially reducing health complications and saving patient lives.

Identiferoai:union.ndltd.org:uottawa.ca/oai:ruor.uottawa.ca:10393/44060
Date15 September 2022
CreatorsMandour, Aseel
ContributorsAnis, Hanan
PublisherUniversité d'Ottawa / University of Ottawa
Source SetsUniversité d’Ottawa
LanguageEnglish
Detected LanguageEnglish
TypeThesis
Formatapplication/pdf
RightsAttribution-NoDerivatives 4.0 International, http://creativecommons.org/licenses/by-nd/4.0/

Page generated in 0.0018 seconds