In many scientific fields, we are faced with extremely large, noisy datasets. Features of interest in these datasets may be difficult to explicitly define, obscured by noise, or simply lost in the magnitude of the dataset. Uncovering these features often necessitates the development of novel mathematical and statistical modeling approaches, and the utilization of powerful analysis tools. In this work, we present three distinct projects, all of which develop specific mathematical and statistical analysis to find features of interest amid large, noisy data. The first project measures cross-frequency coupling (CFC), i.e., the extent to which signals in different frequency bands interact, amid large, noisy neural voltage recordings. We use generalized linear models (GLMs) to define an accurate measure with confidence intervals and significance values. We show in simulation how this measure improves upon existing approaches, and apply this measure to analyze CFC during a human seizure. The second project develops a fully-automated detector of spike ripples, a powerful biomarker of epilepsy, which occur sparingly in long duration neural voltage recordings. The method applies convolutional neural networks (CNNs) to spectrogram data, and performs comparably to gold-standard expert classifications. We apply this measure to a population of patients with childhood epilepsy, and effectively separate them into high and low seizure risk groups. The final project studies the COVID-19 epidemic, modeling infections and deaths over time from large quantities of noisy, incomplete state-level observations. We use a statistical, data-driven analysis to estimate the basic reproduction number (R0), and use this estimate in multiple compartmental models, fitting unknown parameters for death and recovery rates using an ensemble Markov chain Monte Carlo (MCMC) method. We show consistent estimates of dynamics and parameters across multiple compartmental models, in alignment with our current epidemiological understanding of the disease. In all projects, we are able to uncover key features of interest amid the large, noisy data, providing key insights backed by mathematical and statistical rigor.
Identifer | oai:union.ndltd.org:bu.edu/oai:open.bu.edu:2144/43244 |
Date | 29 October 2021 |
Creators | Nadalin, Jessica |
Contributors | Kramer, Mark A. |
Source Sets | Boston University |
Language | en_US |
Detected Language | English |
Type | Thesis/Dissertation |
Page generated in 0.0019 seconds