61 |
Inferencing Gene Regulatory Networks for Drosophila Eye Development Using an Ensemble Machine Learning ApproachAbdul Jawad Mohammed (18437874) 29 April 2024 (has links)
<p dir="ltr">The primary purpose of this thesis is to propose and demonstrate BioGRNsemble, a modular and flexible approach for inferencing gene regulatory networks from RNA-Seq data. Integrating the GENIE3 and GRNBoost2 algorithms, this ensembles-of-ensembles method attempts to balance the outputs of both models through averaging, before providing a trimmed-down gene regulatory network consisting of transcription and target genes. Using a Drosophila Eye Dataset, we were able to successfully test this novel methodology, and our validation analysis using an online database determined over 3500 gene links correctly detected, albeit out of almost 530,000 predictions, leaving plenty of room for improvement in the future.</p>
|
62 |
Devices for On-Field Quantification of <i>Bacteroidales </i>for Risk Assessment in Fresh Produce OperationsAshley Deniz Kayabasi (19194448) 23 July 2024 (has links)
<p dir="ltr">The necessity for on-farm, point-of-need (PON) nucleic acid amplification tests (NAATs) arises from the prolonged turnaround times and high costs associated with traditional laboratory equipment. This thesis aims to address these challenges by developing devices and a user-interface application designed for the efficient, accurate, and rapid detection of <i>Bacteroidales</i> as an indicator of fecal contamination on fresh produce farms.</p><p dir="ltr">In pursuit of this, I collaborated with lab members to engineer a Field-Applicable Rapid Microbial Loop-mediated isothermal Amplification Platform, FARM-LAMP. This device is portable (164 x 135 x 193 mm), energy-efficient (operating under 20 W), achieves the target 65°C with ± 0.2°C fluctuations, and is compatible with paper-based biosensors for loop-mediated isothermal amplification (LAMP). Subsequently, I led the fabrication of the microfluidic Field-Applicable Sampling Tool, FAST, designed to deliver high-throughput (10 samples per device), equal flow-splitting of fluids to paper-based biosensors, eliminating the need for a laboratory or extensive training. FARM-LAMP achieved 100% concordance with standard lab-based tests when deployed on a commercial lettuce farm and FAST achieved an average accuracy of 89% in equal flow-splitting and 70% in volume hydration.</p><p dir="ltr">A crucial aspect of device development is ensuring that results are easily interpretable by users. To this end, I developed a Python-based image analysis codebase to quantify sample positivity for fecal contamination, ranging from 0% (no contamination) to nearly 100% (definite contamination) and the concentration of field samples. It utilizes calculus-based mathematics, such as first and second derivative analysis, and incorporates image analysis techniques, including hue, saturation, and value (HSV) binning to a sigmoid function, along with contrast limited adaptive histogram equalization (CLAHE). Additionally, I developed a preliminary graphical user interface in Python that defines a prediction model for the concentration of <i>Bacteroidales</i> based on local weather patterns.</p><p dir="ltr">This thesis encompasses hardware development for on-field quantification and the creation of a preliminary user-interface application to assess fecal contamination risk on fresh produce farms. Integrating these devices with a user-interface application allows for rapid interpretation of results on-farm, aiding in the effective development of strategies to ensure safety in fresh produce operations.</p>
|
63 |
REDUCED ORDER MODELING ENABLED PREDICTIONS OF ADDITIVE MANUFACTURING PROCESSESCharles Reynolds Owen (19320985) 02 August 2024 (has links)
<p dir="ltr">For additive manufacturing to be a viable method to build metal parts for industries such as nuclear, the manufactured parts must be of higher quality and have lower variation in said quality than what can be achieved today. This high variation in quality bars the techniques from being used in high safety tolerance fields, such as nuclear. If this obstacle could be overcome, the benefits of additive manufacturing would be in lower cost for complex parts, as well as the ability to design and test parts in a very short timeframe, as only the CAD model needs to be created to manufacture the part. In this study, work to achieve this lower variation of quality was approached in two ways. The first was in the development of surrogate models, utilizing machine learning, to predict the end quality of additively manufactured parts. This was done by using experimental data for the mechanical properties of built parts as outputs to be predicted, and in-situ signals captured during the manufacturing process as inputs to the model. To capture the in-situ signals, cameras were used for thermal and optical imaging, leveraging the natural layer-by-layer manufacturing method used in AM techniques. The final models were created using support vector machine and gaussian process regression machine learning algorithms, giving high correlations between the insitu signals and mechanical properties of relative density, elongation to fracture, uniform elongation, and the work hardening exponent. The second approach to this study was in the development of a reduced order model for a computer simulation of an AM build. For project, a ROM was built inside the MOOSE framework, and was developed for an AM model designed by the MOOSE team, using proper orthogonal decomposition to project the problem onto a lower dimensional subspace, using POD to design the reduced basis subspace. The ROM was able to achieve a reduction to 1% the original dimensionality of the problem, while only allowing 2-5% relative error associated with the projection.</p>
|
64 |
IIoT-based Instrumentation and Control System for a Lateral Micro-drilling Robot Using Machine Fault Diagnosis and Failure PrognosisJose A. Solorio Cervantes (11191893) 11 October 2023 (has links)
<p dir="ltr">This project aimed to develop an instrumentation and control system for a micro-drilling robot based on Industrial Internet of Things (IIoT) technologies. The automation system integrated IIoT technological tools to create a robust automation system capable of being used in drilling operations. The system incorporated industrial-grade sensors, which carried out direct measurements of the critical variables of the process. The indirect variables relevant to the control of the robot were calculated from the measured parameters. The system also considered the telemetry architecture necessary to reliably transmit data from the down-the-hole (DTH) robot to a receiver on the surface. Telemetry was based on wireless communication through long-range radio frequency (LoRa). The system developed had models based on Artificial Intelligence (AI) and Machine Learning (ML) for determining the mode of operation, detecting changes in the process, and changes in drilling variables in critical hydraulic components for the drilling process. Algorithms based on AI and ML models also allowed the user to make better decisions based on the variables' correlation to optimize the drilling process (e.g., dynamic change of flow, pressure, and RPMs based on automatic rock identification). A user interface (UI) was developed, and digital tools to perform data analysis were implemented. Safety assessment in all robot systems (e.g., electrical, hardware, software) was contemplated as a critical design component. The result of this research project provides innovative micro-drilling robots with the necessary technological tools to optimize the drilling process. The system made drilling more efficient, reliable, and safe, providing diagnostic and prognostic tools that allowed planning maintenance based on the actual health of the devices. The system that was developed was tested in a test bench under controlled conditions within a laboratory to characterize the system and collect data that allowed ML models' development, training, validation, and testing. The prototype of a micro-drilling robot installed on the test bench served as a case study to assess the implemented models' reliability and the proposed telemetry.</p>
|
65 |
A Machine Learning Model of Perturb-Seq Data for use in Space Flight Gene Expression Profile AnalysisLiam Fitzpatric Johnson (18437556) 27 April 2024 (has links)
<p dir="ltr">The genetic perturbations caused by spaceflight on biological systems tend to have a system-wide effect which is often difficult to deconvolute into individual signals with specific points of origin. Single cell multi-omic data can provide a profile of the perturbational effects but does not necessarily indicate the initial point of interference within a network. The objective of this project is to take advantage of large scale and genome-wide perturbational or Perturb-Seq datasets by using them to pre-train a generalist machine learning model that is capable of predicting the effects of unseen perturbations in new data. Perturb-Seq datasets are large libraries of single cell RNA sequencing data collected from CRISPR knock out screens in cell culture. The advent of generative machine learning algorithms, particularly transformers, make it an ideal time to re-assess large scale data libraries in order to grasp cell and even organism-wide genomic expression motifs. By tailoring an algorithm to learn the downstream effects of the genetic perturbations, we present a pre-trained generalist model capable of predicting the effects of multiple perturbations in combination, locating points of origin for perturbation in new datasets, predicting the effects of known perturbations in new datasets, and annotation of large-scale network motifs. We demonstrate the utility of this model by identifying key perturbational signatures in RNA sequencing data from spaceflown biological samples from the NASA Open Science Data Repository.</p>
|
66 |
Random parameters in learning: advantages and guaranteesEvzenie Coupkova (18396918) 22 April 2024 (has links)
<p dir="ltr">The generalization error of a classifier is related to the complexity of the set of functions among which the classifier is chosen. We study a family of low-complexity classifiers consisting of thresholding a random one-dimensional feature. The feature is obtained by projecting the data on a random line after embedding it into a higher-dimensional space parametrized by monomials of order up to k. More specifically, the extended data is projected n-times and the best classifier among those n, based on its performance on training data, is chosen. </p><p dir="ltr">We show that this type of classifier is extremely flexible, as it is likely to approximate, to an arbitrary precision, any continuous function on a compact set as well as any Boolean function on a compact set that splits the support into measurable subsets. In particular, given full knowledge of the class conditional densities, the error of these low-complexity classifiers would converge to the optimal (Bayes) error as k and n go to infinity. On the other hand, if only a training dataset is given, we show that the classifiers will perfectly classify all the training points as k and n go to infinity. </p><p dir="ltr">We also bound the generalization error of our random classifiers. In general, our bounds are better than those for any classifier with VC dimension greater than O(ln(n)). In particular, our bounds imply that, unless the number of projections n is extremely large, there is a significant advantageous gap between the generalization error of the random projection approach and that of a linear classifier in the extended space. Asymptotically, as the number of samples approaches infinity, the gap persists for any such n. Thus, there is a potentially large gain in generalization properties by selecting parameters at random, rather than optimization. </p><p dir="ltr">Given a classification problem and a family of classifiers, the Rashomon ratio measures the proportion of classifiers that yield less than a given loss. Previous work has explored the advantage of a large Rashomon ratio in the case of a finite family of classifiers. Here we consider the more general case of an infinite family. We show that a large Rashomon ratio guarantees that choosing the classifier with the best empirical accuracy among a random subset of the family, which is likely to improve generalizability, will not increase the empirical loss too much. </p><p dir="ltr">We quantify the Rashomon ratio in two examples involving infinite classifier families in order to illustrate situations in which it is large. In the first example, we estimate the Rashomon ratio of the classification of normally distributed classes using an affine classifier. In the second, we obtain a lower bound for the Rashomon ratio of a classification problem with a modified Gram matrix when the classifier family consists of two-layer ReLU neural networks. In general, we show that the Rashomon ratio can be estimated using a training dataset along with random samples from the classifier family and we provide guarantees that such an estimation is close to the true value of the Rashomon ratio.</p>
|
67 |
Machine Learning for Speech Forensics and Hypersonic Vehicle ApplicationsEmily R Bartusiak (6630773) 06 December 2022 (has links)
<p>Synthesized speech may be used for nefarious purposes, such as fraud, spoofing, and misinformation campaigns. We present several speech forensics methods based on deep learning to protect against such attacks. First, we use a convolutional neural network (CNN) and transformers to detect synthesized speech. Then, we investigate closed set and open set speech synthesizer attribution. We use a transformer to attribute a speech signal to its source (i.e., to identify the speech synthesizer that created it). Additionally, we show that our approach separates different known and unknown speech synthesizers in its latent space, even though it has not seen any of the unknown speech synthesizers during training. Next, we explore machine learning for an objective in the aerospace domain.</p>
<p><br></p>
<p>Compared to conventional ballistic vehicles and cruise vehicles, hypersonic glide vehicles (HGVs) exhibit unprecedented abilities. They travel faster than Mach 5 and maneuver to evade defense systems and hinder prediction of their final destinations. We investigate machine learning for identifying different HGVs and a conic reentry vehicle (CRV) based on their aerodynamic state estimates. We also propose a HGV flight phase prediction method. Inspired by natural language processing (NLP), we model flight phases as “words” and HGV trajectories as “sentences.” Next, we learn a “grammar” from the HGV trajectories that describes their flight phase transition patterns. Given “words” from the initial part of a HGV trajectory and the “grammar”, we predict future “words” in the “sentence” (i.e., future HGV flight phases in the trajectory). We demonstrate that this approach successfully predicts future flight phases for HGV trajectories, especially in scenarios with limited training data. We also show that it can be used in a transfer learning scenario to predict flight phases of HGV trajectories that exhibit new maneuvers and behaviors never seen before during training.</p>
|
Page generated in 0.0965 seconds