• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 3
  • Tagged with
  • 3
  • 3
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Understanding The Effects of Incorporating Scientific Knowledge on Neural Network Outputs and Loss Landscapes

Elhamod, Mohannad 06 June 2023 (has links)
While machine learning (ML) methods have achieved considerable success on several mainstream problems in vision and language modeling, they are still challenged by their lack of interpretable decision-making that is consistent with scientific knowledge, limiting their applicability for scientific discovery applications. Recently, a new field of machine learning that infuses domain knowledge into data-driven ML approaches, termed Knowledge-Guided Machine Learning (KGML), has gained traction to address the challenges of traditional ML. Nonetheless, the inner workings of KGML models and algorithms are still not fully understood, and a better comprehension of its advantages and pitfalls over a suite of scientific applications is yet to be realized. In this thesis, I first tackle the task of understanding the role KGML plays at shaping the outputs of a neural network, including its latent space, and how such influence could be harnessed to achieve desirable properties, including robustness, generalizability beyond training data, and capturing knowledge priors that are of importance to experts. Second, I use and further develop loss landscape visualization tools to better understand ML model optimization at the network parameter level. Such an understanding has proven to be effective at evaluating and diagnosing different model architectures and loss functions in the field of KGML, with potential applications to a broad class of ML problems. / Doctor of Philosophy / My research aims to address some of the major shortcomings of machine learning, namely its opaque decision-making process and the inadequate understanding of its inner workings when applied in scientific problems. In this thesis, I address some of these shortcomings by investigating the effect of supplementing the traditionally data-centric method with human knowledge. This includes developing visualization tools that make understanding such practice and further advancing it easier. Conducting this research is critical to achieving wider adoption of machine learning in scientific fields as it builds up the community's confidence not only in the accuracy of the framework's results, but also in its ability to provide satisfactory rationale.
2

Knowledge-guided Machine Learning for Sensor-based High-Performance Autonomous Material Characterization

Zhang, Junru 02 December 2024 (has links)
Knowledge-guided machine learning enables sensor-based high-performance material characterization that drives accelerated materials discovery and manufacturing. Traditional materials discovery workflows are driven by low-throughput characterization processes that involve several manual sample preparation steps and require relatively large amounts of material. While automated material dispensing processes now provide the ability to automate the synthesis of materials, the characterization of material composition, structure, and properties remains challenging due to the lack of reliable high-throughput characterization methods. Commercial benchtop characterization instruments are gold standards for characterizing the composition, structure, and properties of materials but lack synergy with state-of-the-art accelerated materials discovery workflows, which are based on miniaturized transducers for material testing (e.g., sensors), automation, and low-volume test formats. Due to the time- and resource-intensive nature of experimentation and the limited budget imposed on autonomous experimentation workflows in practical applications, the data generated from accelerated material discovery workflows are usually sparse and imbalanced, challenging the construction and training of machine learning models. In this dissertation, we create knowledge-guided machine learning models to support sensor-based high-performance autonomous material characterization. Several different types of knowledge-guided machine learning models were established for high-performance sensor-based characterization of material composition and phase. Specifically, three new methodologies are proposed and developed: 1. A new rapid and autonomous high-performance characterization method for accelerated engineering of soft functional materials is proposed to overcome the challenge of low-throughput characterization and manual data analysis. The proposed method is compatible with state-of-the-art material synthesis platforms combining automated sensing and sensor physics-guided machine learning that reduces the characterization cycle time and improves the material phase classification accuracy. Utilizing domain knowledge of measurement processes that generate data (e.g., sensor physics) and thermodynamics that govern material phase for feature engineering improved model and process performance. 2. To help mitigate the challenge of low measurement confidence associated with material composition measurement using biosensors, a novel knowledge-guided machine learning approach that integrates domain knowledge in sensor chemistry and physics is proposed. The proposed method implements data augmentation techniques to address sparsity and imbalance of biosensor data and identified new features in biosensor time-series data that are predictive of target analyte concentration and probability of false positive and negative responses. 3. A novel deep learning model with knowledge-guided cost function supervision is proposed to improve biosensor performance, specifically to improve the classification of false responses and reduce biosensor time delay. This new methodology combines regression- and classification-based data analyses, significantly improving biosensor accuracy and speed. The method fuses theory that governs dynamic sensor response (i.e., data generation) with machine learning models to guide regression and classification tasks, providing improved model interpretability and explainability. With the advancement of knowledge-guided machine learning and sensing technologies, the performance of experimental tools and processes for accelerated materials discovery and manufacturing applications can continue to be improved, particularly with respect to speed and reliability, which are critical performance attributes for future industrial adoption. / Doctor of Philosophy / The process of discovering and engineering new molecules and materials is based on a sequential process of making and testing, referred to as synthesis and characterization, respectively, which is often repeated until a design objective or budget is met. While the area of material synthesis has advanced, given the development of 3D printing processes, the characterization process presents a bottleneck due to limited throughput. A combination of sensing, automation, and machine learning offers the potential to advance the performance of characterization tools and processes. This dissertation aims to improve the performance of experimental tools and processes for accelerated discovery and quality-assurance manufacturing of biomolecules and soft materials. This dissertation combines knowledge-guided machine learning with automated sensing to accelerate the characterization of soft material mechanical properties and phase and material composition. The proposed methods are validated in several applications.
3

Learning without Expert Labels for Multimodal Data

Maruf, Md Abdullah Al 09 January 2025 (has links)
While advancements in deep learning have been largely possible due to the availability of large-scale labeled datasets, obtaining labeled datasets at the required granularity is challenging in many real-world applications, especially in scientific domains, due to the costly and labor-intensive nature of generating annotations. Hence, there is a need to develop new paradigms for learning that do not rely on expert-labeled data and can work even with indirect supervision. Approaches for learning with indirect supervision include unsupervised learning, self-supervised learning, weakly supervised learning, few-shot learning, and knowledge distillation. This thesis addresses these opportunities in the context of multi-modal data through three main contributions. First, this thesis proposes a novel Distance-aware Negative Sampling method for self-supervised Graph Representation Learning (GRL) that learns node representations directly from the graph structure by maximizing separation between distant nodes and maximizing cohesion among nearby nodes. Second, this thesis introduces effective modifications to weakly supervised semantic segmentation (WS3) models, such as stochastic aggregation to saliency maps that improve the learning of pseudo-ground truths from class-level coarse-grained labels and address the limitations of class activation maps. Finally, this thesis evaluates whether pre-trained Vision-Language Models (VLMs) contain the necessary scientific knowledge to identify and reason about biological traits from scientific images. The zero-shot performance of 12 large VLMs is evaluated on a novel VLM4Bio dataset, along with the effects of prompting and reasoning hallucinations are explored. / Doctor of Philosophy / While advancements in machine learning (ML), such as deep learning, have been largely possible due to the availability of large-scale labeled datasets, obtaining high-quality and high-resolution labels is challenging in many real-world applications due to the costly and labor-intensive nature of generating annotations. This thesis explores new ways of training ML models without relying heavily on expert-labeled data using indirect supervision. First, it introduces a novel way of using the structure of graphs for learning representations of graph-based data. Second, it analyzes the effect of weak supervision using coarse labels for image-based data. Third, it evaluates whether current ML models can recognize and reason about scientific images on their own, aiming to make learning more efficient and less dependent on exhaustive labeling.

Page generated in 0.0698 seconds