Global ETD Search

1	VISUAL ANALYTICS OF BIG DATA FROM MOLECULAR DYNAMICS SIMULATION Catherine Jenifer Rajam Rajendran (5931113) 03 February 2023 (has links) <p>Protein malfunction can cause human diseases, which makes the protein a target in the process of drug discovery. In-depth knowledge of how protein functions can widely contribute to the understanding of the mechanism of these diseases. Protein functions are determined by protein structures and their dynamic properties. Protein dynamics refers to the constant physical movement of atoms in a protein, which may result in the transition between different conformational states of the protein. These conformational transitions are critically important for the proteins to function. Understanding protein dynamics can help to understand and interfere with the conformational states and transitions, and thus with the function of the protein. If we can understand the mechanism of conformational transition of protein, we can design molecules to regulate this process and regulate the protein functions for new drug discovery. Protein Dynamics can be simulated by Molecular Dynamics (MD) Simulations.</p> <p>The MD simulation data generated are spatial-temporal and therefore very high dimensional. To analyze the data, distinguishing various atomic interactions within a protein by interpreting their 3D coordinate values plays a significant role. Since the data is humongous, the essential step is to find ways to interpret the data by generating more efficient algorithms to reduce the dimensionality and developing user-friendly visualization tools to find patterns and trends, which are not usually attainable by traditional methods of data process. The typical allosteric long-range nature of the interactions that lead to large conformational transition, pin-pointing the underlying forces and pathways responsible for the global conformational transition at atomic level is very challenging. To address the problems, Various analytical techniques are performed on the simulation data to better understand the mechanism of protein dynamics at atomic level by developing a new program called Probing Long-distance interactions by Tapping into Paired-Distances (PLITIP), which contains a set of new tools based on analysis of paired distances to remove the interference of the translation and rotation of the protein itself and therefore can capture the absolute changes within the protein.</p> <p>Firstly, we developed a tool called Decomposition of Paired Distances (DPD). This tool generates a distance matrix of all paired residues from our simulation data. This paired distance matrix therefore is not subjected to the interference of the translation or rotation of the protein and can capture the absolute changes within the protein. This matrix is then decomposed by DPD</p> <p>using Principal Component Analysis (PCA) to reduce dimensionality and to capture the largest structural variation. To showcase how DPD works, two protein systems, HIV-1 protease and 14-3-3 σ, that both have tremendous structural changes and conformational transitions as displayed by their MD simulation trajectories. The largest structural variation and conformational transition were captured by the first principal component in both cases. In addition, structural clustering and ranking of representative frames by their PC1 values revealed the long-distance nature of the conformational transition and locked the key candidate regions that might be responsible for the large conformational transitions.</p> <p>Secondly, to facilitate further analysis of identification of the long-distance path, a tool called Pearson Coefficient Spiral (PCP) that generates and visualizes Pearson Coefficient to measure the linear correlation between any two sets of residue pairs is developed. PCP allows users to fix one residue pair and examine the correlation of its change with other residue pairs.</p> <p>Thirdly, a set of visualization tools that generate paired atomic distances for the shortlisted candidate residue and captured significant interactions among them were developed. The first tool is the Residue Interaction Network Graph for Paired Atomic Distances (NG-PAD), which not only generates paired atomic distances for the shortlisted candidate residues, but also display significant interactions by a Network Graph for convenient visualization. Second, the Chord Diagram for Interaction Mapping (CD-IP) was developed to map the interactions to protein secondary structural elements and to further narrow down important interactions. Third, a Distance Plotting for Direct Comparison (DP-DC), which plots any two paired distances at user’s choice, either at residue or atomic level, to facilitate identification of similar or opposite pattern change of distances along the simulation time. All the above tools of PLITIP enabled us to identify critical residues contributing to the large conformational transitions in both HIV-1 protease and 14-3-3σ proteins.</p> <p>Beside the above major project, a side project of developing tools to study protein pseudo-symmetry is also reported. It has been proposed that symmetry provides protein stability, opportunities for allosteric regulation, and even functionality. This tool helps us to answer the questions of why there is a deviation from perfect symmetry in protein and how to quantify it.</p> Applications in life sciences Spatial data and applications Semi- and unsupervised learning Visual Analytics Data Visualization Principal Component Analysis Parallel Computing Pearson Coefficient Correlation Protein Structure Analysis Molecular Dynamics Simulation Study Paired-Distance Spatial-Temporal Data Pseudo-Symmetry
2	A Mathematical Theory of Communication with Graphs and Symbols Art Terlep (19194136), T. Arthur Terlep (10082101), T. Arthur Terlep (10082104) 25 July 2024 (has links) <p dir="ltr">This work will introduce a channel conceptualization and possible coding scheme for Graph-and-Symbol (GS) Communication. While Claude Shannon’s mathematical model for communication employed graphs to describe relationships and confusability among traditional time-sequenced signals, little work as been done to describe non-linear communication <i>with</i> graphs where we transmit and receive physical structures of information. The principal contribution of this work is to introduce a mathematical framework for communication with graphs which have symbols assigned to vertices. This looks like a molecule, and so we may think of these messages as coded forms of molecular communication.</p><p dir="ltr">At this time, many problems in this area will (and may remain) computationally intractable, but as the field of graph theory continues to develop, new tools and techniques may emerge to solve standing problems in this new subfield of communication.</p><p dir="ltr">Graphs present two difficulties: first, they contain ambiguities among their vertices and do not have an <i>a priori</i> canonical ordering, and second, the relationships among graphs lack structural regularities which we see in traditional error control coding lattices. There are no Galois fields to exploit over graph-based codes as we have with cyclic codes, for example. Furthermore, the shear number of graphs of order n grows so rapidly that it is difficult to account for the neighborhoods around codewords and effectively reduce communication errors which may occur. The more asymmetric a graph is, the more orderings on symbols it can support. However, asymmetries complicate the computation of channel transition probabilities, which are the cornerstone of all communication theory.</p><p dir="ltr">In the prologue, the reader will be introduced to a new educational tool for designing traditional binary cyclic codes.</p><p dir="ltr">1 through 10 will detail the development of Graph-and-Symbol (GS) Commu- nication to date followed by two example codes which demonstrate the power of structuring information on graphs.</p><p dir="ltr">Chapter 13 onward will review the preliminary work in another area of research, disjoint from the main body. It is included here for posterity and special interests in applying graphs to solving other problems in signal processing. It begins with an introduction of spacetime raythic graphs. We propose a new chamfering paradigm for connecting neighboring pixels which approximates solutions to the eikonal equation. We show that some raythic graphs possess structures with multiple, differing solutions to eikonal wavefront propagation which are essential to the construction of the Umbral Transform. This umbral transform emulates ray casting effects, such as shadows and diffraction within an image space, from a network-flow algorithm.</p><p dir="ltr">This work may be duplicated in whole or in part for educational purposes only. All other rights of this work are reserved by the author, Timothy Arthur Terlep Jr., of Rose-Hulman Institute of Technology, Terre Haute, IN (effective August 2024), and subject to the rules and regulations of the Graduate School of Purdue University.</p><p dir="ltr">Readers may contact the author with any comments and questions at <b>taterlep@gmail.com</b></p> Data communications Signal processing Engineering education Spatial data and applications graph theory combinatorics applied algebra communication theory information theory Shannon information content applied graphs Symbol rate optimization Automorphism group
3	Devices for On-Field Quantification of <i>Bacteroidales </i>for Risk Assessment in Fresh Produce Operations Ashley Deniz Kayabasi (19194448) 23 July 2024 (has links) <p dir="ltr">The necessity for on-farm, point-of-need (PON) nucleic acid amplification tests (NAATs) arises from the prolonged turnaround times and high costs associated with traditional laboratory equipment. This thesis aims to address these challenges by developing devices and a user-interface application designed for the efficient, accurate, and rapid detection of <i>Bacteroidales</i> as an indicator of fecal contamination on fresh produce farms.</p><p dir="ltr">In pursuit of this, I collaborated with lab members to engineer a Field-Applicable Rapid Microbial Loop-mediated isothermal Amplification Platform, FARM-LAMP. This device is portable (164 x 135 x 193 mm), energy-efficient (operating under 20 W), achieves the target 65°C with ± 0.2°C fluctuations, and is compatible with paper-based biosensors for loop-mediated isothermal amplification (LAMP). Subsequently, I led the fabrication of the microfluidic Field-Applicable Sampling Tool, FAST, designed to deliver high-throughput (10 samples per device), equal flow-splitting of fluids to paper-based biosensors, eliminating the need for a laboratory or extensive training. FARM-LAMP achieved 100% concordance with standard lab-based tests when deployed on a commercial lettuce farm and FAST achieved an average accuracy of 89% in equal flow-splitting and 70% in volume hydration.</p><p dir="ltr">A crucial aspect of device development is ensuring that results are easily interpretable by users. To this end, I developed a Python-based image analysis codebase to quantify sample positivity for fecal contamination, ranging from 0% (no contamination) to nearly 100% (definite contamination) and the concentration of field samples. It utilizes calculus-based mathematics, such as first and second derivative analysis, and incorporates image analysis techniques, including hue, saturation, and value (HSV) binning to a sigmoid function, along with contrast limited adaptive histogram equalization (CLAHE). Additionally, I developed a preliminary graphical user interface in Python that defines a prediction model for the concentration of <i>Bacteroidales</i> based on local weather patterns.</p><p dir="ltr">This thesis encompasses hardware development for on-field quantification and the creation of a preliminary user-interface application to assess fecal contamination risk on fresh produce farms. Integrating these devices with a user-interface application allows for rapid interpretation of results on-farm, aiding in the effective development of strategies to ensure safety in fresh produce operations.</p> Electronic instrumentation Microfluidics and nanofluidics Additive manufacturing Spatial data and applications Image processing Detection devices Hardware fabrication Microfluidic devices Sample processing Paper-based diagnostics Fecal contamination Fresh produce industry Image processing Point-of-need Machine learning Risk assessment
4	<strong>TOWARDS A TRANSDISCIPLINARY CYBER FORENSICS GEO-CONTEXTUALIZATION FRAMEWORK</strong> Mohammad Meraj Mirza (16635918) 04 August 2023 (has links) <p>Technological advances have a profound impact on people and the world in which they live. People use a wide range of smart devices, such as the Internet of Things (IoT), smartphones, and wearable devices, on a regular basis, all of which store and use location data. With this explosion of technology, these devices have been playing an essential role in digital forensics and crime investigations. Digital forensic professionals have become more able to acquire and assess various types of data and locations; therefore, location data has become essential for responders, practitioners, and digital investigators dealing with digital forensic cases that rely heavily on digital devices that collect data about their users. It is very beneficial and critical when performing any digital/cyber forensic investigation to consider answering the six Ws questions (i.e., who, what, when, where, why, and how) by using location data recovered from digital devices, such as where the suspect was at the time of the crime or the deviant act. Therefore, they could convict a suspect or help prove their innocence. However, many digital forensic standards, guidelines, tools, and even the National Institute of Standards and Technology (NIST) Cyber Security Personnel Framework (NICE) lack full coverage of what location data can be, how to use such data effectively, and how to perform spatial analysis. Although current digital forensic frameworks recognize the importance of location data, only a limited number of data sources (e.g., GPS) are considered sources of location in these digital forensic frameworks. Moreover, most digital forensic frameworks and tools have yet to introduce geo-contextualization techniques and spatial analysis into the digital forensic process, which may aid digital forensic investigations and provide more information for decision-making. As a result, significant gaps in the digital forensics community are still influenced by a lack of understanding of how to properly curate geodata. Therefore, this research was conducted to develop a transdisciplinary framework to deal with the limitations of previous work and explore opportunities to deal with geodata recovered from digital evidence by improving the way of maintaining geodata and getting the best value from them using an iPhone case study. The findings of this study demonstrated the potential value of geodata in digital disciplinary investigations when using the created transdisciplinary framework. Moreover, the findings discuss the implications for digital spatial analytical techniques and multi-intelligence domains, including location intelligence and open-source intelligence, that aid investigators and generate an exceptional understanding of device users' spatial, temporal, and spatial-temporal patterns.</p> Spatial data and applications Knowledge representation and reasoning Digital forensics Data engineering and data science Knowledge and information management Digital curation and preservation Cyber Crime Cybersecurity Cyber Forensics DFIR Digital Forensics Incident Response Threat Intelligence Networking Mobile Forensics iOS Forensics Intelligence Open-source Intelligence (OSINT) Location Intelligence GIS Spatial Analysis Spatiotemporal Analysis UAV Forensics GeoDatabase Geodata
5	Nonpoint Source Pollutant Modeling in Small Agricultural Watersheds with the Water Erosion Prediction Project Ryan McGehee (14054223) 04 November 2022 (has links) <p>Current watershed-scale, nonpoint source (NPS) pollution models do not represent the processes and impacts of agricultural best management practices (BMP) on water quality with sufficient detail. To begin addressing this gap, a novel process-based, watershed-scale, water quality model (WEPP-WQ) was developed based on the Water Erosion Prediction Project (WEPP) and the Soil and Water Assessment Tool (SWAT) models. The proposed model was validated at both hillslope and watershed scales for runoff, sediment, and both soluble and particulate forms of nitrogen and phosphorus. WEPP-WQ is now one of only two models which simulates BMP impacts on water quality in ‘high’ detail, and it is the only one not based on USLE sediment predictions. Model validations indicated that particulate nutrient predictions were better than soluble nutrient predictions for both nitrogen and phosphorus. Predictions of uniform conditions outperformed nonuniform conditions, and calibrated model simulations performed better than uncalibrated model simulations. Applications of these kinds of models in real-world, historical simulations are often limited by a lack of field-scale agricultural management inputs. Therefore, a prototype tool was developed to derive management inputs for hydrologic models from remotely sensed imagery at field-scale resolution. At present, only predictions of crop, cover crop, and tillage practice inference are supported and were validated at annual and average annual time intervals based on data availability for the various management endpoints. Extraction model training and validation were substantially limited by relatively small field areas in the observed management dataset. Both of these efforts contribute to computational modeling research and applications pertaining to agricultural systems and their impacts on the environment.</p> Agricultural hydrology Agricultural management of nutrients Photogrammetry and remote sensing Agricultural engineering Natural resource management Soil physics Applications in physical sciences Spatial data and applications Water Quality Modeling Agricultural Modeling Agricultural Management Remote Sensing Best Management Practices (BMP) Nonpoint Source Pollution (NPS)
6	EXPLORING GRAPH NEURAL NETWORKS FOR CLUSTERING AND CLASSIFICATION Fattah Muhammad Tahabi (14160375) 03 February 2023 (has links) <p><strong>Graph Neural Networks</strong> (GNNs) have become excessively popular and prominent deep learning techniques to analyze structural graph data for their ability to solve complex real-world problems. Because graphs provide an efficient approach to contriving abstract hypothetical concepts, modern research overcomes the limitations of classical graph theory, requiring prior knowledge of the graph structure before employing traditional algorithms. GNNs, an impressive framework for representation learning of graphs, have already produced many state-of-the-art techniques to solve node classification, link prediction, and graph classification tasks. GNNs can learn meaningful representations of graphs incorporating topological structure, node attributes, and neighborhood aggregation to solve supervised, semi-supervised, and unsupervised graph-based problems. In this study, the usefulness of GNNs has been analyzed primarily from two aspects - <strong>clustering and classification</strong>. We focus on these two techniques, as they are the most popular strategies in data mining to discern collected data and employ predictive analysis.</p> Biomechanical engineering Neural engineering Health promotion Preventative health care Applications in health Spatial data and applications Evolutionary computation Natural language processing Planning and decision making Data engineering and data science Data mining and knowledge discovery Graph, social and multimedia data Information retrieval and web search Knowledge and information management Context learning Deep learning Neural networks Semi- and unsupervised learning Data structures and algorithms Graph neural network Node classification Graph clustering Temporal graphs dynamic graphs NODE2VEC Graph Attention Mechanism Hunting BiLSTM model EHR data colorectal Cancer Cancers Cancer symptoms symptom Symptom cluster studies Coauthorship networks network analysis Word2vec Hierarchical Clustering method Dunn index semantic analysis text mining Natural Language Processing Tool UMLS identifiers umls Clinical Data Management

1

Page generated in 0.1496 seconds