Global ETD Search

1	Computational Models of Nuclear Proliferation Frankenstein, William 01 May 2016 (has links) This thesis utilizes social influence theory and computational tools to examine the disparate impact of positive and negative ties in nuclear weapons proliferation. The thesis is broadly in two sections: a simulation section, which focuses on government stakeholders, and a large-scale data analysis section, which focuses on the public and domestic actor stakeholders. In the simulation section, it demonstrates that the nonproliferation norm is an emergent behavior from political alliance and hostility networks, and that alliances play a role in current day nuclear proliferation. This model is robust and contains second-order effects of extended hostility and alliance relations. In the large-scale data analysis section, the thesis demonstrates the role that context plays in sentiment evaluation and highlights how Twitter collection can provide useful input to policy processes. It first highlights the results of an on-campus study where users demonstrated that context plays a role in sentiment assessment. Then, in an analysis of a Twitter dataset of over 7.5 million messages, it assesses the role of ‘noise’ and biases in online data collection. In a deep dive analyzing the Iranian nuclear agreement, we demonstrate that the middle east is not facing a nuclear arms race, and show that there is a structural hole in online discussion surrounding nuclear proliferation. By combining both approaches, policy analysts have a complete and generalizable set of computational tools to assess and analyze disparate stakeholder roles in nuclear proliferation. Nuclear Proliferation International Security Computational Modeling Computational Social Science Large Scale Data Analysis Social Media Analysis
2	Prediction of DNA-Binding Proteins and their Binding Sites Pokhrel, Pujan 01 May 2018 (has links) DNA-binding proteins play an important role in various essential biological processes such as DNA replication, recombination, repair, gene transcription, and expression. The identification of DNA-binding proteins and the residues involved in the contacts is important for understanding the DNA-binding mechanism in proteins. Moreover, it has been reported in the literature that the mutations of some DNA-binding residues on proteins are associated with some diseases. The identification of these proteins and their binding mechanism generally require experimental techniques, which makes large scale study extremely difficult. Thus, the prediction of DNA-binding proteins and their binding sites from sequences alone is one of the most challenging problems in the field of genome annotation. Since the start of the human genome project, many attempts have been made to solve the problem with different approaches, but the accuracy of these methods is still not suitable to do large scale annotation of proteins. Rather than relying solely on the existing machine learning techniques, I sought to combine those using novel “stacking technique” and used the problem-specific architectures to solve the problem with better accuracy than the existing methods. This thesis presents a possible solution to the DNA-binding proteins prediction problem which performs better than the state-of-the-art approaches. Computer Sciences
3	Planejamento, gerenciamento e análise de dados de microarranjos de DNA para identificação de biomarcadores de diagnóstico e prognóstico de cânceres humanos / Planning, management and analysis of DNA microarray data aiming at discovery of biomarkers for diagnosis and prognosis of human cancers. Simões, Ana Carolina Quirino 12 May 2009 (has links) Nesta tese, apresentamos nossas estratégias para desenvolver um ambiente matemático e computacional para análises em larga-escala de dados de expressão gênica obtidos pela tecnologia de microarranjos de DNA. As análises realizadas visaram principalmente à identificação de marcadores moleculares de diagnóstico e prognóstico de cânceres humanos. Apresentamos o resultado de diversas análises implementadas através do ambiente desenvolvido, as quais conduziram a implementação de uma ferramenta computacional para a anotação automática de plataformas de microarranjos de DNA e de outra ferramenta destinada ao rastreamento da análise de dados realizada em ambiente R. Programação eXtrema (eXtreme Programming, XP) foi utilizada como técnica de planejamento e gerenciamento dos projetos de análise dados de expressão gênica. Todos os conjuntos de dados foram obtidos por nossos colaboradores, utilizando-se duas diferentes plataformas de microarranjos de DNA: a primeira enriquecida em regiões não-codificantes do genoma humano, em particular regiões intrônicas, e a segunda representando regiões exônicas de genes humanos. A primeira plataforma foi utilizada para avaliação do perfil de expressão gênica em tumores de próstata e rim humanos, sendo que análises utilizando SAM (Significance Analysis of Microarrays) permitiram a proposição de um conjunto de 49 sequências como potenciais biomarcadores de prognóstico de tumores de próstata. A segunda plataforma foi utilizada para avaliação do perfil de transcritos expressos em sarcomas, carcinomas epidermóide e carcinomas epidermóides de cabeça e pescoço. As análises com sarcomas permitiram a identificação de um conjunto de 12 genes relacionados à agressividade local e metástase. As análises com carcinomas epidermóides de cabeça e pescoço permitiram a identificação de 7 genes relacionados à metástase linfonodal. / In this PhD Thesis, we present our strategies to the development of a mathematical and computational environment aiming the analysis of large-scale microarray datasets. The analyses focused mainly on the identification of molecular markers for diagnosis and prognosis of human cancers. Here we show the results of several analyses implemented using this environment, which led to the development of a computational tool for automatic annotation of DNA microarray platforms and a tool for tracking the analysis within R environment. We also applied eXtreme Programming (XP) as a tool for planning and management of gene expression analyses projects. All data sets were obtained by our collaborators using two different microarray platforms. The first is enriched in non-coding human sequences, particularly intronic sequences. The second one represents exonic regions of human genes. Using the first platform, we evaluated gene expression profiles of prostate and kidney human tumors. Applying SAM to prostate tumor data revealed 49 potential molecular markers for prognosis of this disease. Gene expression in samples of sarcomas, epidermoid carcinomas and head and neck epidermoid carcinomas was investigated using the second platform. A set of 12 genes were identified as potential biomarkers for local aggressiveness and metastasis in sarcoma. In addition, the analyses of data obtained from head and neck epidermoid carcinomas allowed the identification of 7 potential biomarkers for lymph-nodal metastases. Análise de dados em larga escala Cancer Câncer Classificadores Classifiers DNA microarrays eXtreme Programming Large-scale data analysis Marcadores Moleculares Microarranjos de DNA Molecular Markers Programação eXtrema
4	Barcoded DNA Sequencing for Parallel Protein Detection Dezfouli, Mahya January 2015 (has links) The work presented in this thesis describes methodologies developed for integration and accurate interpretation of barcoded DNA, to empower large-scale-omics analysis. The objectives mainly aim at enabling multiplexed proteomic measurements in high-throughput format through DNA barcoding and massive parallel sequencing. The thesis is based on four scientific papers that focus on three main criteria; (i) to prepare reagents for large-scale affinity-proteomics, (ii) to present technical advances in barcoding systems for parallel protein detection, and (iii) address challenges in complex sequencing data analysis. In the first part, bio-conjugation of antibodies is assessed at significantly downscaled reagent quantities. This allows for selection of affinity binders without restrictions to accessibility in large amounts and purity from amine-containing buffers or stabilizer materials (Paper I). This is followed by DNA barcoding of antibodies using minimal reagent quantities. The procedure additionally enables efficient purification of barcoded antibodies from free remaining DNA residues to improve sensitivity and accuracy of the subsequent measurements (Paper II). By utilizing a solid-phase approach on magnetic beads, a high-throughput set-up is ready to be facilitated by automation. Subsequently, the applicability of prepared bio-conjugates for parallel protein detection is demonstrated in different types of standard immunoassays (Papers I and II). As the second part, the method immuno-sequencing (I-Seq) is presented for DNAmediated protein detection using barcoded antibodies. I-Seq achieved the detection of clinically relevant proteins in human blood plasma by parallel DNA readout (Paper II). The methodology is further developed to track antibody-antigen interaction events on suspension bead arrays, while being encapsulated in barcoded emulsion droplets (Paper III). The method, denoted compartmentalized immuno-sequencing (cI-Seq), is potent to perform specific detections with paired antibodies and can provide information on details of joint recognition events. Recent progress in technical developments of DNA sequencing has increased the interest in large-scale studies to analyze higher number of samples in parallel. The third part of this thesis focuses on addressing challenges of large-scale sequencing analysis. Decoding of a huge DNA-barcoded data is presented, aiming at phase-defined sequence investigation of canine MHC loci in over 3000 samples (Paper IV). The analysis revealed new single nucleotide variations and a notable number of novel haplotypes for the 2nd exon of DLA DRB1. Taken together, this thesis demonstrates emerging applications of barcoded sequencing in protein and DNA detection. Improvements through the barcoding systems for assay parallelization, de-convolution of antigen-antibody interactions, sequence variant analysis, as well as large-scale data interpretation would aid biomedical studies to achieve a deeper understanding of biological processes. The future perspectives of the developed methodologies may therefore stem for advancing large-scale omics investigations, particularly in the promising field of DNA-mediated proteomics, for highly multiplex studies of numerous samples at a notably improved molecular resolution. / <p>QC 20150203</p> DNA barcoding antibody labeling antibody oligonucleotide bio-conjugation DNAassisted proteomics immuno-sequencing (I-Seq) droplet-based system large-scale data analysis
5	Planejamento, gerenciamento e análise de dados de microarranjos de DNA para identificação de biomarcadores de diagnóstico e prognóstico de cânceres humanos / Planning, management and analysis of DNA microarray data aiming at discovery of biomarkers for diagnosis and prognosis of human cancers. Ana Carolina Quirino Simões 12 May 2009 (has links) Nesta tese, apresentamos nossas estratégias para desenvolver um ambiente matemático e computacional para análises em larga-escala de dados de expressão gênica obtidos pela tecnologia de microarranjos de DNA. As análises realizadas visaram principalmente à identificação de marcadores moleculares de diagnóstico e prognóstico de cânceres humanos. Apresentamos o resultado de diversas análises implementadas através do ambiente desenvolvido, as quais conduziram a implementação de uma ferramenta computacional para a anotação automática de plataformas de microarranjos de DNA e de outra ferramenta destinada ao rastreamento da análise de dados realizada em ambiente R. Programação eXtrema (eXtreme Programming, XP) foi utilizada como técnica de planejamento e gerenciamento dos projetos de análise dados de expressão gênica. Todos os conjuntos de dados foram obtidos por nossos colaboradores, utilizando-se duas diferentes plataformas de microarranjos de DNA: a primeira enriquecida em regiões não-codificantes do genoma humano, em particular regiões intrônicas, e a segunda representando regiões exônicas de genes humanos. A primeira plataforma foi utilizada para avaliação do perfil de expressão gênica em tumores de próstata e rim humanos, sendo que análises utilizando SAM (Significance Analysis of Microarrays) permitiram a proposição de um conjunto de 49 sequências como potenciais biomarcadores de prognóstico de tumores de próstata. A segunda plataforma foi utilizada para avaliação do perfil de transcritos expressos em sarcomas, carcinomas epidermóide e carcinomas epidermóides de cabeça e pescoço. As análises com sarcomas permitiram a identificação de um conjunto de 12 genes relacionados à agressividade local e metástase. As análises com carcinomas epidermóides de cabeça e pescoço permitiram a identificação de 7 genes relacionados à metástase linfonodal. / In this PhD Thesis, we present our strategies to the development of a mathematical and computational environment aiming the analysis of large-scale microarray datasets. The analyses focused mainly on the identification of molecular markers for diagnosis and prognosis of human cancers. Here we show the results of several analyses implemented using this environment, which led to the development of a computational tool for automatic annotation of DNA microarray platforms and a tool for tracking the analysis within R environment. We also applied eXtreme Programming (XP) as a tool for planning and management of gene expression analyses projects. All data sets were obtained by our collaborators using two different microarray platforms. The first is enriched in non-coding human sequences, particularly intronic sequences. The second one represents exonic regions of human genes. Using the first platform, we evaluated gene expression profiles of prostate and kidney human tumors. Applying SAM to prostate tumor data revealed 49 potential molecular markers for prognosis of this disease. Gene expression in samples of sarcomas, epidermoid carcinomas and head and neck epidermoid carcinomas was investigated using the second platform. A set of 12 genes were identified as potential biomarkers for local aggressiveness and metastasis in sarcoma. In addition, the analyses of data obtained from head and neck epidermoid carcinomas allowed the identification of 7 potential biomarkers for lymph-nodal metastases. Análise de dados em larga escala Câncer Classificadores Marcadores Moleculares Microarranjos de DNA Programação eXtrema Cancer Classifiers DNA microarrays eXtreme Programming Large-scale data analysis Molecular Markers
6	Machine Learning based Protein Sequence to (un)Structure Mapping and Interaction Prediction Iqbal, Sumaiya 09 August 2017 (has links) Proteins are the fundamental macromolecules within a cell that carry out most of the biological functions. The computational study of protein structure and its functions, using machine learning and data analytics, is elemental in advancing the life-science research due to the fast-growing biological data and the extensive complexities involved in their analyses towards discovering meaningful insights. Mapping of protein’s primary sequence is not only limited to its structure, we extend that to its disordered component known as Intrinsically Disordered Proteins or Regions in proteins (IDPs/IDRs), and hence the involved dynamics, which help us explain complex interaction within a cell that is otherwise obscured. The objective of this dissertation is to develop machine learning based effective tools to predict disordered protein, its properties and dynamics, and interaction paradigm by systematically mining and analyzing large-scale biological data. In this dissertation, we propose a robust framework to predict disordered proteins given only sequence information, using an optimized SVM with RBF kernel. Through appropriate reasoning, we highlight the structure-like behavior of IDPs in disease-associated complexes. Further, we develop a fast and effective predictor of Accessible Surface Area (ASA) of protein residues, a useful structural property that defines protein’s exposure to partners, using regularized regression with 3rd-degree polynomial kernel function and genetic algorithm. As a key outcome of this research, we then introduce a novel method to extract position specific energy (PSEE) of protein residues by modeling the pairwise thermodynamic interactions and hydrophobic effect. PSEE is found to be an effective feature in identifying the enthalpy-gain of the folded state of a protein and otherwise the neutral state of the unstructured proteins. Moreover, we study the peptide-protein transient interactions that involve the induced folding of short peptides through disorder-to-order conformational changes to bind to an appropriate partner. A suite of predictors is developed to identify the residue-patterns of Peptide-Recognition Domains from protein sequence that can recognize and bind to the peptide-motifs and phospho-peptides with post-translational-modifications (PTMs) of amino acid, responsible for critical human diseases, using the stacked generalization ensemble technique. The involved biologically relevant case-studies demonstrate possibilities of discovering new knowledge using the developed tools. Machine Learning Large-Scale Data Analysis Bioinformatics Intrinsically Disordered Protein Predictor Framework Protein-Protein Interaction Artificial Intelligence and Robotics Bioinformatics Computational Biology Computer Sciences Databases and Information Systems
7	Utilizing Data-Driven Approaches to Evaluate and Develop Air Traffic Controller Action Prediction Models Jeongjoon Boo (9106310) 27 July 2020 (has links) Air traffic controllers (ATCos) monitor flight operations and resolve predicted aircraft conflicts to ensure safe flights, making them one of the essential human operators in air traffic control systems. Researchers have been studying ATCos with human subjective approaches to understand their tasks and air traffic managing processes. As a result, models were developed to predict ATCo actions. However, there is a gap between our knowledge and the real-world. The developed models have never been validated against the real-world, which creates uncertainties in our understanding of how ATCos detect and resolve predicted aircraft conflicts. Moreover, we do not know how information from air traffic control systems affects their actions. This Ph.D. dissertation work introduces methods to evaluate existing ATCo action prediction models. It develops a prediction model based on flight contextual information (information describing flight operations) to explain the relationship between ATCo actions and information. Unlike conventional approaches, this work takes data-driven approaches that collect large-scale flight tracking data. From the collected real-world data, ATCo actions and corresponding predicted aircraft conflicts were identified by developed algorithms. Comparison methods were developed to measure both qualitative and quantitative differences between solutions from the existing prediction models and ATCo actions on the same aircraft conflicts. The collected data is further utilized to develop an ATCo action prediction model. A hierarchical structure found from analyzing the collected ATCo actions was applied to build a structure for the model. The flight contextual information generated from the collected data was used to predict the actions. Results from this work found that the collected ATCo actions do not show any preferences on the methods to resolve aircraft conflicts. Results found that the evaluated existing prediction model does not reflect the real-world. Also, a large portion of the real conflicts was to be solved by the model both physically and operationally. Lastly, the developed prediction model showed a clear relationship between ATCo actions and applied flight contextual information. These results suggest the following takeaways. First, human actions can be identified from closed-loop data. It could be an alternative approach to collect human subjective data. Second, the importance of evaluating models before implications. Third, potentials to utilize the flight contextual information to conduct high-end prediction models. Aerospace Engineering Air Traffic Control air traffic controllers air traffic systems human control identification aircraft conflict identification aircraft conflict resolution Large Scale Data Analysis model evaluation methods action prediction machine learning

1

Page generated in 0.0802 seconds