Global ETD Search

1061	Interrogation des bases de données XML probabilistes / Querying probabilistic XML Souihli, Asma 21 September 2012 (has links) XML probabiliste est un modèle probabiliste pour les bases de données incertaines semi-structurées, avec des applications telles que l'intégration incertaine de données, l'extraction d'informations ou le contrôle probabiliste de versions. Nous explorons dans cette thèse une solution efficace pour l'évaluation des requêtes tree-pattern avec jointures sur ces documents, ou, plus précisément, pour l'approximation de la probabilité d'une requête booléenne sur un document probabiliste. L'approche repose sur, d'une part, la production de la provenance probabiliste de la requête posée, et, d'autre part, la recherche d'une stratégie optimale pour estimer la probabilité de cette provenance. Cette deuxième partie s'inspire des approches des optimiseurs de requêtes: l'exploration de différents plans d'évaluation pour différentes parties de la formule et l'estimation du coût de chaque plan, suivant un modèle de coût établi pour les algorithmes de calcul utilisés. Nous démontrons l'efficacité de cette approche sur des jeux de données utilisés dans des travaux précédents sur l'interrogation des bases de données XML probabilistes, ainsi que sur des données synthétiques. / Probabilistic XML is a probabilistic model for uncertain tree-structured data, with applications to data integration, information extraction, or uncertain version control. We explore in this dissertation efficient algorithms for evaluating tree-pattern queries with joins over probabilistic XML or, more specifically, for approximating the probability of each item of a query result. The approach relies on, first, extracting the query lineage over the probabilistic XML document, and, second, looking for an optimal strategy to approximate the probability of the propositional lineage formula. ProApproX is the probabilistic query manager for probabilistic XML presented in this thesis. The system allows users to query uncertain tree-structured data in the form of probabilistic XML documents. It integrates a query engine that searches for an optimal strategy to evaluate the probability of the query lineage. ProApproX relies on a query-optimizer--like approach: exploring different evaluation plans for different parts of the formula and predicting the cost of each plan, using a cost model for the various evaluation algorithms. We demonstrate the efficiency of this approach on datasets used in a number of most popular previous probabilistic XML querying works, as well as on synthetic data. An early version of the system was demonstrated at the ACM SIGMOD 2011 conference. First steps towards the new query solution were discussed in an EDBT/ICDT PhD Workshop paper (2011). A fully redesigned version that implements the techniques and studies shared in the present thesis, is published as a demonstration at CIKM 2012. Our contributions are also part of an IEEE ICDE Gestion de base de données Requête XML Database management XLM query
1062	Methodology for Determining Crash and Injury Reduction from Emerging Crash Prevention Systems in the U.S. Kusano, Kristofer Darwin 30 July 2013 (has links) In order to prevent or mitigate the negative consequences of traffic crashes, automakers are developing active safety systems, which aim to prevent or mitigate collisions. These systems are expensive to develop and as a result automakers and regulators are motivated to forecast the potential benefits of a proposed safety system before it is widely deployed in the vehicle fleet. The objective of this dissertation was to develop a methodology for predicting fleet-wide benefits for emerging crash avoidance systems as if all vehicles were equipped with a system. Forward Collision Avoidance Systems (FCAS) were used as an example application of this methodology. The methodology developed for this research includes the following components: 1) identification of the target population, 2) development and validation of a driver model, 3) development of injury risk functions, 4) development of a crash severity reduction model, and 5) computation of fleet-wide benefits. This dissertation presents a general methodology for each of these components that could be used for any active safety system. Then a specific model is constructed for FCAS. FCAS could potentially be applicable to 31% of all collisions, 6% of serious injury crashes, and 7% of fatal crashes. Annually, this accounts for 3.3 million collisions and 18,367 fatal crashes. We developed a model of driver braking in response to a forward collision warning. Next we used logistic regression to develop injury risk functions that predicted the probability of injury given the crash severity ("V) and occupant characteristics. Finally, we simulated 2,459 real-world rear-end collisions as if the driver had an FCAS with combinations of warnings, brake assist, and autonomous braking. We found that between 3.4% and 7.2% of crashes could be prevented and that many more could be mitigated in severity. These systems reduced the number of injured (MAIS2+) drivers in rear-end collisions between 32% and 55%. In total, the systems could prevent between $184 and $338 million in economic costs associated with crashes per year. / Ph. D. Vehicle Safety Active Safety Crash Avoidance Crash Database Injury Risk
1063	Complex Proteoform Identification Using Top-Down Mass Spectrometry Kou, Qiang 12 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Proteoforms are distinct protein molecule forms created by variations in genes, gene expression, and other biological processes. Many proteoforms contain multiple primary structural alterations, including amino acid substitutions, terminal truncations, and posttranslational modifications. These primary structural alterations play a crucial role in determining protein functions: proteoforms from the same protein with different alterations may exhibit different functional behaviors. Because top-down mass spectrometry directly analyzes intact proteoforms and provides complete sequence information of proteoforms, it has become the method of choice for the identification of complex proteoforms. Although instruments and experimental protocols for top-down mass spectrometry have been advancing rapidly in the past several years, many computational problems in this area remain unsolved, and the development of software tools for analyzing such data is still at its very early stage. In this dissertation, we propose several novel algorithms for challenging computational problems in proteoform identification by top-down mass spectrometry. First, we present two approximate spectrum-based protein sequence filtering algorithms that quickly find a small number of candidate proteins from a large proteome database for a query mass spectrum. Second, we describe mass graph-based alignment algorithms that efficiently identify proteoforms with variable post-translational modifications and/or terminal truncations. Third, we propose a Markov chain Monte Carlo method for estimating the statistical signi ficance of identified proteoform spectrum matches. They are the first efficient algorithms that take into account three types of alterations: variable post-translational modifications, unexpected alterations, and terminal truncations in proteoform identification. As a result, they are more sensitive and powerful than other existing methods that consider only one or two of the three types of alterations. All the proposed algorithms have been incorporated into TopMG, a complete software pipeline for complex proteoform identification. Experimental results showed that TopMG significantly increases the number of identifications than other existing methods in proteome-level top-down mass spectrometry studies. TopMG will facilitate the applications of top-down mass spectrometry in many areas, such as the identification and quantification of clinically relevant proteoforms and the discovery of new proteoform biomarkers. / 2019-06-21 Algorithms Alignment Bioinformatics Database searching Proteoform Top-Down Mass Spectrometry
1064	Enabling Static Program Analysis Using A Graph Database Liu, Jialun January 2020 (has links) No description available. Computer Science Computer Engineering Graph Database Static Analysis
1065	Celltyper: A Single-Cell Sequencing Marker Gene Tool Suite Paisley, Brianna Meadow 05 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Single-cell RNA-sequencing (scRNA-seq) has enabled researchers to study interindividual cellular heterogeneity, to explore disease impact on cellular composition of tissue, and to identify novel cell subtypes. However, a major challenge in scRNA-seq analysis is to identify the cell type of individual cells. Accurate cell type identification is crucial for any scRNA-seq analysis to be valid as incorrect cell type assignment will reduce statistical robustness and may lead to incorrect biological conclusions. Therefore, accurate and comprehensive cell type assignment is necessary for reliable biological insights into scRNA-seq datasets. With over 200 distinct cell types in humans alone, the concept of cell identity is large. Even within the same cell type there exists heterogeneity due to cell cycle phase, cell state, cell subtypes, cell health and the tissue microenvironment. This makes cell type classification a complicated biological problem requiring bioinformatics. One approach to classify cell type identity is using marker genes. Marker genes are genes specific for one or a few cell types. When coupled with bioinformatic methods, marker genes show promise of improving cell type classification. However, current scRNA-seq classification methods and databases use marker genes that are non-specific across sources, samples, and/or species leading to bias and errors. Furthermore, many existing tools require manual intervention by the user to provide training datasets or the expected number and name of cell types, which can introduce selection bias. The selection bias negatively impacts the accuracy of cell type classification methods as the model cannot extrapolate outside of the user inputs even when it is biologically meaningful to do so. In this dissertation I developed CellTypeR, a suite of tools to explore the biology governing cell identity in a “normal” state for humans and mice. The work presented here accomplishes three aims: 1. Develop an ontology standardized database of published marker gene literature; 2. Develop and apply a marker gene classification algorithm; and 3. Create user interface and input data structure for scRNA-seq cell type prediction. Bioinformatics Database scRNA-seq Marker gene User interface
1066	Developing an Interactive Web-Based Database for Teaching Plant Materials Weerasinghe, Kanchana S 17 May 2014 (has links) In today’s increasingly fast-moving, complex, and competitive world, the need for flexibility and creativity in teaching and learning is crucial. For that reason, innovative educational methods should be introduced. In education, web-based learning and portable devices are emerging as teaching and learning aids which can be efficient and effective tools. Learning use and identification of ornamental plants are the main objectives of the plant materials courses offered by Department of Plant and Soils Sciences at Mississippi State University (MSU). The professors, teaching assistants (TA), and students use the MSU gardens to study and identify ornamental plant species. This can be time consuming for both instructors and students. This research developed an automated web-based database system to deliver information on the ornamental plants in the MSU gardens. Apache, MySQL, PHP, JavaScript, Dreamweaver, and Photoshop software were used to develop this application in the Windows environment and information about each plant was entered into the database. Plant locations were given by longitude and latitude coordinates and linked to Google maps. Quick Response codes(QR code) were created to directly access ornamental plant information at the field. This database may function as a virtual TA for the plant materials courses and as an information source for the public. Users can search the ornamental plant information and determine the location of plants using a computer or mobile device. Plant information can be retrieved from the field by a smart phone with a QR code reader. To evaluate the effectiveness and efficiency of developed automated system, an experimental study and questionnaire survey were designed. teaching plant identification database portable device QR code
1067	Geometric performance evaluation of concurrency control in database systems Rallis, Nicholas. January 1984 (has links) No description available. Database management. Geometry -- Data processing.
1068	Gestion d'information sur les procédés thermiques par base de données Gagnon, Bertrand. January 1986 (has links) No description available.
1069	Review of "Renaissance Cultural Crossroads Catalogue" Reid, Joshua S. 01 January 2020 (has links) Review of the Renaissance Cultural Crossroads Catalogue (RCCC) database, edited by Brenda Hosington. database review English literature Literature and Language Literature in English, British Isles
1070	Efficient Cryptographic Constructions For Resource-Constrained Blockchain Clients Duc Viet Le (11191410) 28 July 2021 (has links) <div><div>The blockchain offers a decentralized way to provide security guarantees for financial transactions. However, this ability comes with the cost of storing a large (distributed) blockchain state and introducing additional computation and communication overhead to all participants. All these drawbacks raise a challenging scalability problem, especially for resource-constrained blockchain clients. On the other hand, some scaling solutions typically require resource-constrained clients to rely on other nodes with higher computational and storage capabilities. However, such scaling solutions often expose the data of the clients to risks of compromise of the more powerful nodes they rely on (e.g., accidental, malicious through a break-in, insider misbehavior, or malware infestation). This potential for leakage raises a privacy concern for these constrained clients, in addition to other scaling-related concerns. This dissertation proposes several cryptographic constructions and system designs enabling resource-constrained devices to participate in the blockchain network securely and efficiently. </div><div><br></div><div>Our first proposal concerns the storage facet for which we propose two add-on privacy designs to address the scaling issue of storing a large blockchain state. </div><div>The first solution is an oblivious database framework, called T<sup>3</sup>, that allows resource-constrained clients to obliviously fetch blockchain data from potential malicious full clients. The second solution focuses on the problem of using and storing additional private-by-design blockchains (e.g., Monero or ZCash) to achieve privacy. We propose an add-on tumbler design, called AMR, that offers privacy directly to clients of non-private blockchains such as Ethereum without the cost of storing and using different blockchain states.</div><div><br></div><div>Our second proposal addresses the communication facet with focus on payment channels as a solution to address the communication overhead between the constrained clients and the blockchain network. A payment channel enables transactions between arbitrary pairs of constrained clients with a minimal communication overhead with the blockchain network. However, in popular blockchains like Ethereum and Bitcoin, the payment data of such channels are exposed to the public, which is undesirable for financial applications. Thus, to hide transaction data, one can use blockchains that are private by design like Monero. However, existing cryptographic primitives in Monero prevent the system from supporting any form of payment channels. Therefore, we present <i>Dual Linkable Spontaneous Anonymous Group Signature for Ad Hoc Groups (DLSAG),</i> a linkable ring signature scheme that enables, for the first time, off-chain scalability solutions in Monero. </div><div><br></div><div>To address the computation facet, we address the computation overhead of the gossip protocol used in all popular blockchain protocols. For this purpose, we propose a signature primitive called <i>Flexible Signature</i>. In a flexible signature scheme, the verification algorithm quantifies the validity of a signature based on the computational effort performed by the verifier. Thus, the resource-constrained devices can partially verify the signatures in the blockchain transactions before relaying transactions to other peers. This primitive allows the resource-constrained devices to prevent spam transactions from flooding the blockchain network with overhead that is consistent with their resource constraints. </div></div> Computer System Security blockchain cryptography Oblivious Database resource-constrained devices

Search results