Spelling suggestions: "subject:"3structure query"" "subject:"bstructure query""
1 |
Robust and Efficient Algorithms for Protein 3-D Structure Alignment and Genome Sequence ComparisonZhao, Zhiyu 07 August 2008 (has links)
Sequence analysis and structure analysis are two of the fundamental areas of bioinformatics research. This dissertation discusses, specifically, protein structure related problems including protein structure alignment and query, and genome sequence related problems including haplotype reconstruction and genome rearrangement. It first presents an algorithm for pairwise protein structure alignment that is tested with structures from the Protein Data Bank (PDB). In many cases it outperforms two other well-known algorithms, DaliLite and CE. The preliminary algorithm is a graph-theory based approach, which uses the concept of \stars" to reduce the complexity of clique-finding algorithms. The algorithm is then improved by introducing \double-center stars" in the graph and applying a self-learning strategy. The updated algorithm is tested with a much larger set of protein structures and shown to be an improvement in accuracy, especially in cases of weak similarity. A protein structure query algorithm is designed to search for similar structures in the PDB, using the improved alignment algorithm. It is compared with SSM and shows better performance with lower maximum and average Q-score for missing proteins. An interesting problem dealing with the calculation of the diameter of a 3-D sequence of points arose and its connection to the sublinear time computation is discussed. The diameter calculation of a 3-D sequence is approximated by a series of sublinear time deterministic, zero-error and bounded-error randomized algorithms and we have obtained a series of separations about the power of sublinear time computations. This dissertation also discusses two genome sequence related problems. A probabilistic model is proposed for reconstructing haplotypes from SNP matrices with incomplete and inconsistent errors. The experiments with simulated data show both high accuracy and speed, conforming to the theoretically provable e ciency and accuracy of the algorithm. Finally, a genome rearrangement problem is studied. The concept of non-breaking similarity is introduced. Approximating the exemplar non-breaking similarity to factor n1..f is proven to be NP-hard. Interestingly, for several practical cases, several polynomial time algorithms are presented.
|
2 |
OPEN—Enabling Non-expert Users to Extract, Integrate, and Analyze Open DataBraunschweig, Katrin, Eberius, Julian, Thiele, Maik, Lehner, Wolfgang 27 January 2023 (has links)
Government initiatives for more transparency and participation have lead to an increasing amount of structured data on the web in recent years. Many of these datasets have great potential. For example, a situational analysis and meaningful visualization of the data can assist in pointing out social or economic issues and raising people’s awareness. Unfortunately, the ad-hoc analysis of this so-called Open Data can prove very complex and time-consuming, partly due to a lack of efficient system support.On the one hand, search functionality is required to identify relevant datasets. Common document retrieval techniques used in web search, however, are not optimized for Open Data and do not address the semantic ambiguity inherent in it. On the other hand, semantic integration is necessary to perform analysis tasks across multiple datasets. To do so in an ad-hoc fashion, however, requires more flexibility and easier integration than most data integration systems provide. It is apparent that an optimal management system for Open Data must combine aspects from both classic approaches. In this article, we propose OPEN, a novel concept for the management and situational analysis of Open Data within a single system. In our approach, we extend a classic database management system, adding support for the identification and dynamic integration of public datasets. As most web users lack the experience and training required to formulate structured queries in a DBMS, we add support for non-expert users to our system, for example though keyword queries. Furthermore, we address the challenge of indexing Open Data.
|
3 |
Implementation of data flow query language on a handheld deviceEvangelista, Mark A. 03 1900 (has links)
Approved for public release; distribution is unlimited / Handheld devices have evolved significantly from mere simple organizers to more powerful handheld computers that are capable of network connectivity, giving it the ability to send e-mail, browse the World Wide Web, and query remote databases. However, handheld devices, because of its design philosophy, are limited in terms of size, memory, and processing power compared to desktop computers. This thesis investigates the use of Data Flow Query Language (DFQL) in querying local and remote databases from a handheld device. Creating Standard Query Language (SQL) queries can be a complex undertaking; and trying to create one on a handheld device with a small screen only adds to its complexity. However, by using DFQL, the user can submit queries with an easy to use graphical user interface. Although handheld devices are currently more powerful than earlier PCs, they still require applications with a small footprint, which is a limiting factor for software developed. This thesis will also investigate the best division of labor between handheld device and remote servers. / Sergeant, United States Army
|
Page generated in 0.0621 seconds