Spelling suggestions: "subject:"[een] DATABASE"" "subject:"[enn] DATABASE""
1 |
Construction of a Database of Secondary Structure Segments and Short Regions of Disorder and Analysis of Their PropertiesZang, Yizhi 08 August 2005 (has links)
Submitted to the faculty of the School of Informatics in partial fulfillment of the requirements for the degree Master of Science
in Bioinformatics in the School of
Informatics, Indiana University August, 2004 / Prediction of the secondary structure of a protein from its amino acid sequence remains an important task. Not only did the growth of database holding only protein sequences outpace that of solved protein structures, but successful predictions can provide a starting point for direct tertiary structure modeling [1],[2], and they can also significantly improve sequence analysis and sequence-structure threading [3],[4] for aiding in structure and function determination. Previous works on predicting secondary structures of proteins have yielded the best percent accuracy ranging from 63% to 71% [5]. These numbers, however, should be taken with caution since performance of a method based on a training set may vary when trained on a different training set. In order to improve predictions of secondary structure, there are three challenges. The first challenge is establishing an appropriate database. The next challenge is to represent the protein sequence appropriately. The third challenge is finding an appropriate method of classification. So, two of three challenges are related to an appropriate database and characteristic features. Here, we report the development of a database of non-identical segments of secondary structure elements and fragments with missing electron densities (disordered fragments) extracted from Protein Data Bank and categorized into groups of equal lengths, from 6 to 40. The number of residues corresponding to the above-mentioned categories is: 219,788 for α-helices, 82,070 for β-sheets, 179,388 for coils, and 74,724 for disorder. The total number of fragments in the database is 49,544; 17,794 of which are α-helices, 10,216 β-sheets, 16,318 coils, and 5,216 disordered regions. Across the whole range of lengths, α-helices were found to be enriched in L, A, E, I, and R, β-sheets were enriched in V, I, F, Y, and L, coils were enriched in P, G, N, D, and S, while disordered regions were enriched in S, G, P, H, and D. In addition to the amino acid sequence, for each fragment of every structural type, we calculated the distance between the residues immediately flanking its termini. The observed distances have ranges between 3 and 30Å. We found that for the three secondary structure types the average distance between the bookending residues linearly increases with sequence length, while distances were more constant for disorder. For each length between 6 and 40, we compared amino acid compositions of all four structural types and found a strong compositional dependence on length only for the β-sheet fragments, while the other three types showed virtually no change with length. Using the Kullback-Leibler (KL) distance between amino acid compositions, we quantified the differences between the four categories. We found that the closest pair in terms of the KL-distance were coil and disorder (dKL = 0.06 bits), then α-helix and β-sheet (dKL = 0.14 bits), while all other pairs we almost equidistant from one another (dKL ≈ 0.25 bits). With the increasing segment length we found a decreasing KL-distance between sheet and coil, sheet and disorder, and disorder and helix. Analyzing hierarchical clustering of length from 6 to 18 for sheet, coil, disorder, and helix, we found that the group coil had the closet proximity among lengths from 6 to 18. The next closest were helix and disorder. The sheet has the most difference among its length from 6 to 18. In group sheet and coil, fragments of length 17 had the longest distance while fragments of length 6 had the longest distance in group disorder and helix.
|
2 |
Data cube system design: an optimization problem洪宜偉, Hung, Edward. January 2000 (has links)
published_or_final_version / Computer Science and Information Systems / Master / Master of Philosophy
|
3 |
Discovering and using database user access patterns /Yao, Qingsong. January 2006 (has links)
Thesis (Ph.D.)--York University, 2006. Graduate Programme in Computer Science. / Typescript. Includes bibliographical references (leaves 222-232). Also available on the Internet. MODE OF ACCESS via web browser by entering the following URL: http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&res_dat=xri:pqdiss&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&rft_dat=xri:pqdiss:NR29538
|
4 |
Data cube system design : an optimization problem /Hung, Edward. January 2000 (has links)
Thesis (M. Phil.)--University of Hong Kong, 2000. / Includes bibliographical references (leaves 108-117).
|
5 |
Query optimization using frequent itemset miningEom, Boyun. January 2005 (has links)
Thesis (M.S.)--University of Florida, 2005. / Title from title page of source document. Document formatted into pages; contains 99 pages. Includes vita. Includes bibliographical references.
|
6 |
Data modelling techniques to improve student's admission criteriaHutton, David January 2015 (has links)
Education is commonly seen as an escape from poverty and a critical path to securing a better standard of living. This is especially relevant in the South African context, where the need is so great that in one instance people were trampled to death at the gates of a higher educational institution, whilst attempting to register for this opportunity. The root cause of this great need is a limited capacity and a demand, which outstrips the supply. This is not a problem specific to South Africa. It is however exaggerated in the South African context due to the country's lack of infrastructure and the opening of facilities to all people. Tertiary educational institutions are faced with ever-increasing applications for a limited number of available positions. This study focuses on a dataset from the Nelson Mandela Metropolitan University's Faculty of Engineering, the Built Environment and Information Technology - with the aim of establishing guidelines for the use of data modelling techniques to improve student admissions criteria. The importance of data preprocessing was highlighted and generalized linear regression, decision trees and neural networks were proposed and motivated for modelling. Experimentation was carried out, resulting in a number of recommended guidelines focusing on the tremendous value of feature engineering coupled with the use of generalized linear regression as a base line. Adding multiple models was highly recommended; since it allows for greater opportunities for added insight.
|
7 |
Acceleration and execution of relational queries using general purpose graphics processing unit (GPGPU)Wu, Haicheng 07 January 2016 (has links)
This thesis first maps
the relational computation onto Graphics Processing Units (GPU)s by designing a
series of tools and then
explores the different opportunities of reducing the limitation brought by the
memory hierarchy across the CPU and GPU system.
First, a complete end-to-end compiler and runtime infrastructure, Red Fox, is proposed. The
evaluation on the full set of
industry standard TPC-H queries on a single node GPU
shows on average Red Fox is 11.20x faster compared with a commercial database system on a state
of art CPU machine.
Second, a new compiler technique called kernel fusion is designed to fuse the code bodies of several
relational operators to reduce data movement. Third, a multi-predicate join algorithm is
designed for GPUs which can provide much better performance and be used with
more flexibility compared with kernel fusion.
Fourth, the GPU optimized multi-predicate join is integrated into a
multi-threaded CPU database runtime system that supports out-of-core
data set to solve real world problem.
This thesis presents key insights, lessons learned, measurements from the
implementations, and opportunities for further improvements.
|
8 |
A review of theoretical methodologies for locking in a concurrent data base environmentGalinat, Alice Ruth January 2010 (has links)
Photocopy of typescript. / Digitized by Kansas Correctional Industries
|
9 |
A user-oriented transaction definition facility for a relational database systemRoush, C. Steven January 2010 (has links)
Photocopy of typescript. / Digitized by Kansas Correctional Industries
|
10 |
Database Marketing-A Case of A Cosmetic CompanyHuang, Hsiang-Ting 27 June 2008 (has links)
Cosmetics were considered as luxuries due to lack of surviving materials in the past. Thus the quantity of consumers was limited and their needs were considered identical, it was easier for firms to predict market conditions and profit. Resent years, cosmetics industries are facing steeper changes than before: more consumers possess diversified needs and thus expanded the market size, meanwhile, turning the market more fragile and vulnerable. On the other hand, the Internet has enhanced information flows and consumers¡¦ product knowledge, giving them stronger bargaining power. Diversified consumers and higher time values have as well strengthen the pursuit of product and service by individuals. Thus, marketers are facing faultfinding customers, which no firm could rely on traditional mass marketing methods when promoting products: they must customize their products and services to each individual target customer segment. Therefore, it is important for a firm to build effective database marketing systems and scheme marketing plans and marketing channels via data gathering and data analysis conclusions.
This research served planning and analyzing activities of database marketing as the theme, and applied literature review method to form an integrated database marketing planning procedure. This thesis also served a firm as the case, and conducted in-depth interviews in order to understand the applicability and related operation know-how of the underlying model. The conclusion is that the integrated database marketing planning procedure covers practical operations sufficiently. The study also discovered that different functions of database marketing are valued by different levels in the command chain, thus the objective of database marketing should be emphasizing both promoting transaction and relations. Overall planning as well as data analyzing and conducting data-based marketing plans should also be valued in database marketing implementation.
Albeit practical values are contained in theory architectures, limitations such as representative and objective factors exist. It is recommended that future researchers study further into issues based on or in the structure, or comparing effectiveness of a firm before and after the implementation of database marketing, or to compare whether there are differences of implementation processes applied by different firms in different industries to examine the effectiveness and operational variations before and after the database marketing implementation.
|
Page generated in 0.0459 seconds