Spelling suggestions: "subject:"entropy dcaling"" "subject:"entropy fcaling""
1 |
Entropy Measurements and Ball Cover Construction for Biological SequencesRobertson, Jeffrey Alan 01 August 2018 (has links)
As improving technology is making it easier to select or engineer DNA sequences that produce dangerous proteins, it is important to be able to predict whether a novel DNA sequence is potentially dangerous by determining its taxonomic identity and functional characteristics. These tasks can be facilitated by the ever increasing amounts of available biological data. Unfortunately, though, these growing databases can be difficult to take full advantage of due to the corresponding increase in computational and storage costs. Entropy scaling algorithms and data structures present an approach that can expedite this type of analysis by scaling with the amount of entropy contained in the database instead of scaling with the size of the database. Because sets of DNA and protein sequences are biologically meaningful instead of being random, they demonstrate some amount of structure instead of being purely random. As biological databases grow, taking advantage of this structure can be extremely beneficial. The entropy scaling sequence similarity search algorithm introduced here demonstrates this by accelerating the biological sequence search tools BLAST and DIAMOND. Tests of the implementation of this algorithm shows that while this approach can lead to improved query times, constructing the required entropy scaling indices is difficult and expensive. To improve performance and remove this bottleneck, I investigate several ideas for accelerating building indices that support entropy scaling searches. The results of these tests identify key tradeoffs and demonstrate that there is potential in using these techniques for sequence similarity searches. / Master of Science / As biological organisms are created and discovered, it is important to compare their genetic information to known organisms in order to detect possible harmful or dangerous properties. However, the collection of published genetic information from known organisms is huge and growing rapidly, making it difficult to search. This thesis shows that it might be possible to use the non-random properties of biological information to increase the speed and efficiency of searches; that is, because genetic sequences are not random but have common structures, the increase of known data does not mean a proportional increase in complexity, known as entropy. Specifically, when comparing a new sequence to a set of previously known sequences, it is important to choose the correct algorithms for comparing the similarity of two sequences, also known as the distance between them. This thesis explores the performance of entropy scaling algorithm compared to several conventional tools.
|
2 |
Group Contribution Method for the Residual Entropy Scaling Model for Viscosities of Branched AlkanesMickoleit, Erik, Jäger, Andreas, Grau Turuelo, Constantino, Thol, Monika, Bell, Ian H., Breitkopf, Cornelia 16 January 2025 (has links)
In this work it is shown how the entropy scaling paradigm introduced by Rosenfeld (Phys Rev A 15:2545–2549, 1977, https://doi.org/10.1103/PhysRevA.15.2545) can be extended to calculate the viscosities of branched alkanes by group contribution methods (GCM), making the technique more predictive. Two equations of state (EoS) requiring only a few adjustable parameters (Lee–Kesler–Plöcker and PC-SAFT) were used to calculate the thermodynamic properties of linear and branched alkanes. These EOS models were combined with first-order and second-order group contribution methods to obtain the fluid-specific scaling factor allowing the scaled viscosity values to be mapped onto the generalized correlation developed by Yang et al. (J Chem Eng Data 66:1385–1398, 2021, https://doi.org/10.1021/acs.jced.0c01009) The second-order scheme offers a more accurate estimation of the fluid-specific scaling factor, and overall the method yields an AARD of 10 % versus 8.8 % when the fluid-specific scaling factor is fit directly to the experimental data. More accurate results are obtained when using the PC-SAFT EoS, and the GCM generally out-performs other estimation schemes proposed in the literature for the fluid-specific scaling factor.
|
Page generated in 0.0501 seconds