Global ETD Search

81	Probabilistic Interpretation of Quantum Mechanics with Schrödinger Quantization Rule Dwivedi, Saurav 04 March 2011 (has links) (PDF) Quantum theory is a probabilistic theory, where certain variables are hidden or non-accessible. It results in lack of representation of systems under study. However, I deduce system's representation in probabilistic manner, introducing probability of existence w, and quantize it exploiting Schrödinger's quantization rule. The formalism enriches probabilistic quantum theory, and enables system's representation in probabilistic manner. [PHYS:MPHY] Physics/Mathematical Physics Schrödinger Operators Probability Hidden Variables
82	Evidence Combination in Hidden Markov Models for Gene Prediction Brejova, Bronislava January 2005 (has links) This thesis introduces new techniques for finding genes in genomic sequences. Genes are regions of a genome encoding proteins of an organism. Identification of genes in a genome is an important step in the annotation process after a new genome is sequenced. The prediction accuracy of gene finding can be greatly improved by using experimental evidence. This evidence includes homologies between the genome and databases of known proteins, or evolutionary conservation of genomic sequence in different species. <br /><br /> We propose a flexible framework to incorporate several different sources of such evidence into a gene finder based on a hidden Markov model. Various sources of evidence are expressed as partial probabilistic statements about the annotation of positions in the sequence, and these are combined with the hidden Markov model to obtain the final gene prediction. The opportunity to use partial statements allows us to handle missing information transparently and to cope with the heterogeneous character of individual sources of evidence. On the other hand, this feature makes the combination step more difficult. We present a new method for combining partial probabilistic statements and prove that it is an extension of existing methods for combining complete probability statements. We evaluate the performance of our system and its individual components on data from the human and fruit fly genomes. <br /><br /> The use of sequence evolutionary conservation as a source of evidence in gene finding requires efficient and sensitive tools for finding similar regions in very long sequences. We present a method for improving the sensitivity of existing tools for this task by careful modeling of sequence properties. In particular, we build a hidden Markov model representing a typical homology between two protein coding regions and then use this model to optimize a component of a heuristic algorithm called a spaced seed. The seeds that we discover significantly improve the accuracy and running time of similarity search in protein coding regions, and are directly applicable to our gene finder. Computer Science Gene finding hidden Markov model sequence alignment
83	A Bayesian hierarchical nonhomogeneous hidden Markov model for multisite streamflow reconstructions Bracken, C., Rajagopalan, B., Woodhouse, C. 10 1900 (has links) In many complex water supply systems, the next generation of water resources planning models will require simultaneous probabilistic streamflow inputs at multiple locations on an interconnected network. To make use of the valuable multicentury records provided by tree-ring data, reconstruction models must be able to produce appropriate multisite inputs. Existing streamflow reconstruction models typically focus on one site at a time, not addressing intersite dependencies and potentially misrepresenting uncertainty. To this end, we develop a model for multisite streamflow reconstruction with the ability to capture intersite correlations. The proposed model is a hierarchical Bayesian nonhomogeneous hidden Markov model (NHMM). A NHMM is fit to contemporary streamflow at each location using lognormal component distributions. Leading principal components of tree rings are used as covariates to model nonstationary transition probabilities and the parameters of the lognormal component distributions. Spatial dependence between sites is captured with a Gaussian elliptical copula. Parameters of the model are estimated in a fully Bayesian framework, in that marginal posterior distributions of all the parameters are obtained. The model is applied to reconstruct flows at 20 sites in the Upper Colorado River Basin (UCRB) from 1473 to 1906. Many previous reconstructions are available for this basin, making it ideal for testing this new method. The results show some improvements over regression-based methods in terms of validation statistics. Key advantages of the Bayesian NHMM over traditional approaches are a dynamic representation of uncertainty and the ability to make long multisite simulations that capture at-site statistics and spatial correlations between sites. streamflow reconstruction tree rings Gaussian Copula hidden Markov model
84	Meta State Generalized Hidden Markov Model for Eukaryotic Gene Structure Identification Baribault, Carl 20 December 2009 (has links) Using a generalized-clique hidden Markov model (HMM) as the starting point for a eukaryotic gene finder, the objective here is to strengthen the signal information at the transitions between coding and non-coding (c/nc) regions. This is done by enlarging the primitive hidden states associated with individual base labeling (as exon, intron, or junk) to substrings of primitive hidden states or footprint states. Moreover, the allowed footprint transitions are restricted to those that include either one c/nc transition or none at all. (This effectively imposes a minimum length on exons and the other regions.) These footprint states allow the c/nc transitions to be seen sooner and have their contributions to the gene-structure identification weighted more heavily – yet contributing as such with a natural weighting determined by the HMM model itself according to the training data – rather than via introducing an artificial gain-parameter tuning on major transitions. The selection of the generalized HMM model is interpolated to highest Markov order on emission probabilities, and to highest Markov order (subsequence length) on the footprint states. The former is accomplished via simple count cutoff rules, the latter via an identification of anomalous base statistics near the major transitions using Shannon entropy. Preliminary indications, from applications to the C. elegans genome, are that the sensitivity/specificity (SN/SP) result for both the individual state and full exon predictions are greatly enhanced using the generalized-clique HMM when compared to the standard HMM. Here the standard HMM is represented by the choice of the smallest size of footprint state in the generalized-clique HMM. Even with these improvements, we observe that both extremely long and short exon and intron segments would go undetected without an explicit model of the duration of state. The key contributions of this effort are the full derivation and experimental confirmation of a rudimentary, yet powerful and competitive gene finding method based on a higher order hidden Markov model. With suitable extensions, this method is expected to provide superior gene finding capability – not only in the context of pre-conditioned data sets as in the evaluations cited but also in the wider context of less preconditioned and/or raw genomic data. hidden Markov model HMM GHMM gene finding gene prediction
85	New statistical Methods of Genome-Scale Data Analysis in Life Science - Applications to enterobacterial Diagnostics, Meta-Analysis of Arabidopsis thaliana Gene Expression and functional Sequence Annotation / Neue statistische Methoden für genomweite Datenanalysen in den Biowissenschaften - Anwendungen in der Enterobakteriendiagnostik, Meta-Analyse von Arabidopsis thaliana Genexpression und funktionsbezogenen Sequenzannotation Friedrich, Torben January 2009 (has links) (PDF) Recent progresses and developments in molecular biology provide a wealth of new but insufficiently characterised data. This fund comprises amongst others biological data of genomic DNA, protein sequences, 3-dimensional protein structures as well as profiles of gene expression. In the present work, this information is used to develop new methods for the characterisation and classification of organisms and whole groups of organisms as well as to enhance the automated gain and transfer of information. The first two presented approaches (chapters 4 und 5) focus on the medically and scientifically important enterobacteria. Its impact in medicine and molecular biology is founded in versatile mechanisms of infection, their fundamental function as a commensal inhabitant of the intestinal tract and their use as model organisms as they are easy to cultivate. Despite many studies on single pathogroups with clinical distinguishable pathologies, the genotypic factors that contribute to their diversity are still partially unknown. The comprehensive genome comparison described in Chapter 4 was conducted with numerous enterobacterial strains, which cover nearly the whole range of clinically relevant diversity. The genome comparison constitutes the basis of a characterisation of the enterobacterial gene pool, of a reconstruction of evolutionary processes and of comprehensive analysis of specific protein families in enterobacterial subgroups. Correspondence analysis, which is applied for the first time in this context, yields qualitative statements to bacterial subgroups and the respective, exclusively present protein families. Specific protein families were identified for the three major subgroups of enterobacteria namely the genera Yersinia and Salmonella as well as to the group of Shigella and E. coli by applying statistical tests. In conclusion, the genome comparison-based methods provide new starting points to infer specific genotypic traits of bacterial groups from the transfer of functional annotation. Due to the high medical importance of enterobacterial isolates their classification according to pathogenicity has been in focus of many studies. The microarray technology offers a fast, reproducible and standardisable means of bacterial typing and has been proved in bacterial diagnostics, risk assessment and surveillance. The design of the diagnostic microarray of enterobacteria described in chapter 5 is based on the availability of numerous enterobacterial genome sequences. A novel probe selection strategy based on the highly efficient algorithm of string search, which considers both coding and non-coding regions of genomic DNA, enhances pathogroup detection. This principle reduces the risk of incorrect typing due to restrictions to virulence-associated capture probes. Additional capture probes extend the spectrum of applications of the microarray to simultaneous diagnostic or surveillance of antimicrobial resistance. Comprehensive test hybridisations largely confirm the reliability of the selected capture probes and its ability to robustly classify enterobacterial strains according to pathogenicity. Moreover, the tests constitute the basis of the training of a regression model for the classification of pathogroups and hybridised amounts of DNA. The regression model features a continuous learning capacity leading to an enhancement of the prediction accuracy in the process of its application. A fraction of the capture probes represents intergenic DNA and hence confirms the relevance of the underlying strategy. Interestingly, a large part of the capture probes represents poorly annotated genes suggesting the existence of yet unconsidered factors with importance to the formation of respective virulence phenotypes. Another major field of microarray applications is gene expression analysis. The size of gene expression databases rapidly increased in recent years. Although they provide a wealth of expression data, it remains challenging to integrate results from different studies. In chapter 6 the methodology of an unsupervised meta-analysis of genome-wide A. thaliana gene expression data sets is presented, which yields novel insights in function and regulation of genes. The application of kernel-based principal component analysis in combination with hierarchical clustering identified three major groups of contrasts each sharing overlapping expression profiles. Genes associated with two groups are known to play important roles in Indol-3 acetic acid (IAA) mediated plant growth and development as well as in pathogen defence. Yet uncharacterised serine-threonine kinases could be assigned to novel functions in pathogen defence by meta-analysis. In general, hidden interrelation between genes regulated under different conditions could be unravelled by the described approach. HMMs are applied to the functional characterisation of proteins or the detection of genes in genome sequences. Although HMMs are technically mature and widely applied in computational biology, I demonstrate the methodical optimisation with respect to the modelling accuracy on biological data with various distributions of sequence lengths. The subunits of these models, the states, are associated with a certain holding time being the link to length distributions of represented sequences. An adaptation of simple HMM topologies to bell-shaped length distributions described in chapter 7 was achieved by serial chain-linking of single states, while residing in the class of conventional HMMs. The impact of an optimisation of HMM topologies was underlined by performance evaluations with differently adjusted HMM topologies. In summary, a general methodology was introduced to improve the modelling behaviour of HMMs by topological optimisation with maximum likelihood and a fast and easily implementable moment estimator. Chapter 8 describes the application of HMMs to the prediction of interaction sites in protein domains. As previously demonstrated, these sites are not trivial to predict because of varying degree in conservation of their location and type within the domain family. The prediction of interaction sites in protein domains is achieved by a newly defined HMM topology, which incorporates both sequence and structure information. Posterior decoding is applied to the prediction of interaction sites providing additional information of the probability of an interaction for all sequence positions. The implementation of interaction profile HMMs (ipHMMs) is based on the well established profile HMMs and inherits its known efficiency and sensitivity. The large-scale prediction of interaction sites by ipHMMs explained protein dysfunctions caused by mutations that are associated to inheritable diseases like different types of cancer or muscular dystrophy. As already demonstrated by profile HMMs, the ipHMMs are suitable for large-scale applications. Overall, the HMM-based method enhances the prediction quality of interaction sites and improves the understanding of the molecular background of inheritable diseases. With respect to current and future requirements I provide large-scale solutions for the characterisation of biological data in this work. All described methods feature a highly portable character, which allows for the transfer to related topics or organisms, respectively. Special emphasis was put on the knowledge transfer facilitated by a steadily increasing wealth of biological information. The applied and developed statistical methods largely provide learning capacities and hence benefit from the gain of knowledge resulting in increased prediction accuracies and reliability. / Die aktuellen Fortschritte und Entwicklungen in der Molekularbiologie stellen eine Fülle neuer, bisher kaum analysierter Daten bereit. Dieser Fundus umfasst unter Anderem biologische Daten zu genomischer DNA, zu Proteinsequenzen, zu dreidimensionalen Proteinstrukturen sowie zu Genexpressionsprofilen. In der vorliegenden Arbeit werden diese Informationen genutzt, um neue Methoden der Charakterisierung und Klassifizierung von Organismen bzw. Organismengruppen zu entwickeln und einen automatisierten Informationsgewinn sowie eine Informationsübertragung zu ermöglichen. Die ersten beiden vorgestellten Ansätze (Kapitel 4 und 5) konzentrieren sich auf die medizinisch und wissenschaftlich bedeutsame Gruppe der Enterobakterien. Deren Bedeutung für Medizin und Mikrobiologie geht auf ihre Funktion als kommensale Bewohner des Darmtraktes, ihre Nutzung als leicht kultivierbare Modellorganismen und auf die vielseitigen Infektionsmechanismen zurück. Obwohl bereits viele Studien über einzelne Pathogruppen mit klinisch unterscheidbaren Symptomen existieren, sind die genotypischen Faktoren, die für diese Unterschiedlichkeit verantwortlich zeichnen, teilweise noch nicht bekannt. Der in Kapitel 4 beschriebene umfassende Genomvergleich wurde anhand einer Vielzahl von Enterobakterien durchgeführt, die nahezu die gesamte Bandbreite klinisch relevanter Diversität darstellen. Dieser Genomvergleich bildet die Basis für eine Charakterisierung des enterobakteriellen Genpools, für eine Rekonstruktion evolutionärer Prozesse und Einflüsse und für eine umfassende Untersuchung spezifischer Proteinfamilien in enterobakteriellen Untergruppen. Die in diesem Kontext vorher noch nicht angewandte Korrespondenzanalyse liefert qualitative Aussagen zu bakteriellen Untergruppen und den ausschließlich in ihnen vorkommenden Proteinfamilien. In drei Hauptuntergruppen der Enterobakterien, die den Gattungen Yersinia und Salmonella sowie der Gruppe aus Shigella und E. coli entsprechen, wurden die jeweils spezifischen Proteinfamilien mit Hilfe statistischer Tests identifiziert. Zusammenfassend bilden die auf Genomvergleichen aufbauenden Methoden neue Ansatzpunkte, um aus der Übertragung der bekannten Funktionalität einzelner Proteine auf spezifische, genotypische Besonderheiten bakterieller Gruppen zu schließen. Aufgrund ihrer hohen medizinischen Relevanz war die Typisierung enterobakterieller Isolate entsprechend ihrer Pathogenität Ziel zahlreicher Studien. Die Microarray-Technologie bietet ein schnelles, reproduzierbares und standardisierbares Hilfsmittel für bakterielle Typisierung und hat sich in der Bakteriendiagnostik, Risikobewertung und Überwachung bewährt. Das in Kapitel 5 beschriebene Design eines diagnostischen Microarray beruht auf einer großen Anzahl verfügbarer Genomsequenzen von Enterobakterien. Ein hocheffizienter String-Matching-Algorithmus ist die Grundlage einer neuartigen Strategie der Sondenauswahl, die sowohl kodierende als auch nicht-kodierende Bereiche genomischer DNA berücksichtigt. Im Vergleich zu Diagnostika, die ausschließlich auf Virulenz-assoziierten Sonden beruhen, verringert dieses Prinzip das Risiko einer inkorrekten Typisierung. Zusätzliche Sonden erweitern das Anwendungsspektrum auf eine simultane Diagnostik der Antibiotikaresistenz bzw. eine Überwachung der Resistenzausbreitung. Umfangreiche Testhybridisierungen belegen eine überwiegende Zuverlässigkeit der Sonden und vor allem eine robuste Klassifizierung enterobakterieller Stämme entsprechend der Pathogruppen. Die Tests bilden zudem die Grundlage für das Training eines Regressionsmodells zur Klassifizierung der Pathogruppe und zur Vorhersage der Menge hybridisierter DNA. Das Regressionsmodell zeichnet sich durch kontinuierliche Lernfähigkeit und damit durch eine Verbesserung der Vorhersagequalität im Prozess der Anwendung aus. Ein Teil der Sonden repräsentiert intergenische DNA und bestätigt infolgedessen die Relevanz der zugrunde liegenden Strategie. Die Tatsache, dass ein großer Teil der von den Sonden repräsentierten Gene noch nicht annotiert ist, legt die Existenz bisher unentdeckter Faktoren mit Bedeutung für die Ausbildung entsprechender Virulenz-Phänotypen nahe. Ein weiteres Haupteinsatzgebiet von Microarrays ist die Genexpressionsanalyse. Die Größe von Genexpressionsdatenbanken ist in den vergangenen Jahren stark gewachsen. Obwohl sie eine Fülle von Expressionsdaten bieten, sind Ergebnisse aus unterschiedlichen Studien weiterhin schwer in einen übergreifenden Zusammenhang zu bringen. In Kapitel 6 wird die Methodik einer ausschließlich datenbasierten Meta-Analyse für genomweite A. thaliana Genexpressionsdatensätze dargestellt, die neue Erkenntnisse über Funktion und Regulation von Genen verspricht. Die Anwendung von Kernel-basierter Hauptkomponentenanalyse in Kombination mit hierarchischem Clustering identifizierte drei Hauptgruppen von Kontrastexperimenten mit jeweils überlappenden Expressionsmustern. In zwei Gruppen konnten deregulierte Gene wichtigen Funktionen bei Indol-3-Essigsäure (IAA) vermitteltem Pflanzenwachstum und -entwicklung sowie pflanzlicher Pathogenabwehr zugeordnet werden. Bisher funktionell nicht näher charakterisierte Serin-Threonin-Kinasen wurden über die Meta-Analyse mit der Pathogenabwehr assoziiert. Grundsätzlich kann dieser Ansatz versteckte Wechselbeziehungen zwischen Genen aufdecken, die unter verschiedenen Bedingungen reguliert werden. Bei der funktionellen Charakterisierung von Proteinen oder der Vorhersage von Genen in Genomsequenzen werden Hidden-Markov-Modelle (HMMs) eingesetzt. HMMs sind technisch ausgereift und in der computergestützten Biologie vielfach eingesetzt worden. Trotzdem birgt die Methodik das Potential zur Optimierung bezüglich der Modellierung biologischer Daten, die hinsichtlich der Längenverteilung ihrer Sequenzen variieren. Untereinheiten dieser Modelle, die Zustände, repräsentieren über ihre individuelle Verweildauer zugrunde liegende Verteilungen von Sequenzlängen. Kapitel 7 stellt eine Methode zur Anpassung einfacher HMM-Topologien an biologische Daten, die glockenkurvenartige Längenverteilungen zeigen, vor. Die Modellierung solcher Verteilungen wird dabei durch eine serielle Verkettung vervielfältigter Zustände gewährleistet, ohne dass die Klasse herkömmlicher HMMs verlassen wird. Auswertungen der Modellierungsleistung bei unterschiedlich stark optimierten HMM-Topologien unterstreichen die Bedeutung der entwickelten Topologieoptimierung. Zusammenfassend wird hier eine generelle Methodik beschrieben, die die Modelleigenschaften von HMMs über Topologieoptimierungen verbessert. Die Parameter dieser Optimierung werden mit Hilfe von Maximum-Likelihood und einem leicht einzubindenden Momentschätzer bestimmt. In Kapitel 8 wird die Anwendung von HMMs zur Vorhersage von Interaktionsstellen in Proteindomänen beschrieben. Wie bereits gezeigt wurde, sind solche Stellen aufgrund einer variablen Konserviertheit ihrer Position und ihres Typs schwer zu bestimmen. Eine Vorhersage von Interaktionstellen in Proteindomänen wird über die Definition einer neuen HMM-Topologie erreicht, die sowohl Sequenz- als auch Strukturdaten einbindet. Interaktionsstellen werden mit einem Posterior-Decoding-Algorithmus vorhergesagt, der zusätzliche Informationen über die Wahrscheinlichkeit einer Interaktion für alle Sequenzpositionen bereitstellt. Die Implementierung der Interaktionsprofil-HMMs (ipHMMs) basiert auf den etablierten Profil-HMMs und erbt deren Effizienz und Sensitivität. Eine groß angelegte Vorhersage von Interaktionsstellen mit ipHMMs konnte mutationsbedingte Fehlfunktionen in Proteinen erklären, die mit vererbbaren Krankheiten wie unterschiedlichen Tumortypen oder Muskeldystrophie assoziiert sind. Wie Profile-HMMs sind auch ipHMMs für groß angelegte Anwendungen geeignet. Insgesamt verbessert die HMM-gestützte Methode sowohl die Vorhersagequalität für Interaktionsstellen als auch das Verständnis molekularer Hintergründe bei vererbbaren Krankheiten. Im Hinblick auf aktuelle und zukünftige Anforderungen stelle ich in dieser Arbeit Lösungsansätze für eine umfassende Charakterisierung großer Mengen biologischer Daten vor. Alle beschriebenen Methoden zeichnen sich durch gute Übertragbarkeit auf verwandte Probleme aus. Besonderes Augenmerk wurde dabei auf den Wissenstransfer gelegt, der durch einen stetig wachsenden Fundus biologischer Information ermöglicht wird. Die angewandten und entwickelten statistischen Methoden sind lernfähig und profitieren von diesem Wissenszuwachs, Vorhersagequalität und Zuverlässigkeit der Ergebnisse verbessern sich. Genomik Hidden-Markov-Modell Enterobacteriaceae Genexpression Microarray ddc:570
86	Formalizing life : Towards an improved understanding of the sequence-structure relationship in alpha-helical transmembrane proteins Viklund, Håkan January 2007 (has links) <p>Genes coding for alpha-helical transmembrane proteins constitute roughly 25% of the total number of genes in a typical organism. As these proteins are vital parts of many biological processes, an improved understanding of them is important for achieving a better understanding of the mechanisms that constitute life.</p><p>All proteins consist of an amino acid sequence that fold into a three-dimensional structure in order to perform its biological function. The work presented in this thesis is directed towards improving the understanding of the relationship between sequence and structure for alpha-helical transmembrane proteins. Specifically, five original methods for predicting the topology of alpha-helical transmembrane proteins have been developed: PRO-TMHMM, PRODIV-TMHMM, OCTOPUS, Toppred III and SCAMPI. </p><p>A general conclusion from these studies is that approaches that use multiple sequence information achive the best prediction accuracy. Further, the properties of reentrant regions have been studied, both with respect to sequence and structure. One result of this study is an improved definition of the topological grammar of transmembrane proteins, which is used in OCTOPUS and shown to further improve topology prediction. Finally, Z-coordinates, an alternative system for representation of topological information for transmembrane proteins that is based on distance to the membrane center has been introduced, and a method for predicting Z-coordinates from amino acid sequence, Z-PRED, has been developed.</p> bioinformatics protein topology hidden Markov model prediction Bioinformatics Bioinformatik
87	Quantum algorithms for searching, resampling, and hidden shift problems Ozols, Maris January 2012 (has links) This thesis is on quantum algorithms. It has three main themes: (1) quantum walk based search algorithms, (2) quantum rejection sampling, and (3) the Boolean function hidden shift problem. The first two parts deal with generic techniques for constructing quantum algorithms, and the last part is on quantum algorithms for a specific algebraic problem. In the first part of this thesis we show how certain types of random walk search algorithms can be transformed into quantum algorithms that search quadratically faster. More formally, given a random walk on a graph with an unknown set of marked vertices, we construct a quantum walk that finds a marked vertex in a number of steps that is quadratically smaller than the hitting time of the random walk. The main idea of our approach is to interpolate the random walk from one that does not stop when a marked vertex is found to one that stops. The quantum equivalent of this procedure drives the initial superposition over all vertices to a superposition over marked vertices. We present an adiabatic as well as a circuit version of our algorithm, and apply it to the spatial search problem on the 2D grid. In the second part we study a quantum version of the problem of resampling one probability distribution to another. More formally, given query access to a black box that produces a coherent superposition of unknown quantum states with given amplitudes, the problem is to prepare a coherent superposition of the same states with different specified amplitudes. Our main result is a tight characterization of the number of queries needed for this transformation. By utilizing the symmetries of the problem, we prove a lower bound using a hybrid argument and semidefinite programming. For the matching upper bound we construct a quantum algorithm that generalizes the rejection sampling method first formalized by von~Neumann in~1951. We describe quantum algorithms for the linear equations problem and quantum Metropolis sampling as applications of quantum rejection sampling. In the third part we consider a hidden shift problem for Boolean functions: given oracle access to f(x+s), where f(x) is a known Boolean function, determine the hidden shift s. We construct quantum algorithms for this problem using the "pretty good measurement" and quantum rejection sampling. Both algorithms use the Fourier transform and their complexity can be expressed in terms of the Fourier spectrum of f (in particular, in the second case it relates to "water-filling" of the spectrum). We also construct algorithms for variations of this problem where the task is to verify a given shift or extract only a single bit of information about it. quantum algorithms searching rejection sampling hidden shift problem Combinatorics and Optimization
88	Formalizing life : Towards an improved understanding of the sequence-structure relationship in alpha-helical transmembrane proteins Viklund, Håkan January 2007 (has links) Genes coding for alpha-helical transmembrane proteins constitute roughly 25% of the total number of genes in a typical organism. As these proteins are vital parts of many biological processes, an improved understanding of them is important for achieving a better understanding of the mechanisms that constitute life. All proteins consist of an amino acid sequence that fold into a three-dimensional structure in order to perform its biological function. The work presented in this thesis is directed towards improving the understanding of the relationship between sequence and structure for alpha-helical transmembrane proteins. Specifically, five original methods for predicting the topology of alpha-helical transmembrane proteins have been developed: PRO-TMHMM, PRODIV-TMHMM, OCTOPUS, Toppred III and SCAMPI. A general conclusion from these studies is that approaches that use multiple sequence information achive the best prediction accuracy. Further, the properties of reentrant regions have been studied, both with respect to sequence and structure. One result of this study is an improved definition of the topological grammar of transmembrane proteins, which is used in OCTOPUS and shown to further improve topology prediction. Finally, Z-coordinates, an alternative system for representation of topological information for transmembrane proteins that is based on distance to the membrane center has been introduced, and a method for predicting Z-coordinates from amino acid sequence, Z-PRED, has been developed. bioinformatics protein topology hidden Markov model prediction Bioinformatics Bioinformatik
89	Continuous Hidden Markov Model for Pedestrian Activity Classification and Gait Analysis Panahandeh, Ghazaleh, Mohammadiha, Nasser, Leijon, Arne, Händel, Peter January 2013 (has links) This paper presents a method for pedestrian activity classification and gait analysis based on the microelectromechanical-systems inertial measurement unit (IMU). The work targets two groups of applications, including the following: 1) human activity classification and 2) joint human activity and gait-phase classification. In the latter case, the gait phase is defined as a substate of a specific gait cycle, i.e., the states of the body between the stance and swing phases. We model the pedestrian motion with a continuous hidden Markov model (HMM) in which the output density functions are assumed to be Gaussian mixture models. For the joint activity and gait-phase classification, motivated by the cyclical nature of the IMU measurements, each individual activity is modeled by a "circular HMM." For both the proposed classification methods, proper feature vectors are extracted from the IMU measurements. In this paper, we report the results of conducted experiments where the IMU was mounted on the humans' chests. This permits the potential application of the current study in camera-aided inertial navigation for positioning and personal assistance for future research works. Five classes of activity, including walking, running, going upstairs, going downstairs, and standing, are considered in the experiments. The performance of the proposed methods is illustrated in various ways, and as an objective measure, the confusion matrix is computed and reported. The achieved relative figure of merits using the collected data validates the reliability of the proposed methods for the desired applications. / <p>QC 20130114</p> Inertial measurement unit activity classification gait analysis hidden Markov model
90	Detection of covert channel communications based on intentionally corrupted frame check sequences Najafizadeh, Ali 01 July 2011 (has links) This thesis presents the establishment of a covert-channel in wireless networks in the form of frames with intentionally corrupted Frame Check Sequences (FCSs). Previous works had alluded to the possibility of using this kind of covert-channel as an attack vector. We modify a simulation tool, called Sinalgo, which is used as a test bed for generating hypothetical scenarios for establishing a covert-channel. Single and Multi-Agent systems have been proposed as behaviour-based intrusion detection mechanisms, which utilize statistical information about network traffic. This utilized statistical information is used to detect covert-channel communications. This work highlights the potential impact of having this attack perpetrated in communications equipment with a low chance of being detected, if properly crafted. / UOIT Wireless networks Covert-channel Hidden-channel Behavioural intrusion detection

Search results