Global ETD Search

291	Hidden Markov Models Predict Epigenetic Chromatin Domains Larson, Jessica 20 December 2012 (has links) Epigenetics is an important layer of transcriptional control necessary for cell-type specific gene regulation. We developed computational methods to analyze the combinatorial effect and large-scale organizations of genome-wide distributions of epigenetic marks. Throughout this dissertation, we show that regions containing multiple genes with similar epigenetic patterns are found throughout the genome, suggesting the presence of several chromatin domains. In Chapter 1, we develop a hidden Markov model (HMM) for detecting the types and locations of epigenetic domains from multiple histone modifications. We use this method to analyze a published ChIP-seq dataset of five histone modification marks in mouse embryonic stem cells. We successfully detect domains of consistent epigenetic patterns from ChIP-seq data, providing new insights into the role of epigenetics in longrange gene regulation. In Chapter 2, we expand our model to investigate the genome-wide patterns of histone modifications in multiple human cell lines. We find that chromatin states can be used to accurately classify cell differentiation stage, and that three cancer cell lines can be classified as differentiated cells. We also found that genes whose chromatin states change dynamically in accordance with differentiation stage are not randomly distributed across the genome, but tend to be embedded in multi-gene chromatin domains. Moreover, many specialized gene clusters are associated with stably occupied domains. In the last chapter, we develop a more sophisticated, tiered HMM to include a domain structure in our chromatin annotation. We find that a model with three domains and five sub-states per domain best fits our data. Each state has a unique epigenetic pattern, while still staying true to its domain’s specific functional aspects and expression profiles. The majority of the genome (including most introns and intergenic regions) has low epigenetic signals and is assigned to the same domain. Our model outperforms current chromatin state models due to its increased domain coherency and interpretation. biostatistics bioinformatics chromatin chromatin domains epigenetics hidden Markov models histone modifications
292	Statistical Learning of Some Complex Systems: From Dynamic Systems to Market Microstructure Tong, Xiao Thomas 27 September 2013 (has links) A complex system is one with many parts, whose behaviors are strongly dependent on each other. There are two interesting questions about complex systems. One is to understand how to recover the true structure of a complex system from noisy data. The other is to understand how the system interacts with its environment. In this thesis, we address these two questions by studying two distinct complex systems: dynamic systems and market microstructure. To address the first question, we focus on some nonlinear dynamic systems. We develop a novel Bayesian statistical method, Gaussian Emulator, to estimate the parameters of dynamic systems from noisy data, when the data are either fully or partially observed. Our method shows that estimation accuracy is substantially improved and computation is faster, compared to the numerical solvers. To address the second question, we focus on the market microstructure of hidden liquidity. We propose some statistical models to explain the hidden liquidity under different market conditions. Our statistical results suggest that hidden liquidity can be reliably predicted given the visible state of the market. / Statistics Statistics Bayesian inference dynamic systems Gaussian process hidden liquidity market microstructure nonparametric regression
293	Bayesian Inference Approaches for Particle Trajectory Analysis in Cell Biology Monnier, Nilah 28 August 2013 (has links) Despite the importance of single particle motion in biological systems, systematic inference approaches to analyze particle trajectories and evaluate competing motion models are lacking. An automated approach for robust evaluation of motion models that does not require manual intervention is highly desirable to enable analysis of datasets from high-throughput imaging technologies that contain hundreds or thousands of trajectories of biological particles, such as membrane receptors, vesicles, chromosomes or kinetochores, mRNA particles, or whole cells in developing embryos. Bayesian inference is a general theoretical framework for performing such model comparisons that has proven successful in handling noise and experimental limitations in other biological applications. The inherent Bayesian penalty on model complexity, which avoids overfitting, is particularly important for particle trajectory analysis given the highly stochastic nature of particle diffusion. This thesis presents two complementary approaches for analyzing particle motion using Bayesian inference. The first method, MSD-Bayes, discriminates a wide range of motion models--including diffusion, directed motion, anomalous and confined diffusion--based on mean- square displacement analysis of a set of particle trajectories, while the second method, HMM-Bayes, identifies dynamic switching between diffusive and directed motion along individual trajectories using hidden Markov models. These approaches are validated on biological particle trajectory datasets from a wide range of experimental systems, demonstrating their broad applicability to research in cell biology. Biophysics Bayesian inference cell biology hidden Markov models mean-square displacement particle trajectories
294	Weakly supervised part-of-speech tagging for Chinese using label propagation Ding, Weiwei, 1985- 02 February 2012 (has links) Part-of-speech (POS) tagging is one of the most fundamental and crucial tasks in Natural Language Processing. Chinese POS tagging is challenging because it also involves word segmentation. In this report, research will be focused on how to improve unsupervised Part-of-Speech (POS) tagging using Hidden Markov Models and the Expectation Maximization parameter estimation approach (EM-HMM). The traditional EM-HMM system uses a dictionary, which is used to constrain possible tag sequences and initialize the model parameters. This is a very crude initialization: the emission parameters are set uniformly in accordance with the tag dictionary. To improve this, word alignments can be used. Word alignments are the word-level translation correspondent pairs generated from parallel text between two languages. In this report, Chinese-English word alignment is used. The performance is expected to be better, as these two tasks are complementary to each other. The dictionary provides information on word types, while word alignment provides information on word tokens. However, it is found to be of limited benefit. In this report, another method is proposed. To improve the dictionary coverage and get better POS distribution, Modified Adsorption, a label propagation algorithm is used. We construct a graph connecting word tokens to feature types (such as word unigrams and bigrams) and connecting those tokens to information from knowledge sources, such as a small tag dictionary, Wiktionary, and word alignments. The core idea is to use a small amount of supervision, in the form of a tag dictionary and acquire POS distributions for each word (both known and unknown) and provide this as an improved initialization for EM learning for HMM. We find this strategy to work very well, especially when we have a small tag dictionary. Label propagation provides a better initialization for the EM-HMM method, because it greatly increases the coverage of the dictionary. In addition, label propagation is quite flexible to incorporate many kinds of knowledge. However, results also show that some resources, such as the word alignments, are not easily exploited with label propagation. / text Chinese part-of-speech tagging Hidden Markov model Expectation maximization Label propagation
295	Diagnostics and Generalizations for Parametric State Estimation Nearing, Grey Stephen January 2013 (has links) This dissertation is comprised of a collection of five distinct research projects which apply, evaluate and extend common methods for land surface data assimilation. The introduction of novel diagnostics and extensions of existing algorithms is motivated by an example, related to estimating agricultural productivity, of failed application of current methods. We subsequently develop methods, based on Shannon's theory of communication, to quantify the contributions from all possible factors to the residual uncertainty in state estimates after data assimilation, and to measure the amount of information contained in observations which is lost due to erroneous assumptions in the assimilation algorithm. Additionally, we discuss an appropriate interpretation of Shannon information which allows us to measure the amount of information contained in a model, and use this interpretation to measure the amount of information introduced during data assimilation-based system identification. Finally, we propose a generalization of the ensemble Kalman filter designed to alleviate one of the primary assumptions - that the observation function is linear. Data Assimilation Hidden Markov Models Information Theory Remote Sensing Soil Moisture Hydrology Bayesian Analysis
296	Πρόγραμμα αυτόματης εναρμόνισης μελωδίας Σφυράκης, Χαράλαμπος 22 January 2009 (has links) Στη παρούσα διπλωματική εργασία αναπτύσσεται ένα πρόγραμμα σε Java που εναρμονίζει μία μονοφωνική ή πολυφωνική μελωδία, η οποία θα εισάγεται στο σύστημα με τη μορφή MIDI αρχείων. Η βασική τεχνική που χρησιμοποιείται είναι τα κρυμμένα μοντέλα Markov. Εισάγονται διάφορες βελτιώσεις που ενσωματώνουν γνώσεις θεωρίας μουσική στα κρυμμένα μοντέλα Μαρκόφ. Τα πειραματικά αποτελέσματα έδειξαν ότι μπορούν να βελτιώσουν την συνολική απόδοση. / In this diploma dissertation an automatic melody harmonization program is developed, written in Java. It can harmonize either a monophonic or a polyphonic melody contained in a MIDI file using the power of hidden Markov Models. We introduce several methods which incorporate musical knowledge into hidden markov models. Experiment results show higher performance in chord recognition than the initial approach. Εναρμόνιση Μελωδία Τεχνητή νοημοσύνη Μαρκόφ 780.285 Chord recognition Melody Artificial intelligence Hidden Markov model Midi Genre
297	Σχεδίαση και υλοποίηση συστήματος αξιολόγησης της δομής και του περιεχομένου ιστότοπων για κινητές συσκευές Στεφανής, Βασίλειος 12 February 2008 (has links) Τα τελευταία χρόνια η πρόσβαση στον παγκόσμιο ιστό δεν περιορίζεται μόνο στους επιτραπέζιους υπολογιστές αλλά πλέον περιλαμβάνει τα κινητά τηλέφωνα, τα PDAs και γενικότερα κάθε είδους κινητή συσκευή. Μάλιστα, στις αναπτυσσόμενες χώρες ο αριθμός των χρηστών που πλοηγούνται στον παγκόσμιο ιστό από κινητές συσκευές είναι μεγαλύτερος από αυτόν των χρηστών που πλοηγούνται μέσω επιτραπέζιων υπολογιστών. Επίσης, η ανάπτυξη περιεχομένου για τον παγκόσμιο ιστό έχει γίνει ευκολότερη λόγω της ύπαρξης αρκετών εργαλείων, που υπόσχονται τη γρήγορη και εύκολη παραγωγή του, χωρίς να απαιτούνται ιδιαίτερες γνώσεις από το χρήστη. Το ερώτημα είναι ποια χαρακτηριστικά θα πρέπει να έχουν οι ιστότοποι και το περιεχόμενό τους ώστε να προσφέρεται η βέλτιστη εμπειρία πλοήγησης στους χρήστες κινητών συσκευών. Το World Wide Web Consortium (W3C) έχει συντάξει τις πρακτικές που θα πρέπει να εφαρμόζονται για τη σωστή παρουσίαση του περιεχομένου του παγκόσμιου ιστού σε κινητές συσκευές (Mobile Web Best Practices). Η συμμόρφωση με τις πρακτικές αυτές είναι απαραίτητη κυρίως λόγω των περιορισμών των κινητών συσκευών. Οι κυριότεροι περιορισμοί είναι το μικρό μέγεθος οθόνης, ο τρόπος εισαγωγής δεδομένων στη συσκευή από το χρήστη, η διαθέσιμη μνήμη, η μικρή υπολογιστική ισχύ, η ταχύτητα μετάδοσης δεδομένων και η αυτονομία των συσκευών σε ενέργεια. Οι παραπάνω πρακτικές έχουν αντιστοιχηθεί, από το ίδιο το W3C, σε μία σειρά από ελέγχους που μπορούν να γίνουν στη δομή και το περιεχόμενο μιας ιστοσελίδας. Οι έλεγχοι αυτοί αποσκοπούν στο να εξασφαλίσουν ότι η συγκεκριμένη ιστοσελίδα μπορεί να προσφέρει μία αποδεκτή εμπειρία πλοήγησης στους χρήστες κινητών συσκευών. Ένα μέρος από τις πρακτικές αυτές ορίζουν ελέγχους που μπορούν να πραγματοποιηθούν αυτόματα με τη χρήση υπολογιστή, ενώ άλλες ελέγχους που απαιτούν και την ανθρώπινη κρίση. Στα πλαίσια της διπλωματικής, αφού παρουσιάστηκαν και αναλύθηκαν οι πρακτικές του W3C, σχεδιάστηκε και υλοποιήθηκε σύστημα για την αξιολόγηση της δομής και του περιεχομένου ιστότοπων που απευθύνονται σε κινητές συσκευές. Σκοπός του συστήματος είναι ανάλυση του ιστότοπου, η ανάκτηση των ιστοσελίδων που τον αποτελούν και ο έλεγχος της κάθε ιστοσελίδας για την ικανοποίηση ή όχι των παραπάνω ελέγχων. Τελικός στόχος αποτελεί η δημιουργία αναφοράς που θα αφορά συνολικά τον ιστότοπο καθώς και η παραγωγή βαθμού αξιολόγησης του ιστότοπου. Επίσης, ιδιαίτερο βάρος δόθηκε στην ανάκτηση και την αξιολόγηση σελίδων και περιεχομένου του ιστότοπου που αποτελούν μέρος του «κρυμμένου ιστού» (hidden web). Τέλος, στους χρήστες του συστήματος δίνεται η δυνατότητα χρήσης βαρών σημαντικότητας των ελέγχων που πραγματοποιούνται. / During the last years the access to the Web, not only from desktop PCs but from mobile devices too, such as mobile phones and PDAs, is a fact. Furthermore, in developing countries the number of users that browse the Web through mobile devices is larger than the number of users that browses the web from desktop PCs. Also, the creation of web content is much easier, due to a large number of applications that promise the fast and easy creation of web content without demanding special knowledge from their users. The question is which characteristics the web sites and their content should have in order to improve the user experience when accessed from mobile devices. The World Wide Web Consortium (W3C) has gathered the practices for delivering Web content to mobile devices (Mobile Web Best Practices). Those practices are strongly recommended because of the limitations of mobile devices. Those limitations are the small screen size, the inputting text method, the available memory, the small computational power and the power consumption. W3C, based on the above practices, has published a set of tests that refer to the structure and the content of a web page. Web pages which pass the tests provide a functional user experience for users of mobile devices. Some of the practices define tests that are machine verifiable and others tests that require the human judge as well. In this thesis at first the W3C Mobile Web Best practices are presented. Then, a system for the evaluation of the content and the structure of mobile web sites was designed and implemented. Purpose of the system is the analysis of a web site, the crawling of its web pages and the check of every web page against the W3C tests. The final goal of the system is to provide a report and a rating for the whole web site. Also, a module for crawling and evaluating content of the web site that is part of the "hidden web" is provided. Finally, the system's users may put weights of importance to each W3C test. Αξιολόγηση Κρυμμένος ιστός 025.04 Evaluation Mobile OK Hidden web Mobile web
298	Bringing Childhood Health into Focus: Incorporating Survivors into Standard Methods of Investigation Holland, Emily 09 January 2014 (has links) The osteological paradox addresses how well interpretations of past population health generated from human skeletal remains reflect the health of the living population from which they were drawn. Selective mortality and hidden heterogeneity in frailty are particularly relevant when assessing childhood health in the past, as subadults are the most vulnerable group in a population and are therefore less likely to fully represent the health of those who survived. The ability of subadults to represent the health of those who survived is tested here by directly comparing interpretations of childhood stress based on non-survivors (subadults aged 6-20,14 females and 9 males) to those based on retrospective analyses of survivors (adults aged 21-46, 26 females and 27 males). Non-survivors and survivors were directly matched by birth year, using the Coimbra Identified Skeletal Collection; therefore interpretations of childhood stress reflect a shared childhood. Long bone and vertebral canal growth, linear enamel hypoplasia, cribra orbitalia, porotic hyperostosis, scurvy indicators and periosteal bone reactions were assessed for both groups. Overall, long bone growth generates the same interpretation of health for both non-survivors and survivors, and both groups exhibit the same range of stress (mild to severe), but the pattern of stress experienced in childhood differs between the two groups. Female survivors reveal different timing of stress episodes and a higher degree of stress than female non-survivors. Male survivors exhibit less stress than male non-survivors. These different patterns suggest that interpretations based solely on non-survivors would under-represent the stress experienced by female survivors and over-represent the stress experienced by male survivors, further demonstrating the importance of addressing issues of selective mortality. In addition, these different patterns suggest that hidden heterogeneity of frailty may be sex specific where males are more vulnerable to stress and females more able to develop resistance to stress and survive. osteological paradox sex differentials child health skeletal stress indicators selective mortality hidden heterogeneity of frailty 0327
299	Bringing Childhood Health into Focus: Incorporating Survivors into Standard Methods of Investigation Holland, Emily 09 January 2014 (has links) The osteological paradox addresses how well interpretations of past population health generated from human skeletal remains reflect the health of the living population from which they were drawn. Selective mortality and hidden heterogeneity in frailty are particularly relevant when assessing childhood health in the past, as subadults are the most vulnerable group in a population and are therefore less likely to fully represent the health of those who survived. The ability of subadults to represent the health of those who survived is tested here by directly comparing interpretations of childhood stress based on non-survivors (subadults aged 6-20,14 females and 9 males) to those based on retrospective analyses of survivors (adults aged 21-46, 26 females and 27 males). Non-survivors and survivors were directly matched by birth year, using the Coimbra Identified Skeletal Collection; therefore interpretations of childhood stress reflect a shared childhood. Long bone and vertebral canal growth, linear enamel hypoplasia, cribra orbitalia, porotic hyperostosis, scurvy indicators and periosteal bone reactions were assessed for both groups. Overall, long bone growth generates the same interpretation of health for both non-survivors and survivors, and both groups exhibit the same range of stress (mild to severe), but the pattern of stress experienced in childhood differs between the two groups. Female survivors reveal different timing of stress episodes and a higher degree of stress than female non-survivors. Male survivors exhibit less stress than male non-survivors. These different patterns suggest that interpretations based solely on non-survivors would under-represent the stress experienced by female survivors and over-represent the stress experienced by male survivors, further demonstrating the importance of addressing issues of selective mortality. In addition, these different patterns suggest that hidden heterogeneity of frailty may be sex specific where males are more vulnerable to stress and females more able to develop resistance to stress and survive. osteological paradox sex differentials child health skeletal stress indicators selective mortality hidden heterogeneity of frailty 0327
300	A Framework for Discovery and Diagnosis of Behavioral Transitions in Event-streams Akhlaghi, Arash 18 December 2013 (has links) Date stream mining techniques can be used in tracking user behaviors as they attempt to achieve their goals. Quality metrics over stream-mined models identify potential changes in user goal attainment. When the quality of some data mined models varies significantly from nearby models—as defined by quality metrics—then the user’s behavior is automatically flagged as a potentially significant behavioral change. Decision tree, sequence pattern and Hidden Markov modeling being used in this study. These three types of modeling can expose different aspect of user’s behavior. In case of decision tree modeling, the specific changes in user behavior can automatically characterized by differencing the data-mined decision-tree models. The sequence pattern modeling can shed light on how the user changes his sequence of actions and Hidden Markov modeling can identifies the learning transition points. This research describes how model-quality monitoring and these three types of modeling as a generic framework can aid recognition and diagnoses of behavioral changes in a case study of cognitive rehabilitation via emailing. The date stream mining techniques mentioned are used to monitor patient goals as part of a clinical plan to aid cognitive rehabilitation. In this context, real time data mining aids clinicians in tracking user behaviors as they attempt to achieve their goals. This generic framework can be widely applicable to other real-time data-intensive analysis problems. In order to illustrate this fact, the similar Hidden Markov modeling is being used for analyzing the transactional behavior of a telecommunication company for fraud detection. Fraud similarly can be considered as a potentially significant transaction behavioral change. Real-time data mining Sequence pattern Max motif Hidden Markov model Learning Fraud detection

Search results