Global ETD Search

21	AnÃlise de SobrevivÃncia na Modelagem do Tempo de Vida de Redes de Sensores sem Fio / Modeling Wireless Sensor Network Lifetime Using Survival Analysis Rodrigo Teles Hermeto 27 August 2014 (has links) CoordenaÃÃo de AperfeiÃoamento de NÃvel Superior / As Redes de Sensores Sem Fio (RSSF) sÃo exemplos de Resource-Constrained Networks (RCNs) nas quais recursos de processamento, armazenamento e energia sÃo limitados. A partir do momento em que uma RSSF tÃpica entra em funcionamento, decorre-se um intervalo de tempo, conhecido como tempo de vida da rede, durante o qual os nÃs sensores executam operaÃÃes de sensoriamento, processamento e comunicaÃÃo, consumindo energia de suas fontes (e.g. pilhas) atÃ valores mÃnimos de carga que os mantÃm em operaÃÃo. Estimar a priori a estrutura probabilÃstica/estocÃstica do tempo de vida de uma RSSF antes da sua implantaÃÃo fornece meios de elaborar estratÃgias de manutenÃÃo de forma a maximizar seu tempo de vida e de garantir que a rede sobreviverÃ tempo suficiente para cumprir seu objetivo. Assim sendo, esta dissertaÃÃo aborda os modelos Exponencial, Weibull e Log-Normal, comumente utilizados em estudos de AnÃlise de SobrevivÃncia, para obter estimativas do tempo de sobrevivÃncia de uma rede real a partir dos tempos de vida de seus nÃs observados em simulaÃÃo. Nossa hipÃtese de base Ã a de que a AnÃlise de SobrevivÃncia pode melhorar a acurÃcia da estimativa do tempo de vida de uma RSSF e, por conseguinte, o seu planejamento operacional. Aqui propomos respostas a trÃs questÃes em aberto na literatura: (i) quantos nÃs sensores irÃo sair de operaÃÃo durante o tempo de vida de uma RSSF (ii) em qual intervalo de tempo a maior parte dos nÃs vai sair de operaÃÃo (iii) por quanto tempo a rede permanecerÃ em funcionamento. / Wireless Sensor Networks (WSN) are examples of Resource-Constrained Networks (RCNs) in which processing resources, storage and energy are limited. From the moment a typical WSN goes into operation, the sensor nodes begin to perform operations like sensing, processing and communicating, consuming the stored energy in their batteries until its ends completely, a situation that is characterized like the death of the devices and consequently the network. Knowing a priori the expected lifetime of a WSN before deploying it, enables the development of maintenance strategies to maximize it lifespan and ensure that it survives enough time to accomplish it goal. Therefore, we propose in this work the use of Exponential, Weibull and Log-Normal models, which are commonly used in studies of Survival Analysis, to infer survival statistics of a real network from the lifespans of its nodes observed in simulation. Our hypothesis is that the Survival Analysis may improve the accuracy of estimating the lifetime of a WSN and, consequently, their operational planning. This work proposes answers to three questions which are open in the literature: (i) how many sensor nodes will die during the lifetime of a WSN (ii) in which time period most of the nodes will die (iii) for how long network will remains operational. Modelos probabilÃsticos InferÃncia Tempo de vida Wireless Sensor Network Survival Analysis Probabilistic Models Lifetime Inference TELECOMUNICACOES
22	Evaluation of GUI testing techniques for system crashing: from real to model-based controlled experiments BERTOLINI, Cristiano 31 January 2010 (has links) Made available in DSpace on 2014-06-12T15:54:24Z (GMT). No. of bitstreams: 2 arquivo7096_1.pdf: 2072025 bytes, checksum: ca8b71b9cfdeb09118a7c281cafe2872 (MD5) license.txt: 1748 bytes, checksum: 8a4605be74aa9ea9d79846c1fba20a33 (MD5) Previous issue date: 2010 / Conselho Nacional de Desenvolvimento Científico e Tecnológico / Aplicações para celular estão se tornando cada vez mais complexas, bem como testá-las. Teste de interfaces gráficas (GUI) é uma tendência atual e se faz, em geral, através da simulação de interações do usuário. Várias técnicas são propostas, no qual, eficiência (custo de execução) e eficácia (possibilidade de encontrar bugs) são os aspectosmais cruciais desejados pela industria. No entanto, avaliações mais sistemáticas são necessárias para identificar quais técnicas melhoram a eficiência e eficácia de tais aplicações. Esta tese apresenta uma avaliação experimental de duas técnicas de testes de GUI, denominadas de DH e BxT, que são usadas para testar aplicações de celulares com um histórico de erros reais. Estas técnicas são executadas por um longo período de tempo (timeout de 40h, por exemplo) tentando identificar as situações críticas que levam o sistema a uma situação inesperada, onde o sistema pode não continuar sua execução normal. Essa situação é chamada de estado de crash. A técnicaDHjá existia e é utilizadapela industriade software, propomos outra chamada de BxT. Em uma avaliação preliminar, comparamos eficácia e eficiência entre DH e BxT através de uma análise descritiva. Demonstramos que uma exploração sistemática, realizada pela BxT, é uma abordagem mais interessante para detectar falhas em aplicativos de celulares. Com base nos resultados preliminares, planejamos e executamos um experimento controlado para obter evidência estatística sobre sua eficiência e eficácia. Como ambas as técnicas são limitadas por um timeout de 40h, o experimento controlado apresenta resultados parciais e, portanto, realizamos uma investigação mais aprofundada através da análise de sobrevivência. Tal análise permite encontrar a probabilidade de crash de uma aplicação usando tanto DH quanto BxT. Como experimentos controlados são onerosos, propomos uma estratégia baseada em experimentos computacionais utilizando a linguagem PRISM e seu verificador de modelos para poder comparar técnicas de teste de GUI, em geral, e DH e BxT em particular. No entanto, os resultados para DH e BxT tem uma limitação: a precisão do modelo não é estatisticamente comprovada. Assim, propomos uma estratégia que consiste em utilizar os resultados anteriores da análise de sobrevivência para calibrar nossos modelos. Finalmente, utilizamos esta estratégia, já com os modelos calibrados, para avaliar uma nova técnica de teste de GUI chamada Hybrid-BxT (ou simplesmente H-BxT), que é uma combinação de DH e BxT GUI Testing Experimental Software Engineering Model Checking Computer Experiments Probabilistic Models
23	Combining Probabilistic and Discrete Methods for Sequence Modelling Gudjonsen, Ludvik January 1999 (has links) Sequence modelling is used for analysing newly sequenced proteins, giving indication of the 3-D structure and functionality. Current approaches to the modelling of protein families are either based on discrete or probabilistic methods. Here we present an approach for combining these two approaches in a hybrid model, where discrete patterns are used to model conserved regions and probabilistic models are used for variable regions. When hidden Markov models are used to model the variable regions, the hybrid method gives increased classification accuracy, compared to pure discrete or probabilistic models. Bioinformatics Bio-sequence analyse Probabilistic models Hybrid Information Systems
24	Normal Factor Graphs Al-Bashabsheh, Ali January 2014 (has links) This thesis introduces normal factor graphs under a new semantics, namely, the exterior function semantics. Initially, this work was motivated by two distinct lines of research. One line is ``holographic algorithms,'' a powerful approach introduced by Valiant for solving various counting problems in computer science; the other is ``normal graphs,'' an elegant framework proposed by Forney for representing codes defined on graphs. The nonrestrictive normality constraint enables the notion of holographic transformations for normal factor graphs. We establish a theorem, called the generalized Holant theorem, which relates a normal factor graph to its holographic transformation. We show that the generalized Holant theorem on one hand underlies the principle of holographic algorithms, and on the other reduces to a general duality theorem for normal factor graphs, a special case of which was first proved by Forney. As an application beyond Forney's duality, we show that the normal factor graphs duality facilitates the approximation of the partition function for the two-dimensional nearest-neighbor Potts model. In the course of our development, we formalize a new semantics for normal factor graphs, which highlights various linear algebraic properties that enables the use of normal factor graphs as a linear algebraic tool. Indeed, we demonstrate the ability of normal factor graphs to encode several concepts from linear algebra and present normal factor graphs as a generalization of ``trace diagrams.'' We illustrate, with examples, the workings of this framework and how several identities from linear algebra may be obtained using a simple graphical manipulation procedure called ``vertex merging/splitting.'' We also discuss translation association schemes with the aid of normal factor graphs, which we believe provides a simple approach to understanding the subject. Further, under the new semantics, normal factor graphs provide a probabilistic model that unifies several graphical models such as factor graphs, convolutional factor graphs, and cumulative distribution networks. Holographic transformations sum of products partition function probabilistic models graphical models factor graphs trace diagrams
25	Probabilistic Models for Spatially Aggregated Data / 空間集約データのための確率モデル Tanaka, Yusuke 23 March 2020 (has links) 京都大学 / 0048 / 新制・課程博士 / 博士(情報学) / 甲第22586号 / 情博第723号 / 新制\|\|情\|\|124(附属図書館) / 京都大学大学院情報学研究科システム科学専攻 / (主査)教授田中利幸, 教授石井信, 教授下平英寿 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM Probabilistic Models Probabilistic Inference Aggregated Data Gaussian Processes Collective Graphical Models 007
26	Computer-Aided Synthesis of Probabilistic Models / Computer-Aided Synthesis of Probabilistic Models Andriushchenko, Roman January 2020 (has links) Předkládaná práce se zabývá problémem automatizované syntézy pravděpodobnostních systémů: máme-li rodinu Markovských řetězců, jak lze efektivně identifikovat ten který odpovídá zadané specifikaci? Takové rodiny často vznikají v nejrůznějších oblastech inženýrství při modelování systémů s neurčitostí a rozhodování i těch nejjednodušších syntézních otázek představuje NP-těžký problém. V dané práci my zkoumáme existující techniky založené na protipříklady řízené induktivní syntéze (counterexample-guided inductive synthesis, CEGIS) a na zjemňování abstrakce (counterexample-guided abstraction refinement, CEGAR) a navrhujeme novou integrovanou metodu pro pravděpodobnostní syntézu. Experimenty nad relevantními modely demonstrují, že navržená technika je nejenom srovnatelná s moderními metodami, ale ve většině případů dokáže výrazně překonat, někdy i o několik řádů, existující přístupy.
27	Can Knowledge Rich Sentences Help Language Models To Solve Common Sense Reasoning Problems? January 2019 (has links) abstract: Significance of real-world knowledge for Natural Language Understanding(NLU) is well-known for decades. With advancements in technology, challenging tasks like question-answering, text-summarizing, and machine translation are made possible with continuous efforts in the field of Natural Language Processing(NLP). Yet, knowledge integration to answer common sense questions is still a daunting task. Logical reasoning has been a resort for many of the problems in NLP and has achieved considerable results in the field, but it is difficult to resolve the ambiguities in a natural language. Co-reference resolution is one of the problems where ambiguity arises due to the semantics of the sentence. Another such problem is the cause and result statements which require causal commonsense reasoning to resolve the ambiguity. Modeling these type of problems is not a simple task with rules or logic. State-of-the-art systems addressing these problems use a trained neural network model, which claims to have overall knowledge from a huge trained corpus. These systems answer the questions by using the knowledge embedded in their trained language model. Although the language models embed the knowledge from the data, they use occurrences of words and frequency of co-existing words to solve the prevailing ambiguity. This limits the performance of language models to solve the problems in common-sense reasoning task as it generalizes the concept rather than trying to answer the problem specific to its context. For example, "The painting in Mark's living room shows an oak tree. It is to the right of a house", is a co-reference resolution problem which requires knowledge. Language models can resolve whether "it" refers to "painting" or "tree", since "house" and "tree" are two common co-occurring words so the models can resolve "tree" to be the co-reference. On the other hand, "The large ball crashed right through the table. Because it was made of Styrofoam ." to resolve for "it" which can be either "table" or "ball", is difficult for a language model as it requires more information about the problem. In this work, I have built an end-to-end framework, which uses the automatically extracted knowledge based on the problem. This knowledge is augmented with the language models using an explicit reasoning module to resolve the ambiguity. This system is built to improve the accuracy of the language models based approaches for commonsense reasoning. This system has proved to achieve the state of the art accuracy on the Winograd Schema Challenge. / Dissertation/Thesis / Masters Thesis Computer Science 2019 Artificial intelligence Commonsense reasoning Deep Learning Knowledge hunting Machine Learning Natural Language Processing Probabilistic models
28	DROUGHT CHARACTERIZATION USING PROBABILISTIC MODELS Ganeshchandra Mallya (5930027) 23 June 2020 (has links) <p>Droughts are complex natural disasters caused due to deficit in water availability over a region. Water availability is strongly linked to precipitation in many parts of the world that rely on monsoonal rains. Recent studies indicate that the choice of precipitation datasets and drought indices could influence drought analysis. Therefore, drought characteristics for the Indian monsoon region were reassessed for the period 1901-2004 using two different datasets and standard precipitation index (SPI), standardized precipitation-evapotranspiration index (SPEI), Gaussian mixture model-based drought index (GMM-DI), and hidden Markov model-based drought index (HMM-DI). Drought trends and variability were analyzed for three epochs: 1901-1935, 1936-1970 and 1971-2004. Irrespective of the dataset and methodology used, the results indicate an increasing trend in drought severity and frequency during the recent decades (1971-2004). Droughts are becoming more regional and are showing a general shift to the agriculturally important coastal south-India, central Maharashtra, and Indo‑Gangetic plains indicating food security challenges and socioeconomic vulnerability in the region.</p><p><br></p><p> </p><p><br></p><p>Drought severities are commonly reported using drought classes obtained by assigning pre-defined thresholds on drought indices. Current drought classification methods ignore modeling uncertainties and provide discrete drought classification. However, the users of drought classification are often interested in knowing inherent uncertainties in classification so that they can make informed decisions. A probabilistic Gamma mixture model (Gamma-MM)-based drought index is proposed as an alternative to deterministic classification by SPI. The Bayesian framework of the proposed model avoids over-specification and overfitting by choosing the optimum number of mixture components required to model the data - a problem that is often encountered in other probabilistic drought indices (e.g., HMM-DI). When sufficient number of components are used in Gamma-MM, it can provide a good approximation to any continuous distribution in the range (0,infinity), thus addressing the problem of choosing an appropriate distribution for SPI analysis. The Gamma-MM propagates model uncertainties to drought classification. The method is tested on rainfall data over India. A comparison of the results with standard SPI shows significant differences, particularly when SPI assumptions on data distribution are violated.</p><p><br></p><p> </p><p><br></p><p>Finding regions with similar drought characteristics is useful for policy-makers and water resources planners in the optimal allocation of resources, developing drought management plans, and taking timely actions to mitigate the negative impacts during droughts. Drought characteristics such as intensity, frequency, and duration, along with land-use and geographic information, were used as input features for clustering algorithms. Three methods, namely, (i) a Bayesian graph cuts algorithm that combines the Gaussian mixture model (GMM) and Markov random fields (MRF), (ii) k-means, and (iii) hierarchical agglomerative clustering algorithm were used to find homogeneous drought regions that are spatially contiguous and possess similar drought characteristics. The number of homogeneous clusters and their shape was found to be sensitive to the choice of the drought index, the time window of drought, period of analysis, dimensionality of input datasets, clustering method, and model parameters of clustering algorithms. Regionalization for different epochs provided useful insight into the space-time evolution of homogeneous drought regions over the study area. Strategies to combine the results from multiple clustering methods were presented. These results can help policy-makers and water resources planners in the optimal allocation of resources, developing drought management plans, and taking timely actions to mitigate the negative impacts during droughts.</p> Droughts Probabilistic Models Hydroclimatology Extreme events Clustering Climate change
29	Combining machine learning and evolution for the annotation of metagenomics data / La combinaison de l'apprentissage statistique et de l'évolution pour l'annotation des données métagénomiques Ugarte, Ari 16 December 2016 (has links) La métagénomique sert à étudier les communautés microbiennes en analysant de l’ADN extrait directement d’échantillons pris dans la nature, elle permet également d’établir un catalogue très étendu des gènes présents dans les communautés microbiennes. Ce catalogue doit être comparé contre les gènes déjà référencés dans les bases des données afin de retrouver des séquences similaires et ainsi déterminer la fonction des séquences qui le composent. Au cours de cette thèse, nous avons développé MetaCLADE, une nouvelle méthodologie qui améliore la détection des domaines protéiques déjà référencés pour des séquences issues des données métagénomiques et métatranscriptomiques. Pour le développement de MetaCLADE, nous avons modifié un système d’annotations de domaines protéiques qui a été développé au sein du Laboratoire de Biologie Computationnelle et Quantitative appelé CLADE (CLoser sequences for Annotations Directed by Evolution) [17]. En général les méthodes pour l’annotation de domaines protéiques caractérisent les domaines connus avec des modèles probabilistes. Ces modèles probabilistes, appelés Sequence Consensus Models (SCMs) sont construits à partir d’un alignement des séquences homologues appartenant à différents clades phylogénétiques et ils représentent le consensus à chaque position de l’alignement. Cependant, quand les séquences qui forment l’ensemble des homologues sont très divergentes, les signaux des SCMs deviennent trop faibles pour être identifiés et donc l’annotation échoue. Afin de résoudre ce problème d’annotation de domaines très divergents, nous avons utilisé une approche fondée sur l’observation que beaucoup de contraintes fonctionnelles et structurelles d’une protéine ne sont pas globalement conservées parmi toutes les espèces, mais elles peuvent être conservées localement dans des clades. L’approche consiste donc à élargir le catalogue de modèles probabilistes en créant de nouveaux modèles qui mettent l’accent sur les caractéristiques propres à chaque clade. MetaCLADE, un outil conçu dans l’objectif d’annoter avec précision des séquences issues des expériences métagénomiques et métatranscriptomiques utilise cette libraire afin de trouver des correspondances entre les modèles et une base de données de séquences métagénomiques ou métatranscriptomiques. En suite, il se sert d’une étape pré-calculée pour le filtrage des séquences qui permet de déterminer la probabilité qu’une prédiction soit considérée vraie. Cette étape pré-calculée est un processus d’apprentissage qui prend en compte la fragmentation de séquences métagénomiques pour les classer.Nous avons montré que l’approche multi source en combinaison avec une stratégie de méta apprentissage prenant en compte la fragmentation atteint une très haute performance. / Metagenomics is used to study microbial communities by the analyze of DNA extracted directly from environmental samples. It allows to establish a catalog very extended of genes present in the microbial communities. This catalog must be compared against the genes already referenced in the databases in order to find similar sequences and thus determine their function. In the course of this thesis, we have developed MetaCLADE, a new methodology that improves the detection of protein domains already referenced for metagenomic and metatranscriptomic sequences. For the development of MetaCLADE, we modified an annotation system of protein domains that has been developed within the Laboratory of Computational and Quantitative Biology clade called (closer sequences for Annotations Directed by Evolution) [17]. In general, the methods for the annotation of protein domains characterize protein domains with probabilistic models. These probabilistic models, called sequence consensus models (SCMs) are built from the alignment of homolog sequences belonging to different phylogenetic clades and they represent the consensus at each position of the alignment. However, when the sequences that form the homolog set are very divergent, the signals of the SCMs become too weak to be identified and therefore the annotation fails. In order to solve this problem of annotation of very divergent domains, we used an approach based on the observation that many of the functional and structural constraints in a protein are not broadly conserved among all species, but they can be found locally in the clades. The approach is therefore to expand the catalog of probabilistic models by creating new models that focus on the specific characteristics of each clade. MetaCLADE, a tool designed with the objective of annotate with precision sequences coming from metagenomics and metatranscriptomics studies uses this library in order to find matches between the models and a database of metagenomic or metatranscriptomic sequences. Then, it uses a pre-computed step for the filtering of the sequences which determine the probability that a prediction is a true hit. This pre-calculated step is a learning process that takes into account the fragmentation of metagenomic sequences to classify them. We have shown that the approach multi source in combination with a strategy of meta-learning taking into account the fragmentation outperforms current methods. Métagénomique Métatranscriptomique Annotation de domaine Apprentissage statistique Annotation de protéine Modèle probabiliste Metagenomics Metatranscriptomics Probabilistic models 004
30	A PROBABILISTIC APPROACH TO DATA INTEGRATION IN BIOMEDICAL RESEARCH: THE IsBIG EXPERIMENTS Anand, Vibha 16 March 2011 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Biomedical research has produced vast amounts of new information in the last decade but has been slow to find its use in clinical applications. Data from disparate sources such as genetic studies and summary data from published literature have been amassed, but there is a significant gap, primarily due to a lack of normative methods, in combining such information for inference and knowledge discovery. In this research using Bayesian Networks (BN), a probabilistic framework is built to address this gap. BN are a relatively new method of representing uncertain relationships among variables using probabilities and graph theory. Despite their computational complexity of inference, BN represent domain knowledge concisely. In this work, strategies using BN have been developed to incorporate a range of available information from both raw data sources and statistical and summary measures in a coherent framework. As an example of this framework, a prototype model (In-silico Bayesian Integration of GWAS or IsBIG) has been developed. IsBIG integrates summary and statistical measures from the NIH catalog of genome wide association studies (GWAS) and the database of human genome variations from the international HapMap project. IsBIG produces a map of disease to disease associations as inferred by genetic linkages in the population. Quantitative evaluation of the IsBIG model shows correlation with empiric results from our Electronic Medical Record (EMR) – The Regenstrief Medical Record System (RMRS). Only a small fraction of disease to disease associations in the population can be explained by the linking of a genetic variation to a disease association as studied in the GWAS. None the less, the model appears to have found novel associations among some diseases that are not described in the literature but are confirmed in our EMR. Thus, in conclusion, our results demonstrate the potential use of a probabilistic modeling approach for combining data from disparate sources for inference and knowledge discovery purposes in biomedical research. Biology -- Research Medicine -- Research Medical informatics

Search results