• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 25
  • 13
  • 9
  • 3
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 68
  • 68
  • 18
  • 15
  • 14
  • 13
  • 13
  • 12
  • 11
  • 11
  • 10
  • 9
  • 9
  • 8
  • 8
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Bayesian Generative Modeling of Complex Dynamical Systems

Guan, Jinyan January 2016 (has links)
This dissertation presents a Bayesian generative modeling approach for complex dynamical systems for emotion-interaction patterns within multivariate data collected in social psychology studies. While dynamical models have been used by social psychologists to study complex psychological and behavior patterns in recent years, most of these studies have been limited by using regression methods to fit the model parameters from noisy observations. These regression methods mostly rely on the estimates of the derivatives from the noisy observation, thus easily result in overfitting and fail to predict future outcomes. A Bayesian generative model solves the problem by integrating the prior knowledge of where the data comes from with the observed data through posterior distributions. It allows the development of theoretical ideas and mathematical models to be independent of the inference concerns. Besides, Bayesian generative statistical modeling allows evaluation of the model based on its predictive power instead of the model residual error reduction in regression methods to prevent overfitting in social psychology data analysis. In the proposed Bayesian generative modeling approach, this dissertation uses the State Space Model (SSM) to model the dynamics of emotion interactions. Specifically, it tests the approach in a class of psychological models aimed at explaining the emotional dynamics of interacting couples in committed relationships. The latent states of the SSM are composed of continuous real numbers that represent the level of the true emotional states of both partners. One can obtain the latent states at all subsequent time points by evolving a differential equation (typically a coupled linear oscillator (CLO)) forward in time with some known initial state at the starting time. The multivariate observed states include self-reported emotional experiences and physiological measurements of both partners during the interactions. To test whether well-being factors, such as body weight, can help to predict emotion-interaction patterns, we construct functions that determine the prior distributions of the CLO parameters of individual couples based on existing emotion theories. Besides, we allow a single latent state to generate multivariate observations and learn the group-shared coefficients that specify the relationship between the latent states and the multivariate observations. Furthermore, we model the nonlinearity of the emotional interaction by allowing smooth changes (drift) in the model parameters. By restricting the stochasticity to the parameter level, the proposed approach models the dynamics in longer periods of social interactions assuming that the interaction dynamics slowly and smoothly vary over time. The proposed approach achieves this by applying Gaussian Process (GP) priors with smooth covariance functions to the CLO parameters. Also, we propose to model the emotion regulation patterns as clusters of the dynamical parameters. To infer the parameters of the proposed Bayesian generative model from noisy experimental data, we develop a Gibbs sampler to learn the parameters of the patterns using a set of training couples. To evaluate the fitted model, we develop a multi-level cross-validation procedure for learning the group-shared parameters and distributions from training data and testing the learned models on held-out testing data. During testing, we use the learned shared model parameters to fit the individual CLO parameters to the first 80% of the time points of the testing data by Monte Carlo sampling and then predict the states of the last 20% of the time points. By evaluating models with cross-validation, one can estimate whether complex models are overfitted to noisy observations and fail to generalize to unseen data. I test our approach on both synthetic data that was generated by the generative model and real data that was collected in multiple social psychology experiments. The proposed approach has the potential to model other complex behavior since the generative model is not restricted to the forms of the underlying dynamics.
32

Métodos Bayesianos aplicados em taxonomia molecular / Bayesian methods applied in molecular taxonomy

Edwin Rafael Villanueva Talavera 31 August 2007 (has links)
Neste trabalho são apresentados dois métodos de agrupamento de dados visados para aplicações em taxonomia molecular. Estes métodos estão baseados em modelos probabilísticos, o que permite superar alguns problemas apresentados nos métodos não probabilísticos existentes, como a dificuldade na escolha da métrica de distância e a falta de tratamento e aproveitamento do conhecimento a priori disponível. Os métodos apresentados combinam por meio do teorema de Bayes a informação extraída dos dados com o conhecimento a priori que se dispõe, razão pela qual são denominados métodos Bayesianos. O primeiro método, método de agrupamento hierárquico Bayesiano, está baseado no algoritmo HBC (Hierarchical Bayesian Clustering). Este método constrói uma hierarquia de partições (dendrograma) baseado no critério da máxima probabilidade a posteriori de cada partição. O segundo método é baseado em um tipo de modelo gráfico probabilístico conhecido como redes Gaussianas condicionais, o qual foi adaptado para problemas de agrupamento. Ambos métodos foram avaliados em três bancos de dados donde se conhece a rótulo da classe. Os métodos foram usados também em um problema de aplicação real: a taxonomia de uma coleção brasileira de estirpes de bactérias do gênero Bradyrhizobium (conhecidas por sua capacidade de fixar o \'N IND.2\' do ar no solo). Este banco de dados é composto por dados genotípicos resultantes da análise do RNA ribossômico. Os resultados mostraram que o método hierárquico Bayesiano gera dendrogramas de boa qualidade, em alguns casos superior que o melhor dos algoritmos hierárquicos analisados. O método baseado em redes gaussianas condicionais também apresentou resultados aceitáveis, mostrando um adequado aproveitamento do conhecimento a priori sobre as classes tanto na determinação do número ótimo de grupos, quanto no melhoramento da qualidade dos agrupamentos. / In this work are presented two clustering methods thought to be applied in molecular taxonomy. These methods are based in probabilistic models which overcome some problems observed in traditional clustering methods such as the difficulty to know which distance metric must be used or the lack of treatment of available prior information. The proposed methods use the Bayes theorem to combine the information of the data with the available prior information, reason why they are called Bayesian methods. The first method implemented in this work was the hierarchical Bayesian clustering, which is an agglomerative hierarchical method that constructs a hierarchy of partitions (dendogram) guided by the criterion of maximum Bayesian posterior probability of the partition. The second method is based in a type of probabilistic graphical model knows as conditional Gaussian network, which was adapted for data clustering. Both methods were validated in 3 datasets where the labels are known. The methods were used too in a real problem: the clustering of a brazilian collection of bacterial strains belonging to the genus Bradyrhizobium, known by their capacity to transform the nitrogen (\'N IND.2\') of the atmosphere into nitrogen compounds useful for the host plants. This dataset is formed by genetic data resulting of the analysis of the ribosomal RNA. The results shown that the hierarchical Bayesian clustering method built dendrograms with good quality, in some cases, better than the other hierarchical methods. In the method based in conditional Gaussian network was observed acceptable results, showing an adequate utilization of the prior information (about the clusters) to determine the optimal number of clusters and to improve the quality of the groups.
33

Métodos Bayesianos aplicados em taxonomia molecular / Bayesian methods applied in molecular taxonomy

Villanueva Talavera, Edwin Rafael 31 August 2007 (has links)
Neste trabalho são apresentados dois métodos de agrupamento de dados visados para aplicações em taxonomia molecular. Estes métodos estão baseados em modelos probabilísticos, o que permite superar alguns problemas apresentados nos métodos não probabilísticos existentes, como a dificuldade na escolha da métrica de distância e a falta de tratamento e aproveitamento do conhecimento a priori disponível. Os métodos apresentados combinam por meio do teorema de Bayes a informação extraída dos dados com o conhecimento a priori que se dispõe, razão pela qual são denominados métodos Bayesianos. O primeiro método, método de agrupamento hierárquico Bayesiano, está baseado no algoritmo HBC (Hierarchical Bayesian Clustering). Este método constrói uma hierarquia de partições (dendrograma) baseado no critério da máxima probabilidade a posteriori de cada partição. O segundo método é baseado em um tipo de modelo gráfico probabilístico conhecido como redes Gaussianas condicionais, o qual foi adaptado para problemas de agrupamento. Ambos métodos foram avaliados em três bancos de dados donde se conhece a rótulo da classe. Os métodos foram usados também em um problema de aplicação real: a taxonomia de uma coleção brasileira de estirpes de bactérias do gênero Bradyrhizobium (conhecidas por sua capacidade de fixar o \'N IND.2\' do ar no solo). Este banco de dados é composto por dados genotípicos resultantes da análise do RNA ribossômico. Os resultados mostraram que o método hierárquico Bayesiano gera dendrogramas de boa qualidade, em alguns casos superior que o melhor dos algoritmos hierárquicos analisados. O método baseado em redes gaussianas condicionais também apresentou resultados aceitáveis, mostrando um adequado aproveitamento do conhecimento a priori sobre as classes tanto na determinação do número ótimo de grupos, quanto no melhoramento da qualidade dos agrupamentos. / In this work are presented two clustering methods thought to be applied in molecular taxonomy. These methods are based in probabilistic models which overcome some problems observed in traditional clustering methods such as the difficulty to know which distance metric must be used or the lack of treatment of available prior information. The proposed methods use the Bayes theorem to combine the information of the data with the available prior information, reason why they are called Bayesian methods. The first method implemented in this work was the hierarchical Bayesian clustering, which is an agglomerative hierarchical method that constructs a hierarchy of partitions (dendogram) guided by the criterion of maximum Bayesian posterior probability of the partition. The second method is based in a type of probabilistic graphical model knows as conditional Gaussian network, which was adapted for data clustering. Both methods were validated in 3 datasets where the labels are known. The methods were used too in a real problem: the clustering of a brazilian collection of bacterial strains belonging to the genus Bradyrhizobium, known by their capacity to transform the nitrogen (\'N IND.2\') of the atmosphere into nitrogen compounds useful for the host plants. This dataset is formed by genetic data resulting of the analysis of the ribosomal RNA. The results shown that the hierarchical Bayesian clustering method built dendrograms with good quality, in some cases, better than the other hierarchical methods. In the method based in conditional Gaussian network was observed acceptable results, showing an adequate utilization of the prior information (about the clusters) to determine the optimal number of clusters and to improve the quality of the groups.
34

Risk based life management of offshore structures and equipment

Bharadwaj, Ujjwal R. January 2010 (has links)
Risk based approaches are gaining currency as industry looks for rational, efficient and flexible approaches to managing their structures and equipment. When applied to inspection and maintenance of industrial assets, risk based approaches differ from other approaches mainly in their assessment of failure in its wider context and ramifications. These advanced techniques provide more insight into the causes and avoidance of structural failure and competing risks, as well as the resources needed to manage them. Measuring risk is a challenge that is being met with state of the art technology, skills, knowledge and experience. The thesis presents risk based approaches to solving two specific types of problem in the management of offshore structures and equipments. The first type is finding the optimum timing of an asset life management action such that financial benefit is maximised, considering the cost of the action and the risk (quantified in monetary terms) of not undertaking that action. The approach presented here is applied to managing remedial action in offshore wind farms and specifically to corroded wind turbine tower structures. The second type of problem is how to optimise resources using risk based criteria for managing competing demands. The approach presented here is applied to stocking spares in the shipping sector, where the cost of holding spares is balanced against the risk of failing to meet demands for spares. Risk is the leitmotiv running through this thesis. The approaches discussed here will find application in a variety of situations where competing risks are being managed within constraints.
35

Mapeamento semântico com aprendizado estatístico relacional para representação de conhecimento em robótica móvel. / Semantic mapping with statistical relational learning for knowledge representation in mobile robotics.

Corrêa, Fabiano Rogério 30 March 2009 (has links)
A maior parte dos mapas empregados em tarefas de navegação por robôs móveis representam apenas informações espaciais do ambiente. Outros tipos de informações, que poderiam ser obtidos dos sensores do robô e incorporados à representação, são desprezados. Hoje em dia é comum um robô móvel conter sensores de distância e um sistema de visão, o que permitiria a princípio usá-lo na realização de tarefas complexas e gerais de maneira autônoma, dada uma representação adequada e um meio de extrair diretamente dos sensores o conhecimento necessário. Uma representação possível nesse contexto consiste no acréscimo de informação semântica aos mapas métricos, como por exemplo a segmentação do ambiente seguida da rotulação de cada uma de suas partes. O presente trabalho propõe uma maneira de estruturar a informação espacial criando um mapa semântico do ambiente que representa, além de obstáculos, um vínculo entre estes e as imagens segmentadas correspondentes obtidas por um sistema de visão omnidirecional. A representação é implementada por uma descrição relacional do domínio, que quando instanciada gera um campo aleatório condicionado, onde são realizadas as inferências. Modelos que combinam probabilidade e lógica de primeira ordem são mais expressivos e adequados para estruturar informações espaciais em semânticas. / Most maps used in navigational tasks by mobile robots represent only environmental spatial information. Other kinds of information, that might be obtained from the sensors of the robot and incorporated in the representation, are negleted. Nowadays it is common for mobile robots to have distance sensors and a vision system, which could in principle be used to accomplish complex and general tasks in an autonomously manner, given an adequate representation and a way to extract directly from the sensors the necessary knowledge. A possible representation in this context consists of the addition of semantic information to metric maps, as for example the environment segmentation followed by an attribution of labels to them. This work proposes a way to structure the spatial information in order to create a semantic map representing, beyond obstacles, an anchoring between them and the correspondent segmented images obtained by an omnidirectional vision system. The representation is implemented by a domains relational description that, when instantiated, produces a conditional random field, which supports the inferences. Models that combine probability and firstorder logic are more expressive and adequate to structure spatial in semantic information.
36

Evaluation fiabiliste de l'impact des facteurs climatiques sur la corrosion des poutres en béton armé : application au cas libanais / Reliable assessment of the impact of climatic factors on the corrosion of reinforced concrete beams : application to the Lebanese case

El Hassan, Jinane 05 November 2010 (has links)
Les structures en béton armé exposées à des environnements agressifs subissent des dégradations qui affectent leur intégrité. La corrosion des armatures est l’un des mécanismes de dégradation les plus répandus et les coûteux en terme de maintenance et de réparation. Ce processus est dû à la pénétration des agents agressifs dans le béton, notamment les ions chlorures et le gaz carbonique. Les chlorures induisent une corrosion localisée ou par piqûre, alors que le gaz carbonique engendre une corrosion généralisée ou uniforme. Le déclenchement et la propagation de la corrosion dépendent de plusieurs facteurs liés aux matériaux, aux chargements, à la géométrie et à l’environnement. Ces facteurs présentent de grandes incertitudes qui doivent être prise en comptes à travers une approche probabiliste. Dans ce travail de recherche, nous nous intéressons au mécanisme de corrosion en général. Un intérêt particulier est porté à la prise en compte de l’impact des facteurs climatiques sur ce processus, notamment dans le contexte libanais. Ainsi, nous proposons une modélisation physique de la corrosion des aciers dans les poutres en béton armé qui se déroule en deux phases : - une phase d’initiation durant laquelle les agents agressifs (chlorures et gaz carbonique) pénètrent dans le béton et atteignent des concentrations critiques provoquant la dépassivation de l’acier ; - une phase de propagation durant laquelle il y a corrosion active des aciers et diminution de la résistance de la poutre jusqu’à la défaillance. Les facteurs présentant des incertitudes sont traités comme des variables aléatoires. Pour les modéliser, nous avons étudié, pour les différentes variables aléatoires, de nombreux modèles probabilistes proposés dans la littérature. Nous avons vérifié leur compatibilité vis-à-vis de notre problématique et la possibilité d’assurer les données nécessaires à leur bonne utilisation (notamment la cohérence entre les hypothèses). Ensuite, nous avons retenu les modèles probabilistes les plus adaptés à notre cas. Par ailleurs, l’application des principes fiabilistes nous permet d’évaluer la fiabilité des poutres sujettes à la corrosion vis-à-vis des deux états-limites (ELU et ELS). En effet, la perte de la section d’acier due à la corrosion induit d’une part, une diminution de la capacité portante de la poutre, et d’autre part une augmentation de la contrainte au niveau du béton tendu (provoquant un accroissement des ouvertures des fissures). Ainsi, pour l’état limite de service, la marge de sûreté s’annule lorsque l’ouverture des fissures dépasse la valeur limite préconisée par l’Eurocode 2. Quant à l’état limite ultime, la fonction d’état limite est la résistance en flexion : la défaillance a lieu lorsque le moment résistant équivaut au moment sollicitant. Le calcul fiabiliste est effectué au moyen de simulations de Monte-Carlo. Finalement, nous avons réalisé plusieurs applications aux modèles de corrosions proposées dans ce travail. La première application porte sur l’analyse des sensibilités des modèles de corrosion aux différents paramètres. L’effet des moyennes des paramètres aléatoires ainsi que leurs variabilités sur la réponse du modèle est examiné. Une attention particulière est accordée à l’impact des facteurs climatiques. Ainsi une application du modèle de corrosion induite par les chlorures avec des données réelles de température et d’humidité relatives à trois villes côtières ayant des caractéristiques climatiques différentes est présentée. Ensuite une étude comparative de l’effet du choix des diamètres des armatures et des épaisseurs des enrobages sur la fiabilité à l’état limite ultime et à l’état limite de service est effectuée. Les résultats obtenus ont permis de mettre en évidence l’aspect agressif des facteurs climatiques : un climat chaud et humide est très agressif vis-à-vis de la corrosion induite par les chlorures alors qu’un climat à humidité relative variable favorise la corrosion par carbonatation. (...) / When exposed to aggressive environment, reinforced concrete structures are subject to a degradation mechanism that affects their integrity. Among various environmental attacks, the corrosion of RC structures is considered the most dangerous. The process is launched by the penetration of aggressive agents, precisely the chlorides and carbon dioxide into the concrete. The chlorides induce a localized corrosion, also called pitting corrosion, while on the other hand the carbon dioxide leads to a general corrosion called uniform corrosion. This corrosion phenomenon depends on several factors such as the materials characteristics,loadings, geometry and the environment. All these components include different levels of uncertainties that are taken into account throughout a probabilistic approach. In this work, we propose two models for the corrosion mechanisms induced separately by the chlorides and the carbon dioxide. These models take into account the effect of the climatic condition that is mainly described by the temperature and the relative humidity. In addition to that, as a study case we have treated in details the Lebanese climatic context. We have proposed a physical model of steel corrosion in reinforced concrete beams that occurs in two phases : - An initiation phase where aggressive agents like the chlorides and carbon dioxide penetrate into the concrete and reach a critical concentration values causing the depassivation of the steel ; - A propagation phase in which the active corrosion of steel decreases the strength of the beam leading to its failure. All the factors that have uncertainties are treated as random variables. Several probabilistic models are listed and discussed in the literature while only the models that match with our context are selected. The reliability analysis allowed us to assess the reliability of beams subjected to corrosion in ULS and SLS. The loss of steel section due to the corrosion mechanism induces a decrease of the bearing beam capacity, and an increase in the tension stress in the concrete.This causes an increase of the width of cracks openings. Thus, taking into account the serviceability limit state, the safety margin goes to zero when the width of crack opening exceeds the acceptable width as recommended by the Eurocode 2. The limit state function in ULS is the bending strength. The failure occurs when the applied moment equals or surpasses the resisting moment. The reliability calculations are carried out using Monte-Carlo simulations. Finally, several applications to the corrosion model are proposed via this work. The first application concerns the sensitivity analysis of the corrosion models for the different parameters. The effects of the mean values and the variability of the random variables on the model response are also examined. The impact of climatic factors on the corrosion phenomenon took the biggest part of this work. We have applied the chloride’s corrosion model with the real temperatures and relative humidity of three coastal cities having different climatic characteristics. Then a comparative study showing the effect of the ba rdiameters and the cover thickness on the reliability of the RC beam subjected to aggressive environment is carried out. (...)
37

Mapeamento semântico com aprendizado estatístico relacional para representação de conhecimento em robótica móvel. / Semantic mapping with statistical relational learning for knowledge representation in mobile robotics.

Fabiano Rogério Corrêa 30 March 2009 (has links)
A maior parte dos mapas empregados em tarefas de navegação por robôs móveis representam apenas informações espaciais do ambiente. Outros tipos de informações, que poderiam ser obtidos dos sensores do robô e incorporados à representação, são desprezados. Hoje em dia é comum um robô móvel conter sensores de distância e um sistema de visão, o que permitiria a princípio usá-lo na realização de tarefas complexas e gerais de maneira autônoma, dada uma representação adequada e um meio de extrair diretamente dos sensores o conhecimento necessário. Uma representação possível nesse contexto consiste no acréscimo de informação semântica aos mapas métricos, como por exemplo a segmentação do ambiente seguida da rotulação de cada uma de suas partes. O presente trabalho propõe uma maneira de estruturar a informação espacial criando um mapa semântico do ambiente que representa, além de obstáculos, um vínculo entre estes e as imagens segmentadas correspondentes obtidas por um sistema de visão omnidirecional. A representação é implementada por uma descrição relacional do domínio, que quando instanciada gera um campo aleatório condicionado, onde são realizadas as inferências. Modelos que combinam probabilidade e lógica de primeira ordem são mais expressivos e adequados para estruturar informações espaciais em semânticas. / Most maps used in navigational tasks by mobile robots represent only environmental spatial information. Other kinds of information, that might be obtained from the sensors of the robot and incorporated in the representation, are negleted. Nowadays it is common for mobile robots to have distance sensors and a vision system, which could in principle be used to accomplish complex and general tasks in an autonomously manner, given an adequate representation and a way to extract directly from the sensors the necessary knowledge. A possible representation in this context consists of the addition of semantic information to metric maps, as for example the environment segmentation followed by an attribution of labels to them. This work proposes a way to structure the spatial information in order to create a semantic map representing, beyond obstacles, an anchoring between them and the correspondent segmented images obtained by an omnidirectional vision system. The representation is implemented by a domains relational description that, when instantiated, produces a conditional random field, which supports the inferences. Models that combine probability and firstorder logic are more expressive and adequate to structure spatial in semantic information.
38

Exploiting phonological constraints for handshape recognition in sign language video

Thangali, Ashwin 22 January 2016 (has links)
The ability to recognize handshapes in signing video is essential in algorithms for sign recognition and retrieval. Handshape recognition from isolated images is, however, an insufficiently constrained problem. Many handshapes share similar 3D configurations and are indistinguishable for some hand orientations in 2D image projections. Additionally, significant differences in handshape appearance are induced by the articulated structure of the hand and variants produced by different signers. Linguistic rules involved in the production of signs impose strong constraints on the articulations of the hands, yet, little attention has been paid towards exploiting these constraints in previous works on sign recognition. Among the different classes of signs in any signed language, lexical signs constitute the prevalent class. Morphemes (or, meaningful units) for signs in this class involve a combination of particular handshapes, palm orientations, locations for articulation, and movement type. These are thus analyzed by many sign linguists as analogues of phonemes in spoken languages. Phonological constraints govern the ways in which phonemes combine in American Sign Language (ASL), as in other signed and spoken languages; utilizing these constraints for handshape recognition in ASL is the focus of the proposed thesis. Handshapes in monomorphemic lexical signs are specified at the start and end of the sign. The handshape transition within a sign are constrained to involve either closing or opening of the hand (i.e., constrained to exclusively use either folding or unfolding of the palm and one or more fingers). Furthermore, akin to allophonic variations in spoken languages, both inter- and intra- signer variations in the production of specific handshapes are observed. We propose a Bayesian network formulation to exploit handshape co-occurrence constraints also utilizing information about allophonic variations to aid in handshape recognition. We propose a fast non-rigid image alignment method to gain improved robustness to handshape appearance variations during computation of observation likelihoods in the Bayesian network. We evaluate our handshape recognition approach on a large dataset of monomorphemic lexical signs. We demonstrate that leveraging linguistic constraints on handshapes results in improved handshape recognition accuracy. As part of the overall project, we are collecting and preparing for dissemination a large corpus (three thousand signs from three native signers) of ASL video annotated with linguistic information such as glosses, morphological properties and variations, and start/end handshapes associated with each ASL sign.
39

Approche probabiliste pour l’analyse de l’impact des changements dans les programmes orientés objet

Zoghlami, Aymen 06 1900 (has links)
Nous proposons une approche probabiliste afin de déterminer l’impact des changements dans les programmes à objets. Cette approche sert à prédire, pour un changement donné dans une classe du système, l’ensemble des autres classes potentiellement affectées par ce changement. Cette prédiction est donnée sous la forme d’une probabilité qui dépend d’une part, des interactions entre les classes exprimées en termes de nombre d’invocations et d’autre part, des relations extraites à partir du code source. Ces relations sont extraites automatiquement par rétro-ingénierie. Pour la mise en oeuvre de notre approche, nous proposons une approche basée sur les réseaux bayésiens. Après une phase d’apprentissage, ces réseaux prédisent l’ensemble des classes affectées par un changement. L’approche probabiliste proposée est évaluée avec deux scénarios distincts mettant en oeuvre plusieurs types de changements effectués sur différents systèmes. Pour les systèmes qui possèdent des données historiques, l’apprentissage a été réalisé à partir des anciennes versions. Pour les systèmes dont on ne possède pas assez de données relatives aux changements de ses versions antécédentes, l’apprentissage a été réalisé à l’aide des données extraites d’autres systèmes. / We study the possibility of predicting the impact of changes in object-oriented code using bayesian networks. For each change type, we produce a bayesian network that determines the probability that a class is impacted given that another class is changed. Each network takes as input a set of possible relationships between classes. We train our networks using historical data. The proposed impact-prediction approach is evaluated with two different scenarios, various types of changes, and five systems. In the first scenario, we use as training data, the changes performed in the previous versions of the same system. In the second scenario training data is borrowed from systems that are different from the changed one. Our evaluation showed that, in both cases, we obtain very good predictions, even though they are better in the first scenario.
40

MYOP/ToPS/SGEval: Um ambiente computacional para estudo sistemático de predição de genes / MYOP/ToPS/SGEval: A computational framework for gene prediction

Kashiwabara, André Yoshiaki 10 February 2012 (has links)
O desafio de encontrar corretamente genes eucarioticos codificadores de proteinas nas sequencias genomicas e um problema em aberto. Neste trabalho, implementamos uma plata- forma, com o objetivo de melhorar a forma com que preditores de genes sao implementados e avaliados. Tres novas ferramentas foram implementadas: ToPS (Toolkit of Probabilistic Models of Sequences) foi o primeiro arcabouco orientado a objetos que fornece ferramentas para implementacao, manipulacao, e combinacao de modelos probabilisticos para representar sequencias de simbolos; MYOP (Make Your Own Predictor) e um sistema que tem como objetivo facilitar a construcao de preditores de genes; e SGEval utiliza grafos de splicing para comparar diferente anotacoes com eventos de splicing alternativos. Utilizamos nossas ferramentas para o desenvolvimentos de preditores de genes em onze genomas distintos: A. thaliana, C. elegans, Z. mays, P. falciparum, D. melanogaster, D. rerio, M. musculus, R. norvegicus, O. sativa, G. max e H. sapiens. Com esse desenvolvimento, estabelecemos um protocolo para implementacao de novos preditores. Alem disso, utilizando a nossa plata- forma, desenvolvemos um fluxo de trabalho para predicao de genes no projeto do genoma da cana de acucar, que ja foi utilizado em 109 sequencias de BAC geradas pelo BIOEN (FAPESP Bioenergy Program). / The challenge of correctly identify eukaryotic protein-coding genes in the genomic se- quences is an open problem. In this work, we implemented a plataform with the aim of improving the way that gene predictors are implemented and evaluated. ToPS (Toolkit of Probabilistic Models of Sequence) was the first object-oriented framework that provides tools for implementation, manipulation, and combination of probabilistic models that represent sequences of symbols. MYOP (Make Your Own Predictor) facilitates the construction of gene predictors. SGEval (Splicing Graph Evaluation) uses splicing graphs to compare dif- ferent annotations with alternative splicing events. We used our plataform to develop gene finders in eleven distinct genomes: A. thaliana, C. elegans, Z. mays, P. falciparum, D. me- lanogaster, D. rerio, M. musculus, R. norvegicus, O. sativa, G. max e H. sapiens. With this development, we established a protocol for implementing new gene predictors. In addi- tion, using our platform, we developed a pipeline to find genes in the 109 sugarcane BAC sequences produced by BIOEN (FAPESP Bioenergy Program).

Page generated in 0.0897 seconds