Global ETD Search

91	PATO: um ambiente integrado com interface gráfica para a curadoria de dados de sequências biológicas / PATO: an integrated enviroment with GUI to data curation of biological sequences Oliveira, Liliane Santana 22 November 2013 (has links) A evolução das tecnologias de sequenciamento de DNA tem permitido a elucidação da sequência genômica de um número cada vez maior de organismos. Contudo, a obtenção da sequência nucleotídica do genoma é apenas a primeira etapa no estudo dos organismos. O processo de anotação consiste na identicação as diferentes regiões de interesse no genoma e suas funcionalidades. Várias ferramentas computacionais foram desenvolvidas para auxiliar o processo de anotação, porém nenhuma delas permite ao usuário selecionar sequências, processá-las de forma a encontrar evidências a respeito das regiões genômicas, como predição gênica e de domínios protéicos, analisá-las gracamente e adicionar informações a respeito de suas regiões em um mesmo ambiente. Assim, o objetivo desse projeto foi o desenvolvimento de uma plataforma gráca para a anotação genômica que permite ao usuário realizar as tarefas necessárias para o processo de anotação em uma única ferramenta integrada a um banco de dados. A idéia é proporcionar ao usuário liberdade para trabalhar com o seu conjunto de dados, possibilitando a seleção de sequências para análise, construção dos pipelines processamento das mesmas e análise dos resultados encontrados a partir de visualizador que permite ao usuário adicionar in- formações às regiões e fazer a curadoria das sequências. A ferramenta resultante é facilmente extensível, permitindo o acoplamento modular de novas funcionalidades de anotação e sua estrutura permite ao usuário trabalhar tanto com projetos de sequências expressas como anotação de genomas. / The evolution of the technologies of DNA sequencing has permitted the elucidation of genomic sequence of an increasing number of organisms. Though, the obtainment of the genome nucleotide sequence is only the rst step in the study of organisms. The annotation process consists in the identication of different regions of interest on the genome and their features. Several computational tools were developed to support the annotation process, however none allow the user to select sequences, process them, analyze them graphically and add information about its regions in the same surrounding. Thus, the aim of this project was to develop a graphic platform to genome annotation that allows the user to realize your tasks required from the annotation process in a single tool integrated to a database. The idea is to provide from the user liberty to work with your dataset, enabling the selection of sequences for analyze, pipeline construction, processing them and analyze of results from the viewer that allows the user to add information in the regions and to do the trusteeship of sequences. The resulting tool is easily extensible; allowing the engagement modular of new functionalities of annotation and its structure allows the user works both projects of expressed sequences and with genome annotation. ambiente gráfico annotation anotação graphic platform processamento processing
92	Using functional annotation to characterize genome-wide association results Fisher, Virginia Applegate 11 December 2018 (has links) Genome-wide association studies (GWAS) have successfully identified thousands of variants robustly associated with hundreds of complex traits, but the biological mechanisms driving these results remain elusive. Functional annotation, describing the roles of known genes and regulatory elements, provides additional information about associated variants. This dissertation explores the potential of these annotations to explain the biology behind observed GWAS results. The first project develops a random-effects approach to genetic fine mapping of trait-associated loci. Functional annotation and estimates of the enrichment of genetic effects in each annotation category are integrated with linkage disequilibrium (LD) within each locus and GWAS summary statistics to prioritize variants with plausible functionality. Applications of this method to simulated and real data show good performance in a wider range of scenarios relative to previous approaches. The second project focuses on the estimation of enrichment by annotation categories. I derive the distribution of GWAS summary statistics as a function of annotations and LD structure and perform maximum likelihood estimation of enrichment coefficients in two simulated scenarios. The resulting estimates are less variable than previous methods, but the asymptotic theory of standard errors is often not applicable due to non-convexity of the likelihood function. In the third project, I investigate the problem of selecting an optimal set of tissue-specific annotations with greatest relevance to a trait of interest. I consider three selection criteria defined in terms of the mutual information between functional annotations and GWAS summary statistics. These algorithms correctly identify enriched categories in simulated data, but in the application to a GWAS of BMI the penalty for redundant features outweighs the modest relationships with the outcome yielding null selected feature sets, due to the weaker overall association and high similarity between tissue-specific regulatory features. All three projects require little in the way of prior hypotheses regarding the mechanism of genetic effects. These data-driven approaches have the potential to illuminate unanticipated biological relationships, but are also limited by the high dimensionality of the data relative to the moderate strength of the signals under investigation. These approaches advance the set of tools available to researchers to draw biological insights from GWAS results. Biostatistics GWAS Feature selection Fine mapping Functional annotation Random effects
93	The Annotation Cost of Context Switching: How Topic Models and Active Learning [May Not] Work Together Okuda, Nozomu 01 August 2017 (has links) The labeling of language resources is a time consuming task, whether aided by machine learning or not. Much of the prior work in this area has focused on accelerating human annotation in the context of machine learning, yielding a variety of active learning approaches. Most of these attempt to lead an annotator to label the items which are most likely to improve the quality of an automated, machine learning-based model. These active learning approaches seek to understand the effect of item selection on the machine learning model, but give significantly less emphasis to the effect of item selection on the human annotator. In this work, we consider a sentiment labeling task where existing, traditional active learning seems to have little or no value. We focus instead on the human annotator by ordering the items for better annotator efficiency. active learning topic modeling annotation human cost Computer Sciences
94	Using Multiview Annotation to Annotate Multiple Images Simultaneously Price, Timothy C. 01 June 2017 (has links) In order for a system to learn a model for object recognition, it must have a lot of positive images to learn from. Because of this, datasets of similar objects are built to train the model. These object datasets used for learning models are best when large, diverse and have annotations. But the process of obtaining the images and creating the annotations often times take a long time, and are costly. We use a method that obtains many images of the same objects in different angles very quickly and then reconstructs those images into a 3D model. We then use the 3D reconstruction of these images of an object to connect information about the different images of the same object together. We use that information to annotate all of the images taken very quickly and cheaply. These annotated images are then used to train the model. Multiview segmentation 3D Annotation 3D Modeling Computer Sciences
95	Design and application of methods for curating genetic variation databases Ephraim, Sean Stephen 01 July 2014 (has links) Cordova (Curated Online Reference Database Of Variation Annotations) is an out-of-the-box solution for building and maintaining an online database of genetic variations integrated with population study information and pathogenicity prediction results from popular algorithms. Our primary motivation for developing this system is to aid researchers and clinician-scientists in determining the clinical significance of genetic variations. To achieve this goal, Cordova provides an interface to review and manually or computationally curate genetic variation data as well as share it for clinical diagnostics and the advancement of research. Annotation Databases Prediction SNPs Web services
96	Semantics of Video Shots for Content-based Retrieval Volkmer, Timo, timovolkmer@gmx.net January 2007 (has links) Content-based video retrieval research combines expertise from many different areas, such as signal processing, machine learning, pattern recognition, and computer vision. As video extends into both the spatial and the temporal domain, we require techniques for the temporal decomposition of footage so that specific content can be accessed. This content may then be semantically classified - ideally in an automated process - to enable filtering, browsing, and searching. An important aspect that must be considered is that pictorial representation of information may be interpreted differently by individual users because it is less specific than its textual representation. In this thesis, we address several fundamental issues of content-based video retrieval for effective handling of digital footage. Temporal segmentation, the common first step in handling digital video, is the decomposition of video streams into smaller, semantically coherent entities. This is usually performed by detecting the transitions that separate single camera takes. While abrupt transitions - cuts - can be detected relatively well with existing techniques, effective detection of gradual transitions remains difficult. We present our approach to temporal video segmentation, proposing a novel algorithm that evaluates sets of frames using a relatively simple histogram feature. Our technique has been shown to range among the best existing shot segmentation algorithms in large-scale evaluations. The next step is semantic classification of each video segment to generate an index for content-based retrieval in video databases. Machine learning techniques can be applied effectively to classify video content. However, these techniques require manually classified examples for training before automatic classification of unseen content can be carried out. Manually classifying training examples is not trivial because of the implied ambiguity of visual content. We propose an unsupervised learning approach based on latent class modelling in which we obtain multiple judgements per video shot and model the users' response behaviour over a large collection of shots. This technique yields a more generic classification of the visual content. Moreover, it enables the quality assessment of the classification, and maximises the number of training examples by resolving disagreement. We apply this approach to data from a large-scale, collaborative annotation effort and present ways to improve the effectiveness for manual annotation of visual content by better design and specification of the process. Automatic speech recognition techniques along with semantic classification of video content can be used to implement video search using textual queries. This requires the application of text search techniques to video and the combination of different information sources. We explore several text-based query expansion techniques for speech-based video retrieval, and propose a fusion method to improve overall effectiveness. To combine both text and visual search approaches, we explore a fusion technique that combines spoken information and visual information using semantic keywords automatically assigned to the footage based on the visual content. The techniques that we propose help to facilitate effective content-based video retrieval and highlight the importance of considering different user interpretations of visual content. This allows better understanding of video content and a more holistic approach to multimedia retrieval in the future. Video retrieval shot segmentation video annotation visual content modelling
97	MODÉLISATION GÉNÉRIQUE DE DOCUMENTS MULTIMÉDIA PAR DES MÉTADONNÉES : MÉCANISMES D'ANNOTATION ET D'INTERROGATION Jedidi, Anis 06 July 2005 (has links) (PDF) Dans le cadre de la manipulation et de la description du contenu des documents, mes travaux de thèse consistent à étudier la modélisation générique de documents multimédia par des métadonnées. Nous proposons une approche qui consiste à l'homogénéisation des structures de représentation de tels documents facilitant leur traitement final sans avoir recours aux contenus multimédia eux-mêmes. Nous avons proposé la structuration de ces métadonnées dans des documents XML appelés « méta-documents ». Ces méta-documents représentent une structure supplémentaire par rapport à d'éventuelles structures logiques ou physiques rédigées par les auteurs des documents. Nous avons étendu les méta-documents en intégrant des descripteurs sémantiques définis selon le besoin de l'utilisateur et des relations spatiales et temporelles. Au niveau de l'interrogation des documents multimédia, nous avons proposé un outil d'aide à la formulation graphique de requêtes XQuery en utilisant les métadonnées et en intégrant les relations spatio-temporelles entre ces métadonnées. [INFO] Computer Science Document multimédia modélisation générique annotation métadonnée interrogation
98	Représentation de comportements emotionnels multimodaux spontanés : perception, annotation et synthèse Abrilian, Sarkis 07 September 2007 (has links) (PDF) L'objectif de cette thèse est de représenter les émotions spontanées et les signes multimodaux associés pour contribuer à la conception des futurs systèmes affectifs interactifs. Les prototypes actuels sont généralement limités à la détection et à la génération de quelques émotions simples et se fondent sur des données audio ou vidéo jouées par des acteurs et récoltées en laboratoire. Afin de pouvoir modéliser les relations complexes entre les émotions spontanées et leurs expressions dans différentes modalités, une approche exploratoire est nécessaire. L'approche exploratoire que nous avons choisie dans cette thèse pour l'étude de ces émotions spontanées consiste à collecter et annoter un corpus vidéo d'interviews télévisées. Ce type de corpus comporte des émotions plus complexes que les 6 émotions de base (colère, peur, joie, tristesse, surprise, dégoût). On observe en effet dans les comportements émotionnels spontanés des superpositions, des masquages, des conflits entre émotions positives et négatives. Nous rapportons plusieurs expérimentations ayant permis la définition de plusieurs niveaux de représentation des émotions et des paramètres comportementaux multimodaux apportant des informations pertinentes pour la perception de ces émotions complexes spontanées. En perspective, les outils développés durant cette thèse (schémas d'annotation, programmes de mesures, protocoles d'annotation) pourront être utilisés ultérieurement pour concevoir des modèles utilisables par des systèmes interactifs affectifs capables de détecter/synthétiser des expressions multimodales d'émotions spontanées. [INFO] Computer Science emotions multimodalité agents conversationnels animés corpus annotation
99	L'annotation des éléments transposables par la compréhension de leur diversification Flutre, Timothée 28 October 2010 (has links) (PDF) Tout organisme vivant est le produit d'interactions complexes entre son génome et son environnement, interactions caractérisées par des échanges de matière et d'énergie indispensables à la survie de l'organisme et la transmission de son génome. Depuis la découverte dans les années 1910 que le chromosome est le support de l'information génétique, les biologistes étudient les génomes afin de décrypter les mécanismes et processus à l'oeuvre dans le développement des organismes et l'évolution des populations. Grâce aux améliorations technologiques des dernières décennies, plusieurs génomes ont été entièrement séquencés, leur nombre s'accroissant rapidement, mais ils sont loin d'être décryptés pour autant. En effet, certains de leurs composants, les éléments transposables, sont encore mal compris, bien qu'ils aient été détectés chez quasiment toutes les espèces étudiées, et qu'ils puissent représenter jusqu'à 90% du contenu total de leurs génomes. Les éléments transposables sont des fragments du génome possédant la particularité d'être mobiles. Ils ont donc un impact majeur sur la structure des génomes mais également sur l'expression des gènes avoisinants, notamment via des mécanismes épigénétiques. Leur évolution est aussi particulière étant donné qu'ils ont une transmission verticale non-mendélienne et que de nombreux cas de transferts horizontaux ont été mis en évidence. Mais, à part dans le cas de certains organismes modèles pour lesquels nous disposons de séquences de référence, l'annotation des éléments transposables représente souvent un goulot d'étranglement dans l'analyse des séquences génomiques. A cela s'ajoute le fait que les études de génomique comparée montrent que les génomes sont bien plus dynamiques qu'on ne le croyait, en particulier ceux des plantes, ce qui complique d'autant l'annotation précise des éléments transposables. Pendant mes travaux de thèse, j'ai commencé par comparer les programmes informatiques existants utilisés dans les approches d'annotation de novo des éléments transposables. Pour cela, j'ai mis au point un protocole de test sur les génomes de Drosophila melanogaster et Arabidopsis thaliana. Ceci m'a permis de proposer une approche de novo combinant plusieurs outils, capable ainsi de reconstruire automatiquement un grand nombre de séquences de référence. De plus, j'ai pu montrer que notre approche mettait en évidence les variations structurales au sein de familles bien connues, notamment en distinguant des variants structuraux appartenant à une même famille d'éléments transposables, reflétant ainsi la diversification de ces familles au cours de leur évolution. Cette approche a été implémentée dans une suite d'outils (REPET) rendant possible l'analyse des éléments transposables de nombreux génomes de plantes, insectes, champignons et autres. Ces travaux ont abouti à une feuille de route décrivant de manière pratique comment annoter le contenu en éléments transposables de tout génome nouvellement séquencé. Par conséquent, de nombreuses questions concernant l'impact de ces éléments sur l'évolution de la structure des génomes peuvent maintenant être abordées chez différents génomes plus ou moins proches. Je propose également plusieurs pistes de recherche, notamment la simulation des données nécessaires à l'amélioration des algorithmes de détection, démarche complémentaire de la modélisation de la dynamique des éléments transposables. [SDV] Life Sciences génomique bioinformatique élément transposable annotation variation structurale
100	Ad-hoc Collaborative Document Annotation on a Tablet PC Huang, Albert 01 1900 (has links) The use of technology as an effective educational tool has been an elusive goal in the past. Specifically, previous attempts at using small personal computers in the classroom to aid students as collaborative and note-taking tools have been met with lukewarm responses. Many of these past attempts were hampered by inferior hardware and the lack of an efficient and user-friendly interface. With the recent introduction of Tablet PC products on the market, however, the limitations imposed on software developers for mobile computing systems have been dramatically lowered. We present a collaborative annotation system that allows students equipped with tablet computers to work cooperatively in either an ad-hoc or a structured wireless classroom setting. / Singapore-MIT Alliance (SMA) collaborative document annotation educational technology Tablet PC wireless classroom

Search results