Global ETD Search

81	Automating the gathering of relevant information from biomedical text Canevet, Catherine January 2009 (has links) More and more, database curators rely on literature-mining techniques to help them gather and make use of the knowledge encoded in text documents. This thesis investigates how an assisted annotation process can help and explores the hypothesis that it is only with respect to full-text publications that a system can tell relevant and irrelevant facts apart by studying their frequency. A semi-automatic annotation process was developed for a particular database - the Nuclear Protein Database (NPD), based on a set of full-text articles newly annotated with regards to subnuclear protein localisation, along with eight lexicons. The annotation process is carried out online, retrieving relevant documents (abstracts and full-text papers) and highlighting sentences of interest in them. The process also offers a summary Table of the facts found clustered by type of information. Each method involved in each step of the tool is evaluated using cross-validation results on the training data as well as test set results. The performance of the final tool, called the “NPD Curator System Interface”, is estimated empirically in an experiment where the NPD curator updates the database with pieces of information found relevant in 31 publications using the interface. A final experiment complements our main methodology by showing its extensibility to retrieving information on protein function rather than localisation. I argue that the general methods, the results they produced and the discussions they engendered are useful for any subsequent attempt to generate semi-automatic database annotation processes. The annotated corpora, gazetteers, methods and tool are fully available on request of the author (catherine.canevet@bbsrc.ac.uk). 020
82	Visual Cueing: Investigating the Effects of Text Annotation on Student Retention Rates Brown, Ron 05 1900 (has links) This Study examines the grades of students using study skill methods and those who do not. The experiment consists of giving the treatment group the opportunity to use well- known study techniques. The Control group could only read the material. Both groups were given ten minutes to read a pre-selected text. The text consisted of an 1,807 word lesson on the, "Technical Training Management System." Each group was given five minutes to take a twenty item quiz. Fifty-five students in the control group were limited to only reading the material. Fifty-six students in the treatment group could choose between highlighting, note-taking, and underlining. The results of the test scores were compared using a t - test for dependent samples. One week later, the same students in each group were re-tested, using the same quiz they had taken earlier. Students had five minutes to review study material. Study material for the treatment group included the same material they had annotated earlier. The Results from each group wascompared. Efforts were made to avoid potential flaws in previous studies, thereby producing more viable results. Results of this study indicate there is no significant difference between the grades of students who use the aforementioned forms of text annotation and those who do not. Study skills. Memory. Reading, Psychology of. Text annotation highlighting underlining
83	Étude des métaphores conceptuelles utilisées dans la description des structures anatomiques Lubin, Leslie January 2005 (has links) Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal. Métaphore conceptuelle Phraséologie Annotation sémantique Corpus Anatomie Traduction
84	The Cortical response to RhoA is regulated during mitosis. Annotation of cytoskeletal and motility proteins in the sea urchin genome assembly Hoffman, Matthew P. January 2008 (has links) Thesis advisor: David Burgess / This doctoral thesis addresses two central topics divided into separate chapters. In Chapter 1: The cortical response to RhoA is regulated during mitosis, experimental findings using sea urchin embryos are presented that demonstrate that the small GTPase RhoA participates in positive signaling for cell division and that this activity is negatively regulated prior to anaphase. In a second series of experiments, myosin phosphatase is shown to be a central negative regulator of myosin activity during the cell cycle through metaphase of mitosis and experimental findings support the conclusion that myosin phosphatase opposes RhoA signaling until anaphase onset. These experiments also reveal that myosin activation alone is insufficient to stimulate cortical contractions during S phase and during metaphase arrest following activation of the spindle checkpoint. In Chapter 2: Annotation of cytoskeletal and motility proteins in the sea urchin genome assembly, as part of a collaborative project, homologs of cytoskeletal genes and gene families were derived and annotated from the sea urchin genome assembly. In addition, phylogenetic analysis of multiple gene families is presented based on these findings. / Thesis (PhD) — Boston College, 2008. / Submitted to: Boston College. Graduate School of Arts and Sciences. / Discipline: Biology. RhoA Cytokinesis Sea urchin Cytoskeleton Cell cycle Annotation
85	Annotation et recherche contextuelle des documents multimédias socio-personnels / Context-aware annotation and retrieval of socio-personal multimedia documents Lajmi, Sonia 11 March 2011 (has links) L’objectif de cette thèse est d’instrumentaliser des moyens, centrés utilisateur, de représentation, d’acquisition, d’enrichissement et d’exploitation des métadonnées décrivant des documents multimédias socio-personnels. Afin d’atteindre cet objectif, nous avons proposé un modèle d’annotation, appelé SeMAT avec une nouvelle vision du contexte de prise de vue. Nous avons proposé d’utiliser des ressources sémantiques externes telles que GeoNames , et Wikipédia pour enrichir automatiquement les annotations partant des éléments de contexte capturés. Afin d’accentuer l’aspect sémantique des annotations, nous avons modélisé la notion de profil social avec des outils du web sémantique en focalisant plus particulièrement sur la notion de liens sociaux et un mécanisme de raisonnement permettant d’inférer de nouveaux liens sociaux non explicités. Le modèle proposé, appelé SocialSphere, construit un moyen de personnalisation des annotations suivant la personne qui consulte les documents (le consultateur). Des exemples d’annotations personnalisées peuvent être des objets utilisateurs (e.g. maison, travail) ou des dimensions sociales (e.g. ma mère, le cousin de mon mari). Dans ce cadre, nous avons proposé un algorithme, appelé SQO, permettant de suggérer au consultateur des dimensions sociales selon son profil pour décrire les acteurs d’un document multimédia. Dans la perspective de suggérer à l’utilisateur des évènements décrivant les documents multimédias, nous avons réutilisé son expérience et l’expérience de son réseau de connaissances en produisant des règles d’association. Dans une dernière partie, nous avons abordé le problème de correspondance (ou appariement) entre requête et graphe social. Nous avons proposé de ramener le problème de recherche de correspondance à un problème d’isomorphisme de sous-graphe partiel. Nous avons proposé un algorithme, appelé h-Pruning, permettant de faire une correspondance rapprochée entre les nœuds des deux graphes : motif (représentant la requête) et social. Pour la mise en œuvre, nous avons réalisé un prototype à deux composantes : web et mobile. La composante mobile a pour objectif de capturer les éléments de contexte lors de la création des documents multimédias socio-personnels. Quant à la composante web, elle est dédiée à l’assistance de l’utilisateur lors de son annotation ou consultation des documents multimédias socio-personnels. L’évaluation a été effectuée en se servant d’une collection de test construite à partir du service de médias sociaux Flickr. Les tests ont prouvé : (i) l’efficacité de notre approche de recherche dans le graphe social en termes de temps d’exécution ; (ii) l’efficacité de notre approche de suggestion des événements (en effet, nous avons prouvé notre hypothèse en démontrant l’existence d’une cooccurrence entre le contexte spatio-temporel et les événements) ; (iii) l’efficacité de notre approche de suggestion des dimensions sociales en termes de temps d’exécution. / The overall objective of this thesis is to exploit a user centric means of representation, acquisition, enrichment and exploitation of multimedia document metadata. To achieve this goal, we proposed an annotation model, called SeMAT with a new vision of the snapshot context. We proposed the usage of external semantic resources (e.g. GeoNames ,, Wikipedia , etc.) to enrich the annotations automatically from the snapshot contextual elements. To accentuate the annotations semantic aspect, we modeled the concept of ‘social profile’ with Semantic web tools by focusing, in particular, on social relationships and a reasoning mechanism to infer a non-explicit social relationship. The proposed model, called SocialSphere is aimed to exploit a way to personalize the annotations to the viewer. Examples can be user’s objects (e.g. home, work) or user’s social dimensions (e.g. my mother, my husband's cousin). In this context, we proposed an algorithm, called SQO to suggest social dimensions describing actors in multimedia documents according to the viewer’s social profile. For suggesting event annotations, we have reused user experience and the experience of the users in his social network by producing association rules. In the last part, we addressed the problem of pattern matching between query and social graph. We proposed to steer the problem of pattern matching to a sub-graph isomorphism problem. We proposed an algorithm, called h-Pruning, for partial sub-graph isomorphism to ensure a close matching between nodes of the two graphs: motive (representing the request) and the social one. For implementation, we realized a prototype having two components: mobile and web. The mobile component aims to capture the snapshot contextual elements. As for the web component, it is dedicated to the assistance of the user during his socio-personnel multimedia document annotation or socio-personnel multimedia document consultation. The evaluation have proven: (i) the effectiveness of our exploitation of social graph approach in terms of execution time, (ii) the effectiveness of our event suggestion approach (we proved our hypothesis by demonstrating the existence of co-occurrence between the spatio-temporal context and events), (iii) the effectiveness of our social dimension suggestion approach in terms of execution time. Informatique Web sémantique Web 2.0 Annotation sémantique Annotation de documents Réseaux sociaux Graphe Appariement de graphes Information Technology Semantic web Web 2.0 Semantic annotation Document annotation Social ntetworks Graphs Graph matching
86	Sifter-T: Um framework escalável para anotação filogenômica probabilística funcional de domínios protéicos / Sifter-T: A scalable framework for phylogenomic probabilistic protein domain functional annotation Silva, Danillo Cunha de Almeida e 25 October 2013 (has links) É conhecido que muitos softwares deixam de ser utilizados por sua complexa usabilidade. Mesmo ferramentas conhecidas por sua qualidade na execução de uma tarefa são abandonadas em favor de ferramentas mais simples de usar, de instalar ou mais rápidas. Na área da anotação funcional a ferramenta Sifter (v2.0) é considerada uma das com melhor qualidade de anotação. Recentemente ela foi considerada uma das melhores ferramentas de anotação funcional segundo o Critical Assessment of protein Function Annotation (CAFA) experiment. Apesar disso, ela ainda não é amplamente utilizada, provavelmente por questões de usabilidade e adequação do framework à larga escala. O workflow SIFTER original consiste em duas etapas principais: A recuperação das anotações para uma lista de genes e a geração de uma árvore de genes reconciliada para a mesma lista. Em seguida, a partir da árvore de genes o Sifter constrói uma rede bayesiana de mesma estrutura nas quais as folhas representam os genes. As anotações funcionais dos genes conhecidos são associadas a estas folhas e em seguida as anotações são propagadas probabilisticamente ao longo da rede bayesiana até as folhas sem informação a priori. Ao fim do processo é gerada para cada gene de função desconhecida uma lista de funções putativas do tipo Gene Ontology e suas probabilidades de ocorrência. O principal objetivo deste trabalho é aperfeiçoar o código-fonte original para melhor desempenho, potencialmente permitindo que seja usado em escala genômica. Durante o estudo do workflow de pré-processamento dos dados encontramos oportunidades para aperfeiçoamento e visualizamos estratégias para abordá-las. Dentre as estratégias implementadas temos: O uso de threads paralelas; balanceamento de carga de processamento; algoritmos revisados para melhor aproveitamento de disco, memória e tempo de execução; adequação do código fonte ao uso de bancos de dados biológicos em formato utilizado atualmente; aumento da acessibilidade do usuário; expansão dos tipos de entrada aceitos; automatização do processo de reconciliação entre árvores de genes e espécies; processos de filtragem de seqüências para redução da dimensão da análise; e outras implementações menores. Com isto conquistamos aumento de performance de até 87 vezes para a recuperação de anotações e 73,3% para a reconstrução da árvore de genes em máquinas quad-core, e redução significante de consumo de memória na fase de realinhamento. O resultado desta implementação é apresentado como Sifter-T (Sifter otimizado para Throughput), uma ferramenta open source de melhor usabilidade, velocidade e qualidade de anotação em relação à implementação original do workflow de Sifter. Sifter-T foi escrito de forma modular em linguagem de programação Python; foi elaborado para simplificar a tarefa de anotação de genomas e proteomas completos; e os resultados são apresentados de forma a facilitar o trabalho do pesquisador. / It is known that many software are no longer used due to their complex usability. Even tools known for their task execution quality are abandoned in favour of faster tools, simpler to use or install. In the functional annotation field, Sifter (v2.0) is regarded as one of the best when it comes to annotation quality. Recently it has been considered one of the best tools for functional annotation according to the \"Critical Assessment of Protein Function Annotation (CAFA) experiment. Nevertheless, it is still not widely used, probably due to issues with usability and suitability of the framework to a high throughput scale. The original workflow SIFTER consists of two main steps: The annotation recovery for a list of genes and the reconciled gene tree generation for the same list. Next, based on the gene tree, Sifter builds a Bayesian network structure in which its leaves represent genes. The known functional annotations are associated to the aforementioned leaves, and then the annotations are probabilistically propagated along the Bayesian network to the leaves without a priori information. At the end of the process, a list of Gene Ontology functions and their occurrence probabilities is generated for each unknown function gene. This work main goal is to optimize the original source code for better performance, potentially allowing it to be used in a genome-wide scale. Studying the pre-processing workflow we found opportunities for improvement and envisioned strategies to address them. Among the implemented strategies we have: The use of parallel threads; CPU load balancing, revised algorithms for best utilization of disk access, memory usage and runtime; source code adaptation to currently used biological databases; improved user accessibility; input types increase; automatic gene and species tree reconciliation process; sequence filtering to reduce analysis dimension, and other minor implementations. With these implementations we achieved great performance speed-ups. For example, we obtained 87-fold performance increase in the annotation recovering module and 72.3% speed increase in the gene tree generation module using quad-core machines. Additionally, significant memory usage decrease during the realignment phase was obtained. This implementation is presented as Sifter-T (Sifter Throughput-optimized), an open source tool with better usability, performance and annotation quality when compared to the Sifter\'s original workflow implementation. Sifter-T was written in a modular fashion using Python programming language; it is designed to simplify complete genomes and proteomes annotation tasks and the outputs are presented in order to make the researcher\'s work easier. Anotação Funcional Functional annotation High Throughput Larga Escala Sifter Sifter
87	Ontology-driven semantic annotations for multiple engineering viewpoints in computer aided design Li, Chun January 2012 (has links) Engineering design involves a series of activities to handle data, including capturing and storing data, retrieval and manipulation of data. This also applies throughout the entire product lifecycle (PLC). Unfortunately, a closed loop of knowledge and information management system has not been implemented for the PLC. As part of product lifecycle management (PLM) approaches, computer-aided design (CAD) systems are extensively used from embodiment and detail design stages in mechanical engineering. However, current CAD systems lack the ability to handle semantically-rich information, thus to represent, manage and use knowledge among multidisciplinary engineers, and to integrate various tools/services with distributed data and knowledge. To address these challenges, a general-purpose semantic annotation approach based on CAD systems in the mechanical engineering domain is proposed, which contributes to knowledge management and reuse, data interoperability and tool integration. In present-day PLM systems, annotation approaches are currently embedded in software applications and use diverse data and anchor representations, making them static, inflexible and difficult to incorporate with external systems. This research will argue that it is possible to take a generalised approach to annotation with formal annotation content structures and anchoring mechanisms described using general-purpose ontologies. In this way viewpoint-oriented annotation may readily be captured, represented and incorporated into PLM systems together with existing annotations in a common framework, and the knowledge collected or generated from multiple engineering viewpoints may be reasoned with to derive additional knowledge to enable downstream processes. Therefore, knowledge can be propagated and evolved through the PLC. Within this framework, a knowledge modelling methodology has also been proposed for developing knowledge models in various situations. In addition, a prototype system has been designed and developed in order to evaluate the core contributions of this proposed concept. According to an evaluation plan, cost estimation and finite element analysis as case studies have been used to validate the usefulness, feasibility and generality of the proposed framework. Discussion has been carried out based on this evaluation. As a conclusion, the presented research work has met the original aim and objectives, and can be improved further. At the end, some research directions have been suggested. 620.00420285
88	Machine learning architectures for video annotation and retrieval Markatopoulou, Foteini January 2018 (has links) In this thesis we are designing machine learning methodologies for solving the problem of video annotation and retrieval using either pre-defined semantic concepts or ad-hoc queries. Concept-based video annotation refers to the annotation of video fragments with one or more semantic concepts (e.g. hand, sky, running), chosen from a predefined concept list. Ad-hoc queries refer to textual descriptions that may contain objects, activities, locations etc., and combinations of the former. Our contributions are: i) A thorough analysis on extending and using different local descriptors towards improved concept-based video annotation and a stacking architecture that uses in the first layer, concept classifiers trained on local descriptors and improves their prediction accuracy by implicitly capturing concept relations, in the last layer of the stack. ii) A cascade architecture that orders and combines many classifiers, trained on different visual descriptors, for the same concept. iii) A deep learning architecture that exploits concept relations at two different levels. At the first level, we build on ideas from multi-task learning, and propose an approach to learn concept-specific representations that are sparse, linear combinations of representations of latent concepts. At a second level, we build on ideas from structured output learning, and propose the introduction, at training time, of a new cost term that explicitly models the correlations between the concepts. By doing so, we explicitly model the structure in the output space (i.e., the concept labels). iv) A fully-automatic ad-hoc video search architecture that combines concept-based video annotation and textual query analysis, and transforms concept-based keyframe and query representations into a common semantic embedding space. Our architectures have been extensively evaluated on the TRECVID SIN 2013, the TRECVID AVS 2016, and other large-scale datasets presenting their effectiveness compared to other similar approaches.
89	PATO: um ambiente integrado com interface gráfica para a curadoria de dados de sequências biológicas / PATO: an integrated enviroment with GUI to data curation of biological sequences Oliveira, Liliane Santana 22 November 2013 (has links) A evolução das tecnologias de sequenciamento de DNA tem permitido a elucidação da sequência genômica de um número cada vez maior de organismos. Contudo, a obtenção da sequência nucleotídica do genoma é apenas a primeira etapa no estudo dos organismos. O processo de anotação consiste na identicação as diferentes regiões de interesse no genoma e suas funcionalidades. Várias ferramentas computacionais foram desenvolvidas para auxiliar o processo de anotação, porém nenhuma delas permite ao usuário selecionar sequências, processá-las de forma a encontrar evidências a respeito das regiões genômicas, como predição gênica e de domínios protéicos, analisá-las gracamente e adicionar informações a respeito de suas regiões em um mesmo ambiente. Assim, o objetivo desse projeto foi o desenvolvimento de uma plataforma gráca para a anotação genômica que permite ao usuário realizar as tarefas necessárias para o processo de anotação em uma única ferramenta integrada a um banco de dados. A idéia é proporcionar ao usuário liberdade para trabalhar com o seu conjunto de dados, possibilitando a seleção de sequências para análise, construção dos pipelines processamento das mesmas e análise dos resultados encontrados a partir de visualizador que permite ao usuário adicionar in- formações às regiões e fazer a curadoria das sequências. A ferramenta resultante é facilmente extensível, permitindo o acoplamento modular de novas funcionalidades de anotação e sua estrutura permite ao usuário trabalhar tanto com projetos de sequências expressas como anotação de genomas. / The evolution of the technologies of DNA sequencing has permitted the elucidation of genomic sequence of an increasing number of organisms. Though, the obtainment of the genome nucleotide sequence is only the rst step in the study of organisms. The annotation process consists in the identication of different regions of interest on the genome and their features. Several computational tools were developed to support the annotation process, however none allow the user to select sequences, process them, analyze them graphically and add information about its regions in the same surrounding. Thus, the aim of this project was to develop a graphic platform to genome annotation that allows the user to realize your tasks required from the annotation process in a single tool integrated to a database. The idea is to provide from the user liberty to work with your dataset, enabling the selection of sequences for analyze, pipeline construction, processing them and analyze of results from the viewer that allows the user to add information in the regions and to do the trusteeship of sequences. The resulting tool is easily extensible; allowing the engagement modular of new functionalities of annotation and its structure allows the user works both projects of expressed sequences and with genome annotation. ambiente gráfico annotation anotação graphic platform processamento processing
90	Using functional annotation to characterize genome-wide association results Fisher, Virginia Applegate 11 December 2018 (has links) Genome-wide association studies (GWAS) have successfully identified thousands of variants robustly associated with hundreds of complex traits, but the biological mechanisms driving these results remain elusive. Functional annotation, describing the roles of known genes and regulatory elements, provides additional information about associated variants. This dissertation explores the potential of these annotations to explain the biology behind observed GWAS results. The first project develops a random-effects approach to genetic fine mapping of trait-associated loci. Functional annotation and estimates of the enrichment of genetic effects in each annotation category are integrated with linkage disequilibrium (LD) within each locus and GWAS summary statistics to prioritize variants with plausible functionality. Applications of this method to simulated and real data show good performance in a wider range of scenarios relative to previous approaches. The second project focuses on the estimation of enrichment by annotation categories. I derive the distribution of GWAS summary statistics as a function of annotations and LD structure and perform maximum likelihood estimation of enrichment coefficients in two simulated scenarios. The resulting estimates are less variable than previous methods, but the asymptotic theory of standard errors is often not applicable due to non-convexity of the likelihood function. In the third project, I investigate the problem of selecting an optimal set of tissue-specific annotations with greatest relevance to a trait of interest. I consider three selection criteria defined in terms of the mutual information between functional annotations and GWAS summary statistics. These algorithms correctly identify enriched categories in simulated data, but in the application to a GWAS of BMI the penalty for redundant features outweighs the modest relationships with the outcome yielding null selected feature sets, due to the weaker overall association and high similarity between tissue-specific regulatory features. All three projects require little in the way of prior hypotheses regarding the mechanism of genetic effects. These data-driven approaches have the potential to illuminate unanticipated biological relationships, but are also limited by the high dimensionality of the data relative to the moderate strength of the signals under investigation. These approaches advance the set of tools available to researchers to draw biological insights from GWAS results. Biostatistics GWAS Feature selection Fine mapping Functional annotation Random effects

Search results