Spelling suggestions: "subject:"byconcept extraction"" "subject:"c.concept extraction""
1 |
Emergency Medical Service EMR-Driven Concept Extraction From Narrative TextGeorge, Susanna Serene 08 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Being in the midst of a pandemic with patients having minor symptoms that quickly
become fatal to patients with situations like a stemi heart attack, a fatal accident injury,
and so on, the importance of medical research to improve speed and efficiency in patient
care, has increased. As researchers in the computer domain work hard to use automation
in technology in assisting the first responders in the work they do, decreasing the cognitive
load on the field crew, time taken for documentation of each patient case and improving
accuracy in details of a report has been a priority.
This paper presents an information extraction algorithm that custom engineers certain
existing extraction techniques that work on the principles of natural language processing
like metamap along with syntactic dependency parser like spacy for analyzing the sentence structure and regular expressions to recurring patterns, to retrieve patient-specific information from medical narratives. These concept value pairs automatically populates the fields of an EMR form which could be reviewed and modified manually if needed. This report can then be reused for various medical and billing purposes related to the patient.
|
2 |
Graph-Based Visualization of Ontology-Based Competence Profiles for Research CollaborationAfzal, Mansoor January 2012 (has links)
Information visualization can be valuable in a wide range of applications, it deals with abstract, non-spatial data and with the representation of data elements in a meaningful form irrespective of the size of the data, because sometimes visualization itself focuses on the certain key aspects of the data in the representation and thus it helps by providing ease for the goal oriented interpretation. Information visualization focuses on providing a spontaneous and deeper level of the understanding of the data. Research collaboration enhances sharing knowledge and also enhances an individual’s talent. New ideas are generated when knowledge is shared and transferred among each other. According to (He et al, 2009) Research collaboration has been considered as a phenomenon of growing importance for the researchers, also it should be encouraged and is considered to be a “good thing” among the researchers. The main purpose of this thesis work is to prepare a model for the competence profile visualization purpose. For this purpose the study of different visualization techniques that exist in the field of information visualization are discussed in this thesis work. The study and discussion about the visualization techniques motivates in selecting appropriate visualization techniques for the visualization of Ontology-based competence profiles for research collaboration purpose. A proof of concept is developed which shows how these visualization techniques are applied to visualize several components of competence profile.
|
3 |
EMERGENCY MEDICAL SERVICE EMR-DRIVEN CONCEPT EXTRACTION FROM NARRATIVE TEXTSusanna S George (10947207) 05 August 2021 (has links)
Being in the midst of a pandemic with patients having minor symptoms that quickly
become fatal to patients with situations like a stemi heart attack, a fatal accident injury,
and so on, the importance of medical research to improve speed and efficiency in patient
care, has increased. As researchers in the computer domain work hard to use automation
in technology in assisting the first responders in the work they do, decreasing the cognitive
load on the field crew, time taken for documentation of each patient case and improving
accuracy in details of a report has been a priority.
<br>This paper presents an information extraction algorithm that custom engineers certain
existing extraction techniques that work on the principles of natural language processing
like metamap along with syntactic dependency parser like spacy for analyzing the sentence
structure and regular expressions to recurring patterns, to retrieve patient-specific information from medical narratives. These concept value pairs automatically populates the fields
of an EMR form which could be reviewed and modified manually if needed. This report can
then be reused for various medical and billing purposes related to the patient.
|
4 |
Syntax-based Concept Extraction For Question AnsweringGlinos, Demetrios 01 January 2006 (has links)
Question answering (QA) stands squarely along the path from document retrieval to text understanding. As an area of research interest, it serves as a proving ground where strategies for document processing, knowledge representation, question analysis, and answer extraction may be evaluated in real world information extraction contexts. The task is to go beyond the representation of text documents as "bags of words" or data blobs that can be scanned for keyword combinations and word collocations in the manner of internet search engines. Instead, the goal is to recognize and extract the semantic content of the text, and to organize it in a manner that supports reasoning about the concepts represented. The issue presented is how to obtain and query such a structure without either a predefined set of concepts or a predefined set of relationships among concepts. This research investigates a means for acquiring from text documents both the underlying concepts and their interrelationships. Specifically, a syntax-based formalism for representing atomic propositions that are extracted from text documents is presented, together with a method for constructing a network of concept nodes for indexing such logical forms based on the discourse entities they contain. It is shown that meaningful questions can be decomposed into Boolean combinations of question patterns using the same formalism, with free variables representing the desired answers. It is further shown that this formalism can be used for robust question answering using the concept network and WordNet synonym, hypernym, hyponym, and antonym relationships. This formalism was implemented in the Semantic Extractor (SEMEX) research tool and was tested against the factoid questions from the 2005 Text Retrieval Conference (TREC), which operated upon the AQUAINT corpus of newswire documents. After adjusting for the limitations of the tool and the document set, correct answers were found for approximately fifty percent of the questions analyzed, which compares favorably with other question answering systems.
|
5 |
Rule-based data augmentation for document-level medical concept extractionShao, Qiwei 08 1900 (has links)
L'extraction de concepts médicaux au niveau du document identifie les concepts médicaux distincts dans un document entier, essentielle pour améliorer les modèles de recherche d'information et de question-réponse en comprenant les concepts dans les requêtes et les documents sans necessiter d'annotations manuelles.
Les recherches existantes se sont concentrées sur la reconnaissance d'entités nommées (Named Entity Recognition - NER) ou le liaison d'entités (Entity Linking - EL) séparément, s'appuyant fortement sur des annotations manuelles qui sont souvent indisponibles ou limitées. De plus, la plupart des méthodes de NER et EL sont limitées dans leur capacité de tenir compte du contexte lors de l'association de texte aux concepts, ce qui complique l'identification des termes polysémiques et des noms de concepts non canoniques nécessitant une désambiguïsation contextuelle.
Notre approche aborde trois défis : la rareté des données d'entraînement étiquetées, les noms de concepts non canoniques et la polysémie. Nous traitons l'extraction de concepts au niveau du document comme un problème de match de plongement concept-document. Pour entraîner un modèle de match avec des exemples limités, nous utilisons des pseudo-annotations générées par MetaMapLite pour augmenter les données de nombreux concepts de test. Notre hypothèse est que, malgré que les annotations par MetaMapLite sont bruitées, si la majorité des annotations est correcte, elles peuvent servir à entraîner un meilleur modèle de match.
Nos expériences montrent que notre méthode d'augmentation de données dépasse les modèles de base comme BioBERT, BiomedBERT, BioLinkBERT et SapBERT dans l'extraction générale de concepts et des scénarios spécifiques impliquant des concepts sous-entraînés, des noms non canoniques et des termes polysémiques de 6.8\% à 46.7\%. Notre modèle s'avère robuste à diverses configurations, y compris la quantité et le poids des examples d'entraînement augmentés, les plongements lexicaux et les filtres de pseudo-annotations.
Nous établissons une base solide dans l'extraction de concepts médicaux au niveau du document par l'augmentation des données. Notre étude montre une avenue prometteuse d'exploiter diverses techniques d'augmentation de données pour améliorer l'extraction de concepts au niveau du document. / Document-level medical concept extraction identifies distinct medical concepts across an entire document, crucial for enhancing information retrieval and question-answering models by accurately understanding concepts in queries and documents without needing precise mention annotations.
Traditional research has focused on Named Entity Recognition (NER) or Entity Linking (EL) separately, relying heavily on extensive manual annotations often unavailable in many question-answering datasets. Moreover, most NER and EL methods are limited in taking into account context when matching text to concept IDs, complicating the identification of polysemous terms and non-canonical concept names requiring contextual disambiguation.
Our approach address three challenges: scarcity of labeled training data, non-canonical concept names, and polysemy. We treats document-level concept extraction as a concept-document embedding matching problem, enabling the model to learn from context without extensive manual annotations. We use pseudo-annotations generated by MetaMapLite to tackle the lack of labeled data for many test concepts. The assumption is that while the annotations by MetaMapLite are noisy, if the majority of the annotations are correct, they can provide useful information for training a neural matching model.
Our experiments show that our data augmentation method surpasses baseline models like BioBERT, BiomedBERT, BioLinkBERT, and SapBERT in general concept extraction and specific scenarios involving undertrained concepts, non-canonical names, and polysemous terms by 6.8\% to 46.7\%. Our model proves robust to various configurations, including augmented training sample quantity and weighting, embedding methods, and pseudo-annotation filters.
We establish a solid foundation in document-level medical concept extraction through data augmentation. Our study shows a promising avenue of exploiting diverse data augmentation techniques to improve document-level concept extraction.
|
6 |
Extraction automatique et visualisation des thèmes abordés dans des résumés de mémoires et de thèses en anthropologie au Québec, de 1985 à 2009Samson, Anne-Renée 06 1900 (has links)
S’insérant dans les domaines de la Lecture et de l’Analyse de Textes Assistées par Ordinateur (LATAO), de la Gestion Électronique des Documents (GÉD), de la visualisation de l’information et, en partie, de l’anthropologie, cette recherche exploratoire propose l’expérimentation d’une méthodologie descriptive en fouille de textes afin de cartographier thématiquement un corpus de textes anthropologiques. Plus précisément, nous souhaitons éprouver la méthode de classification hiérarchique ascendante (CHA) pour extraire et analyser les thèmes issus de résumés de mémoires et de thèses octroyés de 1985 à 2009 (1240 résumés), par les départements d’anthropologie de l’Université de Montréal et de l’Université Laval, ainsi que le département d’histoire de l’Université Laval (pour les résumés archéologiques et ethnologiques). En première partie de mémoire, nous présentons notre cadre théorique, c'est-à-dire que nous expliquons ce qu’est la fouille de textes, ses origines, ses applications, les étapes méthodologiques puis, nous complétons avec une revue des principales publications. La deuxième partie est consacrée au cadre méthodologique et ainsi, nous abordons les différentes étapes par lesquelles ce projet fut conduit; la collecte des données, le filtrage linguistique, la classification automatique, pour en nommer que quelques-unes. Finalement, en dernière partie, nous présentons les résultats de notre recherche, en nous attardant plus particulièrement sur deux expérimentations. Nous abordons également la navigation thématique et les approches conceptuelles en thématisation, par exemple, en anthropologie, la dichotomie culture ̸ biologie. Nous terminons avec les limites de ce projet et les pistes d’intérêts pour de futures recherches. / Taking advantage of the recent development of automated analysis of textual data, digital records of documents, data graphics and anthropology, this study was set forth using data mining techniques to create a thematic map of anthropological documents. In this exploratory research, we propose to evaluate the usefulness of thematic analysis by using automated classification of textual data, as well as information visualizations (based on network analysis). More precisely, we want to examine the method of hierarchical clustering (HCA, agglomerative) for thematic analysis and information extraction. We built our study from a database consisting of 1 240 thesis abstracts, granted from 1985 to 2009, by anthropological departments at the University of Montreal and University Laval, as well as historical department at University Laval (for archaeological and ethnological abstracts). In the first section, we present our theoretical framework; we expose definitions of text mining, its origins, the practical applications and the methodology, and in the end, we present a literature review. The second part is devoted to the methodological framework and we discuss the various stages through which the project was conducted; construction of database, linguistic and statistical filtering, automated classification, etc. Finally, in the last section, we display results of two specific experiments and we present our interpretations. We also discuss about thematic navigation and conceptual approaches. We conclude with the limitations we faced through this project and paths of interest for future research.
|
7 |
Extraction automatique et visualisation des thèmes abordés dans des résumés de mémoires et de thèses en anthropologie au Québec, de 1985 à 2009Samson, Anne-Renée 06 1900 (has links)
S’insérant dans les domaines de la Lecture et de l’Analyse de Textes Assistées par Ordinateur (LATAO), de la Gestion Électronique des Documents (GÉD), de la visualisation de l’information et, en partie, de l’anthropologie, cette recherche exploratoire propose l’expérimentation d’une méthodologie descriptive en fouille de textes afin de cartographier thématiquement un corpus de textes anthropologiques. Plus précisément, nous souhaitons éprouver la méthode de classification hiérarchique ascendante (CHA) pour extraire et analyser les thèmes issus de résumés de mémoires et de thèses octroyés de 1985 à 2009 (1240 résumés), par les départements d’anthropologie de l’Université de Montréal et de l’Université Laval, ainsi que le département d’histoire de l’Université Laval (pour les résumés archéologiques et ethnologiques). En première partie de mémoire, nous présentons notre cadre théorique, c'est-à-dire que nous expliquons ce qu’est la fouille de textes, ses origines, ses applications, les étapes méthodologiques puis, nous complétons avec une revue des principales publications. La deuxième partie est consacrée au cadre méthodologique et ainsi, nous abordons les différentes étapes par lesquelles ce projet fut conduit; la collecte des données, le filtrage linguistique, la classification automatique, pour en nommer que quelques-unes. Finalement, en dernière partie, nous présentons les résultats de notre recherche, en nous attardant plus particulièrement sur deux expérimentations. Nous abordons également la navigation thématique et les approches conceptuelles en thématisation, par exemple, en anthropologie, la dichotomie culture ̸ biologie. Nous terminons avec les limites de ce projet et les pistes d’intérêts pour de futures recherches. / Taking advantage of the recent development of automated analysis of textual data, digital records of documents, data graphics and anthropology, this study was set forth using data mining techniques to create a thematic map of anthropological documents. In this exploratory research, we propose to evaluate the usefulness of thematic analysis by using automated classification of textual data, as well as information visualizations (based on network analysis). More precisely, we want to examine the method of hierarchical clustering (HCA, agglomerative) for thematic analysis and information extraction. We built our study from a database consisting of 1 240 thesis abstracts, granted from 1985 to 2009, by anthropological departments at the University of Montreal and University Laval, as well as historical department at University Laval (for archaeological and ethnological abstracts). In the first section, we present our theoretical framework; we expose definitions of text mining, its origins, the practical applications and the methodology, and in the end, we present a literature review. The second part is devoted to the methodological framework and we discuss the various stages through which the project was conducted; construction of database, linguistic and statistical filtering, automated classification, etc. Finally, in the last section, we display results of two specific experiments and we present our interpretations. We also discuss about thematic navigation and conceptual approaches. We conclude with the limitations we faced through this project and paths of interest for future research.
|
8 |
[pt] GERAÇÃO AUTOMÁTICA DE CONEXÕES PARA GESTÃO DE CONHECIMENTO / [en] ON AUTOMATIC GENERATION OF KNOWLEDGE CONNECTIONSFELIPE POGGI DE ARAGAO FRAGA 10 November 2022 (has links)
[pt] Recentemente, o tópico de Gestão de Conhecimento Pessoal vem ganhando muita popularidade. Ilustrado pelo rápido crescimento de aplicativos
como Notion, Obsidian, e Roam Research e da aparição de livros como How
to Take Smart Notes e Building a Second Brain.
Contudo, ainda é uma área que não foi fortemente envolvida pelo
Processamento de Linguagem Natural (NLP). Isso abre uma bela oportunidade
para a aplicação de NLP em operações com conhecimento.
Nosso objetivo é o desenvolvimento de um sistema de software que
utiliza NLP e aplicatovps de anotação para transformar uma coleção de textos
isolados em uma coleção de textos interconectada e inter-navegável. Isso é
feito usando mecanismos de navegação baseados em conceitos mencionados e
recomendações semânticas.
Neste trabalho apresentamos a metodologia para construir o sistema,
demonstrações com exemplos palpáveis, assim como uma avaliação para determinar a coerência dos resultados. / [en] Recently, the topic of Personal Knowledge Management (PKM) has seen
a surge in popularity. This is illustrated by the accelerated growth of apps
such as Notion, Obsidian, and Roam Research, and the appearance of books
like How to Take Smart Notes and Building a Second Brain.
However, the area of PKM has not seen much integration with the field of
Natural Language Processing (NLP). This opens up an interesting opportunity
to apply NLP techniques to knowledge operations tasks.
Our objective is the development of a Software System that uses NLP and
note-taking apps to transform a siloed text collection into an interconnected
and inter-navigable text collection. The system uses navigation mechanisms
based on shared concepts and semantic relatedness between texts.
In this study, we present a methodology to build this system, the research
context, demonstrations using examples, and an evaluation to determine if the
system functions properly and if the proposed connections are coherent.
|
Page generated in 0.0737 seconds