Spelling suggestions: "subject:"csrknowledge extraction"" "subject:"csrknowledge axtraction""
1 |
Ontology Enrichment Based on Unstructured Text Data / Ontology Enrichment Based on Unstructured Text DataLukšová, Ivana January 2013 (has links)
Title: Ontology Enrichment Based on Unstructured Text Data Author: Ivana Lukšová Department: Department of Software Engineering Supervisor: Mgr. Martin Nečaský, Ph.D., Department of Software Engi- neering Abstract: Semantic annotation, attaching semantic information to text data, is a fundamental task in the knowledge extraction. Several ontology-based semantic annotation platforms have been proposed in recent years. However, the process of automated ontology engineering is still a challenging problem. In this paper, a new semi-automatic method for ontology enrichment based on unstructured text is presented to facilitate this process. NLP and ma- chined learning methods are employed to extract new ontological elements, such as concepts and relations, from text. Our method achieves F-measure up to 71% for concepts extraction and up to 68% for relations extraction. Keywords: ontology, machine learning, knowledge extraction 1
|
2 |
Knowledge Extraction from Logged Truck Data using Unsupervised Learning MethodsGrubinger, Thomas January 2008 (has links)
<p>The goal was to extract knowledge from data that is logged by the electronic system of</p><p>every Volvo truck. This allowed the evaluation of large populations of trucks without requiring additional measuring devices and facilities.</p><p>An evaluation cycle, similar to the knowledge discovery from databases model, was</p><p>developed and applied to extract knowledge from data. The focus was on extracting</p><p>information in the logged data that is related to the class labels of different populations,</p><p>but also supported knowledge extraction inherent from the given classes. The methods</p><p>used come from the field of unsupervised learning, a sub-field of machine learning and</p><p>include the methods self-organizing maps, multi-dimensional scaling and fuzzy c-means</p><p>clustering.</p><p>The developed evaluation cycle was exemplied by the evaluation of three data-sets.</p><p>Two data-sets were arranged from populations of trucks differing by their operating</p><p>environment regarding road condition or gross combination weight. The results showed</p><p>that there is relevant information in the logged data that describes these differences</p><p>in the operating environment. A third data-set consisted of populations with different</p><p>engine configurations, causing the two groups of trucks being unequally powerful.</p><p>Using the knowledge extracted in this task, engines that were sold in one of the two</p><p>configurations and were modified later, could be detected.</p><p>Information in the logged data that describes the vehicle's operating environment,</p><p>allows to detect trucks that are operated differently of their intended use. Initial experiments</p><p>to find such vehicles were conducted and recommendations for an automated</p><p>application were given.</p>
|
3 |
Knowledge Extraction from Logged Truck Data using Unsupervised Learning MethodsGrubinger, Thomas January 2008 (has links)
The goal was to extract knowledge from data that is logged by the electronic system of every Volvo truck. This allowed the evaluation of large populations of trucks without requiring additional measuring devices and facilities. An evaluation cycle, similar to the knowledge discovery from databases model, was developed and applied to extract knowledge from data. The focus was on extracting information in the logged data that is related to the class labels of different populations, but also supported knowledge extraction inherent from the given classes. The methods used come from the field of unsupervised learning, a sub-field of machine learning and include the methods self-organizing maps, multi-dimensional scaling and fuzzy c-means clustering. The developed evaluation cycle was exemplied by the evaluation of three data-sets. Two data-sets were arranged from populations of trucks differing by their operating environment regarding road condition or gross combination weight. The results showed that there is relevant information in the logged data that describes these differences in the operating environment. A third data-set consisted of populations with different engine configurations, causing the two groups of trucks being unequally powerful. Using the knowledge extracted in this task, engines that were sold in one of the two configurations and were modified later, could be detected. Information in the logged data that describes the vehicle's operating environment, allows to detect trucks that are operated differently of their intended use. Initial experiments to find such vehicles were conducted and recommendations for an automated application were given.
|
4 |
Extracting Structured Knowledge from Textual Data in Software RepositoriesHasan, Maryam 06 1900 (has links)
Software team members, as they communicate and coordinate their work with others throughout the life-cycle of their projects, generate different kinds of textual artifacts. Despite the variety of works in the area of mining software artifacts, relatively little research has focused on communication artifacts. Software communication artifacts, in addition to source code artifacts, contain useful semantic information that is not fully explored by existing approaches.
This thesis, presents the development of a text analysis method and tool to extract and represent useful pieces of information from a wide range of textual data sources associated with software projects. Our text analysis system integrates Natural Language Processing techniques and statistical text analysis methods, with software domain knowledge. The extracted information is represented as RDF-style triples which constitute interesting relations between developers and software products. We applied the developed system to analyze five different textual information, i.e., source code commits, bug reports, email messages, chat logs, and wiki pages. In the evaluation of our system, we found its precision to be 82%, its recall 58%, and its F-measure 68%.
|
5 |
Extracting Structured Knowledge from Textual Data in Software RepositoriesHasan, Maryam Unknown Date
No description available.
|
6 |
Domain-specific Knowledge Extraction from the Web of DataLalithsena, Sarasi 07 June 2018 (has links)
No description available.
|
7 |
Extração de conhecimento de redes neurais artificiais. / Knowledge extraction from artificial neural networks.Martineli, Edmar 20 August 1999 (has links)
Este trabalho descreve experimentos realizados com Redes Neurais Artificiais e algoritmos de aprendizado simbólico. Também são investigados dois algoritmos de extração de conhecimento de Redes Neurais Artificiais. Esses experimentos são realizados com três bases de dados com o objetivo de comparar os desempenhos obtidos. As bases de dados utilizadas neste trabalho são: dados de falência de bancos brasileiros, dados do jogo da velha e dados de análise de crédito. São aplicadas sobre os dados três técnicas para melhoria de seus desempenhos. Essas técnicas são: partição pela menor classe, acréscimo de ruído nos exemplos da menor classe e seleção de atributos mais relevantes. Além da análise do desempenho obtido, também é feita uma análise da dificuldade de compreensão do conhecimento extraído por cada método em cada uma das bases de dados. / This work describes experiments carried out witch Artificial Neural Networks and symbolic learning algorithms. Two algorithms for knowledge extraction from Artificial Neural Networks are also investigates. This experiments are performed whit three data set with the objective of compare the performance obtained. The data set used in this work are: Brazilians banks bankruptcy data set, tic-tac-toe data set and credit analysis data set. Three techniques for data set performance improvements are investigates. These techniques are: partition for the smallest class, noise increment in the examples of the smallest class and selection of more important attributes. Besides the analysis of the performance obtained, an analysis of the understanding difficulty of the knowledge extracted by each method in each data bases is made.
|
8 |
Développement de méthodes évolutionnaires d'extraction de connaissance et application à des systèmes biologiques complexes / Development of evolutionary knowledge extraction methods and their application in biological complex systemsLinard, Benjamin 15 October 2012 (has links)
La biologie des systèmes s’est beaucoup développée ces dix dernières années, confrontant plusieurs niveaux biologiques (molécule, réseau, tissu, organisme, écosystème…). Du point de vue de l’étude de l’évolution, elle offre de nombreuses possibilités. Cette thèse porte sur le développement de nouvelles méthodologies et de nouveaux outils pour étudier l’évolution des systèmes biologiques tout en considérant l’aspect multidimensionnel des données biologiques. Ce travail tente de palier un manque méthodologique évidant pour réaliser des études haut-débit dans le récent domaine de la biologie évolutionnaire des systèmes. De nouveaux messages évolutifs liés aux contraintes intra et inter processus ont été décrites. En particulier, mon travail a permis (i) la création d’un algorithme et un outil bioinformatique dédié à l’étude des relations évolutives d’orthologie existant entre les gènes de centaines d’espèces, (ii) le développement d’un formalisme original pour l’intégration de variables biologiques multidimensionnelles permettant la représentation synthétique de l’ histoire évolutive d’un gène donné, (iii) le couplage de cet outil intégratif avec des approches mathématiques d’extraction de connaissances pour étudier les perturbations évolutives existant au sein des processus biologiques humains actuellement documentés (voies métaboliques, voies de signalisations…). / Systems biology has developed enormously over the 10 last years, with studies covering diverse biological levels (molecule, network, tissue, organism, ecology…). From an evolutionary point of view, systems biology provides unequalled opportunities. This thesis describes new methodologies and tools to study the evolution of biological systems, taking into account the multidimensional properties of biological parameters associated with multiple levels. Thus it addresses the clear need for novel methodologies specifically adapted to high-throughput evolutionary systems biology studies. By taking account the multi-level aspects of biological systems, this work highlight new evolutionary trends associated with both intra and inter-process constraints. In particular, this thesis includes (i) the development of an algorithm and a bioinformatics tool dedicated to comprehensive orthology inference and analysis for hundreds of species, (ii) the development of an original formalism for the integration of multi-scale variables allowing the synthetic representation of the evolutionary history of a given gene, (iii) the combination of this integrative tool with mathematical knowledge discovery approaches in order to highlight evolutionary perturbations in documented human biological systems (metabolic and signalling pathways...).
|
9 |
Gestion de l'incertitude dans le processus d'extraction de connaissances à partir de textes / Uncertainty management in the knowledge extraction process from textKerdjoudj, Fadhela 08 December 2015 (has links)
La multiplication de sources textuelles sur le Web offre un champ pour l'extraction de connaissances depuis des textes et à la création de bases de connaissances. Dernièrement, de nombreux travaux dans ce domaine sont apparus ou se sont intensifiés. De ce fait, il est nécessaire de faire collaborer des approches linguistiques, pour extraire certains concepts relatifs aux entités nommées, aspects temporels et spatiaux, à des méthodes issues des traitements sémantiques afin de faire ressortir la pertinence et la précision de l'information véhiculée. Cependant, les imperfections liées au langage naturel doivent être gérées de manière efficace. Pour ce faire, nous proposons une méthode pour qualifier et quantifier l'incertitude des différentes portions des textes analysés. Enfin, pour présenter un intérêt à l'échelle du Web, les traitements linguistiques doivent être multisources et interlingue. Cette thèse s'inscrit dans la globalité de cette problématique, c'est-à-dire que nos contributions couvrent aussi bien les aspects extraction et représentation de connaissances incertaines que la visualisation des graphes générés et leur interrogation. Les travaux de recherche se sont déroulés dans le cadre d'une bourse CIFRE impliquant le Laboratoire d'Informatique Gaspard Monge (LIGM) de l'Université Paris-Est Marne la Vallée et la société GEOLSemantics. Nous nous appuyons sur une expérience cumulée de plusieurs années dans le monde de la linguistique (GEOLSemantics) et de la sémantique (LIGM).Dans ce contexte, nos contributions sont les suivantes :- participation au développement du système d'extraction de connaissances de GEOLSemantics, en particulier : (1) le développement d'une ontologie expressive pour la représentation des connaissances, (2) le développement d'un module de mise en cohérence, (3) le développement d'un outil visualisation graphique.- l'intégration de la qualification de différentes formes d'incertitude, au sein du processus d'extraction de connaissances à partir d'un texte,- la quantification des différentes formes d'incertitude identifiées ;- une représentation, à l'aide de graphes RDF, des connaissances et des incertitudes associées ;- une méthode d'interrogation SPARQL intégrant les différentes formes d'incertitude ;- une évaluation et une analyse des résultats obtenus avec notre approche / The increase of textual sources over the Web offers an opportunity for knowledge extraction and knowledge base creation. Recently, several research works on this topic have appeared or intensified. They generally highlight that to extract relevant and precise information from text, it is necessary to define a collaboration between linguistic approaches, e.g., to extract certain concepts regarding named entities, temporal and spatial aspects, and methods originating from the field of semantics' processing. Moreover, successful approaches also need to qualify and quantify the uncertainty present in the text. Finally, in order to be relevant in the context of the Web, the linguistic processing need to be consider several sources in different languages. This PhD thesis tackles this problematic in its entirety since our contributions cover the extraction, representation of uncertain knowledge as well as the visualization of generated graphs and their querying. This research work has been conducted within a CIFRE funding involving the Laboratoire d'Informatique Gaspard Monge (LIGM) of the Université Paris-Est Marne la Vallée and the GEOLSemantics start-up. It was leveraging from years of accumulated experience in natural language processing (GeolSemantics) and semantics processing (LIGM).In this context, our contributions are the following:- the integration of a qualifation of different forms of uncertainty, based on ontology processing, within the knowledge extraction processing,- the quantification of uncertainties based on a set of heuristics,- a representation, using RDF graphs, of the extracted knowledge and their uncertainties,- an evaluation and an analysis of the results obtained using our approach
|
10 |
Extrakcia nešpecifikovaných znalostí z webu / Extraction of unspecified relations from the webOvečka, Marek January 2013 (has links)
The subject of this thesis is non-specific knowledge extraction from the web. In recent years, tools that improve the results of this type of knowledge extraction were created. The aim of this thesis is to become familiar with these tools, test and propose the use of results. In this thesis these tools are described and compared and extraction is carried out using OLLIE. Based on the results of the extractions, two methods of enriching extractions using name entity recognition, are proposed. The first method proposes to modify the weights of extractions and second proposes the enrichment of extractions by named entities. The paper proposed ontology, which allows to capture the structure of enriched extractions. In the last part practical experiment is carried out, in which the proposed methods are demonstrated. Future research in this field would be useful in areas of extraction and categorization of relational phrases.
|
Page generated in 0.0984 seconds