271 |
Ontology Integration with Non-Violation Check and Context ExtractionWu, Dan January 2013 (has links)
Matching and integrating ontologies has been a desirable technique in areas such as data fusion, knowledge integration, the Semantic Web and the development of advanced services in distributed system. Unfortunately, the heterogeneities of ontologies cause big obstacles in the development of this technique. This licentiate thesis describes an approach to tackle the problem of ontology integration using description logics and production rules, both on a syntactic level and on a semantic level. Concepts in ontologies are matched and integrated to generate ontology intersections. Context is extracted and rules for handling heterogeneous ontology reasoning with contexts are developed. Ontologies are integrated by two processes. The first integration is to generate an ontology intersection from two OWL ontologies. The result is an ontology intersection, which is an independent ontology containing non-contradictory assertions based on the original ontologies. The second integration is carried out by rules that extract context, such as ontology content and ontology description data, e.g. time and ontology creator. The integration is designed for conceptual ontology integration. The information of instances isn't considered, neither in the integrating process nor in the integrating results. An ontology reasoner is used in the integration process for non-violation check of two OWL ontologies and a rule engine for handling conflicts according to production rules. The ontology reasoner checks the satisfiability of concepts with the help of anchors, i.e. synonyms and string-identical entities; production rules are applied to integrate ontologies, with the constraint that the original ontologies should not be violated. The second integration process is carried out with production rules with context data of the ontologies. Ontology reasoning, in a repository, is conducted within the boundary of each ontology. Nonetheless, with context rules, reasoning is carried out across ontologies. The contents of an ontology provide context for its defined entities and are extracted to provide context with the help of an ontology reasoner. Metadata of ontologies are criteria that are useful for describing ontologies. Rules using context, also called context rules, are developed and in-built in the repository. New rules can also be added. The scientific contribution of the thesis is the suggested approach applying semantic based techniques to provide a complementary method for ontology matching and integrating semantically. With the illustration of the ontology integration process and the context rules and a few manually integrated ontology results, the approach shows the potential to help to develop advanced knowledge-based services. / <p>QC 20130201</p>
|
272 |
Automatizované metody popisu struktury odborného textu a vztah některých prvků ke kvalitě textu / Automated methods of textual content analysis and description of text structuresChýla, Roman January 2012 (has links)
Universal Semantic Language (USL) is a semi-formalized approach for the description of knowledge (a knowledge representation tool). The idea of USL was introduced by Vladimir Smetacek in the system called SEMAN which was used for keyword extraction tasks in the former Information centre of the Czechoslovak Republic. However due to the dissolution of the centre in early 90's, the system has been lost. This thesis reintroduces the idea of USL in a new context of quantitative content analysis. First we introduce the historical background and the problems of semantics and knowledge representation, semes, semantic fields, semantic primes and universals. The basic methodology of content analysis studies is illustrated on the example of three content analysis tools and we describe the architecture of a new system. The application was built specifically for USL discovery but it can work also in the context of classical content analysis. It contains Natural Language Processing (NLP) components and employs the algorithm for collocation discovery adapted for the case of cooccurences search between semantic annotations. The software is evaluated by comparing its pattern matching mechanism against another existing and established extractor. The semantic translation mechanism is evaluated in the task of...
|
273 |
Using Semantic Web Technologies for Classification Analysis in Social NetworksOpuszko, Marek January 2011 (has links)
The Semantic Web enables people and computers to interact and exchange
information. Based on Semantic Web technologies, different machine learning applications have been designed. Particularly to emphasize is the possibility to create complex metadata descriptions for any problem domain, based on pre-defined ontologies. In this paper we evaluate the use of a semantic similarity measure based on pre-defined ontologies as an input for a classification analysis. A link prediction between actors of a social network is performed, which could serve as a recommendation system. We measure the prediction performance based on an ontology-based metadata modeling as well as a feature vector modeling. The findings demonstrate that the prediction accuracy based on ontology-based metadata is comparable to traditional approaches and shows that data mining using ontology-based metadata can be considered as a very promising approach.
|
274 |
Semantic Components: A Model for Enhancing Retrieval of Domain- Specific InformationPrice, Susan Loucette 01 March 2008 (has links)
Despite the success of general Internet search engines, information retrieval remains an incompletely solved problem. Our research focuses on supporting domain experts when they search domain-specific libraries to satisfy targeted information needs. The semantic components model introduces a schema specific to a particular document collection. A semantic component schema consists of a two-level hierarchy, document classes and semantic components. A document class represents a document grouping, such as topic type or document purpose. A semantic component is a characteristic type of information that occurs in a particular document class and represents an important aspect of the document’s main topic. Semantic component indexing identifies the location and extent of semantic component instances within a document and can supplement traditional full text and keyword indexing techniques. Semantic component searching allows a user to refine a topical search by indicating a preference for documents containing specific semantic components or by indicating terms that should appear in specific semantic components.
We investigate four aspects of semantic components in this research. First, we describe lessons learned from using two methods for developing schemas in two domains. Second, we demonstrate use of semantic components to express domainspecific concepts and relationships by mapping a published taxonomy of questions asked by family practice physicians to the semantic component schemas for two document collections about medical care. Third, we report the results of a user study, showing that manual semantic component indexing is comparable to manual keyword indexing with respect to time and perceived difficulty and suggesting that semantic component indexing may be more accurate and consistent than manual keyword indexing. Fourth, we report the results of an interactive searching study, demonstrating the ability of semantic components to enhance search results compared to a baseline system without semantic components.
In addition, we contribute a formal description of the semantic components model, a prototype implementation of semantic component indexing software, and a prototype implementation adding semantic components to an existing commercial search engine. Finally, we analyze metrics for evaluating instances of semantic component indexing and keyword indexing and illustrate use of a session-based metric for evaluating multiple-query search sessions.
|
275 |
Deriving A Better Metric To Assess theQuality of Word Embeddings Trained OnLimited Specialized CorporaMunbodh, Mrinal January 2020 (has links)
No description available.
|
276 |
Ontology Generation, Information Harvesting and Semantic Annotation for Machine-Generated Web PagesTao, Cui 17 December 2008 (has links) (PDF)
The current World Wide Web is a web of pages. Users have to guess possible keywords that might lead through search engines to the pages that contain information of interest and browse hundreds or even thousands of the returned pages in order to obtain what they want. This frustrating problem motivates an approach to turn the web of pages into a web of knowledge, so that web users can query the information of interest directly. This dissertation provides a step in this direction and a way to partially overcome the challenges. Specifically, this dissertation shows how to turn machine-generated web pages like those on the hidden web into semantic web pages for the web of knowledge. We design and develop three systems to address the challenge of turning the web pages into web-of-knowledge pages: TISP (Table Interpretation for Sibling Pages), TISP++, and FOCIH (Form-based Ontology Creation and Information Harvesting). TISP can automatically interpret hidden-web tables. Given interpreted tables, TISP++ can generate ontologies and semantically annotate the information present in the interpreted tables automatically. This way, we can offer a way to make the hidden information publicly accessible. We also provide users with a way where they can generate personalized ontologies. FOCIH provides users with an interface with which they can provide their own view by creating a form that specifies the information they want. Based on the form, FOCIH can generate user-specific ontologies, and based on patterns in machine-generated pages, FOCIH can harvest information and annotate these pages with respect to the generated ontology. Users can directly query on the annotated information. With these contributions, this dissertation serves as a foundational pillar for turning the current web of pages into a web of knowledge.
|
277 |
Orthographic Similarity and False Recognition for Unfamiliar WordsPerrotte, Jeffrey 01 December 2015 (has links)
There is evidence of false recognition (FR) driven by orthographic similarities within languages (Lambert, Chang, & Lin, 2001; Raser, 1972) and some evidence that FR crosses languages (Parra, 2013). No study has investigated whether FR based on orthographic similarities occurs for unknown words in an unknown language. This study aimed to answer this question. It further explored whether FR based on orthographic similarities is more likely in a known (English) than in an unknown (Spanish) language. Forty-six English monolinguals participated. They studied 50 English and 50 Spanish words during a study phase. A recognition test was given immediately after the study phase. It consisted of 40 Spanish and 40 English words. It included list words (i.e., words presented at study); homographs (i.e., words not presented at study, orthographically similar to words presented at study); and unrelated words (i.e., words not presented at study, not orthographically similar to words presented at study). The LSD post-hoc test showed significant results supporting the hypothesis that false recognition based on orthographic similarities occurs for words in a known language (English) and in an unknown language (Spanish). Further evidence was provided by the LSD post-hoc test supporting the hypothesis that false recognition based on orthographic similarities was more likely to occur in a known language than an unknown language. Results provided evidence that the meaning and orthographic form are used when information is encoded thereby influencing recognition decisions. Furthermore, these results emphasize the significance of orthography when information is encoded and retrieved.
|
278 |
Detecting Semantic Method Clones In Java Code Using Method Ioe-behaviorElva, Rochelle 01 January 2013 (has links)
The determination of semantic equivalence is an undecidable problem; however, this dissertation shows that a reasonable approximation can be obtained using a combination of static and dynamic analysis. This study investigates the detection of functional duplicates, referred to as semantic method clones (SMCs), in Java code. My algorithm extends the input-output notion of observable behavior, used in related work [1, 2], to include the effects of the method. The latter property refers to the persistent changes to the heap, brought about by the execution of the method. To differentiate this from the typical input-output behavior used by other researchers, I have coined the term method IOE-Behavior; which means its input-output and effects behavior [3]. Two methods are defined as semantic method clones, if they have identical IOE-Behavior; that is, for the same inputs (actual parameters and initial heap state), they produce the same output (that is result- for non-void methods, an final heap state). The detection process consists of two static pre-filters used to identify candidate clone sets. This is followed by dynamic tests that actually run the candidate methods, to determine semantic equivalence. The first filter groups the methods by type. The second filter refines the output of the first, grouping methods by their effects. This algorithm is implemented in my tool JSCTracker, used to automate the SMC detection process. The algorithm and tool are validated using a case study comprising of 12 open source Java projects, from different application domains and ranging in size from 2 KLOC (thousand lines of code) to 300 KLOC. The objectives of the case study are posed as 4 research questions: 1. Can method IOE-Behavior be used in SMC detection? 2. What is the impact of the use of the pre-filters on the efficiency of the algorithm? 3. How does the performance of method IOE-Behavior compare to using only inputoutput for identifying SMCs? 4. How reliable are the results obtained when method IOE-Behavior is used in SMC detection? Responses to these questions are obtained by checking each software sample with JSCTracker and analyzing the results. The number of SMCs detected range from 0-45 with an average execution time of 8.5 seconds. The use of the two pre-filters reduces the number of methods that reach the dynamic test phase, by an average of 34%. The IOE-Behavior approach takes an average of 0.010 seconds per method while the input-output approach takes an average of 0.015 seconds. The former also identifies an average of 32% false positives, while the SMCs identified using input-output, have an average of 92% false positives. In terms of reliability, the IOE-Behavior method produces results with precision values of an average of 68% and recall value of 76% on average. These reliability values represent an improvement of over 37% (for precision) and 30% (for recall) of the values in related work [4, 5]. Thus, it is my conclusion that IOE-Behavior can be used to detect SMCs in Java code with reasonable reliability.
|
279 |
Semantic Methods for Intelligent Distributed Design EnvironmentsWitherell, Paul W. 01 September 2009 (has links)
Continuous advancements in technology have led to increasingly comprehensive and distributed product development processes while in pursuit of improved products at reduced costs. Information associated with these products is ever changing, and structured frameworks have become integral to managing such fluid information. Ontologies and the Semantic Web have emerged as key alternatives for capturing product knowledge in both a human-readable and computable manner. The primary and conclusive focus of this research is to characterize relationships formed within methodically developed distributed design knowledge frameworks to ultimately provide a pervasive real-time awareness in distributed design processes. Utilizing formal logics in the form of the Semantic Web’s OWL and SWRL, causal relationships are expressed to guide and facilitate knowledge acquisition as well as identify contradictions between knowledge in a knowledge base. To improve the efficiency during both the development and operational phases of these “intelligent” frameworks, a semantic relatedness algorithm is designed specifically to identify and rank underlying relationships within product development processes. After reviewing several semantic relatedness measures, three techniques, including a novel meronomic technique, are combined to create AIERO, the Algorithm for Identifying Engineering Relationships in Ontologies. In determining its applicability and accuracy, AIERO was applied to three separate, independently developed ontologies. The results indicate AIERO is capable of consistently returning relatedness values one would intuitively expect. To assess the effectiveness of AIERO in exposing underlying causal relationships across product development platforms, a case study involving the development of an industry-inspired printed circuit board (PCB) is presented. After instantiating the PCB knowledge base and developing an initial set of rules, FIDOE, the Framework for Intelligent Distributed Ontologies in Engineering, was employed to identify additional causal relationships through extensional relatedness measurements. In a conclusive PCB redesign, the resulting “intelligent” framework demonstrates its ability to pass values between instances, identify inconsistencies amongst instantiated knowledge, and identify conflicting values within product development frameworks. The results highlight how the introduced semantic methods can enhance the current knowledge acquisition, knowledge management, and knowledge validation capabilities of traditional knowledge bases.
|
280 |
Choosing Among Related Foils in Aphasia: The Role of Common and Distinctive Semantic FeaturesMason-Baughman, Mary Beth 30 April 2009 (has links)
No description available.
|
Page generated in 0.082 seconds