Global ETD Search

271	Module extraction for inexpressive description logics Nortje, Riku 08 1900 (has links) Module extraction is an important reasoning task, aiding in the design, reuse and maintenance of ontologies. Reasoning services such as subsumption testing and MinA extraction have been shown to bene t from module extraction methods. Though various syntactic traversal-based module extraction algorithms exist for extracting modules, many only consider the subsumee of a subsumption statement as a selection criterion for reducing the axioms in the module. In this dissertation we extend the bottom-up reachability-based module extraction heuristic for the inexpressive Description Logic EL, by introducing a top-down version of the heuristic which utilises the subsumer of a subsumption statement as a selection criterion to minimize the number of axioms in a module. Then a combined bidirectional heuristic is introduced which uses both operands of a subsumption statement in order to extract very small modules. We then investigate the relationship between MinA extraction and bidirectional reachabilitybased module extraction. We provide empirical evidence that bidirectional reachability-based module extraction for subsumption entailments in EL provides a signi cant reduction in the size of modules for almost no additional costs in the running time of the original algorithms. / Computer Science / M. Sc. (Computer Science) 006.332 Description logics Ontologies (Information retrieval)
272	Enhanced root extraction and document classification algorithm for Arabic text Alsaad, Amal January 2016 (has links) Many text extraction and classification systems have been developed for English and other international languages; most of the languages are based on Roman letters. However, Arabic language is one of the difficult languages which have special rules and morphology. Not many systems have been developed for Arabic text categorization. Arabic language is one of the Semitic languages with morphology that is more complicated than English. Due to its complex morphology, there is a need for pre-processing routines to extract the roots of the words then classify them according to the group of acts or meaning. In this thesis, a system has been developed and tested for text classification. The system is based on two stages, the first is to extract the roots from text and the second is to classify the text according to predefined categories. The linguistic root extraction stage is composed of two main phases. The first phase is to handle removal of affixes including prefixes, suffixes and infixes. Prefixes and suffixes are removed depending on the length of the word, while checking its morphological pattern after each deduction to remove infixes. In the second phase, the root extraction algorithm is formulated to handle weak, defined, eliminated-long-vowel and two-letter geminated words, as there is a substantial great amount of irregular Arabic words in texts. Once the roots are extracted, they are checked against a predefined list of 3800 triliteral and 900 quad literal roots. Series of experiments has been conducted to improve and test the performance of the proposed algorithm. The obtained results revealed that the developed algorithm has better accuracy than the existing stemming algorithm. The second stage is the document classification stage. In this stage two non-parametric classifiers are tested, namely Artificial Neural Networks (ANN) and Support Vector Machine (SVM). The system is trained on 6 categories: culture, economy, international, local, religion and sports. The system is trained on 80% of the available data. From each category, the 10 top frequent terms are selected as features. Testing the classification algorithms has been done on the remaining 20% of the documents. The results of ANN and SVM are compared to the standard method used for text classification, the terms frequency-based method. Results show that ANN and SVM have better accuracy (80-90%) compared to the standard method (60-70%). The proposed method proves the ability to categorize the Arabic text documents into the appropriate categories with a high precision rate. 006.3
273	Postgraduate student success rate with free-form information searching Uwimana, Iris January 2017 (has links) Thesis (MTech (Business Information Systems)--Cape Peninsula University of Technology, 2017. / The Internet has become a useful instrument in connecting users, regardless of their geographical locations, and has thus has made the world a small village where users can interact and search for information. Another aspect that has made the Internet popular amongst users, is its growing popularity as a global resource connecting millions of users surfing the Web daily, searching for and sharing information. A successful search for information depends on the user’s ability to search effectively, and this ability is based on computer competency, knowledge of Information Technology (IT), perceptions of IT usage, and the demographics of the user. These user’s characteristics tend to influence the overall user experience. Although the Internet is used by different groups of users to achieve different objectives of information search, not all of them achieve these objectives. The main aim of this study was to determine the success rate of post-graduate students using free-form information searching to find academic reference materials. Information literacy Information behavior Information retrieval
274	Applying a semantic layer in a source code retrieval tool Durão, Frederico Araujo 31 January 2008 (has links) Made available in DSpace on 2014-06-12T15:51:21Z (GMT). No. of bitstreams: 1 license.txt: 1748 bytes, checksum: 8a4605be74aa9ea9d79846c1fba20a33 (MD5) Previous issue date: 2008 / O reuso de software é uma área de pesquisa da engenharia de software que tem por objetivo prover melhorias na produtividade e qualidade da aplicação através da redução do esforço. Trata-se de reutilizar artefatos existentes, ao invés de construí-los do zero a fim de criar novas aplicações. Porém, para obter os benefícios inerentes ao reuso, alguns obstáculos devem ser superados como, por exemplo, a questão da busca e recuperação de componentes. Em geral, há uma lacuna entre a formulação do problema, na mente do desenvolvedor e a recuperação do mesmo no repositório, o que resulta em resultados irrelevantes diminuindo as chances de reuso. Dessa maneira, mecanismos que auxiliem na formulação das consultas e que contribuam para uma recuperação mais próxima à necessidade do desenvolvedor, são bastante oportunos para solucionar os problemas apresentados. Nesse contexto, este trabalho propõe a extensão de uma ferramenta de busca por palavra-chave através de uma camada semântica que tem por objetivo principal aumentar a precisão da busca e, conseqüentemente, aumentar as chances de reuso do componente procurado. A criação da camada semântica é representada basicamente por dois componentes principais: um para auxiliar o usuário na formulação da consulta, através do uso de uma ontologia de domínio, e outro para tornar a recuperação mais eficiente, através de uma indexação semântica dos componentes no repositório Software Reuse Information Retrieval Semantic Web Ontologies
275	Modèles de compression et critères de complexité pour la description et l'inférence de structure musicale / Compression models and complexity criteria for the description and the inference of music structure Guichaoua, Corentin 19 September 2017 (has links) Une définition très générale de la structure musicale consiste à considérer tout ce qui distingue la musique d'un bruit aléatoire comme faisant partie de sa structure. Dans cette thèse, nous nous intéressons à l'aspect macroscopique de cette structure, en particulier la décomposition de passages musicaux en unités autonomes (typiquement, des sections) et à leur caractérisation en termes de groupements d'entités élémentaires conjointement compressibles. Un postulat de ce travail est d'établir un lien entre l'inférence de structure musicale et les concepts de complexité et d'entropie issus de la théorie de l'information. Nous travaillons ainsi à partir de l'hypothèse que les segments structurels peuvent être inférés par des schémas de compression de données. Dans une première partie, nous considérons les grammaires à dérivation unique (GDU), conçues à l'origine pour la découverte de structures répétitives dans les séquences biologiques (Gallé, 2011), dont nous explorons l'utilisation pour modéliser les séquences musicales. Cette approche permet de compresser les séquences en s'appuyant sur leurs statistiques d'apparition, leur organisation hiérarchique étant modélisée sous forme arborescente. Nous développons plusieurs adaptations de cette méthode pour modéliser des répétitions inexactes et nous présentons l'étude de plusieurs critères visant à régulariser les solutions obtenues. La seconde partie de cette thèse développe et explore une approche novatrice d'inférence de structure musicale basée sur l'optimisation d'un critère de compression tensorielle. Celui-ci vise à compresser l'information musicale sur plusieurs échelles simultanément en exploitant les relations de similarité, les progressions logiques et les systèmes d'analogie présents dans les segments musicaux. La méthode proposée est introduite d'un point de vue formel, puis présentée comme un schéma de compression s'appuyant sur une extension multi-échelle du modèle Système & Contraste (Bimbot et al., 2012) à des patrons tensoriels hypercubiques. Nous généralisons de surcroît l'approche à d'autres patrons tensoriels, irréguliers, afin de rendre compte de la grande variété d'organisations structurelles des segments musicaux. Les méthodes étudiées dans cette thèse sont expérimentées sur une tâche de segmentation structurelle de données symboliques correspondant à des séquences d'accords issues de morceaux de musique pop (RWC-Pop). Les méthodes sont évaluées et comparées sur plusieurs types de séquences d'accords, et les résultats établissent l'attractivité des approches par critère de complexité pour l'analyse de structure et la recherche d'informations musicales, les meilleures variantes fournissant des performances de l'ordre de 70% de F-mesure. / A very broad definition of music structure is to consider what distinguishes music from random noise as part of its structure. In this thesis, we take interest in the macroscopic aspects of music structure, especially the decomposition of musical pieces into autonomous segments (typically, sections) and their characterisation as the result of the grouping process of jointly compressible units. An important assumption of this work is to establish a link between the inference of music structure and information theory concepts such as complexity and entropy. We thus build upon the hypothesis that structural segments can be inferred through compression schemes. In a first part of this work, we study Straight-Line Grammars (SLGs), a family of formal grammars originally used for structure discovery in biological sequences (Gallé, 2011), and we explore their use for the modelisation of musical sequences. The SLG approach enables the compression of sequences, depending on their occurrence frequencies, resulting in a tree-based modelisation of their hierarchical organisation. We develop several adaptations of this method for the modelisation of approximate repetitions and we develop several regularity criteria aimed at improving the efficiency of the method. The second part of this thesis develops and explores a novel approach for the inference of music structure, based on the optimisation of a tensorial compression criterion. This approach aims to compress the musical information on several simultaneous time-scales by exploiting the similarity relations, the logical progressions and the analogy systems which are embedded in musical segments. The proposed method is first introduced from a formal point of view, then presented as a compression scheme rooted in a multi-scale extension of the System & Contrast model (Bimbot et al., 2012) to hypercubic tensorial patterns. Furthermore, we generalise the approach to other, irregular, tensorial patterns, in order to account for the great variety of structural organisations observed in musical segments. The methods presented in this thesis are tested on a structural segmentation task using symbolic data, chords sequences from pop music (RWC-Pop). The methods are evaluated and compared on several sets of chord sequences, and the results establish an experimental advantage for the approaches based on a complexity criterion for the analysis of structure in music information retrieval, with the best variants offering F-measure scores around 70%. To conclude this work, we recapitulate its main contributions and we discuss possible extensions of the studied paradigms, through their application to other musical dimensions, the inclusion of musicological knowledge, and their possible use on audio data. Informatique musicale Music Information Retrieval Music Structure
276	A user-centered design of a web-based interface to bibliographic databases Ahmed, S. M. Zabed January 2002 (has links) This thesis reports results of a research study into the usefulness of a user-centred approach for designing information retrieval interfaces. The main objective of the research was to examine the usability of an existing Web-based IR system in order to design a user-centred prototype Web interface. A series of usability experiments was carried out with the Web of Science. The first experiment was carried out using both novice and experienced users to see their performance and satisfaction with the interface. A set of search tasks was obtained from a user survey and was used in the study. The results showed that there were no significant differences in the time taken to complete the tasks, and the number of different search terms used between the two search groups. Novice users were significantly more satisfied with the interface than the experienced group. However, the experienced group was significantly more successful, and made fewer errors than the novice users. The second experiment was conducted on novices' learning and retention with the Web of Science using the same equipment, tasks and environment. The results of the original learning phase of the experiment showed that novices could readily pick up interface functionality when a brief training was provided. However, their retention of search skills weakened over time. Their subjective satisfaction with the interface also diminished from learning to retention. These findings suggested that the fundamental difficulties of searching IR systems still remain with the Web-based version. A heuristic evaluation was carried out to find out the usability problems in the Web of Science interface. Three human factors experts evaluate the interface. The heuristic evaluation was very helpful in identifying some interface design issues for Web IR systems. The most fundamental of these was increasing the match between system and the real world. The results of both the usability testing and the heuristic evaluations served as a baseline for designing a prototype Web interface. The prototype was designed based on a conceptual model of users' information seeking. Various usability evaluation methods were used to test the usability of the prototype system. After each round of testing, the interface was modified in accordance with the test findings. A summative evaluation of the prototype interface showed that both novice and experienced users improved their search performance. Comparative analysis with the earlier usability studies also showed significant improvements in performance and satisfaction with the prototype. These results show that user-centred methods can yield better interface design for IR systems. 005
277	Trust-aware information retrieval in peer-to-peer environments Zhang, Ye January 2011 (has links) Information Retrieval in P2P environments (P2PIR) has become an active field of research due to the observation that P2P architectures have the potential to become as appealing as traditional centralised architectures. P2P networks are formed with voluntary peers that exchange information and accomplish various tasks. Some of them may be malicious peers spreading untrustworthy resources. However, existing P2PIR systems only focus on finding relevant documents, while trustworthiness of documents and document providers has been ignored. Without prior experience and knowledge about the network, users run the risk to review,download and use untrustworthy documents, even if these documents are relevant. The work presented in this dissertation provide the first integrated framework for trust-aware Information Retrieval in P2P environments, which can retrieve not only relevant but also trustworthy documents. The proposed content trust models extend an existing P2P trust management system, PeerTrust, in the context of P2PIR to compute the trust values of documents and document providers for given queries. A method is proposed to estimate global term statistics which are integrated with existing relevance-based approaches for document ranking and peer selection. Different approaches are explored to find optimal parametersettings in the proposed trust-aware P2PIR systems. Moreover, system architectures and data management protocols are designed to implement the proposed trust-aware P2PIR systems in structured P2P networks. The experimental evaluation demonstrates that P2PIR can benefit from trust-aware P2PIR systems significantly. It can importantly reduce the possibility of untrustworthy documents in the top-ranked result list. The proposed estimated global term statistics can provide acceptable and competitive retrieval accuracy within different P2PIR scenarios. 020 Information Retrieval ; Peer-to-Peer
278	Relevance judgements in information retrieval Cosijn, Erica 19 September 2005 (has links) Please read the abstract in the section 00front of this document / Thesis (DPhil (Information Science))--University of Pretoria, 2005. / Information Science / unrestricted Relevance logic research Information retrieval research UCTD
279	Conditional random fields based method for feature-level opinion mining and results visualization Qi, Luole 01 January 2012 (has links) No description available. Information retrieval Public opinion Random fields
280	Parallel Analysis of Aspect-Based Sentiment Summarization from Online Big-Data Wei, Jinliang 05 1900 (has links) Consumer's opinions and sentiments on products can reflect the performance of products in general or in various aspects. Analyzing these data is becoming feasible, considering the availability of immense data and the power of natural language processing. However, retailers have not taken full advantage of online comments. This work is dedicated to a solution for automatically analyzing and summarizing these valuable data at both product and category levels. In this research, a system was developed to retrieve and analyze extensive data from public online resources. A parallel framework was created to make this system extensible and efficient. In this framework, a star topological network was adopted in which each computing unit was assigned to retrieve a fraction of data and to assess sentiment. Finally, the preprocessed data were collected and summarized by the central machine which generates the final result that can be rendered through a web interface. The system was designed to have sound performance, robustness, manageability, extensibility, and accuracy. Aspect-based Sentiment Analysis Information retrieval system

Search results