Global ETD Search

11	Incremental Aspect Model Learning on Streaming¡@Documents Wu, Cheng-Wei 16 August 2010 (has links) Owing to the development of Internet, excessive online data drive users to apply tools to assist them in obtaining desired and useful information. Information retrieval techniques serve as one of the major assistance tools that ease users¡¦ information processing loads. However, most current IR models do not consider processing streaming information which essentially characterizes today¡¦s Web environment. The approach to re-building models based on the full knowledge of data at hand triggered by the new incoming information every time is impractical, inefficient, and costly. Instead, IR models that can be adapted to streaming information incrementally should be considered under the dynamic environment. Therefore, this research is to propose an IR related technique, the incremental aspect model (ISM), which not only uncovers latent aspects from the collected documents but also adapts the aspect model on streaming documents chronologically. There are two stages in ISM: in Stage I, we employ probabilistic latent semantic indexing (PLSI) technique to build a primary aspect model; and in Stage II, with out-of-date data removing and new data folding-in, the aspect model can be expanded using the derived spectral method if new aspects significantly exist. Three experiments are conducted accordingly to verify ISM. Results from the first two experiments show the robust performance of ISM in incremental text clustering tasks. In Experiment III, ISM performs the task of storylines tracking on the 2010 Soccer World Cup event. It illustrates ISM¡¦s incremental learning ability to discover different themes around the event at any time. The feasibility of our proposed approach in real applications is thus justified. Incremental Clustering Probabilistic Latent Semantic Indexing Aspect Model
12	Concept Extraction With Change Detection From Navigated Information Lin, Tzu-hsiang 07 July 2005 (has links) To manage the information flood in the Internet, we usually navigate specific information using the provided search engines. Search engines are convenient but with limited functions. For example, it is impractical and impossible to browse through the entire collected information for us to gain an overall picture about what the navigated information stands for. To do so, we need an appropriate approach to automatically extracting concepts from the navigated information to assist users to easily and quickly gain the primary understanding toward a topic that interests users. In this research, we propose an approach to extracting concepts from the navigated web information and detecting the concept changes over time. It basically includes two stages. In the first stage, information is decomposed into paragraphs and they are clustered with key terms identified through the aid of latent semantic indexing method. Concepts are represented in the form of paragraph summary and associated key terms, which allows the user to easily comprehend what they describe. The second stage is to adaptively modify the concept structure to detect concept changes. With new information added, the concepts could be merging, splitting, or even emerging with time. Three experiments are conducted in this research to verify the proposed approach. Results of the first and second experiments show both high recall and high precision that matches the predefined concept categories. The last one is an illustrated real case application on the tsunami event. It shows that we can easily grasp different concepts of the tsunami reports and realize their changes by using our approach. The feasibility of employing our approach is thus justified. Internet Concepts Extraction Latent Semantic Indexing Concept Change Detection
13	Latent semantic sentence clustering for multi-document summarization Geiss, Johanna January 2011 (has links) No description available. 004
14	Using Information Retrieval to Improve Integration Testing Alazzam, Iyad January 2012 (has links) Software testing is an important factor of the software development process. Integration testing is an important and expensive level of the software testing process. Unfortunately, since the developers have limited time to perform integration testing and debugging and integration testing becomes very hard as the combinations grow in size, the chain of calls from one module to another grow in number, length, and complexity. This research is about providing new methodology for integration testing to reduce the number of test cases needed to a significant degree while returning as much of its effectiveness as possible. The proposed approach shows the best order in which to integrate the classes currently available for integration and the external method calls that should be tested and in their order for maximum effectiveness. Our approach limits the number of integration test cases. The integration test cases number depends mainly on the dependency among modules and on the number of the integrated classes in the application. The dependency among modules is determined by using an information retrieval technique called Latent Semantic Indexing (LSI). In addition, this research extends the mutation testing for use in integration testing as a method to evaluate the effectiveness of the integration testing process. We have developed a set of integration mutation operators to support development of integration mutation testing. We have conducted experiments based on ten Java applications. To evaluate the proposed methodology, we have created mutants using new mutation operators that exercise the integration testing. Our experiments show that the test cases killed more than 60% of the created mutants. Information retrieval. Latent semantic indexing. Mutation testing of computer programs.
15	Enhancing User Search Experience in Digital Libraries with Rotated Latent Semantic Indexing Polyakov, Serhiy 08 1900 (has links) This study investigates a semi-automatic method for creation of topical labels representing the topical concepts in information objects. The method is called rotated latent semantic indexing (rLSI). rLSI has found application in text mining but has not been used for topical labels generation in digital libraries (DLs). The present study proposes a theoretical model and an evaluation framework which are based on the LSA theory of meaning and investigates rLSI in a DL environment. The proposed evaluation framework for rLSI topical labels is focused on human-information search behavior and satisfaction measures. The experimental systems that utilize those topical labels were built for the purposes of evaluating user satisfaction with the search process. A new instrument was developed for this study and the experiment showed high reliability of the measurement scales and confirmed the construct validity. Data was collected through the information search tasks performed by 122 participants using two experimental systems. A quantitative method of analysis, partial least squares structural equation modeling (PLS-SEM), was used to test a set of research hypotheses and to answer research questions. The results showed a not significant, indirect effect of topical label type on both guidance and satisfaction. The conclusion of the study is that topical labels generated using rLSI provide the same levels of alignment, guidance, and satisfaction with the search process as topical labels created by the professional indexers using best practices. Digital libraries latent semantic analysis user search experience Information behavior. Latent semantic indexing. Digital libraries.
16	Recomendação de conteúdos : aplicação de agrupamento distribuído a conteúdos de TV Rodrigues, Alexandre José Monteiro January 2010 (has links) Tese de mestrado integrado. Engenharia Informática e Computação. Faculdade de Engenharia. Universidade do Porto. 2010 Programas de televisão Filmes Técnica de clustering MinHash IPTV (Internet Protocol Television) MEO
17	Clustering Multilingual Documents: A Latent Semantic Indexing Based Approach Lin, Chia-min 09 February 2006 (has links) Document clustering automatically organizes a document collection into distinct groups of similar documents on the basis of their contents. Most of existing document clustering techniques deal with monolingual documents (i.e., documents written in one language). However, with the trend of globalization and advances in Internet technology, an organization or individual often generates/acquires and subsequently archives documents in different languages, thus creating the need for multilingual document clustering (MLDC). Motivated by its significance and need, this study designs a Latent Semantic Indexing (LSI) based MLDC technique. Our empirical evaluation results show that the proposed LSI-based multilingual document clustering technique achieves satisfactory clustering effectiveness, measured by both cluster recall and cluster precision. Document clustering Latent semantic analysis Latent semantic indexing Text mining Document management Multilingual document clustering
18	An Ontology-based Retrieval System Using Semantic Indexing Kara, Soner 01 July 2010 (has links) (PDF) In this thesis, we present an ontology-based information extraction and retrieval system and its application to soccer domain. In general, we deal with three issues in semantic search, namely, usability, scalability and retrieval performance. We propose a keyword-based semantic retrieval approach. The performance of the system is improved considerably using domain-specific information extraction, inference and rules. Scalability is achieved by adapting a semantic indexing approach. The system is implemented using the state-of-the-art technologies in SemanticWeb and its performance is evaluated against traditional systems as well as the query expansion methods. Furthermore, a detailed evaluation is provided to observe the performance gain due to domain-specific information extraction and inference. Finally, we show how we use semantic indexing to solve simple structural ambiguities. QA Computer Software 76.75-76.765
19	A machine-aided approach to intelligent index generation using natural language processing and latent semantic anaylsis to determine the contexts and relationships among words in a corpus / Lukon, Shelly Candita. January 2006 (has links) Thesis (M.S.)--Duquesne University, 2006. / Title from document title page. Abstract included in electronic submission form. Includes bibliographical references (p.38-40) and index.
20	Log analysis aided by latent semantic mapping Buys, Stephanus 14 April 2013 (has links) In an age of zero-day exploits and increased on-line attacks on computing infrastructure, operational security practitioners are becoming increasingly aware of the value of the information captured in log events. Analysis of these events is critical during incident response, forensic investigations related to network breaches, hacking attacks and data leaks. Such analysis has led to the discipline of Security Event Analysis, also known as Log Analysis. There are several challenges when dealing with events, foremost being the increased volumes at which events are often generated and stored. Furthermore, events are often captured as unstructured data, with very little consistency in the formats or contents of the events. In this environment, security analysts and implementers of Log Management (LM) or Security Information and Event Management (SIEM) systems face the daunting task of identifying, classifying and disambiguating massive volumes of events in order for security analysis and automation to proceed. Latent Semantic Mapping (LSM) is a proven paradigm shown to be an effective method of, among other things, enabling word clustering, document clustering, topic clustering and semantic inference. This research is an investigation into the practical application of LSM in the discipline of Security Event Analysis, showing the value of using LSM to assist practitioners in identifying types of events, classifying events as belonging to certain sources or technologies and disambiguating different events from each other. The culmination of this research presents adaptations to traditional natural language processing techniques that resulted in improved efficacy of LSM when dealing with Security Event Analysis. This research provides strong evidence supporting the wider adoption and use of LSM, as well as further investigation into Security Event Analysis assisted by LSM and other natural language or computer-learning processing techniques. / LaTeX with hyperref package / Adobe Acrobat 9.54 Paper Capture Plug-in Latent semantic indexing Data mining Computer networks -- Security measures Computer hackers Computer security

Search results