Spelling suggestions: "subject:"forminformation retrieval"" "subject:"informationation retrieval""
341 |
Towards efficient distributed search in a peer-to-peer network.January 2007 (has links)
Cheng Chun Kong. / Thesis submitted in: November 2006. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2007. / Includes bibliographical references (leaves 62-64). / Abstracts in English and Chinese. / Abstract --- p.1 / 槪要 --- p.2 / Acknowledgement --- p.3 / Chapter 1. --- Introduction --- p.5 / Chapter 2. --- Literature Review --- p.10 / Chapter 3. --- Design / Chapter A. --- Overview --- p.22 / Chapter B. --- Basic idea --- p.23 / Chapter C. --- Follow-up design --- p.30 / Chapter D. --- Summary --- p.40 / Chapter 4. --- Experimental Findings / Chapter A. --- Goal --- p.41 / Chapter B. --- Analysis Methodology --- p.41 / Chapter C. --- Validation --- p.47 / Chapter D. --- Results --- p.47 / Chapter 5. --- Deployment / Chapter A. --- Limitations --- p.58 / Chapter B. --- Miscellaneous Design Issues --- p.59 / Chapter 6. --- Future Directions and Conclusions --- p.61 / Reference --- p.62 / Appendix --- p.65
|
342 |
Automated analysis and transcription of rhythm data and their use for compositionBoenn, Georg January 2011 (has links)
No description available.
|
343 |
Why did they cite that?Lovering, Charles 26 April 2018 (has links)
We explore a machine learning task, evidence recommendation (ER), the extraction of evidence from a source document to support an external claim. This task is an instance of the question answering machine learning task. We apply ER to academic publications because they cite other papers for the claims they make. Reading cited papers to corroborate claims is time-consuming and an automated ER tool could expedite it. Thus, we propose a methodology for collecting a dataset of academic papers and their references. We explore deep learning models for ER and achieve 77% accuracy with pairwise models and 75% pairwise accuracy with document-wise models.
|
344 |
Query processing in a distributed environmentChao, Han Ying January 2010 (has links)
Typescript (photocopy). / Digitized by Kansas Correctional Industries
|
345 |
Explaining listener differences in the perception of musical structureSmith, Jordan January 2014 (has links)
State-of-the-art models for the perception of grouping structure in music do not attempt to account for disagreements among listeners. But understanding these disagreements, sometimes regarded as noise in psychological studies, may be essential to fully understanding how listeners perceive grouping structure. Over the course of four studies in different disciplines, this thesis develops and presents evidence to support the hypothesis that attention is a key factor in accounting for listeners' perceptions of boundaries and groupings, and hence a key to explaining their disagreements. First, we conduct a case study of the disagreements between two listeners. By studying the justi cations each listener gave for their analyses, we argue that the disagreements arose directly from differences in attention, and indirectly from differences in information, expectation, and ontological commitments made in the opening moments. Second, in a large-scale corpus study, we study the extent to which acoustic novelty can account for the boundary perceptions of listeners. The results indicate that novelty is correlated with boundary salience, but that novelty is a necessary but not su cient condition for being perceived as a boundary. Third, we develop an algorithm that optimally reconstructs a listener's analysis in terms of the patterns of similarity within a piece of music. We demonstrate how the output can identify good justifications for an analysis and account for disagreements between two analyses. Finally, having introduced and developed the hypothesis that disagreements between listeners may be attributable to differences in attention, we test the hypothesis in a sequence of experiments. We find that by manipulating the attention of participants, we are able to influence the groupings and boundaries they find most salient. From the sum of this research, we conclude that a listener's attention is a crucial factor affecting how listeners perceive the grouping structure of music.
|
346 |
Passage Retrieval : en litteraturstudie av ett forskningsområde inom information retrieval / Passage Retrieval : a study of a research topic in information retrievalÅkesson, Mattias January 2000 (has links)
The aim of this thesis is to describe passage retrieval (PR), with basis in results from various empirical experiments, and to critically investigate different approaches in PR. The main questions to be answered in the thesis are: (1) What characterizes PR? (2) What approaches have been proposed? (3) How well do the approaches work in experimental information retrieval (IR)? PR is a research topic in information retrieval, which instead of retrieving the fulltext of documents, that can lead to information overload for the user, tries to retrieve the most relevant passages in the documents. This technique was investigated studying a number of central articles in the research field. PR can be divided into three different types of approaches based on the segmentation of the documents. First, you can divide the text considering the semantics and where the topics change. Second, you can divide the text based on the explicit structure of the documents, with help from e.g. a markup language like SGML. And third, you can do a form of PR, where you divide the text in parts containing a fixed number of words. This method is called unmotivated segmentation. The study showed that an unmotivated segmentation resulted in the best retrieval effectiveness even though the results are difficult to compare because of different kinds of evaluation methods and different types of test collections. A combination between full text retrieval and PR also showed improved results. / Uppsatsnivå: D
|
347 |
A Nearest-Neighbor Approach to Indicative Web SummarizationPetinot, Yves January 2016 (has links)
Through their role of content proxy, in particular on search engine result pages, Web summaries play an essential part in the discovery of information and services on the Web. In their simplest form, Web summaries are snippets based on a user-query and are obtained by extracting from the content of Web pages. The focus of this work, however, is on indicative Web summarization, that is, on the generation of summaries describing the purpose, topics and functionalities of Web pages. In many scenarios — e.g. navigational queries or content-deprived pages — such summaries represent a valuable commodity to concisely describe Web pages while circumventing the need to produce snippets from inherently noisy, dynamic, and structurally complex content. Previous approaches have identified linking pages as a privileged source of indicative content from which Web summaries may be derived using traditional extractive methods. To be reliable, these approaches require sufficient anchortext redundancy, ultimately showing the limits of extractive algorithms for what is, fundamentally, an abstractive task. In contrast, we explore the viability of abstractive approaches and propose a nearest-neighbors summarization framework leveraging summaries of conceptually related (neighboring) Web pages. We examine the steps that can lead to the reuse and adaptation of existing summaries to previously unseen pages. Specifically, we evaluate two Text-to-Text transformations that cover the main types of operations applicable to neighbor summaries: (1) ranking, to identify neighbor summaries that best fit the target; (2) target adaptation, to adjust individual neighbor summaries to the target page based on neighborhood-specific template-slot models. For this last transformation, we report on an initial exploration of the use of slot-driven compression to adjust adapted summaries based on the confidence associated with token-level adaptation operations. Overall, this dissertation explores a new research avenue for indicative Web summarization and shows the potential value, given the diversity and complexity of the content of Web pages, of transferring, and, when necessary, of adapting, existing summary information between conceptually similar Web pages.
|
348 |
Essays on information acquisitionZhong, Weijie January 2019 (has links)
This dissertation studies information acquisition when the choice of information is fully flexible. Throughout the dissertation, I consider a theoretical framework where a decision maker (DM) acquires costly information (signal process) about the payoffs of different alternatives before making a choice. In Chapter 1, I solve a general model where the DM pays a cost that depends on the rate of uncertainty reduction and discounts delayed payoffs. The main finding is that the optimal signal process resembles a Poisson signal --- the signal arrives occasionally according to a Poisson process, and it drives the inferred posterior belief to jump discretely. The optimal signal is chosen to confirm the DM's prior belief of the most promising state. Once seeing the signal, the decision maker is discretely surer about the state and stops learning immediately. When the signal is otherwise absent, the decision maker becomes gradually less sure about the state, and continues learning by seeking more precise but less frequently arriving signals. In Chapter 2, I study the sequential implementation of a target information structure. I characterize the set of decision time distributions induced by all signal processes that satisfy a per-period learning capacity constraint on the rate of uncertainty reduction. I find that all decision time distributions have the same mean, and the maximal and minimal elements by mean-preserving spread order are exponential distribution and deterministic distribution. The result implies that when the time preference is risk loving (e.g. standard or hyperbolic discounting), Poisson signal is optimal since it induces the riskiest exponential decision time distribution. When time preference is risk neutral (e.g. constant delay cost), all signal processes are equally optimal. In Chapter 3, I relax the assumption on information cost by assuming that the measure of signal informativeness is an indirect measure from sequential minimization. I first show that an indirect information measure is supported by sequential minimization iff it satisfies: 1) monotonicity in Blackwell order, 2) sub-additivity in compound experiments and 3) linearity in mixing with no information. Then I study a dynamic information acquisition problem where the cost of information depends on an indirect information measure and the delay cost is fixed (the DM is time-risk neutral). The optimal strategy is to acquire Poisson type signals. The result implies that when the cost of information is measured by an indirect measure, Poisson signals are intrinsically cheaper than other signal processes. Chapter 4 introduces a set of useful technical results on constrained information design that is used to derive the main results in the first three chapters.
|
349 |
Multi-lingual text retrieval and mining.January 2003 (has links)
Law Yin Yee. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2003. / Includes bibliographical references (leaves 130-134). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Cross-Lingual Information Retrieval (CLIR) --- p.2 / Chapter 1.2 --- Bilingual Term Association Mining --- p.5 / Chapter 1.3 --- Our Contributions --- p.6 / Chapter 1.3.1 --- CLIR --- p.6 / Chapter 1.3.2 --- Bilingual Term Association Mining --- p.7 / Chapter 1.4 --- Thesis Organization --- p.8 / Chapter 2 --- Related Work --- p.9 / Chapter 2.1 --- CLIR Techniques --- p.9 / Chapter 2.1.1 --- Existing Approaches --- p.9 / Chapter 2.1.2 --- Difference Between Our Model and Existing Approaches --- p.13 / Chapter 2.2 --- Bilingual Term Association Mining Techniques --- p.13 / Chapter 2.2.1 --- Existing Approaches --- p.13 / Chapter 2.2.2 --- Difference Between Our Model and Existing Approaches --- p.17 / Chapter 3 --- Cross-Lingual Information Retrieval (CLIR) --- p.18 / Chapter 3.1 --- Cross-Lingual Query Processing and Translation --- p.18 / Chapter 3.1.1 --- Query Context and Document Context Generation --- p.20 / Chapter 3.1.2 --- Context-Based Query Translation --- p.23 / Chapter 3.1.3 --- Query Term Weighting --- p.28 / Chapter 3.1.4 --- Final Weight Calculation --- p.30 / Chapter 3.2 --- Retrieval on Documents and Automated Summaries --- p.32 / Chapter 4 --- Experiments on Cross-Lingual Information Retrieval --- p.38 / Chapter 4.1 --- Experimental Setup --- p.38 / Chapter 4.2 --- Results of English-to-Chinese Retrieval --- p.45 / Chapter 4.2.1 --- Using Mono-Lingual Retrieval as the Gold Standard --- p.45 / Chapter 4.2.2 --- Using Human Relevance Judgments as the Gold Stan- dard --- p.49 / Chapter 4.3 --- Results of Chinese-to-English Retrieval --- p.53 / Chapter 4.3.1 --- Using Mono-lingual Retrieval as the Gold Standard --- p.53 / Chapter 4.3.2 --- Using Human Relevance Judgments as the Gold Stan- dard --- p.57 / Chapter 5 --- Discovering Comparable Multi-lingual Online News for Text Mining --- p.61 / Chapter 5.1 --- Story Representation --- p.62 / Chapter 5.2 --- Gloss Translation --- p.64 / Chapter 5.3 --- Comparable News Discovery --- p.67 / Chapter 6 --- Mining Bilingual Term Association Based on Co-occurrence --- p.75 / Chapter 6.1 --- Bilingual Term Cognate Generation --- p.75 / Chapter 6.2 --- Term Mining Algorithm --- p.77 / Chapter 7 --- Phonetic Matching --- p.87 / Chapter 7.1 --- Algorithm Design --- p.87 / Chapter 7.2 --- Discovering Associations of English Terms and Chinese Terms --- p.93 / Chapter 7.2.1 --- Converting English Terms into Phonetic Representation --- p.93 / Chapter 7.2.2 --- Discovering Associations of English Terms and Man- darin Chinese Terms --- p.100 / Chapter 7.2.3 --- Discovering Associations of English Terms and Can- tonese Chinese Terms --- p.104 / Chapter 8 --- Experiments on Bilingual Term Association Mining --- p.111 / Chapter 8.1 --- Experimental Setup --- p.111 / Chapter 8.2 --- Result and Discussion of Bilingual Term Association Mining Based on Co-occurrence --- p.114 / Chapter 8.3 --- Result and Discussion of Phonetic Matching --- p.121 / Chapter 9 --- Conclusions and Future Work --- p.126 / Chapter 9.1 --- Conclusions --- p.126 / Chapter 9.1.1 --- CLIR --- p.126 / Chapter 9.1.2 --- Bilingual Term Association Mining --- p.127 / Chapter 9.2 --- Future Work --- p.128 / Bibliography --- p.134 / Chapter A --- Original English Queries --- p.135 / Chapter B --- Manual translated Chinese Queries --- p.137 / Chapter C --- Pronunciation symbols used by the PRONLEX Lexicon --- p.139 / Chapter D --- Initial Letter-to-Phoneme Tags --- p.141 / Chapter E --- English Sounds with their Chinese Equivalents --- p.143
|
350 |
Information retrieval and query routing in peer-to-peer networks. / Information retrieval & query routing in peer-to-peer networksJanuary 2005 (has links)
Wong Wan Yeung. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2005. / Includes bibliographical references (leaves 118-122). / Abstracts in English and Chinese. / Chapter 1. --- Introduction --- p.1 / Chapter 1.1 --- Problem Definition --- p.1 / Chapter 1.2 --- Major Contributions --- p.5 / Chapter 1.2.1 --- S2S Searching --- p.6 / Chapter 1.2.2 --- GAroute --- p.8 / Chapter 1.3 --- Thesis Chapter Organization --- p.10 / Chapter 2. --- Related Work --- p.11 / Chapter 2.1 --- P2P Networks --- p.11 / Chapter 2.2 --- Query Routing Strategies --- p.20 / Chapter 2.3 --- P2P Network Security --- p.22 / Chapter 3. --- S2S Searching --- p.24 / Chapter 3.1 --- System Architecture --- p.24 / Chapter 3.1.1 --- Administration Module --- p.24 / Chapter 3.1.2 --- Search Module --- p.27 / Chapter 3.2 --- Indexing and Matching --- p.32 / Chapter 3.2.1 --- Background of Indexing and Matching --- p.32 / Chapter 3.2.2 --- Indexing Algorithm --- p.33 / Chapter 3.2.3 --- Matching Algorithm --- p.34 / Chapter 3.3 --- Query Routing --- p.36 / Chapter 3.3.1 --- Background of Query Routing --- p.36 / Chapter 3.3.2 --- Distributed Registrars and Content Summary --- p.38 / Chapter 3.3.3 --- Query Routing Algorithm --- p.41 / Chapter 3.3.4 --- Registrar Maintenance --- p.44 / Chapter 3.4 --- Communication Protocol --- p.45 / Chapter 3.4.1 --- Starting CGI --- p.46 / Chapter 3.4.2 --- Searching CGI --- p.47 / Chapter 3.4.3 --- Pinging CGI --- p.48 / Chapter 3.4.4 --- Joining CGI --- p.48 / Chapter 3.4.5 --- Leaving CGI --- p.48 / Chapter 3.4.6 --- Updating CGI --- p.49 / Chapter 3.5 --- Experiments and Discussions --- p.49 / Chapter 3.5.1 --- Performance of Indexing --- p.50 / Chapter 3.5.2 --- Performance of Matching --- p.52 / Chapter 3.5.3 --- Performance of S2S Searching --- p.54 / Chapter 3.5.4 --- Quality of Content Summary --- p.57 / Chapter 4. --- GAroute --- p.59 / Chapter 4.1 --- Proposed Hybrid P2P Network Model --- p.59 / Chapter 4.1.1 --- Background of Hybrid P2P Networks --- p.60 / Chapter 4.1.2 --- Roles of Zone Managers --- p.62 / Chapter 4.2 --- Proposed GAroute --- p.65 / Chapter 4.2.1 --- Genetic Representation --- p.69 / Chapter 4.2.2 --- Population Initialization --- p.70 / Chapter 4.2.3 --- Mutation --- p.72 / Chapter 4.2.4 --- Crossover --- p.74 / Chapter 4.2.5 --- Fission --- p.77 / Chapter 4.2.6 --- Creation --- p.80 / Chapter 4.2.7 --- Selection --- p.81 / Chapter 4.2.8 --- Stopping Criteria --- p.83 / Chapter 4.2.9 --- Optimization --- p.86 / Chapter 4.3 --- Experiments and Discussions --- p.89 / Chapter 4.3.1 --- Property of Different Topologies --- p.91 / Chapter 4.3.2 --- Scalability and Quality in Different Topologies --- p.92 / Chapter 4.3.3 --- Scalability and Quality in Different Quantities --- p.96 / Chapter 4.3.4 --- Verification of Lower Bandwidth Consumption --- p.101 / Chapter 4.3.5 --- Verification of Better Parallel Search --- p.105 / Chapter 5. --- Discussion --- p.110 / Chapter 6. --- Conclusion --- p.114 / Chapter 7. --- Bibliography --- p.118 / Chapter 8. --- Appendix --- p.123 / Chapter 8.1 --- S2S Search Engine --- p.123 / Chapter 8.1.1 --- Site Owner Perspective --- p.123 / Chapter 8.1.2 --- Search Engine User Perspective --- p.128 / Chapter 8.2 --- GAroute Library --- p.129
|
Page generated in 0.265 seconds