Spelling suggestions: "subject:"informationretrieval."" "subject:"informationsretrieval.""
161 |
Internet multimedia information retrieval based on link analysis.January 2004 (has links)
Chan Ka Yan. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2004. / Includes bibliographical references (leaves i-iv (3rd gp.)). / Abstracts in English and Chinese. / ACKNOWLEDGEMENT --- p.I / ABSTRACT --- p.II / 摘要 --- p.IV / TABLE OF CONTENT --- p.VI / LIST OF FIGURE --- p.VIII / LIST OF TABLE --- p.IX / Chapter CHAPTER 1. --- INTRODUCTION --- p.1 / Chapter 1.1 --- Background --- p.1 / Chapter 1.2 --- Importance of hyperlink analysis --- p.2 / Chapter CHAPTER 2. --- RELATED WORK --- p.4 / Chapter 2.1 --- Crawling --- p.4 / Chapter 2.1.1 --- Crawling method for HITS Algorithm --- p.4 / Chapter 2.1.2 --- Crawling method for Page Rank Algorithm --- p.7 / Chapter 2.2 --- Ranking --- p.7 / Chapter 2.2.1 --- Page Rank Algorithm --- p.8 / Chapter 2.2.2 --- HITS Algorithm --- p.11 / Chapter 2.2.3 --- PageRank-HITS Algorithm --- p.15 / Chapter 2.2.4 --- SALSA Algorithm --- p.16 / Chapter 2.2.5 --- Average and Sim --- p.18 / Chapter 2.2.6 --- Netscape Approach --- p.19 / Chapter 2.2.7 --- Cocitation Approach --- p.19 / Chapter 2.3 --- Multimedia Information Retrieval --- p.20 / Chapter 2.3.1 --- Octopus --- p.21 / Chapter CHAPTER 3. --- RESEARCH METHODOLOGY --- p.25 / Chapter 3.1 --- Research Objective --- p.25 / Chapter 3.2 --- Proposed Crawling Methodology --- p.26 / Chapter 3.2.1 --- Collecting Media Objects --- p.26 / Chapter 3.2.2 --- Filtering the collection of links --- p.29 / Chapter 3.3 --- Proposed Ranking Methodology --- p.34 / Chapter 3.3.1 --- Identifying the factors affect ranking --- p.34 / Chapter 3.3.2 --- Modified Ranking Algorithms --- p.37 / Chapter CHAPTER 4. --- EXPERIMENTAL RESULTS AND DISCUSSIONS --- p.52 / Chapter 4.1 --- Experimental Setup --- p.52 / Chapter 4.1.1 --- Assumptions for the Experiment --- p.53 / Chapter 4.2 --- Some Observations from Experiment --- p.54 / Chapter 4.2.1 --- Dangling links --- p.55 / Chapter 4.2.2 --- "Good Hub = bad Authority, Good Authority = bad Hub?" --- p.55 / Chapter 4.2.3 --- Setting of weights --- p.56 / Chapter 4.3 --- Discussion on Experimental Results --- p.57 / Chapter 4.3.1 --- Relevance --- p.57 / Chapter 4.3.2 --- Precision and recall --- p.58 / Chapter 4.3.3 --- Significance testing --- p.61 / Chapter 4.3.4 --- Ranking --- p.63 / Chapter 4.4 --- Limitations and Difficulties --- p.67 / Chapter 4.4.1 --- Small size of the base set --- p.68 / Chapter 4.4.2 --- Parameter settings --- p.68 / Chapter 4.4.3 --- Unable to remove all the meaningless links from base set --- p.68 / Chapter 4.4.4 --- Resources and time-consuming --- p.69 / Chapter 4.4.5 --- TKC Effect --- p.69 / Chapter 4.4.6 --- Continuously updated format of HTML codes and file types --- p.70 / Chapter 4.4.7 --- The object citation habit of authors --- p.70 / Chapter CHAPTER 5. --- CONCLUSION --- p.71 / Chapter 5.1 --- Contribution of our Methodology --- p.71 / Chapter 5.2 --- Possible Improvement --- p.71 / Chapter 5.3 --- Conclusion --- p.72 / BIBLIOGRAPHY --- p.I / APPENDIX --- p.A-I / Chapter A.1 --- One-tailed paired t-test results --- p.A-I / Chapter A2. --- Anova results --- p.A-IV
|
162 |
Efficient techniques for video shot segmentation and retrieval. / CUHK electronic theses & dissertations collectionJanuary 2007 (has links)
Video segmentation is the first step to most content-based video analysis. In this thesis, several methods have been proposed to detect shot transitions including cut and wipe. In particular, a new cut detection method is proposed to apply multi-adaptive thresholds during three-step processing of frame-by-frame discontinuity values. A "likelihood value", which measures the possibility of the presence of a cut at each step of processing, is used to reduce the influence of threshold selection to the detection performance. A wipe detection algorithm is also proposed in our thesis to detect various wipe effects with accurate frame ranges. In the algorithm, we carefully model a wipe based on its properties and then use the model to remove possible confusion caused by motion or other transition effects. / With the segmented video shots, video indexing and retrieval systems retrieve video shots using shot-based similarity matching based on the features of shot key-frames. Most shot-based similarity matching methods focus on low-level features such as color and texture. Those methods are often not effective enough in video retrieval due to the large gap between semantic interpretation of videos and the low level features. In this thesis, we propose an attention-driven video retrieval method by using an efficient spatiotemporal attention detection framework. Within the framework, we propose an efficient method for focus of attention (FOA) detection which involves combining adaptively the spatial and motion attention to form an overall attention map. Without computing motion explicitly, it detects motion attention using the rank deficiency of gray scale gradient tensors. We also propose an attention-driven shot matching method using primarily FOA. The matching method boosts the attended regions in the respective shots by converting attention values to importance factors in the process of shot similarity matching. Experiment results demonstrate the advantages of the proposed method in shot similarity matching. / Li, Shan. / "September 2007." / Adviser: Moon-Chuen Lee. / Source: Dissertation Abstracts International, Volume: 69-02, Section: B, page: 1108. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2007. / Includes bibliographical references (p. 150-168). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract in English and Chinese. / School code: 1307.
|
163 |
Fast algorithms for sequence data searching.January 1997 (has links)
by Sze-Kin Lam. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1997. / Includes bibliographical references (leaves 71-76). / Abstract --- p.i / Acknowledgement --- p.iii / Chapter 1 --- Introduction --- p.1 / Chapter 2 --- Related Work --- p.6 / Chapter 2.1 --- Sequence query processing --- p.8 / Chapter 2.2 --- Text sequence searching --- p.8 / Chapter 2.3 --- Numerical sequence searching --- p.11 / Chapter 2.4 --- Indexing schemes --- p.17 / Chapter 3 --- Sequence Data Searching using the Projection Algorithm --- p.21 / Chapter 3.1 --- Sequence Similarity --- p.21 / Chapter 3.2 --- Searching Method --- p.24 / Chapter 3.2.1 --- Sequential Algorithm --- p.24 / Chapter 3.2.2 --- Projection Algorithm --- p.25 / Chapter 3.3 --- Handling Scaling Problem by the Projection Algorithm --- p.33 / Chapter 4 --- Sequence Data Searching using Hashing Algorithm --- p.37 / Chapter 4.1 --- Sequence Similarity --- p.37 / Chapter 4.2 --- Hashing algorithm --- p.39 / Chapter 4.2.1 --- Motivation of the Algorithm --- p.40 / Chapter 4.2.2 --- Hashing Algorithm using dynamic hash function --- p.44 / Chapter 4.2.3 --- Handling Scaling Problem by the Hashing Algorithm --- p.47 / Chapter 5 --- Comparisons between algorithms --- p.50 / Chapter 5.1 --- Performance comparison with the sequence searching algorithms --- p.54 / Chapter 5.2 --- Comparison between indexing structures --- p.54 / Chapter 5.3 --- Comparison between sequence searching algorithms in coping some deficits --- p.55 / Chapter 6 --- Performance Evaluation --- p.58 / Chapter 6.1 --- Performance Evaluation using Projection Algorithm --- p.58 / Chapter 6.2 --- Performance Evaluation using Hashing Algorithm --- p.61 / Chapter 7 --- Conclusion --- p.66 / Chapter 7.1 --- Motivation of the thesis --- p.66 / Chapter 7.1.1 --- Insufficiency of Euclidean distance --- p.67 / Chapter 7.1.2 --- Insufficiency of orthonormal transforms --- p.67 / Chapter 7.1.3 --- Insufficiency of multi-dimensional indexing structure --- p.68 / Chapter 7.2 --- Major contribution --- p.68 / Chapter 7.2.1 --- Projection algorithm --- p.68 / Chapter 7.2.2 --- Hashing algorithm --- p.69 / Chapter 7.3 --- Future work --- p.70 / Bibliography --- p.71
|
164 |
Indexing methods for multimedia data objects given pair-wise distances.January 1997 (has links)
by Chan Mei Shuen Polly. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1997. / Includes bibliographical references (leaves 67-70). / Abstract --- p.ii / Acknowledgement --- p.iii / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Definitions --- p.3 / Chapter 1.2 --- Thesis Overview --- p.5 / Chapter 2 --- Background and Related Work --- p.6 / Chapter 2.1 --- Feature-Based Index Structures --- p.6 / Chapter 2.2 --- Distance Preserving Methods --- p.8 / Chapter 2.3 --- Distance-Based Index Structures --- p.9 / Chapter 2.3.1 --- The Vantage-Point Tree Method --- p.10 / Chapter 3 --- The Problem of Distance Preserving Methods in Querying --- p.12 / Chapter 3.1 --- Some Experimental Results --- p.13 / Chapter 3.2 --- Discussion --- p.15 / Chapter 4 --- Nearest Neighbor Search in VP-trees --- p.17 / Chapter 4.1 --- The sigma-factor Algorithm --- p.18 / Chapter 4.2 --- The Constant-α Algorithm --- p.22 / Chapter 4.3 --- The Single-Pass Algorithm --- p.24 / Chapter 4.4 --- Discussion --- p.25 / Chapter 4.5 --- Performance Evaluation --- p.26 / Chapter 4.5.1 --- Experimental Setup --- p.27 / Chapter 4.5.2 --- Results --- p.28 / Chapter 5 --- Update Operations on VP-trees --- p.41 / Chapter 5.1 --- Insert --- p.41 / Chapter 5.2 --- Delete --- p.48 / Chapter 5.3 --- Performance Evaluation --- p.51 / Chapter 6 --- Minimizing Distance Computations --- p.57 / Chapter 6.1 --- A Single Vantage Point per Level --- p.58 / Chapter 6.2 --- Reuse of Vantage Points --- p.59 / Chapter 6.3 --- Performance Evaluation --- p.60 / Chapter 7 --- Conclusions and Future Work --- p.63 / Chapter 7.1 --- Future Work --- p.65 / Bibliography --- p.67
|
165 |
Product record normalization across different web sites.January 2008 (has links)
Wong, Tik Shun. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2008. / Includes bibliographical references (leaves 57-62). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Motivation --- p.1 / Chapter 1.2 --- Thesis Contributions --- p.10 / Chapter 1.3 --- Thesis Organization --- p.11 / Chapter 2 --- Literature Review --- p.12 / Chapter 2.1 --- Related Work on Product Record Normalization --- p.12 / Chapter 2.2 --- Related Work on Information Extraction --- p.15 / Chapter 2.2.1 --- Information Extraction Methods for Unstructured Documents --- p.16 / Chapter 2.2.2 --- Wrappers for Information Extraction --- p.16 / Chapter 2.2.3 --- Supervised Methods for Information Extraction --- p.17 / Chapter 2.2.4 --- Semi-supervised Methods for Information Extraction --- p.20 / Chapter 2.2.5 --- Unsupervised Methods for Information Extraction --- p.21 / Chapter 2.2.6 --- Probabilistic Methods for Information Extraction --- p.23 / Chapter 3 --- Background and Problem Definition --- p.26 / Chapter 3.1 --- Background --- p.26 / Chapter 3.2 --- Problem Definition --- p.29 / Chapter 4 --- Our Approach --- p.31 / Chapter 4.1 --- Generative Model --- p.31 / Chapter 4.2 --- Our Inference Method --- p.34 / Chapter 5 --- Experiments --- p.41 / Chapter 5.1 --- Experimental Setup --- p.41 / Chapter 5.2 --- Experimental Results --- p.49 / Chapter 5.3 --- The Effect of Reference Product Prior --- p.52 / Chapter 5.4 --- The Effect of Layout Information --- p.53 / Chapter 6 --- Conclusions and Future Work --- p.55 / Bibliography --- p.57 / Chapter A --- Detailed Performance of Product Record Normalization --- p.63
|
166 |
Web mining techniques for query log analysis and expertise retrieval. / Web挖掘技術及其在搜索引擎查詢日誌和專家搜索中的應用 / CUHK electronic theses & dissertations collection / Web wa jue ji shu ji qi zai sou suo yin qing cha xun ri zhi he zhuan jia sou suo zhong de ying yongJanuary 2009 (has links)
Deng, Hongbo. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2009. / Includes bibliographical references (leaves 156-175). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract also in Chinese.
|
167 |
Cross-Domain Content-Based Retrieval of Audio Music through TranscriptionSuyoto, Iman S. H., ishs@ishs.net January 2009 (has links)
Research in the field of music information retrieval (MIR) is concerned with methods to effectively retrieve a piece of music based on a user's query. An important goal in MIR research is the ability to successfully retrieve music stored as recorded audio using note-based queries. In this work, we consider the searching of musical audio using symbolic queries. We first examined the effectiveness of using a relative pitch approach to represent queries and pieces. Our experimental results revealed that this technique, while effective, is optimal when the whole tune is used as a query. We then suggested an algorithm involving the use of pitch classes in conjunction with the longest common subsequence algorithm between a query and target, also using the whole tune as a query. We also proposed an algorithm that works effectively when only a small part of a tune is used as a query. The algorithm makes use of a sliding window in addition to pitch classes and the longest common subsequence algorithm between a query and target. We examined the algorithm using queries based on the beginning, middle, and ending parts of pieces. We performed experiments on an audio collection and manually-constructed symbolic queries. Our experimental evaluation revealed that our techniques are highly effective, with most queries used in our experiments being able to retrieve a correct answer in the first rank position. In addition, we examined the effectiveness of duration-based features for improving retrieval effectiveness over the use of pitch only. We investigated note durations and inter-onset intervals. For this purpose, we used solely symbolic music so that we could focus on the core of the problem. A relative pitch approach alongside a relative duration representation were used in our experiments. Our experimental results showed that durations fail to significantly improve retrieval effectiveness, whereas inter-onset intervals significantly improve retrieval effectiveness.
|
168 |
Topic-focused and summarized web information retrievalYoo, Seung Yeol, Computer Science & Engineering, Faculty of Engineering, UNSW January 2007 (has links)
Since the Web is getting bigger and bigger with a rapidly increasing number of heterogeneous Web pages, Web users often suffer from two problems: P1) irrelevant information and P2) information overload Irrelevant information indicates the weak relevance between the retrieved information and a user's information need. Information overload indicates that the retrieved information may contain 1) redundant information (e.g., common information between two retrieved Web pages) or 2) too much amount of information which cannot be easily understood by a user. We consider four major causes of those two problems P1) and P2) as follows; ??? Firstly, ambiguous query-terms. ??? Secondly, ambiguous terms in a Web page. ??? Thirdly, a query and a Web page cannot be semantically matched, because of the first and second causes. ??? Fourthly, the whole content of a Web page is a coarse context-boundary to measure the similarity between the Web page and a query. To answer those two problems P1) and P2), we consider that the meanings of words in a Web page and a query are primitive hints for understanding the related semantics of the Web page. Thus, in this dissertation, we developed three cooperative technologies: Word Sense Based Web Information Retrieval (WSBWIR), Subjective Segment Importance Model (SSIM) and Topic Focused Web Page Summarization (TFWPS). ??? WSBWIR allows for a user to 1) describe their information needs at senselevel and 2) provides one way for users to conceptually explore information existing within Web pages. ??? SSIM discovers a semantic structure of a Web page. A semantic structure respects not only Web page authors logical presentation structures but also a user specific topic interests on the Web pages at query time. ??? TFWPS dynamically generates extractive summaries respecting a user's topic interests. WSBWIR, SSIM and TFWPS technologies are implemented and experimented through several case-studies, classification and clustering tasks. Our experiments demonstrated that 1) the comparable effectiveness of exploration of Web pages using word senses, and 2) the segments partitioned by SSIM and summaries generated by TFWPS can provide more topically coherent features for classification and clustering purposes.
|
169 |
Document management and retrieval for specialised domains: an evolutionary user-based approachKim, Mihye, Computer Science & Engineering, Faculty of Engineering, UNSW January 2003 (has links)
Browsing marked-up documents by traversing hyperlinks has become probably the most important means by which documents are accessed, both via the World Wide Web (WWW) and organisational Intranets. However, there is a pressing demand for document management and retrieval systems to deal appropriately with the massive number of documents available. There are two classes of solution: general search engines, whether for the WWW or an Intranet, which make little use of specific domain knowledge or hand-crafted specialised systems which are costly to build and maintain. The aim of this thesis was to develop a document management and retrieval system suitable for small communities as well as individuals in specialised domains on the Web. The aim was to allow users to easily create and maintain their own organisation of documents while ensuring continual improvement in the retrieval performance of the system as it evolves. The system developed is based on the free annotation of documents by users and is browsed using the concept lattice of Formal Concept Analysis (FCA). A number of annotation support tools were developed to aid the annotation process so that a suitable system evolved. Experiments were conducted in using the system to assist in finding staff and student home pages at the School of Computer Science and Engineering, University of New South Wales. Results indicated that the annotation tools provided a good level of assistance so that documents were easily organised and a lattice-based browsing structure that evolves in an ad hoc fashion provided good efficiency in retrieval performance. An interesting result suggested that although an established external taxonomy can be useful in proposing annotation terms, users appear to be very selective in their use of terms proposed. Results also supported the hypothesis that the concept lattice of FCA helped take users beyond a narrow search to find other useful documents. In general, lattice-based browsing was considered as a more helpful method than Boolean queries or hierarchical browsing for searching a specialised domain. We conclude that the concept lattice of Formal Concept Analysis, supported by annotation techniques is a useful way of supporting the flexible open management of documents required by individuals, small communities and in specialised domains. It seems likely that this approach can be readily integrated with other developments such as further improvements in search engines and the use of semantically marked-up documents, and provide a unique advantage in supporting autonomous management of documents by individuals and groups - in a way that is closely aligned with the autonomy of the WWW.
|
170 |
Enhancing information retrieval effectiveness through use of contextChanana, Vivek, University of Western Sydney, College of Science, Technology and Environment, School of Computing and Information Technology January 2004 (has links)
Information available in digital form has grown phenomenally in recent years. Finding the required information has become a difficult and challenging task. This is primarily due to the diversity and enormous volume of information available and the change in the nature of people now seeking information – from experts to ordinary users of desktop computers with varying interest and objectives. The problem of finding relevant information is further impacted by the poor retrieval effectiveness of most current information retrieval (IR) systems that are primarily based on keyword indexing techniques. Though these systems retrieve documents that contain those keywords specified in the query, the documents that are retrieved may not necessarily be in the context in which the user would have wanted them to be. This research works argues that exploiting the user’s context of the information need has the potential to improve the performance of information retrieval systems. Context can reduce the ambiguity by associating meanings to request/query terms, and thus limit the scope of the possible misinterpretations of query terms. A new way of defining context categories based on information type is proposed and this notion of context differs from the conventional way of defining information categories based on subject topics as it is closely linked with the situation in which the user’s needs for information originates. A new context-based information retrieval system where users could specify the context in which they are seeking information is presented. This work also includes a full-scale development, implementation and evaluation of the new context-based information system / Doctor of Philosophy (PhD)
|
Page generated in 0.0909 seconds