Global ETD Search

151	Accessing Information on the World Wide Web: Predicting Usage Based on Involvement Langford, James David 05 1900 (has links) Advice for Web designers often includes an admonition to use short, scannable, bullet-pointed text, reflecting the common belief that browsing the Web most often involves scanning rather than reading. Literature from several disciplines focuses on the myriad combinations of factors related to online reading but studies of the users' interests and motivations appear to offer a more promising avenue for understanding how users utilize information on Web pages. This study utilized the modified Personal Involvement Inventory (PII), a ten-item instrument used primarily in the marketing and advertising fields, to measure interest and motivation toward a topic presented on the Web. Two sites were constructed from Reader's Digest Association, Inc. online articles and a program written to track students' use of the site. Behavior was measured by the initial choice of short versus longer versions of the main page, the number of pages visited and the amount of time spent on the site. Data were gathered from students at a small, private university in the southwest part of the United States to answer six hypotheses which posited that subjects with higher involvement in a topic presented on the Web and a more positive attitude toward the Web would tend to select the longer text version, visit more pages, and spend more time on the site. While attitude toward the Web did not correlate significantly with any of the behavioral factors, the level of involvement was associated with the use of the sites in two of three hypotheses, but only partially in the manner hypothesized. Increased involvement with a Web topic did correlate with the choice of a longer, more detailed initial Web page, but was inversely related to the number of pages viewed so that the higher the involvement, the fewer pages visited. An additional indicator of usage, the average amount of time spent on each page, was measured and revealed that more involved users spent more time on each page. Internet searching. Information retrieval. Information Web involvement
152	An analysis of Linux RAM forensics Urrea, Jorge Mario. 03 1900 (has links) During a forensic investigation of a computer system, the ability to retrieve volatile information can be of critical importance. The contents of RAM could reveal malicious code running on the system that has been deleted from the hard drive or, better yet, that was never resident on the hard drive at all. RAM can also provide the programs most recently run and files most recently opened in the system. However, due to the nature of modern operating systems, these programs and files are not typically stored contiguously-which makes most retrieval efforts of files larger than one page size futile. To date, analysis of RAM images has been largely restricted to searching for ASCII string content, which typically only yields text information such as document fragments, passwords or scripts. This thesis explores the memory management structures in a SUSE Linux system (kernel version 2.6.13-15) to make sense out of the chaos in RAM and facilitate the retrieval of files/programs larger than one page size. The analysis includes methods for incorporating swap space information for files that may not reside completely within physical memory. The results of this thesis will become the basis of later research efforts in RAM forensics. This includes the creation of tools that will provide forensic analysts with a clear map of what is resident in the volatile memory of a system. Computer science Computer crimes Information retrieval Investigation
153	Automatické doporučování ilustračních snímků / Automatic suggestion of illustrative images Odcházel, Ondřej January 2014 (has links) The objective of this thesis is to implement a web application designed for recommendation of stock photos. The application gets the input from newspaper articles in Czech or English and, based on the text itself, suggests appropriate stock photos. The implemented application also searches images according to visual similarity. The thesis deals with theoretical aspects of keywords extraction and language of text detection. Further it analyzes possibilities of efficient search for similar vectors that are used in the search component for visually similar images. It also describes the possibilities in development of modern web frontend and backend. The quality of algorithm for recommending stock photos is tested on users. Powered by TCPDF (www.tcpdf.org)
154	Music metadata capture in the studio from audio and symbolic data Hargreaves, Steven January 2014 (has links) Music Information Retrieval (MIR) tasks, in the main, are concerned with the accurate generation of one of a number of different types of music metadata {beat onsets, or melody extraction, for example. Almost always, they operate on fully mixed digital audio recordings. Commonly, this means that a large amount of signal processing effort is directed towards the isolation, and then identification, of certain highly relevant aspects of the audio mix. In some cases, results of one MIR algorithm are useful, if not essential, to the operation of another { a chord detection algorithm for example, is highly dependent upon accurate pitch detection. Although not clearly defined in all cases, certain rules exist which we may take from music theory in order to assist the task { the particular note intervals which make up a specific chord, for example. On the question of generating accurate, low level music metadata (e.g. chromatic pitch and score onset time), a potentially huge advantage lies in the use of multitrack, rather than mixed, audio recordings, in which the separate instrument recordings may be analysed in isolation. Additionally, in MIR, as in many other research areas currently, there is an increasing push towards the use of the Semantic Web for publishing metadata using the Resource Description Framework (RDF). Semantic Web technologies, though, also facilitate the querying of data via the SPARQL query language, as well as logical inferencing via the careful creation and use of web ontology language (OWL) ontologies. This, in turn, opens up the intriguing possibility of deferring our decision regarding which particular type of MIR query to ask of our low-level music metadata until some point later down the line, long after all the heavy signal processing has been carried out. In this thesis, we describe an over-arching vision for an alternative MIR paradigm, built around the principles of early, studio-based metadata capture, and exploitation of open, machine-readable Semantic Web data. Using the specific example of structural segmentation, we demonstrate that by analysing multitrack rather than mixed audio, we are able to achieve a significant and quantifiable increase in the accuracy of our segmentation algorithm. We also provide details of a new multitrack audio dataset with structural segmentation annotations, created as part of this research, and available for public use. Furthermore, we show that it is possible to fully implement a pair of pattern discovery algorithms (the SIA and SIATEC algorithms { highly applicable, but not restricted to, symbolic music data analysis) using only SemanticWeb technologies { the SPARQL query language, acting on RDF data, in tandem with a small OWL ontology. We describe the challenges encountered by taking this approach, the particular solution we've arrived at, and we evaluate the implementation both in terms of its execution time, and also within the wider context of our vision for a new MIR paradigm. 621.389
155	Rival penalized competitive learning for content-based indexing. January 1998 (has links) by Lau Tak Kan. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1998. / Includes bibliographical references (leaves 100-108). / Abstract also in Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Background --- p.1 / Chapter 1.2 --- Problem Defined --- p.5 / Chapter 1.3 --- Contributions --- p.5 / Chapter 1.4 --- Thesis Organization --- p.7 / Chapter 2 --- Content-based Retrieval Multimedia Database Background and Indexing Problem --- p.8 / Chapter 2.1 --- Feature Extraction --- p.8 / Chapter 2.2 --- Nearest-neighbor Search --- p.10 / Chapter 2.3 --- Content-based Indexing Methods --- p.15 / Chapter 2.4 --- Indexing Problem --- p.22 / Chapter 3 --- Data Clustering Methods for Indexing --- p.25 / Chapter 3.1 --- Proposed Solution to Indexing Problem --- p.25 / Chapter 3.2 --- Brief Description of Several Clustering Methods --- p.26 / Chapter 3.2.1 --- K-means --- p.26 / Chapter 3.2.2 --- Competitive Learning (CL) --- p.27 / Chapter 3.2.3 --- Rival Penalized Competitive Learning (RPCL) --- p.29 / Chapter 3.2.4 --- General Hierarchical Clustering Methods --- p.31 / Chapter 3.3 --- Why RPCL? --- p.32 / Chapter 4 --- Non-hierarchical RPCL Indexing --- p.33 / Chapter 4.1 --- The Non-hierarchical Approach --- p.33 / Chapter 4.2 --- Performance Experiments --- p.34 / Chapter 4.2.1 --- Experimental Setup --- p.35 / Chapter 4.2.2 --- Experiment 1: Test for Recall and Precision Performance --- p.38 / Chapter 4.2.3 --- Experiment 2: Test for Different Sizes of Input Data Sets --- p.45 / Chapter 4.2.4 --- Experiment 3: Test for Different Numbers of Dimensions --- p.49 / Chapter 4.2.5 --- Experiment 4: Compare with Actual Nearest-neighbor Results --- p.53 / Chapter 4.3 --- Chapter Summary --- p.55 / Chapter 5 --- Hierarchical RPCL Indexing --- p.56 / Chapter 5.1 --- The Hierarchical Approach --- p.56 / Chapter 5.2 --- The Hierarchical RPCL Binary Tree (RPCL-b-tree) --- p.58 / Chapter 5.3 --- Insertion --- p.61 / Chapter 5.4 --- Deletion --- p.63 / Chapter 5.5 --- Searching --- p.63 / Chapter 5.6 --- Experiments --- p.69 / Chapter 5.6.1 --- Experimental Setup --- p.69 / Chapter 5.6.2 --- Experiment 5: Test for Different Node Sizes --- p.72 / Chapter 5.6.3 --- Experiment 6: Test for Different Sizes of Data Sets --- p.75 / Chapter 5.6.4 --- Experiment 7: Test for Different Data Distributions --- p.78 / Chapter 5.6.5 --- Experiment 8: Test for Different Numbers of Dimensions --- p.80 / Chapter 5.6.6 --- Experiment 9: Test for Different Numbers of Database Ob- jects Retrieved --- p.83 / Chapter 5.6.7 --- Experiment 10: Test with VP-tree --- p.86 / Chapter 5.7 --- Discussion --- p.90 / Chapter 5.8 --- A Relationship Formula --- p.93 / Chapter 5.9 --- Chapter Summary --- p.96 / Chapter 6 --- Conclusion --- p.97 / Chapter 6.1 --- Future Works --- p.97 / Chapter 6.2 --- Conclusion --- p.98 / Bibliography --- p.100 Multimedia systems Indexing Cluster analysis Information retrieval
156	An effective Chinese indexing method based on partitioned signature files. January 1998 (has links) Wong Chi Yin. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1998. / Includes bibliographical references (leaves 107-114). / Abstract also in Chinese. / Abstract --- p.ii / Acknowledgements --- p.vi / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Introduction to Chinese IR --- p.1 / Chapter 1.2 --- Contributions --- p.3 / Chapter 1.3 --- Organization of this Thesis --- p.5 / Chapter 2 --- Background --- p.6 / Chapter 2.1 --- Indexing methods --- p.6 / Chapter 2.1.1 --- Full-text scanning --- p.7 / Chapter 2.1.2 --- Inverted files --- p.7 / Chapter 2.1.3 --- Signature files --- p.9 / Chapter 2.1.4 --- Clustering --- p.10 / Chapter 2.2 --- Information Retrieval Models --- p.10 / Chapter 2.2.1 --- Boolean model --- p.11 / Chapter 2.2.2 --- Vector space model --- p.11 / Chapter 2.2.3 --- Probabilistic model --- p.13 / Chapter 2.2.4 --- Logical model --- p.14 / Chapter 3 --- Investigation of Segmentation on the Vector Space Retrieval Model --- p.15 / Chapter 3.1 --- Segmentation of Chinese Texts --- p.16 / Chapter 3.1.1 --- Character-based segmentation --- p.16 / Chapter 3.1.2 --- Word-based segmentation --- p.18 / Chapter 3.1.3 --- N-Gram segmentation --- p.21 / Chapter 3.2 --- Performance Evaluation of Three Segmentation Approaches --- p.23 / Chapter 3.2.1 --- Experimental Setup --- p.23 / Chapter 3.2.2 --- Experimental Results --- p.24 / Chapter 3.2.3 --- Discussion --- p.29 / Chapter 4 --- Signature File Background --- p.32 / Chapter 4.1 --- Superimposed coding --- p.34 / Chapter 4.2 --- False drop probability --- p.36 / Chapter 5 --- Partitioned Signature File Based On Chinese Word Length --- p.39 / Chapter 5.1 --- Fixed Weight Block (FWB) Signature File --- p.41 / Chapter 5.2 --- Overview of PSFC --- p.45 / Chapter 5.3 --- Design Considerations --- p.50 / Chapter 6 --- New Hashing Techniques for Partitioned Signature Files --- p.59 / Chapter 6.1 --- Direct Division Method --- p.61 / Chapter 6.2 --- Random Number Assisted Division Method --- p.62 / Chapter 6.3 --- Frequency-based hashing method --- p.64 / Chapter 6.4 --- Chinese character-based hashing method --- p.68 / Chapter 7 --- Experiments and Results --- p.72 / Chapter 7.1 --- Performance evaluation of partitioned signature file based on Chi- nese word length --- p.74 / Chapter 7.1.1 --- Retrieval Performance --- p.75 / Chapter 7.1.2 --- Signature Reduction Ratio --- p.77 / Chapter 7.1.3 --- Storage Requirement --- p.79 / Chapter 7.1.4 --- Discussion --- p.81 / Chapter 7.2 --- Performance evaluation of different dynamic signature generation methods --- p.82 / Chapter 7.2.1 --- Collision --- p.84 / Chapter 7.2.2 --- Retrieval Performance --- p.86 / Chapter 7.2.3 --- Discussion --- p.89 / Chapter 8 --- Conclusions and Future Work --- p.91 / Chapter 8.1 --- Conclusions --- p.91 / Chapter 8.2 --- Future work --- p.95 / Chapter A --- Notations of Signature Files --- p.96 / Chapter B --- False Drop Probability --- p.98 / Chapter C --- Experimental Results --- p.103 / Bibliography --- p.107 Chinese language--Data processing Indexing Information retrieval
157	Learning with social media. / 基於社會化媒體的學習 / CUHK electronic theses & dissertations collection / Ji yu she hui hua mei ti de xue xi January 2013 (has links) 隨著Web 2.0系統在過去十年的迅猛發展，社會化媒體，比如社會化評分系統、社會化標籤系統、在線論壇和社會化問答系統，已經革命性地改變了人們在互聯網上創造和分享內容的方式。但是，面對社會化媒體數據的飛速增長，用戶面臨嚴重的信息過載的問題。現在，基於社會化媒體學習的社會化計算，已經發展成爲了幫助社會化媒體用戶有效解決信息需求的一個重要的研究領域。一般來說，用戶在社會化媒體中發佈信息，期望通過社會化計算尋找到合適的項目。爲了更好地理解用戶的興趣，分析不同類型的用戶產生數據是非常重要的。另一方面，返回給用戶的可以是項目，或是擁有相似興趣的其他用戶。除了基於用戶的分析，進行基於項目的分析也是非常有趣和重要的，比如理解項目的屬性，將語義相關的項目聚在一起爲了更好地滿足用戶的信息需求等。 / 本論文的目地是提出自動化和可擴展的模型來幫助社會化媒體用戶更有效的解決信息需求。這些模型基於社會化媒體中兩個重要的組成提出：用戶和項目。因此，基於以下兩個目標，我們提出一個統一的框架來整合用戶信息和項目信息：1) 通過用戶的行為找出用戶的興趣，并為之推薦可能感興趣的項目和相似興趣的用戶；2) 理解項目的屬性，並將語義相關的項目聚合在一起從而能更好的滿足用戶的信息需求。 / 爲了完成第一個目標，我們提出了一個新的矩陣分解的框架來整合不同的用戶行為數據，從而預測用戶對新項目的興趣。這個框架有效地解決了數據稀疏性以及傳統方法中信息來源單一的問題，其次，爲了給社會化媒體用戶提供自動發現類似興趣的其他用戶的方式，通過利用社會化標籤信息，我們提出了基於用戶興趣挖掘和基於興趣的用戶推薦的框架。大量的真實數據實驗驗證了提出的基於用戶的模型的有效性。 / 爲了完成第二個目標，我們在具問答性質的社會化媒體中提出了問題推薦的應用。問題推薦的目標是基於一個用戶問題推薦語義相關的問題。傳統的詞袋模型不能有效地解決相關問題中用詞不同的問題。因此，我們提出了兩個模型來結合詞法分析以及潛在語義分析，從而有效地衡量問題間的語義相關度。在問題分析中，當前研究缺少對問題屬性的認識。爲了解決這個問題，我們提出了一個有監督學習的方法來識別問題的主觀性。具體來說，我們提出了一種基於社會化信號的無人工參與的自動收集訓練數據的方法。大量實驗證實了提出的方法的效果超過了之前的其他算法。 / 概括起來，圍繞社會化媒體中兩個重要的組成，我們提出了兩個基於用戶的模型和兩個基於項目的模型來幫助社會化媒體的用戶更準確更有效地解決信息需求。我們通過不同社會化媒體中的大量實驗證實了提出模型的有效性。 / With the astronomical growth of Web 2.0 over the past decade, social media systems, such as rating systems, social tagging systems, online forums, and community-based question answering (Q&A) systems, have revolutionized people’s way of creating and sharing contents on the Web. However, due to the explosive growth of data in social media systems, users are drowning in information and encountering information overload problem. Currently, social computing techniques, achieved through learning with social media, have emerged as an important research area to help social media users find their information needs. In general, users post contents which reflect their interests in social media systems, and expect to obtain the suitable items through social computing techniques. To better understand users’ interests, it is very essential to analyze different types of user generate content. On the other hand, the returned information may be items, or users with similar interests. Beyond the user-based analysis, it would be quite interesting and important to conduct item-oriented study, such as understand items’ characteristics, and grouping items that are semantically related for better addressing users’ information needs. / The objective of this thesis is to establish automatic and scalable models to help social media users find their information needs more effectively. These models are proposed based on the two key entities in social media systems: user and item. Thus, one important aspect of this thesis is therefore to develop a framework to combine the user information and the item information with the following two purposes: 1) modeling users’ interests with respect to their behavior, and recommending items or users they may be interested in; and 2) understanding items’ characteristics, and grouping items that are semantically related for better addressing users’ information needs. / For the first purpose, a novel unified matrix factorization framework which fuses different types of users’ behavior data, is proposed for predicting users’ interests on new items. The framework tackles the data sparsity problem and non-flexibility problem confronted by traditional algorithms. Furthermore, to provide users with an automatic and effective way to discover other users with common interests, we propose a framework for user interest modeling and interest-based user recommendation by utilizing users’ tagging information. Extensive evaluations on real world data demonstrate the effectiveness of the proposed user-based models. / For the second purpose, a new functionality question suggestion, which targets at suggesting questions that are semantically related to a queried question, is proposed in social media systems with Q&A functionalities. Existing bag-of-words approaches suffer from the shortcoming that they could not bridge the lexical chasm between semantically related questions. Therefore, we present two models which combines both the lexical and latent semantic knowledge to measure the semantic relatedness among questions. In question analysis, there is a lack of understanding of questions’ characteristics. To tackle this problem, a supervised approach is developed to identify questions’ subjectivity. Moreover, we come up with an approach to collect training data automatically by utilizing social signals without involving any manual labeling. The experimental results show that our methods perform better than the state-of-theart approaches. / In summary, based on the two key entities in social media systems,we present two user-based models and two item-oriented models to help social media users find their information needs more accurately and effectively through learning with social media. Extensive experiments on various social media systems confirm the effectiveness of proposed models. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Zhou, Chao. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2013. / Includes bibliographical references (leaves 130-163). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts also in Chinese. / Abstract --- p.i / Acknowledgement --- p.vi / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Overview --- p.1 / Chapter 1.2 --- Thesis Contribution --- p.9 / Chapter 1.3 --- Thesis Organization --- p.11 / Chapter 2 --- Background Review --- p.15 / Chapter 2.1 --- Recommender System Techniques --- p.15 / Chapter 2.1.1 --- Content-based Filtering --- p.16 / Chapter 2.1.2 --- Collaborative Filtering --- p.17 / Chapter 2.2 --- Information Retrieval Models --- p.23 / Chapter 2.2.1 --- Vector Space Model --- p.24 / Chapter 2.2.2 --- Probabilistic Model and Language Model --- p.27 / Chapter 2.2.3 --- Translation Model --- p.31 / Chapter 2.3 --- Machine Learning --- p.32 / Chapter 2.3.1 --- Supervised Learning --- p.32 / Chapter 2.3.2 --- Semi-Supervised Learning --- p.34 / Chapter 2.3.3 --- Unsupervised Learning --- p.36 / Chapter 2.4 --- Rating Prediction --- p.37 / Chapter 2.5 --- User Recommendation --- p.39 / Chapter 2.6 --- Automatic Question Answering --- p.40 / Chapter 2.6.1 --- Automatic Question Answering (Q&A) from theWeb --- p.40 / Chapter 2.6.2 --- Proliferation of community-based Q&A services and online forums --- p.41 / Chapter 2.6.3 --- Automatic Question Answering (Q&A) in social media --- p.42 / Chapter 3 --- Item Recommendation with Tagging Ensemble --- p.44 / Chapter 3.1 --- Problem and Motivation --- p.44 / Chapter 3.2 --- TagRec Framework --- p.45 / Chapter 3.2.1 --- Preliminaries --- p.45 / Chapter 3.2.2 --- User-Item Rating Matrix Factorization --- p.45 / Chapter 3.2.3 --- User-Tag Tagging Matrix Factorization --- p.47 / Chapter 3.2.4 --- Item-Tag Tagging Matrix Factorization --- p.49 / Chapter 3.2.5 --- A Unified Matrix Factorization for TagRec --- p.50 / Chapter 3.2.6 --- Complexity Analysis --- p.53 / Chapter 3.3 --- Experimental Analysis --- p.54 / Chapter 3.3.1 --- Description of Data Set and Metrics --- p.54 / Chapter 3.3.2 --- Performance Comparison --- p.55 / Chapter 3.3.3 --- Impact of Parameters and --- p.56 / Chapter 3.4 --- Summary --- p.58 / Chapter 4 --- User Recommendation via Interest Modeling --- p.60 / Chapter 4.1 --- Problem and Motivation --- p.60 / Chapter 4.2 --- UserRec Framework --- p.61 / Chapter 4.2.1 --- User Interest Modeling --- p.61 / Chapter 4.2.2 --- Interest-based User Recommendation --- p.65 / Chapter 4.3 --- Experimental Analysis --- p.67 / Chapter 4.3.1 --- Dataset Description and Analysis --- p.67 / Chapter 4.3.2 --- Experimental Results --- p.70 / Chapter 4.4 --- Summary --- p.75 / Chapter 5 --- Item Suggestion with Semantic Analysis --- p.76 / Chapter 5.1 --- Problem and Motivation --- p.76 / Chapter 5.2 --- Question Suggestion Framework --- p.77 / Chapter 5.2.1 --- Question Suggestion in Online Forums --- p.77 / Chapter 5.2.2 --- Question Suggestion in Community-based Q&A Services --- p.84 / Chapter 5.3 --- Experiments And Results --- p.88 / Chapter 5.3.1 --- Experiments in Online Forums --- p.88 / Chapter 5.3.2 --- Experiments in Community-based Q&A Services --- p.96 / Chapter 5.4 --- Summary --- p.105 / Chapter 6 --- Item Modeling via Data-Driven Approach --- p.106 / Chapter 6.1 --- Problem and Motivation --- p.106 / Chapter 6.2 --- Question Subjectivity Identification --- p.107 / Chapter 6.2.1 --- Social Signal Investigation --- p.107 / Chapter 6.2.2 --- Feature Investigation --- p.110 / Chapter 6.3 --- Experimental Evaluation --- p.112 / Chapter 6.3.1 --- Experimental Setting --- p.112 / Chapter 6.3.2 --- Effectiveness of Social Signals --- p.114 / Chapter 6.3.3 --- Effectiveness of Heuristic Features --- p.116 / Chapter 6.4 --- Summary --- p.122 / Chapter 7 --- Conclusion --- p.123 / Chapter 7.1 --- Summary --- p.123 / Chapter 7.2 --- Future Work --- p.124 / Chapter A --- List of Publications --- p.126 / Chapter A.1 --- Conference Publications --- p.126 / Chapter A.2 --- Journal Publications --- p.127 / Chapter A.3 --- Under Review --- p.128 / Bibliography --- p.129 Web 2.0 Online social networks Information retrieval
158	A commonsense aboutness theory for information retrieval modeling. / CUHK electronic theses & dissertations collection January 2000 (has links) Song Dawei. / "August 2000." / Thesis (Ph.D.)--Chinese University of Hong Kong, 2000. / Includes bibliographical references (p. 159-162). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Mode of access: World Wide Web. / Abstracts in English and Chinese. Information retrieval
159	Similarity searching in sequence databases under time warping. January 2004 (has links) Wong, Siu Fung. / Thesis submitted in: December 2003. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2004. / Includes bibliographical references (leaves 77-84). / Abstracts in English and Chinese. / Abstract --- p.ii / Acknowledgement --- p.vi / Chapter 1 --- Introduction --- p.1 / Chapter 2 --- Preliminary --- p.6 / Chapter 2.1 --- Dynamic Time Warping (DTW) --- p.6 / Chapter 2.2 --- Spatial Indexing --- p.10 / Chapter 2.3 --- Relevance Feedback --- p.11 / Chapter 3 --- Literature Review --- p.13 / Chapter 3.1 --- Searching Sequences under Euclidean Metric --- p.13 / Chapter 3.2 --- Searching Sequences under Dynamic Time Warping Metric --- p.17 / Chapter 4 --- Subsequence Matching under Time Warping --- p.21 / Chapter 4.1 --- Subsequence Matching --- p.22 / Chapter 4.1.1 --- Sequential Search --- p.22 / Chapter 4.1.2 --- Indexing Scheme --- p.23 / Chapter 4.2 --- Lower Bound Technique --- p.25 / Chapter 4.2.1 --- Properties of Lower Bound Technique --- p.26 / Chapter 4.2.2 --- Existing Lower Bound Functions --- p.27 / Chapter 4.3 --- Point-Based indexing --- p.28 / Chapter 4.3.1 --- Lower Bound for subsequences matching --- p.28 / Chapter 4.3.2 --- Algorithm --- p.35 / Chapter 4.4 --- Rectangle-Based indexing --- p.37 / Chapter 4.4.1 --- Lower Bound for subsequences matching --- p.37 / Chapter 4.4.2 --- Algorithm --- p.41 / Chapter 4.5 --- Experimental Results --- p.43 / Chapter 4.5.1 --- Candidate ratio vs Width of warping window --- p.44 / Chapter 4.5.2 --- CPU time vs Number of subsequences --- p.45 / Chapter 4.5.3 --- CPU time vs Width of warping window --- p.46 / Chapter 4.5.4 --- CPU time vs Threshold --- p.46 / Chapter 4.6 --- Summary --- p.47 / Chapter 5 --- Relevance Feedback under Time Warping --- p.49 / Chapter 5.1 --- Integrating Relevance Feedback with DTW --- p.49 / Chapter 5.2 --- Query Reformulation --- p.53 / Chapter 5.2.1 --- Constraint Updating --- p.53 / Chapter 5.2.2 --- Weight Updating --- p.55 / Chapter 5.2.3 --- Overall Strategy --- p.58 / Chapter 5.3 --- Experiments and Evaluation --- p.59 / Chapter 5.3.1 --- Effectiveness of the strategy --- p.61 / Chapter 5.3.2 --- Efficiency of the strategy --- p.63 / Chapter 5.3.3 --- Usability --- p.64 / Chapter 5.4 --- Summary --- p.71 / Chapter 6 --- Conclusion --- p.72 / Chapter A --- Deduction of Data Bounding Hyper-rectangle --- p.74 / Chapter B --- Proof of Theorem2 --- p.76 / Bibliography --- p.77 / Publications --- p.84 Time-series analysis Database searching Information retrieval
160	Named entity translation matching and learning with mining from multilingual news. January 2004 (has links) Cheung Pik Shan. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2004. / Includes bibliographical references (leaves 79-82). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Named Entity Translation Matching --- p.2 / Chapter 1.2 --- Mining New Translations from News --- p.3 / Chapter 1.3 --- Thesis Organization --- p.4 / Chapter 2 --- Related Work --- p.5 / Chapter 3 --- Named Entity Matching Model --- p.9 / Chapter 3.1 --- Problem Nature --- p.9 / Chapter 3.2 --- Matching Model Investigation --- p.12 / Chapter 3.3 --- Tokenization --- p.15 / Chapter 3.4 --- Hybrid Semantic and Phonetic Matching Algorithm --- p.16 / Chapter 4 --- Phonetic Matching Model --- p.22 / Chapter 4.1 --- Generating Phonetic Representation for English --- p.22 / Chapter 4.1.1 --- Phoneme Generation --- p.22 / Chapter 4.1.2 --- Training the Tagging Lexicon and Transformation Rules --- p.25 / Chapter 4.2 --- Generating Phonetic Representation for Chinese --- p.29 / Chapter 4.3 --- Phonetic Matching Algorithm --- p.31 / Chapter 5 --- Learning Phonetic Similarity --- p.37 / Chapter 5.1 --- The Widrow-Hoff Algorithm --- p.39 / Chapter 5.2 --- The Exponentiated-Gradient Algorithm --- p.41 / Chapter 5.3 --- The Genetic Algorithm --- p.42 / Chapter 6 --- Experiments on Named Entity Matching Model --- p.43 / Chapter 6.1 --- Results for Learning Phonetic Similarity --- p.44 / Chapter 6.2 --- Results for Named Entity Matching --- p.46 / Chapter 7 --- Mining New Entity Translations from News --- p.48 / Chapter 7.1 --- Metadata Generation --- p.52 / Chapter 7.2 --- Discovering Comparable News Cluster --- p.54 / Chapter 7.2.1 --- News Preprocessing --- p.54 / Chapter 7.2.2 --- Gloss Translation --- p.55 / Chapter 7.2.3 --- Comparable News Cluster Discovery --- p.62 / Chapter 7.3 --- Named Entity Cognate Generation --- p.64 / Chapter 7.4 --- Entity Matching --- p.66 / Chapter 7.4.1 --- Matching Algorithm --- p.66 / Chapter 7.4.2 --- Matching Result Production --- p.68 / Chapter 8 --- Experiments on Mining New Translations --- p.69 / Chapter 9 --- Experiments on Context-based Gloss Translation --- p.72 / Chapter 9.1 --- Results on Chinese News Translation --- p.73 / Chapter 9.2 --- Results on Arabic News Translation --- p.75 / Chapter 10 --- Conclusions and Future Work --- p.77 / Bibliography --- p.79 / A --- p.83 / B --- p.85 / C --- p.87 / D --- p.89 / E --- p.91 / F --- p.94 / G --- p.95 Machine translating Cross-language information retrieval

Search results