Spelling suggestions: "subject:"ford mismatch"" "subject:"ford dismatch""
1 |
Investigations of Term Expansion on Text Mining TechniquesYang, Chin-Sheng 02 August 2002 (has links)
Recent advances in computer and network technologies have contributed significantly to global connectivity and stimulated the amount of online textual document to grow extremely rapidly. The rapid accumulation of textual documents on the Web or within an organization requires effective document management techniques, covering from information retrieval, information filtering and text mining. The word mismatch problem represents a challenging issue to be addressed by the document management research. Word mismatch has been extensively investigated in information retrieval (IR) research by the use of term expansion (or specifically query expansion). However, a review of text mining literature suggests that the word mismatch problem has seldom been addressed by text mining techniques. Thus, this thesis aims at investigating the use of term expansion on some text mining techniques, specifically including text categorization, document clustering and event detection. Accordingly, we developed term expansion extensions to these three text mining techniques. The empirical evaluation results showed that term expansion increased the categorization effectiveness when the correlation coefficient feature selection was employed. With respect to document clustering, techniques extended with term expansion achieved comparable clustering effectiveness to existing techniques and showed its superiority in improving clustering specificity measure. Finally, the use of term expansion for supporting event detection has degraded the detection effectiveness as compared to the traditional event detection technique.
|
2 |
Cluster-based Query Expansion TechniqueHuang, Chun-Neng 14 August 2003 (has links)
As advances in information and networking technologies, huge amount of information typically in the form of text documents are available online. To facilitate efficient and effective access to documents relevant to users¡¦ information needs, information retrieval systems have been imposed a more significant role than ever. One challenging issue in information retrieval is word mismatch that refers to the phenomenon that concepts may be described by different words in user queries and/or documents. The word mismatch problem, if not appropriately addressed, would degrade retrieval effectiveness critically of an information retrieval system.
In this thesis, we develop a cluster-based query expansion technique to solve the word mismatch problem. Using the traditional query expansion techniques (i.e., global analysis and local feedback) as performance benchmarks, the empirical results suggest that when a user query only consists of one query term, the global analysis technique is more effective. However, if a user query consists of two or more query terms, the cluster-based query expansion technique can provide a more accurate query result, especially within the first few top-ranked documents retrieved.
|
3 |
Fuzzy Cluster-Based Query ExpansionTai, Chia-Hung 29 July 2004 (has links)
Advances in information and network technologies have fostered the creation and availability of a vast amount of online information, typically in the form of text documents. Information retrieval (IR) pertains to determining the relevance between a user query and documents in the target collection, then returning those documents that are likely to satisfy the user¡¦s information needs. One challenging issue in IR is word mismatch, which occurs when concepts can be described by different words in the user queries and/or documents. Query expansion is a promising approach for dealing with word mismatch in IR.
In this thesis, we develop a fuzzy cluster-based query expansion technique to solve the word mismatch problem. Using existing expansion techniques (i.e., global analysis and non-fuzzy cluster-based query expansion) as performance benchmarks, our empirical results suggest that the fuzzy cluster-based query expansion technique can provide a more accurate query result than the benchmark techniques can.
|
Page generated in 0.0638 seconds