Spelling suggestions: "subject:"document mining"" "subject:"ocument mining""
1 |
Extracting Episodic Knowledge from Documents to Support Decision MakingChuang, Kun-Han 27 July 2006 (has links)
Knowledge management is an important weapon for business competition. Many organizations are adopting knowledge management systems. For knowledge management, document management is its key foundation. There is a large amount of procedural knowledge existing in decision documents. This knowledge can illustrate the process and considerations in a decision situation, called episodic. The episodic knowledge can help decision makers understand historical decision process and considerations for future decision making. Therefore, how to discover decision episodes from existing documents is a major research issue in knowledge management.
This research proposes a method for episode mining that integrates automatic document summary techniques, knowledge ontology, and index structures to build the relations and processes of events, and use the Gantt Chart and Flow Chart to portray event processes. We build a prototype system and use a news event as our example to illustrate the feasibility of the proposed approach and demonstrate the results.
|
2 |
A Semantic Graph Model for Text Representation and Matching in Document MiningShaban, Khaled January 2006 (has links)
The explosive growth in the number of documents produced daily necessitates the development of effective alternatives to explore, analyze, and discover knowledge from documents. Document mining research work has emerged to devise automated means to discover and analyze useful information from documents. This work has been mainly concerned with constructing text representation models, developing distance measures to estimate similarities between documents, and utilizing that in mining processes such as document clustering, document classification, information retrieval, information filtering, and information extraction. <br /><br /> Conventional text representation methodologies consider documents as bags of words and ignore the meanings and ideas their authors want to convey. It is this deficiency that causes similarity measures to fail to perceive contextual similarity of text passages due to the variation of the words the passages contain, or at least perceive contextually dissimilar text passages as being similar because of the resemblance of words the passages have. <br /><br /> This thesis presents a new paradigm for mining documents by exploiting semantic information of their texts. A formal semantic representation of linguistic inputs is introduced and utilized to build a semantic representation scheme for documents. The representation scheme is constructed through accumulation of syntactic and semantic analysis outputs. A new distance measure is developed to determine the similarities between contents of documents. The measure is based on inexact matching of attributed trees. It involves the computation of all distinct similarity common sub-trees, and can be computed efficiently. It is believed that the proposed representation scheme along with the proposed similarity measure will enable more effective document mining processes. <br /><br /> The proposed techniques to mine documents were implemented as vital components in a mining system. A case study of semantic document clustering is presented to demonstrate the working and the efficacy of the framework. Experimental work is reported, and its results are presented and analyzed.
|
3 |
A Semantic Graph Model for Text Representation and Matching in Document MiningShaban, Khaled January 2006 (has links)
The explosive growth in the number of documents produced daily necessitates the development of effective alternatives to explore, analyze, and discover knowledge from documents. Document mining research work has emerged to devise automated means to discover and analyze useful information from documents. This work has been mainly concerned with constructing text representation models, developing distance measures to estimate similarities between documents, and utilizing that in mining processes such as document clustering, document classification, information retrieval, information filtering, and information extraction. <br /><br /> Conventional text representation methodologies consider documents as bags of words and ignore the meanings and ideas their authors want to convey. It is this deficiency that causes similarity measures to fail to perceive contextual similarity of text passages due to the variation of the words the passages contain, or at least perceive contextually dissimilar text passages as being similar because of the resemblance of words the passages have. <br /><br /> This thesis presents a new paradigm for mining documents by exploiting semantic information of their texts. A formal semantic representation of linguistic inputs is introduced and utilized to build a semantic representation scheme for documents. The representation scheme is constructed through accumulation of syntactic and semantic analysis outputs. A new distance measure is developed to determine the similarities between contents of documents. The measure is based on inexact matching of attributed trees. It involves the computation of all distinct similarity common sub-trees, and can be computed efficiently. It is believed that the proposed representation scheme along with the proposed similarity measure will enable more effective document mining processes. <br /><br /> The proposed techniques to mine documents were implemented as vital components in a mining system. A case study of semantic document clustering is presented to demonstrate the working and the efficacy of the framework. Experimental work is reported, and its results are presented and analyzed.
|
4 |
Constructing Event Ontology and Episodic Knowledge from DocumentYang, Yi-cheng 20 July 2007 (has links)
Knowledge is an increasingly important asset for organizational competition, and knowledge management becomes the most important issue for an organization. Building knowledge ontology is a good solution to increase knowledge reusability. Ontology explicitly defines concepts and their relationships, which can facilitate user understanding and further analysis.
Based on previous research (Wu, 2006; Chuang, 2006), this research proposes a refined method for the construction of event ontology. The method includes text pre-processing, event ontology construction, and event ontology presentation. The text pre-processing module includes POS tagger, word filter, and term analysis. Based on the concept of sub-event, we can build a 3-level architecture of event ontology that includes sub-events, events, and topics in the event ontology construction module. Event ontology construction module developed in the project provides a friendly editing environment for the user to edit the concepts and attributes of an event that may cover ¡§who,¡¨ ¡§what,¡¨ ¡§where,¡¨ and ¡§what object.¡¨ In the event ontology presentation module, event episode may be illustrated by event frames, flow charts, and Gantt charts.
To verify the feasibility of the proposed method, a prototype system has been built. The Alexander Poison Event was used as an example to demonstrate the value of the prototype system.
|
5 |
由職官年表中利用循序共現樣式探勘人脈網絡 / Social network analysis from official chronology using sequential co-occurrence pattern mining宋邡熏, Song, Fang Shiun Unknown Date (has links)
在政治權力結構中,權臣與派系在其政治人物的社會網絡中扮演重要的角色。本論文研究由職官年表中探勘權臣與派系。我們提出資料探勘演算法由職官年表中探勘循序共現樣式,以探勘出政府官員官職陞貶的共現關係。接著根據所探勘出的循序共現樣式,建立官員之間的社會網絡。透過社會網絡分析中的網絡中心性與社群偵測分別探勘出權臣與派系。本論文以清康熙時期的職官年表實驗驗證。透過視覺化分析顯示本論文所提出的方法有助於歷史學者的研究。 / In a power structure, chief officials and cliques play important roles in the social network and have high influence on politics. This thesis proposes an approach of social network mining from official chronologies to discover the chief officials and the cliques. We propose and develop the algorithm to discover the sequential co-occurrence patterns from official chronologies. Then the social network is constructed based on the discovered sequential co-occurrence patterns. Chief officials are discovered by network centrality analysis while cliques are discovered by community analysis of the constructed social network. The official chronology of Kangxi Emperor is taken as an example for experiments and the visualization analysis demonstrates that the proposed methods are helpful to assist historian for historical research.
|
Page generated in 0.0767 seconds