Spelling suggestions: "subject:"byconcept mining"" "subject:"c.concept mining""
1 |
Automatic Text Ontological Representation and Classification via Fundamental to Specific Conceptual Elements (TOR-FUSE)Razavi, Amir Hossein 16 July 2012 (has links)
In this dissertation, we introduce a novel text representation method mainly used for text classification purpose. The presented representation method is initially based on a variety of closeness relationships between pairs of words in text passages within the entire corpus. This representation is then used as the basis for our multi-level lightweight ontological representation method (TOR-FUSE), in which documents are represented based on their contexts and the goal of the learning task. The method is unlike the traditional representation methods, in which all the documents are represented solely based on the constituent words of the documents, and are totally isolated from the goal that they are represented for. We believe choosing the correct granularity of representation features is an important aspect of text classification. Interpreting data in a more general dimensional space, with fewer dimensions, can convey more discriminative knowledge and decrease the level of learning perplexity. The multi-level model allows data interpretation in a more conceptual space, rather than only containing scattered words occurring in texts. It aims to perform the extraction of the knowledge tailored for the classification task by automatic creation of a lightweight ontological hierarchy of representations. In the last step, we will train a tailored ensemble learner over a stack of representations at different conceptual granularities. The final result is a mapping and a weighting of the targeted concept of the original learning task, over a stack of representations and granular conceptual elements of its different levels (hierarchical mapping instead of linear mapping over a vector). Finally the entire algorithm is applied to a variety of general text classification tasks, and the performance is evaluated in comparison with well-known algorithms.
|
2 |
Automatic Text Ontological Representation and Classification via Fundamental to Specific Conceptual Elements (TOR-FUSE)Razavi, Amir Hossein 16 July 2012 (has links)
In this dissertation, we introduce a novel text representation method mainly used for text classification purpose. The presented representation method is initially based on a variety of closeness relationships between pairs of words in text passages within the entire corpus. This representation is then used as the basis for our multi-level lightweight ontological representation method (TOR-FUSE), in which documents are represented based on their contexts and the goal of the learning task. The method is unlike the traditional representation methods, in which all the documents are represented solely based on the constituent words of the documents, and are totally isolated from the goal that they are represented for. We believe choosing the correct granularity of representation features is an important aspect of text classification. Interpreting data in a more general dimensional space, with fewer dimensions, can convey more discriminative knowledge and decrease the level of learning perplexity. The multi-level model allows data interpretation in a more conceptual space, rather than only containing scattered words occurring in texts. It aims to perform the extraction of the knowledge tailored for the classification task by automatic creation of a lightweight ontological hierarchy of representations. In the last step, we will train a tailored ensemble learner over a stack of representations at different conceptual granularities. The final result is a mapping and a weighting of the targeted concept of the original learning task, over a stack of representations and granular conceptual elements of its different levels (hierarchical mapping instead of linear mapping over a vector). Finally the entire algorithm is applied to a variety of general text classification tasks, and the performance is evaluated in comparison with well-known algorithms.
|
3 |
運用社會網絡技術由文集中探勘觀念:以新青年為例 / Concept Discovery from Essays based on Social Network Mining: Using New Youth as an Example陳柏聿, Chen, Po Yu Unknown Date (has links)
以往人文歷史領域的學者們,以土法煉鋼的人工方式進行資料的研究與分析,這樣的方法在資料量不大的時候還可行,但隨著數位典藏的進行以及巨量資料的興起,傳統的書本、古籍和文獻大量的數位化,若繼續使用傳統逐條分析的方式便會花費很多的時間與人力,但也因為資料數位化的關係,資訊領域的人便能利用資訊技術從旁進行協助。
而其中在觀念史研究領域裡,關鍵詞叢的研究是其中的重點之一,因為觀念可以用關鍵詞或含關鍵詞的句子來表達,所以研究關鍵詞就能幫助人文學者,了解史料文獻背後的意義與掌握當時的脈絡。因此本篇論文研究之目的在於針對收錄多篇文章的文集,探討詞彙與詞彙之間出現在文章中的情形,並利用五種的共現關係,將社群網絡的概念引入到文本分析之中,將每個詞彙當作節點,詞彙之間的關聯性當作邊建立詞彙網絡,從中找出詞彙所形成的觀念,最後實作一個由文集中探勘觀念的系統,此系統主要提供三種分析功能,分別是多詞彙觀念查詢、單詞彙觀念查詢與潛在觀念探勘。
本研究主要以《新青年》雜誌作為主要的觀察文集與實驗案例分析,《新青年》中觀念由自由主義轉向馬克思列寧主義,而我們利用本系統的確能夠找出變化的軌跡,以及探勘兩個觀念下的關鍵詞彙。 / With development of the digital archives, essays have been digitized. While it takes much time to analyze the contents of essays by human, it is beneficial to analyze by computer. This thesis aims to investigate the approach to discover concepts of essays based on social network mining techniques. While a concept can be represented as a set of keywords, the proposed approach measure the co-occurrence relationships between two keywords and represent the relationships among keywords by networks of keywords. Social network mining techniques are employed to discover the concepts of essays. We also develop the concept discovery system which provides discovery by multiple keywords, discovery by single keyword, and latent concept mining. The New Youth is taken as an example to demonstrate the capability of the developed system.
|
4 |
Automatic Text Ontological Representation and Classification via Fundamental to Specific Conceptual Elements (TOR-FUSE)Razavi, Amir Hossein January 2012 (has links)
In this dissertation, we introduce a novel text representation method mainly used for text classification purpose. The presented representation method is initially based on a variety of closeness relationships between pairs of words in text passages within the entire corpus. This representation is then used as the basis for our multi-level lightweight ontological representation method (TOR-FUSE), in which documents are represented based on their contexts and the goal of the learning task. The method is unlike the traditional representation methods, in which all the documents are represented solely based on the constituent words of the documents, and are totally isolated from the goal that they are represented for. We believe choosing the correct granularity of representation features is an important aspect of text classification. Interpreting data in a more general dimensional space, with fewer dimensions, can convey more discriminative knowledge and decrease the level of learning perplexity. The multi-level model allows data interpretation in a more conceptual space, rather than only containing scattered words occurring in texts. It aims to perform the extraction of the knowledge tailored for the classification task by automatic creation of a lightweight ontological hierarchy of representations. In the last step, we will train a tailored ensemble learner over a stack of representations at different conceptual granularities. The final result is a mapping and a weighting of the targeted concept of the original learning task, over a stack of representations and granular conceptual elements of its different levels (hierarchical mapping instead of linear mapping over a vector). Finally the entire algorithm is applied to a variety of general text classification tasks, and the performance is evaluated in comparison with well-known algorithms.
|
Page generated in 0.0757 seconds