偵測新興議題對於研究者而言是一個相當重要的問題,研究者如何在有限的時間和資源下探討同一領域內的新興議題將比解決已經成熟的議題帶來較大的貢獻和影響力。本研究將致力於協助研究者偵測新興且具有未來潛力的研究議題,並且從學術論文中探究對於研究者在做研究中有幫助的學術智慧。在搜尋可能具有研究潛力的議題時,我們假設具有研究潛力的議題將會由同一領域中較具有影響力的作者和刊物發表出,因此本研究使用貝式估計的方法去推估同一領域中相關的研究者和學術刊物對於該領域的影響力,進而藉由這些資訊可以找出未來具有潛力的新興候選議題。此外就我們所知的議題偵測文獻中對於認定一個議題是否已經趨於成熟或者是否新穎且具有研究的潛力仍然缺乏有效及普遍使用的衡量工具,因此本研究試圖去發展有效的衡量工具以評估議題就本身的發展生命週期是否仍然具有繼續投入的學術價值。
本研究從許多重要的資料庫中挑選了和資料探勘和資訊檢索相關的論文並且驗證這些在會議論文中所涵蓋的議題將會領導後續幾年期刊論文相似的議題。此外本研究也使用了一些已經存在的演算法並且結合這些演算法發展一個檢測的流程幫助研究者去偵測學術論文中的領導趨勢並發掘學術智慧。本研究使用貝式估計的方法試圖從已經發表的資訊和被引用的資訊來建構估計作者和刊物的影響力的事前機率與概似函數,並且計算出同一領域重要的作者和刊物的影響力,當這些作者和刊物的論文發表時將會相對的具有被觀察的價值,進而檢定這些新興候選議題是否會成為新興議題。而找出的重要研究議題雖然已經縮小探索的範圍,但是仍然有可能是發展成熟的議題使得具有影響力的作者和刊物都必須討論,因此需要評估議題未來潛力的指標或工具。然而目前文獻中對於評估議題成熟的方法僅著重在議題的出現頻率而忽視了議題的新穎度也是重要的指標,另一方面也有只為了找出新議題並沒有顧及這個議題是否具有未來的潛力。更重要的是單一的使用出現頻率的曲線只能在議題已經成熟之後才能確定這是一個重要的議題,使得這種方法成為落後的指標。
本研究試圖提出解決這些困境的指標進而發展成衡量新興議題潛力的方法。這些指標包含了新穎度指標、發表量指標和偵測點指標,藉由這些指標和曲線可以在新興議題的偵測中提供更多前導性的資訊幫助研究者去建構各自領域中新興議題的偵測標準。偵測點所代表的意義並非這個議題開始新興的正確日期,它代表了這個議題在自己發展的生命週期上最具有研究的潛力和價值的時間點,因此偵測點會根據後來的蓬勃發展而在時間上產生遞延的結果,這表示我們的指標可以偵測出議題生命力的延續。相對於傳統的次數分配曲線可以看出議題的崛起和衰退,本研究的發表量指標更能以生命週期的概念去看出議題在各個時間點的發展潛力。本研究希望從這些過程中所發現的學術智慧可以幫助研究者建構各自領域的議題偵測標準,節省大量人力與時間於探究新興議題。本研究所提出的新方法不僅可以解決影響因子這個指標的缺點,此外還可以使用作者和刊物的影響力去針對一個尚未累積任何索引次數的論文進行潛力偵測,解決Google 學術搜尋目前總是在論文已經被很多檢索之後才能確定論文重要性的缺點,學者總是希望能夠領先發現重要的議題或論文。然而,我們以議題為導向的檢索方法相信可以更確實的滿足研究者在搜尋議題或論文上的需求。 / This research presents endeavors that seek to identify the emerging topics for researchers and pinpoint research intelligence via academic papers. It is intended to reveal the connection between topics investigated by conference papers and journal papers which can help the research decrease the plenty of time and effort to detect all the academic papers. In order to detect the emerging research topics the study uses the Bayesian estimation approach to estimate the impact of the authors and publications may have on a topic and to discover candidate emerging topics by the combination of the impact authors and publications. Finally the research also develops the measurement tools which could assess the research potential of these topics to find the emerging topics.
This research selected huge of papers in data mining and information retrieval from well-known databases and showed that the topics covered by conference papers in a year often leads to similar topics covered by journal papers in the subsequent year and vice versa. This study also uses some existing algorithms and combination of these algorithms to propose a new detective procedure for the researchers to detect the new trend and get the academic intelligence from conferences and journals. The research uses the Bayesian estimation approach and citation analysis methods to construct the prior distribution and likelihood function of the authors and publications in a topic. Because the topics published by these authors and publications will get more attention and valuable than others. Researchers can assess the potential of these candidate emerging topics. Although the topics we recommend decrease the range of the searching space, these topics may so popular that even all of the impact authors and publications discuss it. The measurement tools or indices are need. But the current methods only focus on the frequency of subjects, and ignore the novelty of subjects which is critical and beyond the frequency study or only focus one of them and without considering the potential of the topics. Some of them only use the curve of published frequency will make the index as a backward one. This research tackles the inadequacy to propose a set of new indices of novelty for emerging topic detection. They are the novelty index (NI) and the published volume index (PVI). These indices are then utilized to determine the detection point (DP) of emerging topics. The detection point (DP) is not the real time which the topic starts to be emerging, but it represents the topic have the highest potential no matter in novelty or hotness for research in its life cycle. Different from the absolute frequent method which can really find the exact emerging period of the topic, the PVI uses the accumulative relative frequency and tries to detect the research potential timing of its life cycle. Following the detection points, the intersection decides the worthiness of a new topic. Readers following the algorithms presented this thesis will be able to decide the novelty and life span of an emerging topic in their field. The novel methods we proposed can improve the limitations of impact factor proposed by ISI. Besides, it uses the impact power of the authors and the publication in a topic to measure the impact power of a paper before it really has been an impact paper can solve the limitations of Google scholar’s approach. We suggest that the topic oriented thinking of our methods can really help the researchers to solve their problems of searching the valuable topics.
Identifer | oai:union.ndltd.org:CHENGCHI/G0094356509 |
Creators | 杜逸寧, Tu, Yi-Ning |
Publisher | 國立政治大學 |
Source Sets | National Chengchi University Libraries |
Language | 中文 |
Detected Language | English |
Type | text |
Rights | Copyright © nccu library on behalf of the copyright holders |
Page generated in 0.0027 seconds