Global ETD Search

1	知識倉儲的知識結構之研究-以某行政部門為例盧美惠 Unknown Date (has links) 隨著資訊科技的蓬勃發展，經由資訊媒介的傳播，造成了企業或組織內部資訊大量的累積。因此，知識倉儲(knowledge repository)可以說是儲存各類型文件的儲存庫，主要用來管理和組織各類型資訊，例如資料庫、報告、文件、表單，都可以數位化方式儲存在知識倉儲，其功能在於進行組織內部各類型文件知識內容管理，進而協助組織提供網路服務(web service)，包括：提供目錄、索引以協助使用者尋找資訊的檢索服務以及辨識和確認資訊位址的定址服務。因此，當知識不斷地從組織運作之中產生，知識與資訊的量也跟著不斷增加，如何管理這些知識就益顯重要，包括知識的表達、結構、儲存與取用方式等。在這篇論文中，本研究試圖整理對於知識倉儲的『知識結構』之相關或背景知識，並針對文件彼此之間的相互參照關係以及索引典建立知識地圖(knowledge map)，進一步將領域的相關知識，如術語或關連性等資料儲存成有結構性之知識，利用此領域知識對於文件內容附加上有語意關係之處理，在進行資訊檢索時，從而利用領域知識結構以協助使用者準確地檢索與查詢有用或相關之資訊內容。本研究運用檔案管理全宗理論及控制層級(control level)，提出因應組織結構改變之檔案系統目錄結構，劃分全宗、系列、案卷、文件等層次，知識倉儲系統藉由文件虛擬位址(DL)以及文件實體位址(URL)之對映，以處理組織結構改變之動態文件管理。本研究進一步針對具有關連的一組文件進行案卷內部分類，利用所分析之案卷類型結構，描述具有單一文件以及具有複合文件概念之文件，包括：會議記錄、法令規章等，並運用都柏林核心集(Dublin Core)描述文件資料建立Metadata結構，然後透過索引典(Thesaurus)詞彙語意關係之處理，提供概念性之語意資訊檢索。知識倉儲知識結構索引典知識本體 Knowledge Repository Knowledge Structure Thesaurus Ontology Dublin Core Metadata
2	以進階的文句推薦方式使用語料庫做為英文寫作之輔助 / A Sentence Recommendation Approach to Using Corpora for English Writing Assistance 洪培鈞, Hung, Pei Chun Unknown Date (has links) 對於大部分的人而言，寫作是一種深度的表達過程，需要詳盡而深入的描述能力及精準而嚴謹的語意呈現。對於非英語為母語的學習者(ESL/EFL)來說，英語寫作尤其是一個困難的過程，常常會因為單字、搭配字(collocation)、句型結構等方面的認知不足，或是受到母語認知的牽制影響，而造成用詞、語法、甚至語意的錯誤。近年來，語料庫的發展帶來了豐富的語言資源，不僅可以對語言的使用提供許多統計分析上的資訊，也可以做為語言學習者在特定詞語使用上的參照對象。目前語料庫的詞語使用參照以concordance技術為主，提供以特定字詞為基準的上下文例句排列，這種功能對ESL/EFL學習者於寫作過程所提供的參照仍然相當有限；而對於不同的查詢條件，若沒有良善的輸入和比對機制，往往搜尋上的數量和例句品質無法滿足學習者的參照需求；除此之外，對於檢索結果語料庫也沒有評估與排序的概念，學習者往往需要在大量的資訊中篩選出有用的資訊；綜觀以上幾點，目前語料庫技術無法滿足不同程度的ESL/EFL寫作者在參照協助上的需求。本研究提出一種文句推薦方法，針對ESL/EFL寫作者在寫作過程上的認知不足或不確定，從語料庫中尋找出可用的參照例句，進而提供針對性的協助。我們的文句推薦方法包含三個模組，第一模組是一個彈性的表達元素模組，能針對寫作過程中的語言資訊需求，以規定的表達元素呈現作者的認知需求。第二模組是一個檢索模組，指從語料庫中比對尋找符合作者語意需求的例句。第三模組是一個排序的模組，針對找出的例句，評估其符合作者語意需求的程度，並將之排序，以提升例句參照的使用效率。本研究依據文句推薦方法實做雛形系統，以British National Corpus和科學人雜誌語料庫(Science America)當例句來源，並依據論文的客觀評估和學習者的問卷調查兩種評估方式來評量雛形系統的成效。經由上述兩種評估方式，本研究驗證雛形系統能針對ESL/EFL於寫作過程中給予一定程度的協助成效，證明本研究的文句推薦方法確實能達成寫作協助的既定目標。 / For most people, writing is a process of profundity. It requires well description of capabilities and excellent semantic of representations. For ESL/EFL (English as a Second Language/English as a Foreign Language) learners, English writing is a particularly difficult process. They have error in words, syntax or semantics because their expressions are constrained by their mother tongue or by lacking of the awareness of vocabularies, collocation, or sentence-structures. In recent years, according to the development of corpus technique, we have rich resource in language. The resource can not only provide many statistical analyses of informations in language research but also can be used as a reference in the use of specific words for the learners. Current corpus technique use concordance which displays the words with their surrounding text to present sentences, but that approach provides very limited references in the writing process for ESL/EFL learners. Also considering the different inquired queries, if we have no well-defined input and retrieval module, the quantity and quality of results could not meet the demand for the learners. In addition, corpus technique doesn’t provide the ranking and evaluating skills to the retrieval results. As a result of that, learners often require filtering a large number of information to find something useful. In view of the above points, current corpus technique is unable to satisfy different degrees of ESL/EFL learners in the writing process. This paper presents a Sentence Recommendation Approach technique. With regard to the inadequate knowledge in the writing process for ESL/EFL learners, we search for useful sentences from the corpus and provide them to learners. Our Sentence Recommendation Approach technique has three modules: one is Expression Element Module. It allows the learners to use some expression elements to represent their demand. Another is Retrieval Module. It means the process of searching the corresponding sentences based on the expression elements in Expression Element Module. The last is Sort Module. It means the process of ranking the results derived from the Retrieval Module. Our research establishes the experimental system to verify the performance of our Sentence Recommendation Approach technique. We use British National Corpus and Science America Corpus for the source of sentences. The evaluation is divided into our objective evaluation and the questionnaire evaluation. Both of them prove that our experimental system does some favor in the writing process for ESL/EFL learners. In other words, our Sentence Recommendation Approach technique really helps the learners in the writing process. 寫作協助語料庫資訊檢索語索引推薦式系統 Writing Assistance Corpora Information Retrieval Concordance Recommender System
3	引用文獻索引資料庫之比較研究 / A Comparison Study of Citation Indexing Database 陳薇竹, Chen, Wei-Chu Unknown Date (has links) 引用文獻索引資料庫在Institute for Scientific Information(ISI)建置了Science Citation Index(SCI)與Social Science Citation Index(SSCI)以後，逐漸為學術界所重視，也帶動了傳統商業公司發展引用文獻索引資料庫之風潮，其中又以Science Citation Index-Expanded(SCIE)及後起之Scopus最為人稱道。但由於傳統商業公司對學者及圖書館收費過於高昂，引起學者及圖書館的反動，興起一陣由計畫及少數商業公司所發展，開放存取引用文獻索引資料庫之風潮，其中又以Google公司製作的Google Scholar，及NEC公司隨著計畫建置的CiteSeer最受人注目。 / 本研究採取實作法為研究方法，評比四個引用文獻索引資料庫的檢索介面及檢索細項之優劣。並以美國計算機械學會(ACM)頒發的杜林獎之50位得主為樣本，對SCIE、Scopus、CiteSeer及Google Scholar四個引用文獻索引資料庫進行作者檢索，逐一過濾檢索結果後，針對正確的檢索結果進行分析，比較四個引用文獻索引資料庫內部重複性與完整性，並交叉比對四個引用文獻索引資料庫兩兩比較之重複性、獨特性及完整性，並歸納造成此研究結果之原因。 / 研究結果發現SCIE與Scopus的檢索方式較容易，不會造成使用者太大的負擔，檢索方式也較為多元詳盡，其中又以Scopus的作者檢索使用最方便；而Google Scholar及CiteSeer皆主要利用一簡潔的檢索列，較難精準的檢索出所需資料。收錄資料完整度方面，Google Scholar收錄資料最多元，SCIE則涵蓋最完整之學術資源。交叉比對結果可得知，Google Scholar之資料獨特性最高；CiteSeer之收錄資料完整度最低。此外除了SCIE以外，其他三個引文索引資料庫皆收錄大量的網路資源。此外，美國計算機械學會的出版品則在四個引文索引資料庫中，皆扮演重要角色。 / 根據研究結果，對此四個引文索引資料庫提出建議，希望傳統商業引文索引資料庫能增加索引網路資源，並調整收費政策；開放存取引文索引資料庫應改正其書目著錄格式；希望圖書館能增加對引文索引資料庫使用之推廣，並教導使用者正確利用開放存取引文索引資料庫。 / 引文索引資料庫索引之文獻，已對學術評鑑造成很大的影響。圖書館應實地使用並引導使用者正確的利用引文索引資料庫，及使用網路資源的正確觀念。如此方可協助使用者不在浩瀚之網路資源中迷失。 / After Institute for Scientific Information(ISI) made Science Citation Index(SCI) and Social Science Citation Index(SSCI), Scholars progressively took notice of citation indexing databases. Commercial Companies also had begun to expand citation indexing database like the famous products are Science Citation Index-Expanded(SCIE). However, the commercial companies charged too much for using the database. So it excited the development of open access(OA) citation indexing database, instant of Google Scholar and CiteSeer. / OA means that people can use these citation indexing database for free. This paper aims to adopt comparison as four databases’ retrieval interface, and unique and overlap of documents of the subjects of computing machinery and electrical engineering. The research subjects are composed of OA and traditional commercial citation indexing database in the follow: SCIE, Scopus, Google Scholar, and CiteSeer. Moreover, this research retrieved all documents of Turing award winners in the four citation indexing databases, in order to examine these four citation indexing databases’ unique and overlap. / As a consequence, this study provides the findings as follows : Firstly, traditional commercial citation indexing databases (SCIE and Scopus) have the easier retrieval interface and various searching forms. The Google Scholar collects more multiform resources of retrieval results, and SCIE completed collects scholarly literatures. We make a comparison to find that Google Scholar has much more unique data, but CiteSeer is completely less in four citation indexing databases. Besides SCIE, another three citation indexing databases conclude a large number of internet data. Finally, publications of The Association of Computing Machinery(ACM) play an important role in the four citation indexing databases. 引文索引資料庫重複性學術出版開放存取 Citation Indexing Database Overlap Scholarly Publishing Open Access
4	在高度分散式環境下對高維度資料建立索引 / Indexing high-dimensional data in highly distributed environments 黃齡葦, Huang, Ling Wei Unknown Date (has links) 目前，隨著資料急速地增加，大規模可擴充性的高度分散式資料庫服務已逐漸成為一種趨勢。在資料如此分散的環境下，如何讓資料的查詢更有效率，建立一個好的索引扮演著相當重要的角色，加上越來越多的資料庫程式應用像是生物、圖像、音樂和視訊等等，皆是處理高維度的資料，而在這些應用程式中，經常需要做相似資料的查詢，但是在高維度的資料且分散式的資料做相似資料的查詢，需耗費大量的時間與運算成本。基於在高度分散式的環境下，針對高維度的資料有效地做KNN的查詢。我們提出一個利用reference point[2,13]的作法RP-CAN( Reference Point-Content Addressable Network )來改善查詢的效率。RP-CAN 主要是結合CAN [14] 的路由協定和使用reference point建立索引的方式來幫助在高度分散式環境下有效率的對高維的資料做查詢處理。最後會實作出我們所提出的RP-CAN索引並與RT-CAN[1]做比較。我們發現我們所提出的RP-CAN索引在高維度資料作KNN的查詢時比RT-CAN索引來的有效率。 / There has been an increasing interest in deploying a storage system in a highly distributed environment because of the rapid increasing data. And many database applications such as time series, biological and multimedia database, handle high-dimensional data. In these systems, k nearest-neighbors query is one of the most frequent queries but costly operation that is to find objects in the high-dimensional database that are similar to a given query object. As in conventional DBMS, indexes can indeed improve query performance but cannot deploy directly in highly distributed systems because the environment has become more complex. To efficiently support k nearest-neighbors query, a high-dimensional indexing strategy, is developed for the highly distributed environment. In this paper, we propose an efficient indexing strategy, RP-CAN( Reference Point-Content Addressable Network ), to improve the performance of the k nearest-neighbors query in a highly distributed environment. In the end of this paper, we designed an experiment to demonstrate that the performance of RP-CAN is better than RT-CAN in high dimensional space. Thus, our RP-CAN index could efficiently handle the high dimensional data. 索引 KNN查詢高度分散式環境參考點 Index P2P Distributed Enviroments Reference Point
5	子樹查詢的索引結構設計 / PCS-trie: An Index Structure for Sub-Tree Query 張詩宜, Shih-i Chang Unknown Date (has links) 隨著電腦以及網際網路的普及，越來越多各領域的資料被數位化，利用電腦幫助儲存及管理資料。有許多資料在數位化的過程中，採用tree的資料結構來表達以及儲存。也因此，如何查詢這些龐大的資料，就成為重要的課題。在本論文中，我們針對具有rooted、labeled以及ordered或unordered等特性的tree結構資料的索引問題，提出稱之為PCS-trie之全新的主記憶體資料庫(Main Memory Database)索引結構，並提出相關的增刪資料及搜尋演算法，以達成加速處理sub-tree query之目標。此索引方法的基礎，在於將tree database中的tree編碼為可完整代表其結構的PREOD code字串，之後再以我們所提出的PCS-trie加以索引。PCS-trie索引結構支援資料庫的動態增刪，且我們也提供了有效率的新增及刪除資料的演算法。本論文中也提出了多種搜尋演算法，使能夠在PCS-trie中進行exact matching、處理query tree含有don’t care部分之查詢、以及fault tolerant等不同類型的查詢。最後，我們以實驗的方法，配合人工產生的實驗資料，來對PCS-trie索引方法的時間及空間等各方面的效率加以檢驗。 / With the popularization of computer and the Internet, more and more data in various domains are digitized in order to take advantage of the power of computing and storing, and use the Internet to spread these informationz. Many data use tree structure to store them in the process of digitization. For this reason, it is a challenge to deal with these enormous data. In this paper, we present an approach to the search problem for these rooted labeled trees. We show a novel index structure, PREOD code Search trie (PCS-trie), and related algorithms for construction and search of PCS-trie, to speed up the sub-tree query to a tree database. The fundamental of this reaseach is to encode trees in tree database by PREOD code, in which the structure information of a tree can be reserved completely. Then we index these PREOD codes by PCS-trie. PCS-trie supports dynamic insertion and deleteion of PREOD codes. PCS-trie can handle three different types of query requirements: exact sub-tree query, query with don’t cares, and fault-tolerant query. Finally, we have conducted a series of experiments to evaluate the performance of PCS-trie and related algorithms. Experimental results obtained by running our techniques on synthetic data demonstrate the good performance of the proposed approach. 索引結構樹子樹查詢資訊檢索可延伸式標記語言 Index tructure Tree Sub-tree query Information Retrieval XML
6	透過專利、學術論文分析技術發展趨勢－以蝕刻技術為例 / Technology Trends Analysis via Patent and Scientific Publication - A Case Study of Etching 徐竣祈 Unknown Date (has links) 競爭是現代社會中無所不在的行為，國家或企業透過產業競爭分析、企業競爭分析，乃至市場分析及技術預測（Technological Forecast），才能知己知彼並且擬定正確的決策。對科技產業而言，若企業無法隨時掌握技術發展的趨勢，儘早投入技術研發或調整企業的經營策略。不久之後，市場便會被其他競爭對手所佔據。所幸，沒有一項技術發明是直接由發明者的腦袋直接跳到廣泛應用的境地。其間總是經過好幾個連續階段，每一個階段都使得「實用性」及「有用性」更成熟。因此若能掌握科技發展的脈絡，早期投入研發，便能維持企業的競爭優勢。專利資訊可以用來評估與預測技術發展、規劃研發或技術發展項目、避免誤觸專利權而浪費研發資源、掌握企業發展動向及市場需求。許多企業和政府機關已經發現專利分析的重要性，並且投入大量的人力、物力來進行專利分析的工作。然而，專利的申請日和公開日之間還是存在至少18個月的時滯，若企業過渡倚賴專利資訊的分析，容易使後續的研發資源投入競爭激烈的技術紅海當中。因此若要充分掌握前瞻技術發展的脈動，基礎研究趨勢分析相對於專利趨勢分析，其重要性有過之而無不及。在分析方法方面，現存的書目記錄以科學與技術類佔大多數，因此，以書目計量學為工具，自然成為研究「科學」技術整體發展的主流。除了傳統的計量分析之外，利用自動化的方法，挖掘大量文件中的隱含及有用知識，也是最近熱門的研究議題。對探勘技術而言，關聯分析、分群、預測等探勘技術，也漸漸成為技術預測不可或缺的工具之一。過去曾有眾多的研究利用書目計量來分析學術論文或專利資訊，而最近幾年則陸續出現利用文字探勘來分析學術論文或專利資訊，但這樣的分析結果是片段而不完全的。本研究提出整合性的概念，同時結合計量分析（Bibliometrics）與文字探勘（Text Mining）兩種方法，分別對學術論文（Science Citation Index Expanded）與專利資訊（Derwent World Patents Index）這兩種文獻資料作分析，透過互相比較來瞭解技術發展的趨勢。除此之外，也希望透過個案分析，對本研究所提出之方法論本身，探討之間的關聯性。在選擇個案方面，奈米科技是目前最熱門的科技產業發展方向。其中最具代表性的產業即「半導體產業」和「微機電系統產業」。蝕刻（Etching）製程與設備技術的良劣，直接影響晶圓產品良率的高低，是影響奈米科技未來發展的重要技術之一。故蝕刻技術之發展趨勢值得深入研究。而本研究之研究結論如下： 1.技術趨勢分析在層次上宜由遠（計量分析）而近（文獻探勘），理論（論文資料）與應用（專利資料）應並重，分析結果才能互補長短。 2.科學發展與市場需求為專利技術生命週期的領先指標。 3.科學發展增加技術商業化的應用，但市場需求則強化了創新擴散的效果。 4.蝕刻技術的基礎研究目前處於成熟期，而技術發展目前處於成長期。在電子產品輕薄短小、高效能的需求下，預期蝕刻技術將持續被商業化應用。 5.蝕刻技術的領先地位，美商已逐漸被亞洲企業所取代，尤其近來南韓的半導體廠商最為積極。台灣的台積電和聯電過去已累積雄厚的技術發展基礎，惟台灣在基礎研究與產學合作方面仍待加強。書目計量文字探勘專利科學引文索引技術預測蝕刻技術 Bibliometrics Text Mining Patent Science Citation Index Technological Forecast Etching
7	人文社會科學引文索引資料庫之系統結構與欄位設計研究 / The structure and field description of humanity and social science citation database research 林佳怡, Lin, Chia Yi Unknown Date (has links) 本研究旨在針對WoS、Scopus、CSSCI、CCD、TSSCI、THCI與ACI等七個引文索引資料庫，觀察與統計現有引文索引資料庫之系統文獻來源、查詢功能、索引欄位、輸出欄位、引文分析功能及其檢索結果分析與個人化功能等，藉此瞭解現有引文索引資料庫之現況與不足之處，並輔以資料庫使用者、圖書館館員以及資料庫建置專家之訪談，瞭解建置人文社會科學引文索引資料庫之阻礙、後續維護管理問題、推廣問題以及與國際接軌之建議作法，進而對人文社會科學引文索引資料庫的建置提出建議。　　本研究結果歸納如下：（1）系統文獻來源為綜合性較佳，較能建構出較完整的引文網路。（2）查詢功能、索引欄位、輸出欄位、引文分析、檢索結果分析與個人化功能宜參考WoS、Scopus與CCD三個建置較齊全之資料庫，提供使用者較全面的系統功能。（3）建置阻礙包括參考文獻引文格式的差異問題、拖刊問題、選刊問題、建檔的人力與資源問題、以及名稱的權威檔等問題。（4）後續維護管理問題包括人力與經費、資料庫的推廣、資料庫所採取的營運模式問題、資料的來源問題、以及引文資料的建檔問題。（5）若欲推廣資料庫，必須建置功能及內容完整之資料庫，讓使用者感受到資料庫的查詢具有引文網路的架構。（6）資料庫在建置時就須參考國外資料庫在處理引文資料庫時所注重的重點，例如各項索引與輸出欄位等，將來與國外接軌或合併的可行性才有可能提高。　　本研究結果可供期刊出版商提升自我品質之參考，亦可對建置人文社會科學引文索引資料庫各項系統功能提出建置建議，希冀有朝一日能建置出媲美國外大型引文索引資料庫的臺灣人文社會科學引文索引資料庫。 / Citation index databases have been explored and made impact on academic research for several decades. The Web of Science (WOS) of Thomson Reuters ISI is one of the most well-known citation databases in the world, and the Scopus of Elsevier is also a famous citation databases. In addition, the Mainland China has also developed Chinese Social Sciences Citation Index (CSSCI) database and Chinese Citation Database (CCD) since recent twenty years. Under the sponsorship of the National Science Council in Taiwan, the Taiwan Social Science Research Center established the Taiwan Social Sciences Citation Index (TSSCI) database in 1996 and the Center for Humanities Research built up Taiwan Humanities Citation Index (THCI) database in 1999. On the other hand, Airiti Incorporation has developed Academic Citation Index (ACI). However, the application of TSSCI, THCI and ACI reveals many design flaws and use limitations for both databases, some search functions are different and not available, and it is therefore difficult for users to get information through the same search interface. It is truly important to construct an integrate citation index database for Taiwan humanity and social science researchers. Working with an information service company, the purpose of the present study is to propose a design plan to establish a Taiwan humanity and social science citation database. 　　The following issues will be investigated in this study: 1. Collecting source literature and citation literature from humanity and social science journals publishing by Taiwan academic institutes. 2. Designing the structure of citation index system, types and contents of the database. 3. Setting up the standards for description on database fields. 4. Developing the basic, advanced and other information retrieval functions. 5. Seeking for the integration with WoS to fulfill the goal of information resources sharing and the promotion of global visualization. 6. Building the quantitative indicator for evaluating the humanity and social science research. 引文索引資料庫引文分析 WoS Scopus CSSCI CCD TSSCI THCI ACI Citation index databases Citation analysis WoS Scopus CSSCI CCD TSSCI THCI ACI

1

Page generated in 0.0163 seconds