Global ETD Search

1	針對複合式競賽挑選最佳球員組合的方法 / Selecting the best group of players for a composite competition 鄧雅文, Teng, Ya Wen Unknown Date (has links) 在資料庫的處理中，top-k查詢幫助使用者從龐大的資料中萃取出具有價值的物件，它將資料庫中的物件依照給分公式給分後，選擇出分數最高的前k個回傳給使用者。然而在多數的情況下，一個物件也許不只有一個分數，要如何在多個分數中仍然選擇出整體最高分的前k個物件，便成為一個新的問題。在本研究中，我們將這樣的物件用不確定資料來表示，而每個物件的不確定性則是其帶有機率的分數以表示此分數出現的可能性，並提出一個新的問題：Best-kGROUP查詢。在此我們將情況模擬為一個複合式競賽，其中有多個子項目，每個項目的參賽人數各異，且最多需要k個人參賽；我們希望能針對此複合式競賽挑選出最佳的k個球員組合。當我們定義一個較佳的組合為其在較多項目居首位的機率比另一組合高，而最佳的組合則是沒有比它更佳的組合。為了加快挑選的速度，我們利用動態規劃的方式與篩選的演算法，將不可能的組合先剔除；所剩的組合則是具有天際線特質的組合，在這些天際線組合中，我們可以輕易的找出最佳的組合。此外，在實驗中，對於在所有球員中挑選最佳的組合，Best-kGROUP查詢也有非常優異的表現。 / In a large database, top-k query is an important mechanism to retrieve the most valuable information for the users. It ranks data objects with a ranking function and reports the k objects with the highest scores. However, when an object has multiple scores, how to rank objects without information loss becomes challenging. In this paper, we model the object with multiple scores as an uncertain data object and the uncertainty of the object as a distribution of the scores, and consider a novel problem named Best-kGROUP query. Imagine the following scenario. Assume there is a composite competition consisting of several games each of which requires a distinct number of players. Suppose the largest number is k, and we want to select the best group of k players from all the players for the competition. A group x is considered better than another group y if x has higher aggregated probability to be the top ones in more games than y. In order to speed up the selection process, the groups worse than another group definitely should first be discarded. We identify these groups using a dynamic programming based approach and a filtering algorithm. The remaining groups with the property that none of them have higher aggregated probability to be the top ones for all games against the other groups are called skyline groups. From these skyline groups, we can easily compare them to select the best group for the composite competition. The experiments show that our approach outperforms the other approaches in selecting the best group to defeat the other groups in the composite competitions. Top-k查詢不確定資料 Best-kGROUP查詢 Top-k query uncertain data Best-kGROUP query skyline kGROUP
2	從搜尋引擎查詢紀錄中學習Ontology / Ontology Learning from Query Logs of Search Engines 陳茂富 Unknown Date (has links) Ontology可用來組織、管理與分享知識，Ontology Engineering是一種建構Ontology的過程，建構的過程中，多數的工作需要人費時費力地去完成，因此利用機器來輔助Ontology Engineering成了一門重要的課題。使用Knowledge Discovery的方法協助Ontology Engineering建構Ontology的過程，稱為Ontology Learning，本論文中提出的Ontology Learning方法為分析使用者在搜尋引擎下關鍵字查詢時的行為，加上利用與查詢關鍵字有關的網頁資訊，以輔助建構Ontology。本論文中的Ontology由使用者所查詢的關鍵字組成，我們要learning的，則是這些關鍵字彼此之間的關係，其中有上義詞、下義詞與同義詞等等，因此，自動尋找關鍵字彼此之間的關係以輔助建構Ontology，即為我們提出本論文的目的。除此之外，本論文亦實作了完整的Ontology Learning系統，從一開始使用者查詢記錄的蒐集，關鍵字擷取與分析，關鍵字之間的關係判定，直到最後Ontology的產生，都將由系統自動完成。 / Ontology can be used to organize, manage and share knowledge. Ontology Engineering is the process of constructing Ontology. However, it’s usually a time-consuming and error-prone task. Thus, utilizing methods of Knowledge Discovery to help Ontology Engineering is called Ontology Learning. In this thesis, Ontology Learning process is done by using those pages related query terms and analyzing the querying behavior of users on search engines. The Ontology is organized by user query terms and relations among them. These relations we define are hyperonomy, hyponomy, synonymy and et al. Our goal of this thesis is to automatically learn the correct relations among these query terms. Besides, we implemented the complete system platform for Ontology Learning. The system can automatically collect logs, extract and analyze query keywords, and produce the final Ontology. 搜尋引擎查詢紀錄學習Ontology Ontology learning Query log
3	根據食材搭配與替代關係設計食譜搜尋的自動完成機制 / Autocomplete Mechanism for Recipe Search by Ingredients Based on Ingredient Complement and Substitution 周冠嶔, Chou, Kuan Chin Unknown Date (has links) 「民以食為天」，飲食與我們的生活息息相關。近年來由於食安風暴肆虐，自行烹煮的需求隨之高漲。然而在家自行烹煮時常會面臨不知道該烹煮什麼料理的問題，因此有便利的食譜搜尋系統對烹煮的人而言將是相當方便的。然而使用搜尋系統時，由於我們只知道想用某些特定食材進行烹煮，而不知道哪些食譜含有特定食材，因此在以少數食材進行查詢時不免會得到過多的食譜結果而難以快速找到喜好的食譜。我們建立了一個食譜搜尋的自動完成機制，並依照該機制實做出了食譜搜尋引擎。使用者使用系統進行搜尋時，我們將會依照使用者輸入的食材尋找適合搭配的食材推薦給使用者，幫助使用者在查詢時使用更完整的Query讓搜尋系統可以找到更少更精準的食譜，幫助使用者更快的找到喜歡的食譜。然而只推薦搭配性食材，可能會推薦出與Query中的食材是替代關係的食材，也就是通常不會一起出現的食材，因此我們也進行了替代性食材的研究。給定由兩個食材組成的食材配對，我們研究如何自動的判斷替代性食材。我們將問題轉化成分類問題來解決，並使用One-Class Classification的技術解決分類問題中的Imbalanced Problem。我們使用f1-score觀看One-Class Classification與傳統分類器的比較。經實驗測試，One Class Classification與傳統分類器相比，One Class Classification較能協助我們解決Imbalanced Problem。資料採掘查詢詞自動完成食譜搜尋引擎 Data Mining Query Autocomplete Recipe Search Engine
4	地理資訊系統在不動產查詢與分析上之應用王琬宜, Wang, Wan-I Unknown Date (has links) 本研究整合不動產相關之資訊及相關之空間資料，以使用者的觀點，利用地理資訊系統輔助建立一套結合不動產本身條件及其空間關係的不動產查詢分析系統。本研究所開發之不動產查詢分析系統，在地理資料庫之建置上採用ARC/INFO軟體，納入了購屋考量之基本屬性因素及空間環境屬性因素，以兼顧不動產之「點」及「面」的資訊。系統的開發則採用Visual Basic（VB） 6.0中文企業版。本系統在開發上未使用任何商業地理資訊系統軟體，而是以Visual Basic撰寫程式方式開發完成，除了具有開發使用之載台成本低廉，工具取得容易等優點外，其最大的效益在於系統的可攜性及可推廣性。本系統最大之功能特色在於使用者可自行選擇所需的個案條件，以及依據其對各因素之重視程度自行決定各因素之權重。地理資訊系統不動產查詢與分析 Geographic Information Systems (GIS) Real Estate Query and Analysis
5	在高度分散式環境下對高維度資料建立索引 / Indexing high-dimensional data in highly distributed environments 黃齡葦, Huang, Ling Wei Unknown Date (has links) 目前，隨著資料急速地增加，大規模可擴充性的高度分散式資料庫服務已逐漸成為一種趨勢。在資料如此分散的環境下，如何讓資料的查詢更有效率，建立一個好的索引扮演著相當重要的角色，加上越來越多的資料庫程式應用像是生物、圖像、音樂和視訊等等，皆是處理高維度的資料，而在這些應用程式中，經常需要做相似資料的查詢，但是在高維度的資料且分散式的資料做相似資料的查詢，需耗費大量的時間與運算成本。基於在高度分散式的環境下，針對高維度的資料有效地做KNN的查詢。我們提出一個利用reference point[2,13]的作法RP-CAN( Reference Point-Content Addressable Network )來改善查詢的效率。RP-CAN 主要是結合CAN [14] 的路由協定和使用reference point建立索引的方式來幫助在高度分散式環境下有效率的對高維的資料做查詢處理。最後會實作出我們所提出的RP-CAN索引並與RT-CAN[1]做比較。我們發現我們所提出的RP-CAN索引在高維度資料作KNN的查詢時比RT-CAN索引來的有效率。 / There has been an increasing interest in deploying a storage system in a highly distributed environment because of the rapid increasing data. And many database applications such as time series, biological and multimedia database, handle high-dimensional data. In these systems, k nearest-neighbors query is one of the most frequent queries but costly operation that is to find objects in the high-dimensional database that are similar to a given query object. As in conventional DBMS, indexes can indeed improve query performance but cannot deploy directly in highly distributed systems because the environment has become more complex. To efficiently support k nearest-neighbors query, a high-dimensional indexing strategy, is developed for the highly distributed environment. In this paper, we propose an efficient indexing strategy, RP-CAN( Reference Point-Content Addressable Network ), to improve the performance of the k nearest-neighbors query in a highly distributed environment. In the end of this paper, we designed an experiment to demonstrate that the performance of RP-CAN is better than RT-CAN in high dimensional space. Thus, our RP-CAN index could efficiently handle the high dimensional data. 索引 KNN查詢高度分散式環境參考點 Index P2P Distributed Enviroments Reference Point
6	土地資訊系統中含時間維度之資料管理陳怡茹, Che,I-ju Unknown Date (has links) 隨著地籍管理工作在廣度深度上的增加，使得地籍、土地利用等歷史資料的查詢與檢索日益頻繁，但現階段地籍管理系統之建置方法對於歷史資料查詢之效率較差。本文在探討含時間維度資料庫之建置方法，並利用異動時間與異動註記等欄位及異動紀錄表之建立，令目前地籍管理系統可達到以時間維度，或指定地號，進行地籍資料查詢與檢索。 / The more cadastral management is developed, the more historic data query of cadastre and land use are important. The primary purpose of this study is concentrated on the recoveries of parcel linage and areas on any time profile. The targets of this research are placed on design of spatio-temporal data model, and the analysis of efficiency of data recovery and query. 地籍管理時空資料模型資料查詢與檢索 Cadastral management Spatio-temporal data model Data query and recovery
7	用於混合式耐延遲網路之適地性服務資料搜尋方法 / Location-based content search approach in hybrid delay tolerant networks 李欣諦, Lee, Hsin Ti Unknown Date (has links) 在耐延遲網路上，離線的使用者，可以透過節點的相遇，以點對點之特定訊息繞送方法，將資訊傳遞至目的地。如此解決了使用者暫時無法上網時欲傳遞資訊之困難。因此，在本研究中，當使用者在某一地區，欲查詢該地區相關之資訊，但又一時無法連上網際網路時，則可透過耐延遲網路之特性，尋求其它同樣使用本服務之使用者幫忙以達到查詢之目的。本論文提出一適地性服務之資料搜尋方法，以三層式區域概念，及混合式節點型態，並透過資料訊息複製、查詢訊息複製、資料回覆及資料同步等四項策略來達成使用者查詢之目的。特別在訊息傳遞方面，提出一訊息佇列選擇演算法，賦予優先權概念於每一訊息中，使得較為重要之訊息得以優先傳送，藉此提高查詢之成功率及減少查詢之延遲時間。最後，我們將本論文方法與其它查詢方法比較評估效能，其模擬結果顯示我們提出的方法有較優的查詢效率與延遲。 / In Delay Tolerant Networks (DTNs), the offline users can, through the encountering nodes, use the specific peer-to-peer message routing approach to deliver messages to the destination. Thus, it solves the problem that users have the demands to deliver messages while they are temporarily not able to connect to Internet. Therefore, by the characteristics of DTNs, people who are not online can still query some location based information, with the help of users using the same service in the nearby area. In this thesis, we proposed a Location-based content search approach. Based on the concept of three-tier area and hybrid node types, we presented four strategies to solve the query problem. They are Data Replication, Query Replication, Data Reply and Data synchronization strategies. Especially in message transferring, we proposed a Message Queue Selection algorithm. We set the priority concept to every message such that the most important one could be sent first. In this way, it can increase the query success ratio and reduce the query delay time. Finally, we evaluated our approach, and compared with other routing schemes. The simulation results showed that our proposed approach had better query efficiency and shorter delay. 耐延遲網路適地性服務內容查詢路由協定 Delay Tolerant Networks Location-based Content Query Routing protocol
8	影像內容檢索中以社群網絡演算法為基礎之多張影像搜尋 / Query by Multiple Images for Content-Based Image Retrieval Based on Social Network Algorithms 張瑋鈴, Chang, Wei Ling Unknown Date (has links) 近年來，隨著數位科技快速的發展，影像資料量迅速的增加，因此影像檢索成為重要的多媒體技術之一。在傳統的影像內容檢索技術中，使用影像低階特徵值，例如顏色(Color)、紋理(Texture)、形狀(Shape)等來描述影像的內容並進行圖片相似度的比對。然而，傳統的影像內容檢索僅提供單張影像查詢，很少研究多張影像的查詢。因此，本研究提出一個可針對多張影像查詢的方法以提供多張影像查詢的影像內容檢索。本研究將影像內容檢索結合社群網絡演算法，使用MPEG-7中相關特徵描述子和SIFT做為主要特徵向量，擷取影像的低階影像特徵，透過特徵相似度計算建立影像之間的網絡，並利用社群網絡演算法找出與多張查詢影像相似的影像。實驗結果顯示所提出的方法可精確的擷取到相似的影像。 / In recent years, with the faster and faster development of computer technology, the number of digital images is grown rapidly so that the Content-Based Image Retrieval has become one of important multimedia technologies. Much research has been done on Content-Based Image Retrieval. However, little research has been done on query by multiple images. This thesis investigates the mechanism for query by multiple images. First, MPEG-7 image features and SIFT are extracted from images. Then, we calculate the similarity of images to construct the proximity graph which represents the similarity structure between images. Last, processing of query by multiple images is achieved based on the social network algorithms. Experimental results indicate the proposed method provides high accuracy and precision. 影像內容檢索多張影像查詢社群網絡 Content-Based Image Retrieval Multiple Images Search Social Network
9	子樹查詢的索引結構設計 / PCS-trie: An Index Structure for Sub-Tree Query 張詩宜, Shih-i Chang Unknown Date (has links) 隨著電腦以及網際網路的普及，越來越多各領域的資料被數位化，利用電腦幫助儲存及管理資料。有許多資料在數位化的過程中，採用tree的資料結構來表達以及儲存。也因此，如何查詢這些龐大的資料，就成為重要的課題。在本論文中，我們針對具有rooted、labeled以及ordered或unordered等特性的tree結構資料的索引問題，提出稱之為PCS-trie之全新的主記憶體資料庫(Main Memory Database)索引結構，並提出相關的增刪資料及搜尋演算法，以達成加速處理sub-tree query之目標。此索引方法的基礎，在於將tree database中的tree編碼為可完整代表其結構的PREOD code字串，之後再以我們所提出的PCS-trie加以索引。PCS-trie索引結構支援資料庫的動態增刪，且我們也提供了有效率的新增及刪除資料的演算法。本論文中也提出了多種搜尋演算法，使能夠在PCS-trie中進行exact matching、處理query tree含有don’t care部分之查詢、以及fault tolerant等不同類型的查詢。最後，我們以實驗的方法，配合人工產生的實驗資料，來對PCS-trie索引方法的時間及空間等各方面的效率加以檢驗。 / With the popularization of computer and the Internet, more and more data in various domains are digitized in order to take advantage of the power of computing and storing, and use the Internet to spread these informationz. Many data use tree structure to store them in the process of digitization. For this reason, it is a challenge to deal with these enormous data. In this paper, we present an approach to the search problem for these rooted labeled trees. We show a novel index structure, PREOD code Search trie (PCS-trie), and related algorithms for construction and search of PCS-trie, to speed up the sub-tree query to a tree database. The fundamental of this reaseach is to encode trees in tree database by PREOD code, in which the structure information of a tree can be reserved completely. Then we index these PREOD codes by PCS-trie. PCS-trie supports dynamic insertion and deleteion of PREOD codes. PCS-trie can handle three different types of query requirements: exact sub-tree query, query with don’t cares, and fault-tolerant query. Finally, we have conducted a series of experiments to evaluate the performance of PCS-trie and related algorithms. Experimental results obtained by running our techniques on synthetic data demonstrate the good performance of the proposed approach. 索引結構樹子樹查詢資訊檢索可延伸式標記語言 Index tructure Tree Sub-tree query Information Retrieval XML
10	開放型XML資料庫績效評估工作量之模型 / An Open Workload Model for XML Database Benchmark 尤靖雅 Unknown Date (has links) XML (eXtensible Markup Language)是今日新興在網路上所使用的延伸性的標記語言。它具有豐富的語意表達及與展現方式獨立的資料獨立性。由於這些特性，使得XML成為新的資料交換標準並且其應用在資料庫中產生了許多新的研究議題在資料儲存和查詢處理上。在本篇研究中，將研究XML資料庫中的績效評估的議題並且發展一個可適用於不同應用領域及各種平台上的使用者導向且開放型工作量模式以評估XML資料庫績效。此XML開放型的工作量模型包含三個子模型─XML資料模型、查詢模型以及控制模型。XML資料模型將模式化XML文件中階層式的結構概念；查詢模型包含了一連串查詢模組以供測試XML資料庫的處理查詢能力以及一個開放型的查詢輸入介面以供使用者依照需求設定所需的測試查詢；控制模型中定義了一連串變數以供設定績效評估系統中的執行環境。我們發展此系統化且具開放型的工作量方法可以幫助各種不同應用領域的使用者預測及展現XML資料庫系統的績效。 / XML (eXtensible Markup Language) is the emerging data format for data processing on the Internet. XML provides a rich data semantics and data independence from the presentation. Thanks to these features, XML becomes a new data exchange standard and leads new storage and query processing issues on database research communities. In this paper, the performance evaluation issues on XML databases have been studied and a generic and requirement-driven XML workload model that is applicable to any application scenario or movable on various platforms is developed. There are three sub-models in this generic workload model, the XML data model, the query model, and the control model. The XML data model formulates the generic hierarchy structure of XML documents and supports a flexible document structure of the test database. The XML query model contains a flexible classical query module selector and an open query input to define the requirement-driven test query model to challenge the XML query processing ability of the XML database. The control model defines variables that are used to set up the implementation of a benchmark. This open, flexible, and systematic workload method permits users in various application domains to predicate or profile the performance of the XML database systems. XML資料庫績效評估工作量模型資料模型查詢模型控制模型 XML database performance evaluation workload model data model query model control model XML benchmark

1	針對複合式競賽挑選最佳球員組合的方法 / Selecting the best group of players for a composite competition 鄧雅文, Teng, Ya Wen Unknown Date (has links) 在資料庫的處理中，top-k查詢幫助使用者從龐大的資料中萃取出具有價值的物件，它將資料庫中的物件依照給分公式給分後，選擇出分數最高的前k個回傳給使用者。然而在多數的情況下，一個物件也許不只有一個分數，要如何在多個分數中仍然選擇出整體最高分的前k個物件，便成為一個新的問題。在本研究中，我們將這樣的物件用不確定資料來表示，而每個物件的不確定性則是其帶有機率的分數以表示此分數出現的可能性，並提出一個新的問題：Best-kGROUP查詢。在此我們將情況模擬為一個複合式競賽，其中有多個子項目，每個項目的參賽人數各異，且最多需要k個人參賽；我們希望能針對此複合式競賽挑選出最佳的k個球員組合。當我們定義一個較佳的組合為其在較多項目居首位的機率比另一組合高，而最佳的組合則是沒有比它更佳的組合。為了加快挑選的速度，我們利用動態規劃的方式與篩選的演算法，將不可能的組合先剔除；所剩的組合則是具有天際線特質的組合，在這些天際線組合中，我們可以輕易的找出最佳的組合。此外，在實驗中，對於在所有球員中挑選最佳的組合，Best-kGROUP查詢也有非常優異的表現。 / In a large database, top-k query is an important mechanism to retrieve the most valuable information for the users. It ranks data objects with a ranking function and reports the k objects with the highest scores. However, when an object has multiple scores, how to rank objects without information loss becomes challenging. In this paper, we model the object with multiple scores as an uncertain data object and the uncertainty of the object as a distribution of the scores, and consider a novel problem named Best-kGROUP query. Imagine the following scenario. Assume there is a composite competition consisting of several games each of which requires a distinct number of players. Suppose the largest number is k, and we want to select the best group of k players from all the players for the competition. A group x is considered better than another group y if x has higher aggregated probability to be the top ones in more games than y. In order to speed up the selection process, the groups worse than another group definitely should first be discarded. We identify these groups using a dynamic programming based approach and a filtering algorithm. The remaining groups with the property that none of them have higher aggregated probability to be the top ones for all games against the other groups are called skyline groups. From these skyline groups, we can easily compare them to select the best group for the composite competition. The experiments show that our approach outperforms the other approaches in selecting the best group to defeat the other groups in the composite competitions. Top-k查詢不確定資料 Best-kGROUP查詢 Top-k query uncertain data Best-kGROUP query skyline kGROUP
2	從搜尋引擎查詢紀錄中學習Ontology / Ontology Learning from Query Logs of Search Engines 陳茂富 Unknown Date (has links) Ontology可用來組織、管理與分享知識，Ontology Engineering是一種建構Ontology的過程，建構的過程中，多數的工作需要人費時費力地去完成，因此利用機器來輔助Ontology Engineering成了一門重要的課題。使用Knowledge Discovery的方法協助Ontology Engineering建構Ontology的過程，稱為Ontology Learning，本論文中提出的Ontology Learning方法為分析使用者在搜尋引擎下關鍵字查詢時的行為，加上利用與查詢關鍵字有關的網頁資訊，以輔助建構Ontology。本論文中的Ontology由使用者所查詢的關鍵字組成，我們要learning的，則是這些關鍵字彼此之間的關係，其中有上義詞、下義詞與同義詞等等，因此，自動尋找關鍵字彼此之間的關係以輔助建構Ontology，即為我們提出本論文的目的。除此之外，本論文亦實作了完整的Ontology Learning系統，從一開始使用者查詢記錄的蒐集，關鍵字擷取與分析，關鍵字之間的關係判定，直到最後Ontology的產生，都將由系統自動完成。 / Ontology can be used to organize, manage and share knowledge. Ontology Engineering is the process of constructing Ontology. However, it’s usually a time-consuming and error-prone task. Thus, utilizing methods of Knowledge Discovery to help Ontology Engineering is called Ontology Learning. In this thesis, Ontology Learning process is done by using those pages related query terms and analyzing the querying behavior of users on search engines. The Ontology is organized by user query terms and relations among them. These relations we define are hyperonomy, hyponomy, synonymy and et al. Our goal of this thesis is to automatically learn the correct relations among these query terms. Besides, we implemented the complete system platform for Ontology Learning. The system can automatically collect logs, extract and analyze query keywords, and produce the final Ontology. 搜尋引擎查詢紀錄學習Ontology Ontology learning Query log
3	根據食材搭配與替代關係設計食譜搜尋的自動完成機制 / Autocomplete Mechanism for Recipe Search by Ingredients Based on Ingredient Complement and Substitution 周冠嶔, Chou, Kuan Chin Unknown Date (has links) 「民以食為天」，飲食與我們的生活息息相關。近年來由於食安風暴肆虐，自行烹煮的需求隨之高漲。然而在家自行烹煮時常會面臨不知道該烹煮什麼料理的問題，因此有便利的食譜搜尋系統對烹煮的人而言將是相當方便的。然而使用搜尋系統時，由於我們只知道想用某些特定食材進行烹煮，而不知道哪些食譜含有特定食材，因此在以少數食材進行查詢時不免會得到過多的食譜結果而難以快速找到喜好的食譜。我們建立了一個食譜搜尋的自動完成機制，並依照該機制實做出了食譜搜尋引擎。使用者使用系統進行搜尋時，我們將會依照使用者輸入的食材尋找適合搭配的食材推薦給使用者，幫助使用者在查詢時使用更完整的Query讓搜尋系統可以找到更少更精準的食譜，幫助使用者更快的找到喜歡的食譜。然而只推薦搭配性食材，可能會推薦出與Query中的食材是替代關係的食材，也就是通常不會一起出現的食材，因此我們也進行了替代性食材的研究。給定由兩個食材組成的食材配對，我們研究如何自動的判斷替代性食材。我們將問題轉化成分類問題來解決，並使用One-Class Classification的技術解決分類問題中的Imbalanced Problem。我們使用f1-score觀看One-Class Classification與傳統分類器的比較。經實驗測試，One Class Classification與傳統分類器相比，One Class Classification較能協助我們解決Imbalanced Problem。資料採掘查詢詞自動完成食譜搜尋引擎 Data Mining Query Autocomplete Recipe Search Engine
4	地理資訊系統在不動產查詢與分析上之應用王琬宜, Wang, Wan-I Unknown Date (has links) 本研究整合不動產相關之資訊及相關之空間資料，以使用者的觀點，利用地理資訊系統輔助建立一套結合不動產本身條件及其空間關係的不動產查詢分析系統。本研究所開發之不動產查詢分析系統，在地理資料庫之建置上採用ARC/INFO軟體，納入了購屋考量之基本屬性因素及空間環境屬性因素，以兼顧不動產之「點」及「面」的資訊。系統的開發則採用Visual Basic（VB） 6.0中文企業版。本系統在開發上未使用任何商業地理資訊系統軟體，而是以Visual Basic撰寫程式方式開發完成，除了具有開發使用之載台成本低廉，工具取得容易等優點外，其最大的效益在於系統的可攜性及可推廣性。本系統最大之功能特色在於使用者可自行選擇所需的個案條件，以及依據其對各因素之重視程度自行決定各因素之權重。地理資訊系統不動產查詢與分析 Geographic Information Systems (GIS) Real Estate Query and Analysis
5	在高度分散式環境下對高維度資料建立索引 / Indexing high-dimensional data in highly distributed environments 黃齡葦, Huang, Ling Wei Unknown Date (has links) 目前，隨著資料急速地增加，大規模可擴充性的高度分散式資料庫服務已逐漸成為一種趨勢。在資料如此分散的環境下，如何讓資料的查詢更有效率，建立一個好的索引扮演著相當重要的角色，加上越來越多的資料庫程式應用像是生物、圖像、音樂和視訊等等，皆是處理高維度的資料，而在這些應用程式中，經常需要做相似資料的查詢，但是在高維度的資料且分散式的資料做相似資料的查詢，需耗費大量的時間與運算成本。基於在高度分散式的環境下，針對高維度的資料有效地做KNN的查詢。我們提出一個利用reference point[2,13]的作法RP-CAN( Reference Point-Content Addressable Network )來改善查詢的效率。RP-CAN 主要是結合CAN [14] 的路由協定和使用reference point建立索引的方式來幫助在高度分散式環境下有效率的對高維的資料做查詢處理。最後會實作出我們所提出的RP-CAN索引並與RT-CAN[1]做比較。我們發現我們所提出的RP-CAN索引在高維度資料作KNN的查詢時比RT-CAN索引來的有效率。 / There has been an increasing interest in deploying a storage system in a highly distributed environment because of the rapid increasing data. And many database applications such as time series, biological and multimedia database, handle high-dimensional data. In these systems, k nearest-neighbors query is one of the most frequent queries but costly operation that is to find objects in the high-dimensional database that are similar to a given query object. As in conventional DBMS, indexes can indeed improve query performance but cannot deploy directly in highly distributed systems because the environment has become more complex. To efficiently support k nearest-neighbors query, a high-dimensional indexing strategy, is developed for the highly distributed environment. In this paper, we propose an efficient indexing strategy, RP-CAN( Reference Point-Content Addressable Network ), to improve the performance of the k nearest-neighbors query in a highly distributed environment. In the end of this paper, we designed an experiment to demonstrate that the performance of RP-CAN is better than RT-CAN in high dimensional space. Thus, our RP-CAN index could efficiently handle the high dimensional data. 索引 KNN查詢高度分散式環境參考點 Index P2P Distributed Enviroments Reference Point
6	土地資訊系統中含時間維度之資料管理陳怡茹, Che,I-ju Unknown Date (has links) 隨著地籍管理工作在廣度深度上的增加，使得地籍、土地利用等歷史資料的查詢與檢索日益頻繁，但現階段地籍管理系統之建置方法對於歷史資料查詢之效率較差。本文在探討含時間維度資料庫之建置方法，並利用異動時間與異動註記等欄位及異動紀錄表之建立，令目前地籍管理系統可達到以時間維度，或指定地號，進行地籍資料查詢與檢索。 / The more cadastral management is developed, the more historic data query of cadastre and land use are important. The primary purpose of this study is concentrated on the recoveries of parcel linage and areas on any time profile. The targets of this research are placed on design of spatio-temporal data model, and the analysis of efficiency of data recovery and query. 地籍管理時空資料模型資料查詢與檢索 Cadastral management Spatio-temporal data model Data query and recovery
7	用於混合式耐延遲網路之適地性服務資料搜尋方法 / Location-based content search approach in hybrid delay tolerant networks 李欣諦, Lee, Hsin Ti Unknown Date (has links) 在耐延遲網路上，離線的使用者，可以透過節點的相遇，以點對點之特定訊息繞送方法，將資訊傳遞至目的地。如此解決了使用者暫時無法上網時欲傳遞資訊之困難。因此，在本研究中，當使用者在某一地區，欲查詢該地區相關之資訊，但又一時無法連上網際網路時，則可透過耐延遲網路之特性，尋求其它同樣使用本服務之使用者幫忙以達到查詢之目的。本論文提出一適地性服務之資料搜尋方法，以三層式區域概念，及混合式節點型態，並透過資料訊息複製、查詢訊息複製、資料回覆及資料同步等四項策略來達成使用者查詢之目的。特別在訊息傳遞方面，提出一訊息佇列選擇演算法，賦予優先權概念於每一訊息中，使得較為重要之訊息得以優先傳送，藉此提高查詢之成功率及減少查詢之延遲時間。最後，我們將本論文方法與其它查詢方法比較評估效能，其模擬結果顯示我們提出的方法有較優的查詢效率與延遲。 / In Delay Tolerant Networks (DTNs), the offline users can, through the encountering nodes, use the specific peer-to-peer message routing approach to deliver messages to the destination. Thus, it solves the problem that users have the demands to deliver messages while they are temporarily not able to connect to Internet. Therefore, by the characteristics of DTNs, people who are not online can still query some location based information, with the help of users using the same service in the nearby area. In this thesis, we proposed a Location-based content search approach. Based on the concept of three-tier area and hybrid node types, we presented four strategies to solve the query problem. They are Data Replication, Query Replication, Data Reply and Data synchronization strategies. Especially in message transferring, we proposed a Message Queue Selection algorithm. We set the priority concept to every message such that the most important one could be sent first. In this way, it can increase the query success ratio and reduce the query delay time. Finally, we evaluated our approach, and compared with other routing schemes. The simulation results showed that our proposed approach had better query efficiency and shorter delay. 耐延遲網路適地性服務內容查詢路由協定 Delay Tolerant Networks Location-based Content Query Routing protocol
8	影像內容檢索中以社群網絡演算法為基礎之多張影像搜尋 / Query by Multiple Images for Content-Based Image Retrieval Based on Social Network Algorithms 張瑋鈴, Chang, Wei Ling Unknown Date (has links) 近年來，隨著數位科技快速的發展，影像資料量迅速的增加，因此影像檢索成為重要的多媒體技術之一。在傳統的影像內容檢索技術中，使用影像低階特徵值，例如顏色(Color)、紋理(Texture)、形狀(Shape)等來描述影像的內容並進行圖片相似度的比對。然而，傳統的影像內容檢索僅提供單張影像查詢，很少研究多張影像的查詢。因此，本研究提出一個可針對多張影像查詢的方法以提供多張影像查詢的影像內容檢索。本研究將影像內容檢索結合社群網絡演算法，使用MPEG-7中相關特徵描述子和SIFT做為主要特徵向量，擷取影像的低階影像特徵，透過特徵相似度計算建立影像之間的網絡，並利用社群網絡演算法找出與多張查詢影像相似的影像。實驗結果顯示所提出的方法可精確的擷取到相似的影像。 / In recent years, with the faster and faster development of computer technology, the number of digital images is grown rapidly so that the Content-Based Image Retrieval has become one of important multimedia technologies. Much research has been done on Content-Based Image Retrieval. However, little research has been done on query by multiple images. This thesis investigates the mechanism for query by multiple images. First, MPEG-7 image features and SIFT are extracted from images. Then, we calculate the similarity of images to construct the proximity graph which represents the similarity structure between images. Last, processing of query by multiple images is achieved based on the social network algorithms. Experimental results indicate the proposed method provides high accuracy and precision. 影像內容檢索多張影像查詢社群網絡 Content-Based Image Retrieval Multiple Images Search Social Network
9	子樹查詢的索引結構設計 / PCS-trie: An Index Structure for Sub-Tree Query 張詩宜, Shih-i Chang Unknown Date (has links) 隨著電腦以及網際網路的普及，越來越多各領域的資料被數位化，利用電腦幫助儲存及管理資料。有許多資料在數位化的過程中，採用tree的資料結構來表達以及儲存。也因此，如何查詢這些龐大的資料，就成為重要的課題。在本論文中，我們針對具有rooted、labeled以及ordered或unordered等特性的tree結構資料的索引問題，提出稱之為PCS-trie之全新的主記憶體資料庫(Main Memory Database)索引結構，並提出相關的增刪資料及搜尋演算法，以達成加速處理sub-tree query之目標。此索引方法的基礎，在於將tree database中的tree編碼為可完整代表其結構的PREOD code字串，之後再以我們所提出的PCS-trie加以索引。PCS-trie索引結構支援資料庫的動態增刪，且我們也提供了有效率的新增及刪除資料的演算法。本論文中也提出了多種搜尋演算法，使能夠在PCS-trie中進行exact matching、處理query tree含有don’t care部分之查詢、以及fault tolerant等不同類型的查詢。最後，我們以實驗的方法，配合人工產生的實驗資料，來對PCS-trie索引方法的時間及空間等各方面的效率加以檢驗。 / With the popularization of computer and the Internet, more and more data in various domains are digitized in order to take advantage of the power of computing and storing, and use the Internet to spread these informationz. Many data use tree structure to store them in the process of digitization. For this reason, it is a challenge to deal with these enormous data. In this paper, we present an approach to the search problem for these rooted labeled trees. We show a novel index structure, PREOD code Search trie (PCS-trie), and related algorithms for construction and search of PCS-trie, to speed up the sub-tree query to a tree database. The fundamental of this reaseach is to encode trees in tree database by PREOD code, in which the structure information of a tree can be reserved completely. Then we index these PREOD codes by PCS-trie. PCS-trie supports dynamic insertion and deleteion of PREOD codes. PCS-trie can handle three different types of query requirements: exact sub-tree query, query with don’t cares, and fault-tolerant query. Finally, we have conducted a series of experiments to evaluate the performance of PCS-trie and related algorithms. Experimental results obtained by running our techniques on synthetic data demonstrate the good performance of the proposed approach. 索引結構樹子樹查詢資訊檢索可延伸式標記語言 Index tructure Tree Sub-tree query Information Retrieval XML
10	開放型XML資料庫績效評估工作量之模型 / An Open Workload Model for XML Database Benchmark 尤靖雅 Unknown Date (has links) XML (eXtensible Markup Language)是今日新興在網路上所使用的延伸性的標記語言。它具有豐富的語意表達及與展現方式獨立的資料獨立性。由於這些特性，使得XML成為新的資料交換標準並且其應用在資料庫中產生了許多新的研究議題在資料儲存和查詢處理上。在本篇研究中，將研究XML資料庫中的績效評估的議題並且發展一個可適用於不同應用領域及各種平台上的使用者導向且開放型工作量模式以評估XML資料庫績效。此XML開放型的工作量模型包含三個子模型─XML資料模型、查詢模型以及控制模型。XML資料模型將模式化XML文件中階層式的結構概念；查詢模型包含了一連串查詢模組以供測試XML資料庫的處理查詢能力以及一個開放型的查詢輸入介面以供使用者依照需求設定所需的測試查詢；控制模型中定義了一連串變數以供設定績效評估系統中的執行環境。我們發展此系統化且具開放型的工作量方法可以幫助各種不同應用領域的使用者預測及展現XML資料庫系統的績效。 / XML (eXtensible Markup Language) is the emerging data format for data processing on the Internet. XML provides a rich data semantics and data independence from the presentation. Thanks to these features, XML becomes a new data exchange standard and leads new storage and query processing issues on database research communities. In this paper, the performance evaluation issues on XML databases have been studied and a generic and requirement-driven XML workload model that is applicable to any application scenario or movable on various platforms is developed. There are three sub-models in this generic workload model, the XML data model, the query model, and the control model. The XML data model formulates the generic hierarchy structure of XML documents and supports a flexible document structure of the test database. The XML query model contains a flexible classical query module selector and an open query input to define the requirement-driven test query model to challenge the XML query processing ability of the XML database. The control model defines variables that are used to set up the implementation of a benchmark. This open, flexible, and systematic workload method permits users in various application domains to predicate or profile the performance of the XML database systems. XML資料庫績效評估工作量模型資料模型查詢模型控制模型 XML database performance evaluation workload model data model query model control model XML benchmark

Search results