1 |
網頁地理關聯性之分析與研究 / The Analysis of Geographic Relations of Internet Information黃建達, Huang, Jian Da Unknown Date (has links)
近幾年來,有關地理資訊的網頁搜尋越來越受到重視。傳統的網頁搜尋引擎無法反應使用者查詢和網頁文件之間的地理關聯性。在一些情況下,我們希望網路搜尋引擎能夠考慮使用者查詢與網頁文件間的地理相關性,以提升搜尋的準確度。
我們的研究透過包圍矩形模型(Bounding Rectangle Model;BR Model)以搜尋與使用者查詢之地理相關程度較高的網頁文件。 使用者僅需輪入文字的查詢,即能得到相符結果的網頁文件。首先,我們建立一個地名辭典以找出使用者查詢與網頁文件內出現的地名及空間資料,接著我們利用空間資料建立空間索引項(spatial index term)集合,用來表示使用者查詢與網頁文件內的地理範圍,最後再透過使用者查詢與網頁文件的空間索引項集合計算兩者之間的地理相似程度,以找出與使用者查詢有較高地理關聯性的網頁文件。
此篇論文的貢獻在於我們提一套完整資訊檢索模型架構的方法,分析使用者查詢與網頁文件之間的地理關聯性,使用者藉由輸入文字查詢即能得到相符地理關聯性的網頁文件。 / Geographic web search becomes increasingly popular in recent years. Traditional web search engine, such as Google and Yahoo, can not accommodate geographic relevance between user queries and internet documents. Hence, they can not retrieve geographic related information from user queries. However, in many cases, the geographic relevance between user queries and internet documents could enhance the accuracy of this type of searches.
In this thesis, we propose a mechanism that uses the Bounding Rectangle Model (BR Model) to retrieve geographic relevant internet documents in response to user queries. Users provide only the conventional input queries (keywords) and our search engine will return the geographic relevant results. Our method can be classified into the following three steps. In the first step, we create a gazetteer and use it to relate the user query’s geographic terms in internet documents. In the next step, we use the spatial data to build a set of spatial index terms that represents the geographic scope of user query and internet documents. And then we use these spatial index terms to calculate degree of geographic similarity between user query and internet documents to identify highly relevant geographic internet documents.
We implemented a prototype search engine using our approach. The experiment results show that we can successfully retrieve geographic relevant data through this mechanism and provide more accurate search results.
|
Page generated in 0.0156 seconds