Global ETD Search

1	A Semantic Web based search engine with X3D visualisation of queries and results Gkoutzis, Konstantinos January 2013 (has links) The Semantic Web project has introduced new techniques for managing information. Data can now be organised more efficiently and in such a way that computers can take advantage of the relationships that characterise the given input to present more relevant output. Semantic Web based search engines can quickly educe exactly what is needed to be found and retrieve it while avoiding information overload. Up until now, search engines have interacted with their users by asking them to look for words and phrases. We propose the creation of a new generation Semantic Web search engine that will offer a visual interface for queries and results. To create such an engine, information input must be viewed not merely as keywords, but as specific concepts and objects which are all part of the same universal system. To make the manipulation of the interconnected visual objects simpler and more natural, 3D graphics are utilised, based on the X3D Web standard, allowing users to semantically synthesise their queries faster and in a more logical way, both for them and the computer. 025.042
2	Enhanced Web Search Engines with Query-Concept Bipartite Graphs Chen, Yan 16 August 2010 (has links) With rapid growth of information on the Web, Web search engines have gained great momentum for exploiting valuable Web resources. Although keywords-based Web search engines provide relevant search results in response to users’ queries, future enhancement is still needed. Three important issues include (1) search results can be diverse because ambiguous keywords in queries can be interpreted to different meanings; (2) indentifying keywords in long queries is difficult for search engines; and (3) generating query-specific Web page summaries is desirable for Web search results’ previews. Based on clickthrough data, this thesis proposes a query-concept bipartite graph for representing queries’ relations, and applies the queries’ relations to applications such as (1) personalized query suggestions, (2) long queries Web searches and (3) query-specific Web page summarization. Experimental results show that query-concept bipartite graphs are useful for performance improvement for the three applications. Queries Query-concept bipartite graph Web search engine Text mining Computational intelligence. Computer Sciences
3	Using Wikipedia Knowledge and Query Types in a New Indexing Approach for Web Search Engines Al-Akashi, Falah Hassan Ali January 2014 (has links) The Web is comprised of a vast quantity of text. Modern search engines struggle to index it independent of the structure of queries and type of Web data, and commonly use indexing based on Web‘s graph structure to identify high-quality relevant pages. However, despite the apparent widespread use of these algorithms, Web indexing based on human feedback and document content is controversial. There are many fundamental questions that need to be addressed, including: How many types of domains/websites are there in the Web? What type of data is in each type of domain? For each type, which segments/HTML fields in the documents are most useful? What are the relationships between the segments? How can web content be indexed efficiently in all forms of document configurations? Our investigation of these questions has led to a novel way to use Wikipedia to find the relationships between the query structures and document configurations throughout the document indexing process and to use them to build an efficient index that allows fast indexing and searching, and optimizes the retrieval of highly relevant results. We consider the top page on the ranked list to be highly important in determining the types of queries. Our aim is to design a powerful search engine with a strong focus on how to make the first page highly relevant to the user, and on how to retrieve other pages based on that first page. Through processing the user query using the Wikipedia index and determining the type of the query, our approach could trace the path of a query in our index, and retrieve specific results for each type. We use two kinds of data to increase the relevancy and efficiency of the ranked results: offline and real-time. Traditional search engines find it difficult to use these two kinds of data together, because building a real-time index from social data and integrating it with the index for the offline data is difficult in a traditional distributed index. As a source of offline data, we use data from the Text Retrieval Conference (TREC) evaluation campaign. The web track at TREC offers researchers chance to investigate different retrieval approaches for web indexing and searching. The crawled offline dataset makes it possible to design powerful search engines that extends current methods and to evaluate and compare them. We propose a new indexing method, based on the structures of the queries and the content of documents. Our search engine uses a core index for offline data and a hash index for real-time V data, which leads to improved performance. The TREC Web track evaluation of our experiments showed that our approach can be successfully employed for different types of queries. We evaluated our search engine on different sets of queries from TREC 2010, 2011 and 2012 Web tracks. Our approach achieved very good results in the TREC 2010 training queries. In the TREC 2011 testing queries, our approach was one of the six best compared to all other approaches (including those that used a very large corpus of 500 million documents), and it was second best when compared to approaches that used only part of the corpus (50 million documents), as ours did. In the TREC 2012 testing queries, our approach was second best if compared to all the approaches, and first if compared only to systems that used the subset of 50 million documents. Web Search Engine Indexing and Searching Wikipedia Knowledge Term Impact Centralized Index Real Time and Social Index Query Structures User Interface
4	Uma arquitetura para mecanismos de buscas na web usando integração de esquemas e padrões de metadados heterogêneos de recursos educacionais abertos em repositórios dispersos / An architecture for web search engines using integration of heterogeneous metadata schemas and standards of open educational resources in scattered repositories Gazzola, Murilo Gleyson 18 November 2015 (has links) Recursos Educacionais Abertos (REA) podem ser definidos como materiais de ensino, aprendizagem e pesquisa, em qualquer meio de armazenamento, que estão amplamente disponíveis por meio de uma licença aberta que permite reuso, readequação e redistribuição sem restrições ou com restrições limitadas. Atualmente, diversas instituições de ensino e pesquisa têm investido em REA para ampliar o acesso ao conhecimento. Entretanto, os usuários ainda têm dificuldades de encontrar os REA com os mecanismos de busca atuais. Essa dificuldade deve-se principalmente ao fato dos mecanismos de busca na Web serem genéricos, pois buscam informação em qualquer lugar, desde páginas de vendas até materiais escritos por pessoas anônimas. De fato, esses mecanismos não levam em consideração as características intrínsecas de REA, como os diferentes padrões de metadados, repositórios e plataformas existentes, os tipos de licença, a granularidade e a qualidade dos recursos. Esta dissertação apresenta o desenvolvimento de um mecanismo de busca na Web especificamente para recuperação de REA denominado SeeOER. As principais contribuições desta pesquisa de mestrado consistem no desenvolvimento de um mecanismo de busca na Web por REA com diferenciais entre os quais se destacam a resolução de conflitos em nível de esquema oriundos da heterogeneidade dos REA, a busca em repositórios de REA, a consulta sobre a procedência de dados e o desenvolvimento de um crawler efetivo para obtenção de metadados específicos. Além disso, contribui na inclusão de busca de REA no cenário brasileiro, no mapeamento de padrões de metadados para mecanismos de busca na Web e a publicação de uma arquitetura de um mecanismo de busca na Web. Ademais, o SeeOER disponibiliza um serviço que traz um índice invertido de busca que auxilia encontrar REA nos repositórios dispersos na Web. Também foi disponibilizada uma API para buscas que possibilita consultas por palavras chaves e o uso de palavras booleanas. A forma de validação em mecanismos de busca na Web, como um todo, e de forma quantitativa e específica por componentes foi feita em grau de especialidade. Para validação de qualidade foram considerados 10 participantes com grupos distintos de escolaridade e área de estudo. Os resultados quantitativos demonstraram que o SeeOER é superior em 23.618 REA indexados em comparação a 15.955 do Jorum. Em relação à qualidade o SeeOER demonstrou ser superior ao Jorum considerando a função penalizada e o score utilizada nesta pesquisa. / Open Educational Resources (OER) has been increasingly applied to support students and professionals in their learning process. They consist of learning resources, usually stored in electronic device, associated with an open license that allows reuse, re-adaptation and redistribution with either no or limited restrictions. However, currently the Web search engines do not provide efficient mechanisms to find OER, in particular, because they do not consider the intrinsic characteristics of OER such as different standards of metadata, repositories and heterogeneous platforms, license types, granularity and quality of resources. This project proposes a Web search engine, named SeeOER, designed to recover OER. Main features of SeeOER are: schema-level con ict resolution derived from the heterogeneity of OER, search for Brazilian OER repositories, query considering data provenance and the development of an effective crawler to obtain specific metadata. In addition, our project contributes to the inclusion of the search OER research issues in the Brazilian scenario, to the mapping of metadata standards to Web search engine. In addition, SeeOER provides a service which internally has an inverted index search to find the OER which is different from traditional Web repositories. We also provide an API for queries which make it possible to write queries based on keywords and boolean. The validation of the search engine on the Web was both qualitative and quantitative. In the quantitative validation it was observed in level of specialty of the search engines components. In conclusion, the quality and quantitative results experiments showed that SeeOER is superior in OER indexed 23,618 compared to 15,955 the Jorum. In relation to the quality SeeOER shown to be superior to Jorum 27 points considering the metric used in project. Integração de dados Integration schemes Mecanismo de busca na Web Metadata standards Open educational resources Procedência de dados Recursos educacionais abertos Web search engine
5	論網路新聞搜尋引擎的合理使用-以Google News美國版的著作權法相關爭議為中心 / The fair use analysis on the web search engine 林佩蓉, Lin, Patricia Unknown Date (has links) 網際網路在許多方面皆改變了大眾的讀報方式：不僅僅是因為Google News一類的新聞搜尋引擎，提供讀者多樣化且一次性的即時整合型閱覽，現今多數的網路使用者，也已養成從數個不同網頁中選擇所需資訊來瀏覽或進行研究、調查的習慣。然而，隨著報紙印刷發行量的銳減、新聞網站的流量不如預期，新聞媒體出版商對於搜尋引擎憑藉他人著作獲取暴利的手法感到越來越挫敗。Google自西元2009年起在Google News美國版刊登廣告，會不會成為壓垮傳統新聞媒體的最後一根稻草?而Google的這項舉措是否動搖其利用他人著作內容賺錢的正當性?可符合美國著作權法的合理使用原則? 本文將就Google News刊登廣告所引發的爭議，從網路新聞的發展、Google的經營模式以及合理使用原則的規範目的等面向，鎖定美國著作權法，來探討網路搜尋引擎截取他人著作內容的合法性，以及數位時代下合理使用原則的未來命運。 / The Internet has changed the way people reading the newspaper: not only because news search engines such as Google News providing readers with a diverse and one-time real-time integration of reading, but also because nowadays, most Web users have become accustomed to selecting the information they need from several different Web pages for browsing or carrying out research, and investigation. However, as the newspapers’ circulation drops, and the flow of news Web sites does not reach the expectation, the fact that search engines use others’ works to obtain profits disappoints news media publishers a lot. Will the advertisements placed alongside search results on the American version of Google News be the last straw? Will this undermine the validity of Google’s initiative of using others’ work for money, and fit in with the fair use doctrine? This article will try to analyze the Google News advertising disputes from the perspectives of development of network newsm, Google's business model and the fair use doctrine , and to discuss the legality of the Web search engine as well as the fate of fair use under digital era in the future. 網路新聞網路搜尋引擎合理使用 web journalism web search engine Fair use
6	Uma arquitetura para mecanismos de buscas na web usando integração de esquemas e padrões de metadados heterogêneos de recursos educacionais abertos em repositórios dispersos / An architecture for web search engines using integration of heterogeneous metadata schemas and standards of open educational resources in scattered repositories Murilo Gleyson Gazzola 18 November 2015 (has links) Recursos Educacionais Abertos (REA) podem ser definidos como materiais de ensino, aprendizagem e pesquisa, em qualquer meio de armazenamento, que estão amplamente disponíveis por meio de uma licença aberta que permite reuso, readequação e redistribuição sem restrições ou com restrições limitadas. Atualmente, diversas instituições de ensino e pesquisa têm investido em REA para ampliar o acesso ao conhecimento. Entretanto, os usuários ainda têm dificuldades de encontrar os REA com os mecanismos de busca atuais. Essa dificuldade deve-se principalmente ao fato dos mecanismos de busca na Web serem genéricos, pois buscam informação em qualquer lugar, desde páginas de vendas até materiais escritos por pessoas anônimas. De fato, esses mecanismos não levam em consideração as características intrínsecas de REA, como os diferentes padrões de metadados, repositórios e plataformas existentes, os tipos de licença, a granularidade e a qualidade dos recursos. Esta dissertação apresenta o desenvolvimento de um mecanismo de busca na Web especificamente para recuperação de REA denominado SeeOER. As principais contribuições desta pesquisa de mestrado consistem no desenvolvimento de um mecanismo de busca na Web por REA com diferenciais entre os quais se destacam a resolução de conflitos em nível de esquema oriundos da heterogeneidade dos REA, a busca em repositórios de REA, a consulta sobre a procedência de dados e o desenvolvimento de um crawler efetivo para obtenção de metadados específicos. Além disso, contribui na inclusão de busca de REA no cenário brasileiro, no mapeamento de padrões de metadados para mecanismos de busca na Web e a publicação de uma arquitetura de um mecanismo de busca na Web. Ademais, o SeeOER disponibiliza um serviço que traz um índice invertido de busca que auxilia encontrar REA nos repositórios dispersos na Web. Também foi disponibilizada uma API para buscas que possibilita consultas por palavras chaves e o uso de palavras booleanas. A forma de validação em mecanismos de busca na Web, como um todo, e de forma quantitativa e específica por componentes foi feita em grau de especialidade. Para validação de qualidade foram considerados 10 participantes com grupos distintos de escolaridade e área de estudo. Os resultados quantitativos demonstraram que o SeeOER é superior em 23.618 REA indexados em comparação a 15.955 do Jorum. Em relação à qualidade o SeeOER demonstrou ser superior ao Jorum considerando a função penalizada e o score utilizada nesta pesquisa. / Open Educational Resources (OER) has been increasingly applied to support students and professionals in their learning process. They consist of learning resources, usually stored in electronic device, associated with an open license that allows reuse, re-adaptation and redistribution with either no or limited restrictions. However, currently the Web search engines do not provide efficient mechanisms to find OER, in particular, because they do not consider the intrinsic characteristics of OER such as different standards of metadata, repositories and heterogeneous platforms, license types, granularity and quality of resources. This project proposes a Web search engine, named SeeOER, designed to recover OER. Main features of SeeOER are: schema-level con ict resolution derived from the heterogeneity of OER, search for Brazilian OER repositories, query considering data provenance and the development of an effective crawler to obtain specific metadata. In addition, our project contributes to the inclusion of the search OER research issues in the Brazilian scenario, to the mapping of metadata standards to Web search engine. In addition, SeeOER provides a service which internally has an inverted index search to find the OER which is different from traditional Web repositories. We also provide an API for queries which make it possible to write queries based on keywords and boolean. The validation of the search engine on the Web was both qualitative and quantitative. In the quantitative validation it was observed in level of specialty of the search engines components. In conclusion, the quality and quantitative results experiments showed that SeeOER is superior in OER indexed 23,618 compared to 15,955 the Jorum. In relation to the quality SeeOER shown to be superior to Jorum 27 points considering the metric used in project. Integração de dados Mecanismo de busca na Web Procedência de dados Recursos educacionais abertos Integration schemes Metadata standards Open educational resources Web search engine
7	L'implicite dans la requête adressée à un moteur de recherche Web / The implicit in query sent to Web engine Zouhri, Talal 04 July 2013 (has links) L'objet de notre étude est la requête adressée à un moteur de recherche Web par un usager dans le cadre d'une recherche d'information. Nous souhaitons mieux comprendre l'étape de la recherche d'information située entre le besoin d'information et la formulation / reformulation de la requête. Notre thèse est articulée autour de deux hypothèses de recherche. D'abord, nous avons émis l'hypothèse qu'une requête adressée à un moteur de recherche Web peut receler de l'implicite. Ensuite, nous avons considéré que ce contenu implicite peut être utilisé par les usagers dans des tactiques de formulation / reformulation de la requête. Nous avons notamment analysé le discours de 61 étudiants que nous avons interrogés sur leur intention de recherche. Ce discours était principalement constitué d'un niveau sémantique (qui décrit le thème de la recherche) et d'un niveau pragmatique (composé d'un but seul ou d'un but ou plusieurs sous-but(s)). Les termes représentant le niveau sémantique pouvaient être complètement ou partiellement formulés dans la requête, mais ceux représentant le niveau pragmatique n'étaient généralement pas formulés. Cette situation de communication s'apparente à une négociation entre le moteur de recherche et l'usager. Le moteur de recherche tente de disposer d'éléments sur le besoin d'information de l'usager et ce dernier tente d'obtenir à partir d'un contenu explicitement formulé dans sa requête, un ensemble d'information afin de progresser sur la résolution de son problème / The object of our study is the query, sent to a Web search engine, by an Internet user. We aim to reach a better understanding of the phase of information seeking located between the information need and the query formulation. Our thesis is based on two core hypotheses, all related to the query. Firstly, we considered that the query expressed partially the user’s information need and therefore contain an implicit part. Secondly, we also considered that the implicit part can be used by the users in their query formulation tactics. We notably analyzed 61 students’ speech about their search intent. The speech was based mainly on a semantic level (the terms representing the topic of the research) and a pragmatic level (composed of an only purpose or purpose and of many under purposes). The terms representing the semantic level could be rather completely or partially formulated in the query but those representing the pragmatic level weren’t formulated. This situation of communication is similar to a negotiation between the Web search engine and the user. The search engine Web tries to have elements on user’s information need and the user tries to obtain, from a contents explicitly formulated in his query, a set of information in order to progress on his resolution of its problem Besoin d'information Implicite But Requête Moteur de recherche Web Recherche d'information Information need Implicit Goal Query Web search engine Information seeking 025.04
8	網頁地理關聯性之分析與研究 / The Analysis of Geographic Relations of Internet Information 黃建達, Huang, Jian Da Unknown Date (has links) 近幾年來，有關地理資訊的網頁搜尋越來越受到重視。傳統的網頁搜尋引擎無法反應使用者查詢和網頁文件之間的地理關聯性。在一些情況下，我們希望網路搜尋引擎能夠考慮使用者查詢與網頁文件間的地理相關性，以提升搜尋的準確度。我們的研究透過包圍矩形模型(Bounding Rectangle Model；BR Model)以搜尋與使用者查詢之地理相關程度較高的網頁文件。使用者僅需輪入文字的查詢，即能得到相符結果的網頁文件。首先，我們建立一個地名辭典以找出使用者查詢與網頁文件內出現的地名及空間資料，接著我們利用空間資料建立空間索引項(spatial index term)集合，用來表示使用者查詢與網頁文件內的地理範圍，最後再透過使用者查詢與網頁文件的空間索引項集合計算兩者之間的地理相似程度，以找出與使用者查詢有較高地理關聯性的網頁文件。此篇論文的貢獻在於我們提一套完整資訊檢索模型架構的方法，分析使用者查詢與網頁文件之間的地理關聯性，使用者藉由輸入文字查詢即能得到相符地理關聯性的網頁文件。 / Geographic web search becomes increasingly popular in recent years. Traditional web search engine, such as Google and Yahoo, can not accommodate geographic relevance between user queries and internet documents. Hence, they can not retrieve geographic related information from user queries. However, in many cases, the geographic relevance between user queries and internet documents could enhance the accuracy of this type of searches. In this thesis, we propose a mechanism that uses the Bounding Rectangle Model (BR Model) to retrieve geographic relevant internet documents in response to user queries. Users provide only the conventional input queries (keywords) and our search engine will return the geographic relevant results. Our method can be classified into the following three steps. In the first step, we create a gazetteer and use it to relate the user query’s geographic terms in internet documents. In the next step, we use the spatial data to build a set of spatial index terms that represents the geographic scope of user query and internet documents. And then we use these spatial index terms to calculate degree of geographic similarity between user query and internet documents to identify highly relevant geographic internet documents. We implemented a prototype search engine using our approach. The experiment results show that we can successfully retrieve geographic relevant data through this mechanism and provide more accurate search results. 地理資訊系統資訊檢索網頁搜尋引擎包圍矩形模型 geographic information system information retrieval web search engine BR-Model
9	K problematice kolokability u adjektivních vstupů ve Velkém německo-českém akademickém slovníku (VNČAS) / On Collocation Problems in Adjective Entries. Problematic Cases in the Large German-Czech Academic Dictionary Oliva, Jakub January 2012 (has links) The aim of the given work was to point out the problems of the collocability of german adjectives in dictionaries and on the basis of the executed analysis to suggest possible solutions which could be exploited in the entries. The primary information sources were the german dictionary Duden and the german-czech dictionary Siebenschein, the secondary ones were the internet corpus DeReKo and the web search engine Google. The dictionary collocations should not be chosen by the criterion of their quantity, but by the criterion of their usefulness. They should exemplify the differences between both languages and they should be used as the assure instance for the dictionary user.

Search results