• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 66
  • 9
  • 6
  • 6
  • 4
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 119
  • 119
  • 70
  • 48
  • 42
  • 36
  • 35
  • 31
  • 17
  • 17
  • 17
  • 17
  • 16
  • 14
  • 12
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
71

Information Retrieval with Query Hypergraphs

Bendersky, Michael 01 September 2012 (has links)
Current information retrieval models are optimized for retrieval with short keyword queries. In contrast, in this dissertation we focus on longer, verbose queries with more complex structure that are becoming more common in both mobile and web search. To this end, we propose an expressive query representation formalism based on query hypergraphs. Unlike the existing query representations, query hypergraphs model the dependencies between arbitrary concepts in the query, rather than dependencies between single query terms. Query hypergraphs are parameterized by importance weights, which are assigned to concepts and concept dependencies in the query hypergraph, based on their contribution to the overall retrieval effectiveness. Query hypergraphs are not limited to modeling the explicit query structure. Accordingly, we develop two methods for query expansion using query hypergraphs. In these methods, the expansion concepts in the query hypergraph may come either from the retrieval corpus alone or from a combination of multiple information sources such as Wikipedia or the anchor text extracted from a large-scale web corpus. We empirically demonstrate that query hypergraphs are consistently and significantly more effective than many of the current state-of-the-art retrieval methods, as demonstrated by the experiments on newswire and web corpora. Query hypergraphs improve the retrieval performance for all query types, and, in particular, they exhibit the highest effectiveness gains for verbose queries.
72

EFFICIENT K-WORD PROXIMITY SEARCH

Gupta, Chirag January 2008 (has links)
No description available.
73

Detecting Internet visual plagiarism in higher education photography with Google™ Search by Image : proposed upload methods and system evaluation

Van Heerden, Leanri. January 2014 (has links)
Thesis (M. Tech. (Design and Studio Art)) - Central University of Technology, Free State, 2014 / The Information Age has presented those in the discipline of photography with very many advantages. Digital photographers enjoy all the perquisites of convenience while still producing high-quality images. Lecturers find themselves the authorities of increasingly archaic knowledge in a perpetual race to keep up with technology. When inspiration becomes imitation and visual plagiarism occurs, lecturers may find themselves at a loss for taking action as content-based image retrieval systems, like Google™ Search by Image (SBI), have not yet been systematically tested for the detection of visual plagiarism. Currently there exists no efficacious method available to photography lecturers in higher education for detecting visual plagiarism. As such, the aim of this study is to ascertain the most effective uploading methods and precision of the Google™ SBI system which lecturers can use to establish a systematic workflow that will combat visual plagiarism in photography programmes. Images were selected from the Google™ Images database by means of random sampling and uploaded to Google™ SBI to determine if the system can match the images to their Internet source. Each of the images received a black and white conversion, a contrast adjustment and a hue shift to ascertain whether the system can also match altered images. Composite images were compiled to establish whether the system can detect images from the salient feature. Results were recorded and the precision values calculated to determine the system’s success rate and accuracy. The results were favourable and 93.25% of the adjusted images retrieved results with a precision value of 0.96. The composite images had a success rate of 80% when uploaded intact with no dissections and a perfect precision value of 1.00. Google™ SBI can successfully be used by the photography lecturer as a functional visual plagiarism detection system to match images unethically appropriated by students from the Internet.
74

Web search engines as teaching and research resources : a perceptions survey of IT and CS staff from selected universities of the KwaZulu-Natal and Eastern Cape provinces of South Africa

Tamba, Paul A. Tamba January 2011 (has links)
A dissertation submitted in fulfillment of the requirements for the degree of Master in Technology: Information Technology, Durban University of Technology, 2011. / This study examines the perceived effect of the following factors on web searching ability of academic staff in the computing discipline: demographic attributes such as gender, age group, position held by the academic staff, highest qualification, etc; lecturing experience, research experience, English language proficiency, and web searching experience. The research objectives are achieved using a Likert-scale based questionnaire for 61 academic staff from Information Technology and Computer Science departments from four Universities from the Kwazulu-Natal and Eastern Cape provinces of South Africa. Descriptive and inferential statistics were computed for data analysis from the questionnaire after performing data reliability and validity tests using factor analysis and Cronbach‟s coefficients methods on the PASW Statistics 18.0 (SPSS). Descriptive statistics revealed a majority of staff from IT as compared to staff in CS and, a majority of under qualified middle age male staff in junior positions with considerable years of lecturing experience but with little research experience. Inferential statistics show an association between web searching ability and demographic attributes such as academic qualifications, positions, and years of research experience, and also reveal a relationship between web searching ability and lecturing experience, and between web searching ability and English language ability. However, the association between position, English language ability, and searching ability was found to be the strongest of all. The novelty finding by this study is the effect of lecturing experience on web searching ability which has not been claimed by existing research reviewed. Ideas for future research include mentoring of academic staff by more experienced staff, training of novice web searchers, designing and using semantic search systems both in English and in local languages, publishing more web content in local languages, and triangulating various research strategies for the analysis of the usability of web search engines.
75

Google searches and financial markets: IPOs and uncertainty / Google searches and financial markets: IPOs and uncertainty

Vakrman, Tomáš January 2014 (has links)
This thesis studies how the investor attention proxied by Google search volume affects different aspects of market behavior. My results show that a surge in online attention is associated with an increase in trading activity and stock price volatility, but no effect is detected for daily returns. Yet, if market sentiment is taken into account, the relationship comes to the surface for returns as well. The returns tend to decrease with attention hikes in negative sentiment periods and the opposite is observed for periods of positive sentiment, suggesting that Google web search captures predominately attention of sentiment investors. Moreover, I demonstrate that with the outburst of financial crisis, the interdependence between attention and trading activity was intensified. Lastly, I provide evidence that web search may shed some light on IPO-related puzzles. The initial returns seem to be higher for IPOs that receive above average attention, and are likely to be reversed in long-term. In addition, it is ascertained that web search volume may act as a proxy for market overreaction to the offerings. Powered by TCPDF (www.tcpdf.org)
76

Projeção multidimensional aplicada a visualização de resultados de busca textual / Multidimensional projection applied to textual search results visualization

Nieto, Erick Mauricio Gómez 30 August 2012 (has links)
Usuários da Internet estão muito familiarizados que resultados de uma consulta sejam exibidos como uma lista ordenada de snippets. Cada snippet possui conteúdo textual que mostra um resumo do documento referido (ou página web) e um link para o mesmo. Esta representação tem muitas vantagens como, por exemplo, proporcionar uma navegação fácil e simples de interpretar. No entanto, qualquer usuário que usa motores de busca poderia reportar possivelmente alguma experiência de decepção com este modelo. Todavia, ela tem limitações em situações particulares, como o não fornecimento de uma visão geral da coleção de documentos recuperados. Além disso, dependendo da natureza da consulta - por exemplo, pode ser muito geral, ou ambígua, ou mal expressa - a informação desejada pode ser mal classificada, ou os resultados podem contemplar temas variados. Várias tarefas de busca seriam mais fáceis se fosse devolvida aos usuários uma visão geral dos documentos organizados de modo a refletir a forma como são relacionados, em relação ao conteúdo. Propomos uma técnica de visualização para exibir os resultados de consultas web que visa superar tais limitações. Ela combina a capacidade de preservação de vizinhança das projeções multidimensionais com a conhecida representação baseada em snippets. Essa visualização emprega uma projeção multidimensional para derivar layouts bidimensionais dos resultados da pesquisa, que preservam as relações de similaridade de texto, ou vizinhança. A similaridade é calculada mediante a aplicação da similaridade do cosseno sobre uma representação bag-of-words vetorial de coleções construídas a partir dos snippets. Se os snippets são exibidos diretamente de acordo com o layout derivado, eles se sobrepõem consideravelmente, produzindo uma visualização pobre. Nós superamos esse problema definindo uma energia funcional que considera tanto a sobreposição entre os snippets e a preservação da estrutura de vizinhanças como foi dada no layout da projeção. Minimizando esta energia funcional é fornecida uma representação bidimensional com preservação das vizinhanças dos snippets textuais com sobreposição mínima. A visualização transmite tanto uma visão global dos resultados da consulta como os agrupamentos visuais que refletem documentos relacionados, como é ilustrado em vários dos exemplos apresentados / Internet users are very familiar with the results of a search query displayed as a ranked list of snippets. Each textual snippet shows a content summary of the referred document (or web page) and a link to it. This display has many advantages, e.g., it affords easy navigation and is straightforward to interpret. Nonetheless, any user of search engines could possibly report some experience of disappointment with this metaphor. Indeed, it has limitations in particular situations, as it fails to provide an overview of the document collection retrieved. Moreover, depending on the nature of the query - e.g., it may be too general, or ambiguous, or ill expressed - the desired information may be poorly ranked, or results may contemplate varied topics. Several search tasks would be easier if users were shown an overview of the returned documents, organized so as to reflect how related they are, content-wise. We propose a visualization technique to display the results of web queries aimed at overcoming such limitations. It combines the neighborhood preservation capability of multidimensional projections with the familiar snippet-based representation by employing a multidimensional projection to derive two-dimensional layouts of the query search results that preserve text similarity relations, or neighborhoods. Similarity is computed by applying the cosine similarity over a bag-of-words vector representation of collection built from the snippets. If the snippets are displayed directly according to the derived layout they will overlap considerably, producing a poor visualization. We overcome this problem by defining an energy functional that considers both the overlapping amongst snippets and the preservation of the neighborhood structure as given in vii the projected layout. Minimizing this energy functional provides a neighborhood preserving two-dimensional arrangement of the textual snippets with minimum overlap. The resulting visualization conveys both a global view of the query results and visual groupings that reflect related results, as illustrated in several examples shown
77

Improving the relevance of search results via search-term disambiguation and ontological filtering

Zhu, Dengya January 2007 (has links)
With the exponential growth of the Web and the inherent polysemy and synonymy problems of the natural languages, search engines are facing many challenges such as information overload, mismatch of search results, missing relevant documents, poorly organized search results, and mismatch of human mental model of clustering engines. To address these issues, much effort including employing different information retrieval (IR) models, information categorization/clustering, personalization, semantic Web, ontology-based IR, and so on, has been devoted to improve the relevance of search results. The major focus of this study is to dynamically re-organize Web search results under a socially constructed hierarchical knowledge structure, to facilitate information seekers to access and manipulate the retrieved search results, and consequently to improve the relevance of search results. / To achieve the above research goal, a special search-browser is developed, and its retrieval effectiveness is evaluated. The hierarchical structure of the Open Directory Project (ODP) is employed as the socially constructed knowledge structure which is represented by the Tree component of Java. Yahoo! Search Web Services API is utilized to obtain search results directly from Yahoo! search engine databases. The Lucene text search engine calculates similarities between each returned search result and the semantic characteristics of each category in the ODP; and thus to assign the search results to the corresponding ODP categories by Majority Voting algorithm. When an interesting category is selected by a user, only search results categorized under the category are presented to the user, and the quality of the search results is consequently improved. / Experiments demonstrate that the proposed approach of this research can improve the precision of Yahoo! search results at the 11 standard recall levels from an average 41.7 per cent to 65.2 per cent; the improvement is as high as 23.5 per cent. This conclusion is verified by comparing the improvements of the P@5 and P@10 of Yahoo! search results and the categorized search results of the special search-browser. The improvement of P@5 and P@10 are 38.3 per cent (85 per cent - 46.7 per cent) and 28 per cent (70 per cent - 42 per cent) respectively. The experiment of this research is well designed and controlled. To minimize the subjectiveness of relevance judgments, in this research five judges (experts) are asked to make their relevance judgments independently, and the final relevance judgment is a combination of the five judges’ judgments. The judges are presented with only search-terms, information needs, and the 50 search results of Yahoo! Search Web Service API. They are asked to make relevance judgments based on the information provided above, there is no categorization information provided. / The first contribution of this research is to use an extracted category-document to represent the semantic characteristics of each of the ODP categories. A category-document is composed of the topic of the category, description of the category, the titles and the brief descriptions of the submitted Web pages under this category. Experimental results demonstrate the category-documents of the ODP can represent the semantic characteristics of the ODP in most cases. Furthermore, for machine learning algorithms, the extracted category-documents can be utilized as training data which otherwise demand much human labor to create to ensure the learning algorithm to be properly trained. The second contribution of this research is the suggestion of the new concepts of relevance judgment convergent degree and relevance judgment divergent degree that are used to measure how well different judges agree with each other when they are asked to judge the relevance of a list of search results. When the relevance judgment convergent degree of a search-term is high, an IR algorithm should obtain a higher precision as well. On the other hand, if the relevance judgment convergent degree is low, or the relevance judgment divergent degree is high, it is arguable to use the data to evaluate the IR algorithm. This intuition is manifested by the experiment of this research. The last contribution of this research is that the developed search-browser is the first IR system (IRS) to utilize the ODP hierarchical structure to categorize and filter search results, to the best of my knowledge.
78

Learning an integrated hybrid image retrieval system

Jing, Yushi 06 January 2012 (has links)
Current Web image search engines, such as Google or Bing Images, adopt a hybrid search approach in which a text-based query (e.g. "apple") is used to retrieve a set of relevant images, which are then refined by the user (e.g. by re-ranking the retrieved images based on similarity to a selected example). This approach makes it possible to use both text information (e.g. the initial query) and image features (e.g. as part of the refinement stage) to identify images which are relevant to the user. One limitation of these current systems is that text and image features are treated as independent components and are often used in a decoupled manner. This work proposes to develop an integrated hybrid search method which leverages the synergies between text and image features. Recently, there has been tremendous progress in the computer vision community in learning models of visual concepts from collections of example images. While impressive performance has been achieved on standardized data sets, scaling these methods so that they are capable of working at web scale remains a significant challenge. This work will develop approaches to visual modeling that can be scaled to address the task of retrieving billions of images on the Web. Specifically, we propose to address two research issues related to integrated text- and image-based retrieval. First, we will explore whether models of visual concepts which are learned from collections of web images can be utilized to improve the image ranking associated with a text-based query. Second, we will investigate the hypothesis that the click-patterns associated with standard web image search engines can be utilized to learn query-specific image similarity measures that support improved query-refinement performance. We will evaluate our research by constructing a prototype integrated hybrid retrieval system based on the data from 300K real-world image queries. We will conduct user-studies to evaluate the effectiveness of our learned similarity measures and quantify the benefit of our method in real world search tasks such as target search.
79

Surfing for knowledge : how undergraduate students use the internet for research and study purposes.

Phillips, Genevieve. January 2013 (has links)
The developments in technology and concomitant access to the Internet have reshaped the way people research in their personal and academic lives. The ever-expanding amount of information on the Internet is creating an environment where users are able to find what they seek for or add to the body of knowledge or both. Researching, especially for academic purposes, has been greatly impacted by the Internet’s rapid growth and expansion. This project stemmed from a desire to understand how student’s research methods have evolved when taking into account their busy schedules and needs. The availability and accessibility of the Internet has increased its use considerably as a straightforward medium from which users obtain desired information. This thesis was to ascertain in what manner senior undergraduate students at the University of Kwa-Zulu Natal Pietermaritzburg campus use the Internet for academic research purposes which is largely determined by the individual’s personal preference and access to the Internet. Through the relevant literature review there arose pertinent questions that required answers. Students were interviewed to determine when, why and how they began using the Internet, and how this usage contributes to their academic work; whether it aids or inhibits student’s research. Through collection and analysis of data, evidence emerged that students followed contemporary research methods, making extensive use of the Internet, while a few use both forms of resources, unless compelled by lecturers when following assignment requirements. As a secondary phase, from the results received from the students, lecturers were interviewed. Differing levels of restrictions on students were evident; they themselves use the Internet for academic research purposes. Lecturers were convinced they had the understanding and experience to discern what was relevant and factual. Referring to the Internet for research is becoming more popular. This should continue to increase as the student’s lives become more complex. A suggestion offered by this research project is to academic staff. Equip students from their early University years on standards they should follow in order to research correctly, as opposed to limiting their use of the Internet leading in part to students committing plagiarism being unaware of the wealth of reputable resources available for their use and benefit on the Internet. / Thesis (M.A.)-University of KwaZulu-Natal, Pietermaritzburg, 2013.
80

Building a search engine for music and audio on the World Wide Web

Knopke, Ian January 2005 (has links)
The main contribution of this dissertation is a system for locating and indexing audio files on the World Wide Web. The idea behind this system is that the use of both web page and audio file analysis techniques can produce more relevant information for locating audio files on the web than is used in full-text search engines. / The most important part of this system is a web crawler that finds materials by following hyperlinks between web pages. The crawler is distributed and operates using multiple computers across a network, storing results to a database. There are two main components: a set of retrievers that retrieve pages and audio files from the web, and a central crawl manager that coordinates the retrievers and handles data storage tasks. / The crawler is designed to locate three types of audio files: AIFF, WAVE, and MPEG-1 (MP3), but other types can be easily added to the system. Once audio files are located, analyses are performed of both the audio files and the associated web pages that link to these files. Information extracted by the crawler can be used to build search indexes for resolving user queries. A set of results demonstrating aspects of the performance of the crawler are presented, as well as some statistics and points of interest regarding the nature of audio files on the web.

Page generated in 0.0726 seconds