• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 67
  • 9
  • 6
  • 6
  • 4
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 120
  • 120
  • 71
  • 48
  • 42
  • 36
  • 35
  • 31
  • 18
  • 17
  • 17
  • 17
  • 16
  • 14
  • 12
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
81

Building a search engine for music and audio on the World Wide Web

Knopke, Ian January 2005 (has links)
The main contribution of this dissertation is a system for locating and indexing audio files on the World Wide Web. The idea behind this system is that the use of both web page and audio file analysis techniques can produce more relevant information for locating audio files on the web than is used in full-text search engines. / The most important part of this system is a web crawler that finds materials by following hyperlinks between web pages. The crawler is distributed and operates using multiple computers across a network, storing results to a database. There are two main components: a set of retrievers that retrieve pages and audio files from the web, and a central crawl manager that coordinates the retrievers and handles data storage tasks. / The crawler is designed to locate three types of audio files: AIFF, WAVE, and MPEG-1 (MP3), but other types can be easily added to the system. Once audio files are located, analyses are performed of both the audio files and the associated web pages that link to these files. Information extracted by the crawler can be used to build search indexes for resolving user queries. A set of results demonstrating aspects of the performance of the crawler are presented, as well as some statistics and points of interest regarding the nature of audio files on the web.
82

Websites are capable of reflecting a particular human temperament : fact or fad?

Theron, Annatjie. January 2008 (has links)
Thesis (MIT(Informatics))--University of Pretoria, 2008. / Abstract in English and Afrikaans. Includes bibliographical references.
83

Search engine poisoning and its prevalence in modern search engines

Blaauw, Pieter January 2013 (has links)
The prevalence of Search Engine Poisoning in trending topics and popular search terms on the web within search engines is investigated. Search Engine Poisoning is the act of manipulating search engines in order to display search results from websites infected with malware. Research done between February and August 2012, using both manual and automated techniques, shows us how easily the criminal element manages to insert malicious content into web pages related to popular search terms within search engines. In order to provide the reader with a clear overview and understanding of the motives and the methods of the operators of Search Engine Poisoning campaigns, an in-depth review of automated and semi-automated web exploit kits is done, as well as looking into the motives for running these campaigns. Three high profile case studies are examined, and the various Search Engine Poisoning campaigns associated with these case studies are discussed in detail to the reader. From February to August 2012, data was collected from the top trending topics on Google’s search engine along with the top listed sites related to these topics, and then passed through various automated tools to discover if these results have been infiltrated by the operators of Search Engine Poisoning campaings, and the results of these automated scans are then discussed in detail. During the research period, manual searching for Search Engine Poisoning campaigns was also done, using high profile news events and popular search terms. These results are analysed in detail to determine the methods of attack, the purpose of the attack and the parties behind it
84

Search engine strategies: a model to improve website visibility for SMME websites

Chambers, Rickard January 2005 (has links)
THESIS Submitted in fulfilment of the requirements for the degree MAGISTER TECHNOLOGIAE in INFORMATION TECHNOLOGY in the FACULTY OF BUSINESS INFORMATICS at the CAPE PENINSULA UNIVERSITY OF TECHNOLOGY 2005 / The Internet has become the fastest growing technology the world has ever seen. It also has the ability to permanently change the face of business, including e-business. The Internet has become an important tool required to gain potential competitiveness in the global information environment. Companies could improve their levels of functionality and customer satisfaction by adopting e-commerce, which ultimately could improve their long-term profitability. Those companies who do end up adopting the use of the Internet, often fail to gain the advantage of providing a visible website. Research has also shown that even though the web provides numerous opportunities, the majority of SMMEs (small, medium and micro enterprises) are often ill equipped to exploit the web’s commercial potential. It was determined in this research project through the analysis of 300 websites, that only 6.3% of SMMEs in the Western Cape Province of South Africa appears within the top 30 results of six search engines, when searching for services/products. This lack of ability to produce a visible website is believed to be due to the lack of education and training, financial support and availability of time prevalent in SMMEs. For this reason a model was developed to facilitate the improvement of SMME website visibility. To develop the visibility model, this research project was conducted to identify potential elements which could provide a possible increase in website visibility. A criteria list of these elements was used to evaluate a sample of websites, to determine to what extent they made use of these potential elements. An evaluation was then conducted with 144 different SMME websites by searching for nine individual keywords within four search engines (Google, MSN, Yahoo, Ananzi), and using the first four results of every keyword from every search engine for analysis. Elements gathered through academic literature were then listed according to the usage of these elements in the top-ranking websites when searching for predetermined keywords. Further qualitative research was conducted to triangulate the data gathered from the literature and the quantitative research. The evaluative results provided the researcher with possible elements / designing techniques to formulate a model to develop a visible website that is not only supported by arrant research, but also through real current applications. The research concluded that, as time progresses and technology improves, new ways to improve website visibility will evolve. Furthermore, that there is no quick method for businesses to produce a visible website as there are many aspects that should be considered when developing “visible” websites.
85

An exploratory study of techniques in passive network telescope data analysis

Cowie, Bradley January 2013 (has links)
Careful examination of the composition and concentration of malicious traffic in transit on the channels of the Internet provides network administrators with a means of understanding and predicting damaging attacks directed towards their networks. This allows for action to be taken to mitigate the effect that these attacks have on the performance of their networks and the Internet as a whole by readying network defences and providing early warning to Internet users. One approach to malicious traffic monitoring that has garnered some success in recent times, as exhibited by the study of fast spreading Internet worms, involves analysing data obtained from network telescopes. While some research has considered using measures derived from network telescope datasets to study large scale network incidents such as Code-Red, SQLSlammer and Conficker, there is very little documented discussion on the merits and weaknesses of approaches to analyzing network telescope data. This thesis is an introductory study in network telescope analysis and aims to consider the variables associated with the data received by network telescopes and how these variables may be analysed. The core research of this thesis considers both novel and previously explored analysis techniques from the fields of security metrics, baseline analysis, statistical analysis and technical analysis as applied to analysing network telescope datasets. These techniques were evaluated as approaches to recognize unusual behaviour by observing the ability of these techniques to identify notable incidents in network telescope datasets
86

Using Wikipedia Knowledge and Query Types in a New Indexing Approach for Web Search Engines

Al-Akashi, Falah Hassan Ali January 2014 (has links)
The Web is comprised of a vast quantity of text. Modern search engines struggle to index it independent of the structure of queries and type of Web data, and commonly use indexing based on Web‘s graph structure to identify high-quality relevant pages. However, despite the apparent widespread use of these algorithms, Web indexing based on human feedback and document content is controversial. There are many fundamental questions that need to be addressed, including: How many types of domains/websites are there in the Web? What type of data is in each type of domain? For each type, which segments/HTML fields in the documents are most useful? What are the relationships between the segments? How can web content be indexed efficiently in all forms of document configurations? Our investigation of these questions has led to a novel way to use Wikipedia to find the relationships between the query structures and document configurations throughout the document indexing process and to use them to build an efficient index that allows fast indexing and searching, and optimizes the retrieval of highly relevant results. We consider the top page on the ranked list to be highly important in determining the types of queries. Our aim is to design a powerful search engine with a strong focus on how to make the first page highly relevant to the user, and on how to retrieve other pages based on that first page. Through processing the user query using the Wikipedia index and determining the type of the query, our approach could trace the path of a query in our index, and retrieve specific results for each type. We use two kinds of data to increase the relevancy and efficiency of the ranked results: offline and real-time. Traditional search engines find it difficult to use these two kinds of data together, because building a real-time index from social data and integrating it with the index for the offline data is difficult in a traditional distributed index. As a source of offline data, we use data from the Text Retrieval Conference (TREC) evaluation campaign. The web track at TREC offers researchers chance to investigate different retrieval approaches for web indexing and searching. The crawled offline dataset makes it possible to design powerful search engines that extends current methods and to evaluate and compare them. We propose a new indexing method, based on the structures of the queries and the content of documents. Our search engine uses a core index for offline data and a hash index for real-time V data, which leads to improved performance. The TREC Web track evaluation of our experiments showed that our approach can be successfully employed for different types of queries. We evaluated our search engine on different sets of queries from TREC 2010, 2011 and 2012 Web tracks. Our approach achieved very good results in the TREC 2010 training queries. In the TREC 2011 testing queries, our approach was one of the six best compared to all other approaches (including those that used a very large corpus of 500 million documents), and it was second best when compared to approaches that used only part of the corpus (50 million documents), as ours did. In the TREC 2012 testing queries, our approach was second best if compared to all the approaches, and first if compared only to systems that used the subset of 50 million documents.
87

A multi-agent collaborative personalized web mining system model.

Oosthuizen, Ockmer Louren 02 June 2008 (has links)
The Internet and world wide web (WWW) have in recent years, grown exponentially in size and in terms of the volume of information that is available on it. In order to effectively deal with the huge amount of information on the web, so called web search engines have been developed for the task of retrieving useful and relevant information for its users. Unfortunately, these web search engines have not kept pace with the boom growth and commercialization of the web. The main goal of this dissertation is the development of a model for a collaborative personalized meta-search agent (COPEMSA) system for the WWW. This model will enable the personalization of web search for users. Furthermore, the model aims to leverage on current search engines on the web as well as enable collaboration between users of the search system for the purposes of sharing useful resources between them. The model also employs the use of multiple intelligent agents and web content mining techniques. This enables the model to autonomously retrieve useful information for it’s user(s) and present this information in an effective manner. In order to achieve the above stated, the COPEMSA model employs the use of multiple intelligent agents. COPEMSA consists of five core components: a user agent, a query agent, a community agent, a content mining agent and a directed web spider. The user agent learns about the user in order to introduce personal preference into user queries. The query agent is a scaled down meta-search engine with the task of submitting the personalized queries it receives from the user agent to multiple search services on theWWW. The community agent enables the search system to communicate and leverage on the search experiences of a community of searchers. The content mining agent is responsible for analysis of the retrieved results from theWWWand the presentation of these results to the system user. Finally, a directed web spider is used by the content mining agent to retrieve the actual web pages it analyzes from the WWW. In this dissertation an additional model is also presented to deal with a specific problem all web spidering software must deal with namely content and link encapsulation. / Prof. E.M. Ehlers
88

Projeção multidimensional aplicada a visualização de resultados de busca textual / Multidimensional projection applied to textual search results visualization

Erick Mauricio Gómez Nieto 30 August 2012 (has links)
Usuários da Internet estão muito familiarizados que resultados de uma consulta sejam exibidos como uma lista ordenada de snippets. Cada snippet possui conteúdo textual que mostra um resumo do documento referido (ou página web) e um link para o mesmo. Esta representação tem muitas vantagens como, por exemplo, proporcionar uma navegação fácil e simples de interpretar. No entanto, qualquer usuário que usa motores de busca poderia reportar possivelmente alguma experiência de decepção com este modelo. Todavia, ela tem limitações em situações particulares, como o não fornecimento de uma visão geral da coleção de documentos recuperados. Além disso, dependendo da natureza da consulta - por exemplo, pode ser muito geral, ou ambígua, ou mal expressa - a informação desejada pode ser mal classificada, ou os resultados podem contemplar temas variados. Várias tarefas de busca seriam mais fáceis se fosse devolvida aos usuários uma visão geral dos documentos organizados de modo a refletir a forma como são relacionados, em relação ao conteúdo. Propomos uma técnica de visualização para exibir os resultados de consultas web que visa superar tais limitações. Ela combina a capacidade de preservação de vizinhança das projeções multidimensionais com a conhecida representação baseada em snippets. Essa visualização emprega uma projeção multidimensional para derivar layouts bidimensionais dos resultados da pesquisa, que preservam as relações de similaridade de texto, ou vizinhança. A similaridade é calculada mediante a aplicação da similaridade do cosseno sobre uma representação bag-of-words vetorial de coleções construídas a partir dos snippets. Se os snippets são exibidos diretamente de acordo com o layout derivado, eles se sobrepõem consideravelmente, produzindo uma visualização pobre. Nós superamos esse problema definindo uma energia funcional que considera tanto a sobreposição entre os snippets e a preservação da estrutura de vizinhanças como foi dada no layout da projeção. Minimizando esta energia funcional é fornecida uma representação bidimensional com preservação das vizinhanças dos snippets textuais com sobreposição mínima. A visualização transmite tanto uma visão global dos resultados da consulta como os agrupamentos visuais que refletem documentos relacionados, como é ilustrado em vários dos exemplos apresentados / Internet users are very familiar with the results of a search query displayed as a ranked list of snippets. Each textual snippet shows a content summary of the referred document (or web page) and a link to it. This display has many advantages, e.g., it affords easy navigation and is straightforward to interpret. Nonetheless, any user of search engines could possibly report some experience of disappointment with this metaphor. Indeed, it has limitations in particular situations, as it fails to provide an overview of the document collection retrieved. Moreover, depending on the nature of the query - e.g., it may be too general, or ambiguous, or ill expressed - the desired information may be poorly ranked, or results may contemplate varied topics. Several search tasks would be easier if users were shown an overview of the returned documents, organized so as to reflect how related they are, content-wise. We propose a visualization technique to display the results of web queries aimed at overcoming such limitations. It combines the neighborhood preservation capability of multidimensional projections with the familiar snippet-based representation by employing a multidimensional projection to derive two-dimensional layouts of the query search results that preserve text similarity relations, or neighborhoods. Similarity is computed by applying the cosine similarity over a bag-of-words vector representation of collection built from the snippets. If the snippets are displayed directly according to the derived layout they will overlap considerably, producing a poor visualization. We overcome this problem by defining an energy functional that considers both the overlapping amongst snippets and the preservation of the neighborhood structure as given in vii the projected layout. Minimizing this energy functional provides a neighborhood preserving two-dimensional arrangement of the textual snippets with minimum overlap. The resulting visualization conveys both a global view of the query results and visual groupings that reflect related results, as illustrated in several examples shown
89

Building a search engine for music and audio on the World Wide Web

Knopke, Ian January 2005 (has links)
No description available.
90

以學名結構為基礎之網路搜尋負載量模型設計 / A Generic Construct based Workload Model for Web Search

柯怡芬, Ke, I Fen Unknown Date (has links)
網際網路搜尋是很重要的工具,可用以蒐集或尋找資訊。然而搜尋結果有時無法完全符合使用者的原意,所以網際網路搜尋引擎公司致力於發展更好的搜尋演算法,是為了增進搜尋結果的準確性並提高使用者對搜尋引擎的使用率,我們從探討的文獻中發現目前並沒有一個較彈性、開放的工具來評量網路搜尋的效能。本研究的目的就是希望能發展出一個較具彈性的負載量模型以針對網路搜尋進行效能評量。本研究著重在效能評量的負載量模型及測試套組的設計,我們希望透過以學名結構為基礎的方法擴展負載量模型的彈性,我們蒐集及研討幾個具代表性的網路搜尋演算法,並找出這些主要演算法的學名結構,以這些學名結構為基礎進行負載量模型的設計,負載量模型包含網頁模型、查詢模型與控制模型。最後,我們利用雛形實作來驗證本研究所提出的研究方法。 / Web search service is a vital way to find information on the web. However, not every piece of information found is relevant or useful. In order to improve search accuracy, most designers of the web search engines devote to working on search algorithms development and optimization. From literature, we realize that there are few open or flexible performance evaluation methods for web search service. The objective of this research is to develop a more flexible workload model based on generic construct for web search benchmarking and build an automated benchmarking environment of performance evaluation. Generic constructs are major components which can represent the web search algorithm. We collect and review literature related to web search algorithms and benchmarking. And we identify the generic constructs of key web search algorithms. The workload model consists of a page model, query model and control model. The page model describes the web page structure in web search. The query model defines some important criteria to query the web search engines. The control model defines the variables that used to set up the benchmark environment. Finally, we validate the research model through the prototype implementation.

Page generated in 0.0405 seconds