Global ETD Search

11	Web Search Using Genetic Programming Wu, Jain-Shing 11 September 2002 (has links) To locate and to retrieve the needed information from the Internet is an important issue. Existing search engines may give too much useless and redundancy information. Due to the search feature is different for different search engines, it¡¦s very difficult to find an optimal search scheme for all subjects. In this paper, we propose a genetic programming web search system (GPWS) to generate exact query according to a user¡¦s interests. The system can retrieve the information from the search engines, filter the retrieved results and remove the redundancy and useless results. The filtered results are displayed on a uniform user interface. Compared with the queries generated by randomly, the degree of similarity of results and user¡¦s interests are improved. Web Search Internet Information Retrieval Genetic Programming
12	Search engine optimisation or paid placement systems-user preference / Neethling, Riaan. January 2007 (has links) Thesis (MTech (Information Technology))--Cape Peninsula University of Technology, 2007. / Includes bibliographical references (leaves 98-113). Also available online.
13	Search algorithms for discovery of Web services Hicks, Janette M. January 2005 (has links) Thesis (M.S.)--State University of New York at Binghamton, Watson School of Engineering and Applied Science (Computer Science), 2005. / Includes bibliographical references. World Wide Web Web search engines.
14	Evaluation and comparison of search engines Mtshontshi, Lindiwe 12 1900 (has links) Thesis (MPhil)--Stellenbosch University, 2004. / ENGLISH ABSTRACT: A growing body of studies is developing approaches to evaluate human interaction with Web search engines. Measuring the information retrieval effectiveness of World Wide Web search engines is costly because of the human relevance judgements involved. However, both for business enterprises and people it is important to know the most effective Web search engine, since such search engines help their users find a higher number of relevant Web pages with less effort. Furthermore, this information can be used for several practical purposes. This study does not attempt to describe all the currently available search engines, but provides a comparison of some, which are deemed to be among the most useful. It concentrates on search engines and their characteristics only. The goal is to help a new user get the most useful "hits" when using the various tools. / AFRIKAANSE OPSOMMING: Al hoe meer studies word gedoen om benaderings te ontwikkel vir die evaluasie van menslike interaksie met Web-soekenjins. Om te meet hoe effektief 'n soekenjin inligting op die Wêreldwye Web kan opspoor, is duur vanweë die mens se relevansiebeoordeling wat daarby betrokke is. Dit is egter belangrik dat die bestuurders van sake-ondememings en ander mense sal weet watter die mees doeltreffende soekenjins is, aangesien sulke soekenjins hulle gebruikers help om 'n hoër aantal relevante Webblaaie met minder inspanning te vind. Hierdie inligting kan ook gebruik word om 'n paar praktiese doelwitte te verwesenlik. Daar word nie gepoog om al die soekenjins wat tans beskikbaar is, te beskryf nie, maar sommige van die soekenjins wat as die nuttigste beskou word, word vergelyk. Daar word alleenlik op soekenjins en hulle kenmerke gekonsentreer. Die doel is om die nuwe gebruiker te help om die nuttigste inligting te verkry deur gebruik te maak van verskeie hulpmiddels. Search engines -- Evaluation Web search engines -- Evaluation
15	Search engine optimisation or paid placement systems: user preference Neethling, Riaan January 2007 (has links) Thesis submitted in fulfilment of the requirements for the degree Magister Technologiae in Information Technology in the Faculty of Informatics and Design at the CAPE PENINSULA UNIVERSITY OF TECHNOLOGY 2007 / The objective of this study was to investigate and report on user preference of Search Engine Optimisation (SEO), versus Pay Per Click (PPC) results. This will assist online advertisers to identify their optimal Search Engine Marketing (SEM) strategy for their specific target market. Research shows that online advertisers perceive PPC as a more effective SEM strategy than SEO. However, empirical evidence exists that PPC may not be the best strategy for online advertisers, creating confusion for advertisers considering a SEM campaign. Furthermore, not all advertisers have the funds to implement a dual strategy and as a result advertisers need to choose between a SEO and PPC campaign. In order for online advertisers to choose the most relevant SEM strategy, it is of importance to understand user perceptions of these strategies. A quantitative research design was used to conduct the study, with the purpose to collect and analyse data. A questionnaire was designed and hosted on a busy website to ensure maximal exposure. The questionnaire focused on how search engine users perceive SEM and their click response towards SEO and PPC respectively. A qualitative research method was also used in the form of an interview. The interview was conducted with representatives of a leading South African search engine, to verify the results and gain experts’ opinions. The data was analysed and the results interpreted. Results indicated that the user perceived relevancy split is 45% for PPC results, and 55% for SEO results, regardless of demographic factors. Failing to invest in either one could cause a significant loss of website traffic. This indicates that advertisers should invest in both PPC and SEO. Advertisers can invest in a PPC campaign for immediate results, and then implement a SEO campaign over a period of time. The results can further be used to adjust a SEM strategy according to the target market group profile of an advertiser, which will ensure maximum effectiveness. Web search engines Internet marketing MTech
16	The crossover point between keyword rich website text and spamdexing Zuze, Herbert January 2011 (has links) Thesis Submitted in fulfilment of the requirements for the degree MAGISTER TECHNOLOGIAE In BUSINESS INFORMATION SYSTEMS in the FACULTY OF BUSINESS at the CAPE PENINSULA UNIVERSITY OF TECHNOLOGY 2011 / With over a billion Internet users surfing the Web daily in search of information, buying, selling and accessing social networks, marketers focus intensively on developing websites that are appealing to both the searchers and the search engines. Millions of webpages are submitted each day for indexing to search engines. The success of a search engine lies in its ability to provide accurate search results. Search engines’ algorithms constantly evaluate websites and webpages that could violate their respective policies. For this reason some websites and webpages are subsequently blacklisted from their index. Websites are increasingly being utilised as marketing tools, which result in major competition amongst websites. Website developers strive to develop websites of high quality, which are unique and content rich as this will assist them in obtaining a high ranking from search engines. By focusing on websites of a high standard, website developers utilise search engine optimisation (SEO) strategies to earn a high search engine ranking. From time to time SEO practitioners abuse SEO techniques in order to trick the search engine algorithms, but the algorithms are programmed to identify and flag these techniques as spamdexing. Search engines do not clearly explain how they interpret keyword stuffing (one form of spamdexing) in a webpage. However, they regard spamdexing in many different ways and do not provide enough detail to clarify what crawlers take into consideration when interpreting the spamdexing status of a website. Furthermore, search engines differ in the way that they interpret spamdexing, but offer no clear quantitative evidence for the crossover point of keyword dense website text to spamdexing. Scholars have indicated different views in respect of spamdexing, characterised by different keyword density measurements in the body text of a webpage. This raised several fundamental questions that form the basis of this research. This research was carried out using triangulation in order to determine how the scholars, search engines and SEO practitioners interpret spamdexing. Five websites with varying keyword densities were designed and submitted to Google, Yahoo! and Bing. Two phases of the experiment were done and the results were recorded. During both phases almost all of the webpages, including the one with a 97.3% keyword density, were indexed. The aforementioned enabled this research to conclusively disregard the keyword stuffing issue, blacklisting and any form of penalisation. Designers are urged to rather concentrate on usability and good values behind building a website. The research explored the fundamental contribution of keywords to webpage indexing and visibility. Keywords used with or without an optimum level of measurement of richness and poorness result in website ranking and indexing. However, the focus should be on the way in which the end user would interpret the content displayed, rather than how the search engine would react towards the content. Furthermore, spamdexing is likely to scare away potential clients and end users instead of embracing them, which is why the time spent on spamdexing should rather be used to produce quality content. Web search engines Web site development MTech
17	Using reinforcement learning to learn relevance ranking of search queries Sandupatla, Hareesh 05 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Web search has become a part of everyday life for hundreds of millions of users around the world. However, the effectiveness of a user's search depends vitally on the quality of search result ranking. Even though enormous efforts have been made to improve the ranking quality, there is still significant misalignment between search engine ranking and an end user's preference order. This is evident from the fact that, for many search results on major search and e-commerce platforms, many users ignore the top ranked results and click on the lower ranked results. Nevertheless, finding a ranking that suits all the users is a difficult problem to solve as every user's need is different. So, an ideal ranking is the one which is preferred by the majority of the users. This emphasizes the need for an automated approach which improves the search engine ranking dynamically by incorporating user clicks in the ranking algorithm. In existing search result ranking methodologies, this direction has not been explored profoundly. A key challenge in using user clicks in search result ranking is that the relevance feedback that is learnt from click data is imperfect. This is due to the fact that a user is more likely to click a top ranked result than a lower ranked result, irrespective of the actual relevance of those results. This phenomenon is known as position bias which poses a major difficulty in obtaining an automated method for dynamic update of search rank orders. In my thesis, I propose a set of methodologies which incorporate user clicks for dynamic update of search rank orders. The updates are based on adaptive randomization of results using reinforcement learning strategy by considering the user click activities as reinforcement signal. Beginning at any rank order of the search results, the proposed methodologies guaranty to converge to a ranking which is close to the ideal rank order. Besides, the usage of reinforcement learning strategy enables the proposed methods to overcome the position bias phenomenon. To measure the effectiveness of the proposed method, I perform experiments considering a simplified user behavior model which I call color ball abstraction model. I evaluate the quality of the proposed methodologies using standard information retrieval metrics like Precision at n (P@n), Kendall tau rank correlation, Discounted Cumulative Gain (DCG) and Normalized Discounted Cumulative Gain (NDCG). The experiment results clearly demonstrate the success of the proposed methodologies. Information Retrieval Learning to Rank Web Search
18	A Semantic Web based search engine with X3D visualisation of queries and results Gkoutzis, Konstantinos January 2013 (has links) The Semantic Web project has introduced new techniques for managing information. Data can now be organised more efficiently and in such a way that computers can take advantage of the relationships that characterise the given input to present more relevant output. Semantic Web based search engines can quickly educe exactly what is needed to be found and retrieve it while avoiding information overload. Up until now, search engines have interacted with their users by asking them to look for words and phrases. We propose the creation of a new generation Semantic Web search engine that will offer a visual interface for queries and results. To create such an engine, information input must be viewed not merely as keywords, but as specific concepts and objects which are all part of the same universal system. To make the manipulation of the interconnected visual objects simpler and more natural, 3D graphics are utilised, based on the X3D Web standard, allowing users to semantically synthesise their queries faster and in a more logical way, both for them and the computer. 025.042
19	Finding structure and characteristic of web documents for classification. January 2000 (has links) by Wong, Wai Ching. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2000. / Includes bibliographical references (leaves 91-94). / Abstracts in English and Chinese. / Abstract --- p.ii / Acknowledgments --- p.v / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Semistructured Data --- p.2 / Chapter 1.2 --- Problem Addressed in the Thesis --- p.4 / Chapter 1.2.1 --- Labels and Values --- p.4 / Chapter 1.2.2 --- Discover Labels for the Same Attribute --- p.5 / Chapter 1.2.3 --- Classifying A Web Page --- p.6 / Chapter 1.3 --- Organization of the Thesis --- p.8 / Chapter 2 --- Background --- p.8 / Chapter 2.1 --- Related Work on Web Data --- p.8 / Chapter 2.1.1 --- Object Exchange Model (OEM) --- p.9 / Chapter 2.1.2 --- Schema Extraction --- p.11 / Chapter 2.1.3 --- Discovering Typical Structure --- p.15 / Chapter 2.1.4 --- Information Extraction of Web Data --- p.17 / Chapter 2.2 --- Automatic Text Processing --- p.19 / Chapter 2.2.1 --- Stopwords Elimination --- p.19 / Chapter 2.2.2 --- Stemming --- p.20 / Chapter 3 --- Web Data Definition --- p.22 / Chapter 3.1 --- Web Page --- p.22 / Chapter 3.2 --- Problem Description --- p.27 / Chapter 4 --- Hierarchical Structure --- p.32 / Chapter 4.1 --- Types of HTML Tags --- p.33 / Chapter 4.2 --- Tag-tree --- p.36 / Chapter 4.3 --- Hierarchical Structure Construction --- p.41 / Chapter 4.4 --- Hierarchical Structure Statistics --- p.50 / Chapter 5 --- Similar Labels Discovery --- p.53 / Chapter 5.1 --- Expression of Hierarchical Structure --- p.53 / Chapter 5.2 --- Labels Discovery Algorithm --- p.55 / Chapter 5.2.1 --- Phase 1: Remove Non-label Nodes --- p.57 / Chapter 5.2.2 --- Phase 2: Identify Label Nodes --- p.61 / Chapter 5.2.3 --- Phase 3: Discover Similar Labels --- p.66 / Chapter 5.3 --- Performance Evaluation of Labels Discovery Algorithm --- p.76 / Chapter 5.3.1 --- Phase 1 Results --- p.75 / Chapter 5.3.2 --- Phase 2 Results --- p.77 / Chapter 5.3.3 --- Phase 3 Results --- p.81 / Chapter 5.4 --- Classifying a Web Page --- p.83 / Chapter 5.4.1 --- Similarity Measurement --- p.84 / Chapter 5.4.2 --- Performance Evaluation --- p.86 / Chapter 6 --- Conclusion --- p.89 World Wide Web Information organization Web search engines
20	A Nearest-Neighbor Approach to Indicative Web Summarization Petinot, Yves January 2016 (has links) Through their role of content proxy, in particular on search engine result pages, Web summaries play an essential part in the discovery of information and services on the Web. In their simplest form, Web summaries are snippets based on a user-query and are obtained by extracting from the content of Web pages. The focus of this work, however, is on indicative Web summarization, that is, on the generation of summaries describing the purpose, topics and functionalities of Web pages. In many scenarios — e.g. navigational queries or content-deprived pages — such summaries represent a valuable commodity to concisely describe Web pages while circumventing the need to produce snippets from inherently noisy, dynamic, and structurally complex content. Previous approaches have identified linking pages as a privileged source of indicative content from which Web summaries may be derived using traditional extractive methods. To be reliable, these approaches require sufficient anchortext redundancy, ultimately showing the limits of extractive algorithms for what is, fundamentally, an abstractive task. In contrast, we explore the viability of abstractive approaches and propose a nearest-neighbors summarization framework leveraging summaries of conceptually related (neighboring) Web pages. We examine the steps that can lead to the reuse and adaptation of existing summaries to previously unseen pages. Specifically, we evaluate two Text-to-Text transformations that cover the main types of operations applicable to neighbor summaries: (1) ranking, to identify neighbor summaries that best fit the target; (2) target adaptation, to adjust individual neighbor summaries to the target page based on neighborhood-specific template-slot models. For this last transformation, we report on an initial exploration of the use of slot-driven compression to adjust adapted summaries based on the confidence associated with token-level adaptation operations. Overall, this dissertation explores a new research avenue for indicative Web summarization and shows the potential value, given the diversity and complexity of the content of Web pages, of transferring, and, when necessary, of adapting, existing summary information between conceptually similar Web pages. Information retrieval Web search engines Internet searching Computer science

Search results