Spelling suggestions: "subject:"eeb 3research"" "subject:"eeb 1research""
11 |
Web Search Using Genetic ProgrammingWu, Jain-Shing 11 September 2002 (has links)
To locate and to retrieve the needed information from the Internet is an important issue. Existing search engines may give too much useless and redundancy information. Due to the search feature is different for different search engines, it¡¦s very difficult to find an optimal search scheme for all subjects. In this paper, we propose a genetic programming web search system (GPWS) to generate exact query according to a user¡¦s interests. The system can retrieve the information from the search engines, filter the retrieved results and remove the redundancy and useless results. The filtered results are displayed on a uniform user interface. Compared with the queries generated by randomly, the degree of similarity of results and user¡¦s interests are improved.
|
12 |
Search engine optimisation or paid placement systems-user preference /Neethling, Riaan. January 2007 (has links)
Thesis (MTech (Information Technology))--Cape Peninsula University of Technology, 2007. / Includes bibliographical references (leaves 98-113). Also available online.
|
13 |
Search algorithms for discovery of Web servicesHicks, Janette M. January 2005 (has links)
Thesis (M.S.)--State University of New York at Binghamton, Watson School of Engineering and Applied Science (Computer Science), 2005. / Includes bibliographical references.
|
14 |
Evaluation and comparison of search enginesMtshontshi, Lindiwe 12 1900 (has links)
Thesis (MPhil)--Stellenbosch University, 2004. / ENGLISH ABSTRACT: A growing body of studies is developing approaches to evaluate human interaction
with Web search engines. Measuring the information retrieval effectiveness of World
Wide Web search engines is costly because of the human relevance judgements
involved. However, both for business enterprises and people it is important to know
the most effective Web search engine, since such search engines help their users find
a higher number of relevant Web pages with less effort. Furthermore, this information
can be used for several practical purposes. This study does not attempt to describe all
the currently available search engines, but provides a comparison of some, which are
deemed to be among the most useful. It concentrates on search engines and their
characteristics only. The goal is to help a new user get the most useful "hits" when
using the various tools. / AFRIKAANSE OPSOMMING: Al hoe meer studies word gedoen om benaderings te ontwikkel vir die evaluasie van
menslike interaksie met Web-soekenjins. Om te meet hoe effektief 'n soekenjin
inligting op die Wêreldwye Web kan opspoor, is duur vanweë die mens se
relevansiebeoordeling wat daarby betrokke is. Dit is egter belangrik dat die
bestuurders van sake-ondememings en ander mense sal weet watter die mees
doeltreffende soekenjins is, aangesien sulke soekenjins hulle gebruikers help om 'n
hoër aantal relevante Webblaaie met minder inspanning te vind. Hierdie inligting
kan ook gebruik word om 'n paar praktiese doelwitte te verwesenlik. Daar word nie
gepoog om al die soekenjins wat tans beskikbaar is, te beskryf nie, maar sommige
van die soekenjins wat as die nuttigste beskou word, word vergelyk. Daar word
alleenlik op soekenjins en hulle kenmerke gekonsentreer. Die doel is om die nuwe
gebruiker te help om die nuttigste inligting te verkry deur gebruik te maak van
verskeie hulpmiddels.
|
15 |
Search engine optimisation or paid placement systems: user preferenceNeethling, Riaan January 2007 (has links)
Thesis
submitted in fulfilment
of the requirements for the degree
Magister Technologiae
in
Information Technology
in the
Faculty of Informatics and Design
at the
CAPE PENINSULA UNIVERSITY OF TECHNOLOGY
2007 / The objective of this study was to investigate and report on user preference of
Search Engine Optimisation (SEO), versus Pay Per Click (PPC) results. This will
assist online advertisers to identify their optimal Search Engine Marketing (SEM)
strategy for their specific target market.
Research shows that online advertisers perceive PPC as a more effective SEM
strategy than SEO. However, empirical evidence exists that PPC may not be the
best strategy for online advertisers, creating confusion for advertisers considering a
SEM campaign. Furthermore, not all advertisers have the funds to implement a dual
strategy and as a result advertisers need to choose between a SEO and PPC
campaign. In order for online advertisers to choose the most relevant SEM strategy,
it is of importance to understand user perceptions of these strategies.
A quantitative research design was used to conduct the study, with the purpose to
collect and analyse data. A questionnaire was designed and hosted on a busy
website to ensure maximal exposure. The questionnaire focused on how search
engine users perceive SEM and their click response towards SEO and PPC
respectively. A qualitative research method was also used in the form of an
interview. The interview was conducted with representatives of a leading South
African search engine, to verify the results and gain experts’ opinions.
The data was analysed and the results interpreted. Results indicated that the user
perceived relevancy split is 45% for PPC results, and 55% for SEO results,
regardless of demographic factors. Failing to invest in either one could cause a
significant loss of website traffic. This indicates that advertisers should invest in both
PPC and SEO. Advertisers can invest in a PPC campaign for immediate results, and
then implement a SEO campaign over a period of time. The results can further be
used to adjust a SEM strategy according to the target market group profile of an
advertiser, which will ensure maximum effectiveness.
|
16 |
The crossover point between keyword rich website text and spamdexingZuze, Herbert January 2011 (has links)
Thesis
Submitted in fulfilment
of the requirements for the degree
MAGISTER TECHNOLOGIAE
In
BUSINESS INFORMATION SYSTEMS
in the
FACULTY OF BUSINESS
at the
CAPE PENINSULA UNIVERSITY OF TECHNOLOGY
2011 / With over a billion Internet users surfing the Web daily in search of information, buying,
selling and accessing social networks, marketers focus intensively on developing websites
that are appealing to both the searchers and the search engines. Millions of webpages are
submitted each day for indexing to search engines. The success of a search engine lies in its
ability to provide accurate search results. Search engines’ algorithms constantly evaluate
websites and webpages that could violate their respective policies. For this reason some
websites and webpages are subsequently blacklisted from their index.
Websites are increasingly being utilised as marketing tools, which result in major competition
amongst websites. Website developers strive to develop websites of high quality, which are
unique and content rich as this will assist them in obtaining a high ranking from search
engines. By focusing on websites of a high standard, website developers utilise search
engine optimisation (SEO) strategies to earn a high search engine ranking.
From time to time SEO practitioners abuse SEO techniques in order to trick the search
engine algorithms, but the algorithms are programmed to identify and flag these techniques
as spamdexing. Search engines do not clearly explain how they interpret keyword stuffing
(one form of spamdexing) in a webpage. However, they regard spamdexing in many different
ways and do not provide enough detail to clarify what crawlers take into consideration when
interpreting the spamdexing status of a website. Furthermore, search engines differ in the
way that they interpret spamdexing, but offer no clear quantitative evidence for the crossover
point of keyword dense website text to spamdexing. Scholars have indicated different views
in respect of spamdexing, characterised by different keyword density measurements in the
body text of a webpage. This raised several fundamental questions that form the basis of this
research.
This research was carried out using triangulation in order to determine how the scholars,
search engines and SEO practitioners interpret spamdexing. Five websites with varying
keyword densities were designed and submitted to Google, Yahoo! and Bing. Two phases of
the experiment were done and the results were recorded. During both phases almost all of
the webpages, including the one with a 97.3% keyword density, were indexed. The
aforementioned enabled this research to conclusively disregard the keyword stuffing issue,
blacklisting and any form of penalisation. Designers are urged to rather concentrate on
usability and good values behind building a website.
The research explored the fundamental contribution of keywords to webpage indexing and
visibility. Keywords used with or without an optimum level of measurement of richness and
poorness result in website ranking and indexing. However, the focus should be on the way in
which the end user would interpret the content displayed, rather than how the search engine
would react towards the content. Furthermore, spamdexing is likely to scare away potential
clients and end users instead of embracing them, which is why the time spent on
spamdexing should rather be used to produce quality content.
|
17 |
Using reinforcement learning to learn relevance ranking of search queriesSandupatla, Hareesh 05 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Web search has become a part of everyday life for hundreds of millions of users
around the world. However, the effectiveness of a user's search depends vitally
on the quality of search result ranking. Even though enormous efforts have been
made to improve the ranking quality, there is still significant misalignment
between search engine ranking and an end user's preference order. This is
evident from the fact that, for many search results on major search and
e-commerce platforms, many users ignore the top ranked results and click on the
lower ranked results. Nevertheless, finding a ranking that suits all the users is a
difficult problem to solve as every user's need is different. So, an ideal ranking is
the one which is preferred by the majority of the users. This emphasizes the need
for an automated approach which improves the search engine ranking dynamically
by incorporating user clicks in the ranking algorithm. In existing search result
ranking methodologies, this direction has not been explored profoundly.
A key challenge in using user clicks in search result ranking is that the
relevance feedback that is learnt from click data is imperfect. This is due
to the fact that a user is more likely to click a top ranked result than
a lower ranked result, irrespective of the actual relevance of those results.
This phenomenon is known as position bias which poses a major difficulty
in obtaining an automated method for dynamic update of search rank orders.
In my thesis, I propose a set of methodologies which incorporate user clicks
for dynamic update of search rank orders. The updates are based on adaptive
randomization of results using reinforcement learning strategy by considering
the user click activities as reinforcement signal. Beginning at any rank order
of the search results, the proposed methodologies guaranty to converge to
a ranking which is close to the ideal rank order. Besides, the usage of reinforcement
learning strategy enables the proposed methods to overcome the position bias phenomenon.
To measure the effectiveness
of the proposed method, I perform experiments considering a
simplified user behavior model which I call color ball abstraction model.
I evaluate the quality of the proposed methodologies using standard information retrieval
metrics like Precision at n (P@n), Kendall tau rank correlation, Discounted
Cumulative Gain (DCG) and Normalized Discounted Cumulative Gain (NDCG).
The experiment results clearly demonstrate the success of the proposed methodologies.
|
18 |
A Semantic Web based search engine with X3D visualisation of queries and resultsGkoutzis, Konstantinos January 2013 (has links)
The Semantic Web project has introduced new techniques for managing information. Data can now be organised more efficiently and in such a way that computers can take advantage of the relationships that characterise the given input to present more relevant output. Semantic Web based search engines can quickly educe exactly what is needed to be found and retrieve it while avoiding information overload. Up until now, search engines have interacted with their users by asking them to look for words and phrases. We propose the creation of a new generation Semantic Web search engine that will offer a visual interface for queries and results. To create such an engine, information input must be viewed not merely as keywords, but as specific concepts and objects which are all part of the same universal system. To make the manipulation of the interconnected visual objects simpler and more natural, 3D graphics are utilised, based on the X3D Web standard, allowing users to semantically synthesise their queries faster and in a more logical way, both for them and the computer.
|
19 |
Finding structure and characteristic of web documents for classification.January 2000 (has links)
by Wong, Wai Ching. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2000. / Includes bibliographical references (leaves 91-94). / Abstracts in English and Chinese. / Abstract --- p.ii / Acknowledgments --- p.v / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Semistructured Data --- p.2 / Chapter 1.2 --- Problem Addressed in the Thesis --- p.4 / Chapter 1.2.1 --- Labels and Values --- p.4 / Chapter 1.2.2 --- Discover Labels for the Same Attribute --- p.5 / Chapter 1.2.3 --- Classifying A Web Page --- p.6 / Chapter 1.3 --- Organization of the Thesis --- p.8 / Chapter 2 --- Background --- p.8 / Chapter 2.1 --- Related Work on Web Data --- p.8 / Chapter 2.1.1 --- Object Exchange Model (OEM) --- p.9 / Chapter 2.1.2 --- Schema Extraction --- p.11 / Chapter 2.1.3 --- Discovering Typical Structure --- p.15 / Chapter 2.1.4 --- Information Extraction of Web Data --- p.17 / Chapter 2.2 --- Automatic Text Processing --- p.19 / Chapter 2.2.1 --- Stopwords Elimination --- p.19 / Chapter 2.2.2 --- Stemming --- p.20 / Chapter 3 --- Web Data Definition --- p.22 / Chapter 3.1 --- Web Page --- p.22 / Chapter 3.2 --- Problem Description --- p.27 / Chapter 4 --- Hierarchical Structure --- p.32 / Chapter 4.1 --- Types of HTML Tags --- p.33 / Chapter 4.2 --- Tag-tree --- p.36 / Chapter 4.3 --- Hierarchical Structure Construction --- p.41 / Chapter 4.4 --- Hierarchical Structure Statistics --- p.50 / Chapter 5 --- Similar Labels Discovery --- p.53 / Chapter 5.1 --- Expression of Hierarchical Structure --- p.53 / Chapter 5.2 --- Labels Discovery Algorithm --- p.55 / Chapter 5.2.1 --- Phase 1: Remove Non-label Nodes --- p.57 / Chapter 5.2.2 --- Phase 2: Identify Label Nodes --- p.61 / Chapter 5.2.3 --- Phase 3: Discover Similar Labels --- p.66 / Chapter 5.3 --- Performance Evaluation of Labels Discovery Algorithm --- p.76 / Chapter 5.3.1 --- Phase 1 Results --- p.75 / Chapter 5.3.2 --- Phase 2 Results --- p.77 / Chapter 5.3.3 --- Phase 3 Results --- p.81 / Chapter 5.4 --- Classifying a Web Page --- p.83 / Chapter 5.4.1 --- Similarity Measurement --- p.84 / Chapter 5.4.2 --- Performance Evaluation --- p.86 / Chapter 6 --- Conclusion --- p.89
|
20 |
A Nearest-Neighbor Approach to Indicative Web SummarizationPetinot, Yves January 2016 (has links)
Through their role of content proxy, in particular on search engine result pages, Web summaries play an essential part in the discovery of information and services on the Web. In their simplest form, Web summaries are snippets based on a user-query and are obtained by extracting from the content of Web pages. The focus of this work, however, is on indicative Web summarization, that is, on the generation of summaries describing the purpose, topics and functionalities of Web pages. In many scenarios — e.g. navigational queries or content-deprived pages — such summaries represent a valuable commodity to concisely describe Web pages while circumventing the need to produce snippets from inherently noisy, dynamic, and structurally complex content. Previous approaches have identified linking pages as a privileged source of indicative content from which Web summaries may be derived using traditional extractive methods. To be reliable, these approaches require sufficient anchortext redundancy, ultimately showing the limits of extractive algorithms for what is, fundamentally, an abstractive task. In contrast, we explore the viability of abstractive approaches and propose a nearest-neighbors summarization framework leveraging summaries of conceptually related (neighboring) Web pages. We examine the steps that can lead to the reuse and adaptation of existing summaries to previously unseen pages. Specifically, we evaluate two Text-to-Text transformations that cover the main types of operations applicable to neighbor summaries: (1) ranking, to identify neighbor summaries that best fit the target; (2) target adaptation, to adjust individual neighbor summaries to the target page based on neighborhood-specific template-slot models. For this last transformation, we report on an initial exploration of the use of slot-driven compression to adjust adapted summaries based on the confidence associated with token-level adaptation operations. Overall, this dissertation explores a new research avenue for indicative Web summarization and shows the potential value, given the diversity and complexity of the content of Web pages, of transferring, and, when necessary, of adapting, existing summary information between conceptually similar Web pages.
|
Page generated in 0.0582 seconds