51 |
Search engine exclusion policies: implications on indexing e-commerce websitesMbikiwa, Fernie Neo January 2005 (has links)
THESIS
Submitted in fulfilment
of the requirements for the degree
MAGISTER TECHNOLOGIAE
in
INFORMATION TECHNOLOGY
in the
FACULTY OF BUSINESS INFORMATICS
at the
CAPE PENINSULA UNIVERSITY OF TECHNOLOGY
2005 / The aim of this research was to determine how search engine exclusion
policies and spam affect the indexing of e-Commerce websites. The Internet
has brought along new ways of doing business. The unexpected growth of
the World Wide Web made it essential for firms to adopt e-commerce as a
means of obtaining a competitive edge. The introduction of e-commerce in
turn facilitated the breaking down of physical barriers that were evident in
traditional business operations.
It is important for e-commerce websites to attract visitors, otherwise the
website content is irrelevant. Websites can be accessed through the use of
search engines, and it is estimated that 88% of users start with search
engines when completing tasks on the web. This has resulted in web
designers aiming to have their websites appear in the top ten search engine
result list, as a high placement of websites in search engines is one of the
strongest contributors to a commercial website’s success.
To achieve such high rankings, web designers often adopt Search Engine
Optimization (SEO) practices. Some of these practices invariably culminate in
undeserving websites achieving top rankings. It is not clear how these SEO
practices are viewed by search engines, as some practices that are deemed
unacceptable by certain search engines are accepted by others. Furthermore,
there are no clear standards for assessing what is considered good or bad
SEO practices. This confuses web designers in determining what is spam,
resulting in the amount of search engine spam having increased over time,
impacting adversely on search engine results.
From the literature reviewed in this thesis, as well as the policies of five top
search engines (Google, Yahoo!, AskJeeves, AltaVista, and Ananzi), this
author was able to compile a list of what is generally considered as spam.
Furthermore, 47 e-commerce websites were analysed to determine if they
contain any form of spam. The five major search engines indexed some of
these websites. This enabled the author to determine to what extent search
engines adhere to their policies. This analysis returned two major findings. A
small amount of websites contained spam, and from the pre-compiled list of
spam tactics, only two were identified in the websites, namely keyword
stuffing and page redirects. Of the total number of websites analysed, it was
found that 21.3% of the websites contained spam.
From these findings, the research contained in this thesis concluded that
search engines adhere to their own policies, but lack stringent controls for the
majority of websites that contained spam, and were still listed by search
engines. In this study, the author only analysed e-commerce websites, and
cannot therefore generalise the results to other websites outside ecommerce.
|
52 |
Usability of a Keyphrase Browsing Tool Based on a Semantic Cloud ModelJohnston, Onaje Omotola 08 1900 (has links)
The goal of this research was to facilitate the scrutiny and utilization of Web search engine retrieval results. I used a graphical keyphrase browsing interface to visualize the conceptual information space of the results, presenting document characteristics that make document relevance determinations easier.
|
53 |
The internet as a resource for research, teaching and learning : a comparative study between the University of Zimbabwe and University of ZululandMugwisi, Tinashe January 2002 (has links)
A dissertation submitted in partial fulfillment of the requirements for the award of a degree of Masters of Arts in Library and Information Science from the Department of Information Studies at the University of Zululand, 2002. / The Internet has been described as a collection of sprawling computer networks that link millions of computers used by tens of millions of people all over the world (Leedy 1997:66). From an initial few hundred computers, the Internet has grown exponentially enabling users to communicate with each other and share information. Libraries have embraced the Internet in order to deliver improved services and extend and expand the scope of what they offer. The purpose of this study was to explore and examine, through a comparison, the use of the Internet for teaching, learning and research by academics and students at the Universities of Zimbabwe and Zululand. It was also to explore how their libraries could contribute towards achieving this aim. The survey method was largely used in which both qualitative and quantitative data was collected. Two sets of questionnaires were distributed, one to academics and students, and the second to professional librarians in the two institutions. Interviews were also conducted with IT divisions. Data was then analysed using the SAS programme and Microsoft Excel.
The study found out that there were high computer and Internet skills among the respondents, both among academics and students, and librarians. The Internet was used in both institutions, for study and work purposes. Among the resources used, e-mail and the web were the most used by the majority of respondents. The study found out that there was no recognisable difference between Internet use and academic discipline, between and within the two institutions. This was contrary to studies in literature reviewed where Sciences were found to use the Internet more than Humanities. No significant differences were also noticed when Internet use was analysed by level of study and status of faculty academics. The study however established that the Internet had changed the information seeking behaviour of the majority of respondents in all categories. There was evidence of use of others services like telnet, electronic journals and other library OPACs by librarians for work purposes. There was however a poor link between librarians and their users with regards to use of Internet resources. The study also highlighted rather similar problems facing the two institutions in terms of Internet accessibility. Access was a major concern, due to inadequate provision of computers and existing connection to the Internet. The need for more formalised training in the use of Internet resources and the creation of awareness among academics and other potential users were also highlighted. Despite these problems, the study revealed that there is a great potential for Internet use and appreciation among academic librarians and users in the two institutions. Recommendations were put forward, among them, the need for management in the two institutions to make resources, both financial and materially available in order to sustain Internet use programmes and initiatives that are already in place.
|
54 |
Privacy-Aware Data Analysis: Recent Developments for Statistics and Machine LearningLut, Yuliia January 2022 (has links)
Due to technological development, personal data has become more available to collect, store and analyze. Companies can collect detailed browsing behavior data, health-related data from smartphones and smartwatches, voice and movement recordings from smart home devices. Analysis of such data can bring numerous advantages to society and further development of science and technology. However, given an often sensitive nature of the collected data, people have become increasingly concerned about the data they share and how they interact with new technology.
These concerns have motivated companies and public institutions to provide services and products with privacy guarantees. Therefore, many institutions and research communities have adopted the notion of differential privacy to address privacy concerns which has emerged as a powerful technique for enabling data analysis while preventing information leakage about individuals. In simple words, differential privacy allows us to use and analyze sensitive data while maintaining privacy guarantees for every individual data point. As a result, numerous algorithmic private tools have been developed for various applications. However, multiple open questions and research areas remain to be explored around differential privacy in machine learning, statistics, and data analysis, which the existing literature has not covered.
In Chapter 1, we provide a brief discussion of the problems and the main contributions that are presented in this thesis. Additionally, we briefly recap the notion of differential privacy with some useful results and algorithms.
In Chapter 2, we study the problem of differentially private change-point detection for unknown distributions. The change-point detection problem seeks to identify distributional changes in streams of data. Non-private tools for change-point detection have been widely applied in several settings. However, in certain applications, such as identifying disease outbreaks based on hospital records or IoT devices detecting home activity, the collected data is highly sensitive, which motivates the study of privacy-preserving tools. Much of the prior work on change-point detection---including the only private algorithms for this problem---requires complete knowledge of the pre-change and post-change distributions. However, this assumption is not realistic for many practical applications of interest. In this chapter, we present differentially private algorithms for solving the change-point problem when the data distributions are unknown to the analyst. Additionally, we study the case when data may be sampled from distributions that change smoothly over time rather than fixed pre-change and post-change distributions. Furthermore, our algorithms can be applied to detect changes in linear trends of such data streams. Finally, we also provide a computational study to empirically validate the performance of our algorithms.
In Chapter 3, we study the problem of learning from imbalanced datasets, in which the classes are not equally represented, through the lens of differential privacy. A widely used method to address imbalanced data is resampling from the minority class instances. However, when confidential or sensitive attributes are present, data replication can lead to privacy leakage, disproportionally affecting the minority class. This challenge motivates the study of privacy-preserving pre-processing techniques for imbalanced learning. In this work, we present a differentially private synthetic minority oversampling technique (DP-SMOTE) which is based on a widely used non-private oversampling method known as SMOTE. Our algorithm generates differentially private synthetic data from the minority class. We demonstrate the impact of our pre-processing technique on the performance and privacy leakage of various classification methods in a detailed computational study.
In Chapter 4, we focus on the analysis of sensitive data that is generated from online internet activity. Accurately analyzing and modeling online browsing behavior play a key role in understanding users and technology interactions. Towards this goal, in this chapter, we present an up-to-date measurement study of online browsing behavior. We study both self-reported and observational browsing data and analyze what underlying features can be learned from statistical analysis of this potentially sensitive data. For this, we empirically address the following questions: (1) Do structural patterns of browsing differ across demographic groups and types of web use?, (2) Do people have correct perceptions of their behavior online?, and (3) Do people change their browsing behavior if they are aware of being observed?
In response to these questions, we found little difference across most demographic groups and website categories, suggesting that these features cannot be implied solely from clickstream data. We find that users significantly overestimate the time they spend online but have relatively accurate perceptions of how they spend their time online. We find no significant changes in behavior throughout the study, which may indicate that observation had no effect on behavior or that users were consciously aware of being observed throughout the study.
|
55 |
Wrapper application generation for semantic webHan, Wei 01 December 2003 (has links)
No description available.
|
56 |
The effects of search strategies and information interaction on sensemakingWilson, Mathew J. January 2015 (has links)
No description available.
|
57 |
The impact of modes of mediation on the web retrieval processPannu, M. January 2011 (has links)
This research is an integral part of the effort aimed at overcoming the limitations of the classic search engines. This thesis is concerned with the investigation of the impact of different modes of mediation on the web search process. Conceptually, it is divided into three main parts. The first part details the investigation of methods and mechanisms in user profile generation and in filtering search results. The second part deals with the presentation of an approach and its application in the development of a mediation framework between the user and the classic Web Search engines. This involved the integration of the explicit, implicit and hybrid modes of mediation within a content-based method, and was facilitated by the adoption of the Vector Space Model. The third part presents an extensive comparative evaluation of the impact of the different types of mediation systems on web search, in terms of precision, recall and F-measure. The thesis concludes by identifying the contribution of the research programme and the satisfaction of the stated objectives.
|
58 |
Faculty Use of the World Wide Web: Modeling Information Seeking Behavior in a Digital EnvironmentFortin, Maurice G. 12 1900 (has links)
There has been a long history of studying library users and their information seeking behaviors and activities. Researchers developed models to better understand these information seeking behaviors and activities of users. Most of these models were developed before the onset of the Internet. This research project studied faculty members' use of and their information seeking behaviors and activities on the Internet at Angelo State University, a Master's I institution. Using both a quantitative and qualitative methodology, differences were found between tenured and tenure-track faculty members on the perceived value of the Internet to meet their research and classroom information needs. Similar differences were also found among faculty members in the broad discipline areas of the humanities, social sciences, and sciences. Tenure-track faculty members reported a higher average Internet use per week than tenured faculty members. Based on in-depth, semi-structured interviews with seven tenured and seven tenure-track faculty members, an Internet Information Seeking Activities Model was developed to describe the information seeking activities on the Internet by faculty members at Angelo State University. The model consisted of four basic stages of activities: "Gathering," "Validating," "Linking" with a sub-stage of "Re-validating," and "Monitoring." There were two parallel stages included in the model. These parallel stages were "Communicating" and "Mentoring." The Internet Information Seeking Activities Model was compared to the behavioral model of information seeking by faculty members developed by Ellis. The Internet Model placed a greater emphasis on validating information retrieved from the Internet. Otherwise there were no other substantive changes to Ellis' model.
|
59 |
University Students and the Internet: Information Seeking StudyShamo, Esmaeel 05 1900 (has links)
This study explored university students' information needs and seeking behaviors on the Internet. A Web-based survey was administrated one time. Two hundred responses were received from the target sample within the two weeks period of the study. Data were analyzed with descriptive statistics, factor analysis, and graphical representation. The study explored various issues related to the usability, preferences, and activities of the Internet, such as searching tools, e-mail, search engines, and preferred primary sources of everyday-life information needs. The study explored the perceptions of the students toward the Internet and the traditional library. Kuhlthau's model of the information-seeking process, which includes six stages and affective components, was utilized and modified in the construction of the Web survey. A study by Presno (1998), which includes the four types of Internet anxiety, was utilized in the construction of the Web survey. With regard to the six stages of Kuhlthau model, the majority of the respondents experienced stage 5, which was about information gathering; stage 3 had the next highest number of respondents. Very few respondents experienced stages 1 and 2. There was a systematic pattern in which, the earlier the stages the respondents were in, the more negative adjectives they selected, and vice versa. The feeling adjectives section showed a difference in the behavior between males and females. The results indicated that most students had Internet time delay anxiety. In general, the study found that students have a great interest in the Internet and consider it an important source of information for their personal, educational, and communication activities.
|
60 |
Three-dimensional Information Space : An Exploration of a World Wide Web-based, Three-dimensional, Hierarchical Information Retrieval Interface Using Virtual Reality Modeling LanguageScannell, Peter 12 1900 (has links)
This study examined the differences between a 3-D, VRML search interface, similar to Cone Trees, as a front-end to Yahoo on the World Wide Web and a conventional text-based, 1-Dinterface to the same database. The study sought to determine how quickly users could find information using both interfaces, their degree of satisfaction with both search interfaces, and which interface they preferred.
|
Page generated in 0.1136 seconds