The usage of Web sites has been of interest to Web administrators and researchers ever since the Web started. Analysis of Web site usage data helps to understand the behaviour of its users, which is very important, as many important decisions can be made based on it. The user behaviour may be deduced by knowing all the activities each user does from the time s/he starts a session on the Web site until s/he leaves it, which is collectively called a user session. As Web server logs explicitly record the browsing behaviour of site users and are readily and economically available, this thesis explores the use of Web server logs in capturing user sessions on Web. In order to protect users’ privacy, the standard Web server logs in general do not record the user identities or similar measures to uniquely identify the users. This thesis concentrates on heuristic strategies to infer user sessions. The heuristics exploit the background knowledge of user navigational behaviour recorded in the standard Web server logs without requiring additional information through cookies, logins and session ids. They identify relationships that may exist among the log data and make use of them to assess whether requests registered by the Web server can belong to the same individual and whether these requests were performed during the same visit. Researchers have proposed several heuristics, which were adversely affected by proxy servers, caching and undefined referrers. The thesis proposes new heuristics, which effectively address all the limitations, thus extending the work in this field. It also introduces a set of measures to quantify the performance of the heuristics and uses them to investigate their efficiency based on logs from three Web sites and makes recommendations for the Web sites to devise their own heuristics. The investigation has shown satisfactory results and the new heuristics are applicable to wider range of Web sites. / Doctor of Philosophy (PhD)
Identifer | oai:union.ndltd.org:ADTP/244923 |
Date | January 2005 |
Creators | Caldera, Amithalal, University of Western Sydney, College of Science, Technology and Environment, School of Computing and Information Technology |
Source Sets | Australiasian Digital Theses Program |
Language | English |
Detected Language | English |
Page generated in 0.0017 seconds