This study proposes and evaluates a new method for Web archiving. We leverage the caching infrastructure in Web servers for archiving. Redis is used as the page cache and its persistence mechanism is exploited for archiving. We experimentally evaluate the performance of our archival technique using the Greek version of Wikipedia deployed on Amazon cloud infrastructure. We show that there is a slight increase in latencies of the rendered pages due to archiving. Though the server performance is comparable at larger page cache sizes, the maximum throughput the server can handle decreases significantly at lower cache sizes due to more disk write operations as a result of archiving. Since pages are dynamically rendered and the technology stack of Wikipedia is extensively used in a number of Web applications, our results should have broad impact. / Master of Science / This study proposes and evaluates a new method for Web archiving. To reduce response time for serving webpages, Web Servers store recently rendered pages in memory. This process is known as caching. We modify this caching mechanism of Web Servers for archival. We then experimentally evaluate the impact of our archival technique on Web Servers. We observe that the time to render a Web page increases slightly as long as the Web Server is under moderate load. Through our experiments, we establish limits on the maximum requests a Web Server can handle without increasing the response time. We ensure our experiments are conducted on Web Servers using technologies that are widely used today. Thus our results should have broad impact.
Identifer | oai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/78252 |
Date | 23 June 2017 |
Creators | Vishwasrao, Saket Dilip |
Contributors | Electrical and Computer Engineering, Fox, Edward A., Hou, Yiwei Thomas, Xie, Zhiwu |
Publisher | Virginia Tech |
Source Sets | Virginia Tech Theses and Dissertation |
Detected Language | English |
Type | Thesis |
Format | ETD, application/pdf, application/x-zip-compressed |
Rights | In Copyright, http://rightsstatements.org/vocab/InC/1.0/ |
Page generated in 0.0019 seconds