Global ETD Search

Return to search

Component-Based Crawling of Complex Rich Internet Applications

During the past decade, web applications have evolved substantially. Taking advantage of new technologies, Rich Internet Applications (RIAs) make heavy use of client side code to present content. Web crawlers, however, face new challenges in crawling RIAs, such as how to explore and identify different client states. The problem of crawling RIAs has been a focus for researchers during recent years, and solutions have been proposed based on constructing a state-transition model with DOMs as states and JavaScript events as transitions. When faced with real-life RIAs, however, a major problem prevalent in current solutions is state space explosion caused by the complexity of the RIAs. This problem prevents the automated crawlers from being usable on complex RIAs as they fail to produce useful results in a timely fashion. This research addresses the challenge of efficiently crawling complex RIAs with two main ideas: component-based crawling and similarity detection. Our experimental results show that these ideas lead to a drastic reduction of the time required to produce results, enabling the crawler to explore RIAs previously too complex for automated crawl.

AJAX

crawl

ria

Identifer	oai:union.ndltd.org:uottawa.ca/oai:ruor.uottawa.ca:10393/30636
Date	January 2014
Creators	Moosavi Byooki, Seyed Ali
Contributors	Jourdan, Guy-Vincent, Onut, Iosif-Viorel
Publisher	Université d'Ottawa / University of Ottawa
Source Sets	Université d’Ottawa
Language	English
Detected Language	English
Type	Thesis

Page generated in 0.0031 seconds

Component-Based Crawling of Complex Rich Internet Applications

Description

Links & Downloads

Tags

Additional Fields