Return to search

Relationship analysis for web content adaptation

The use of mobile devices to access the World Wide Web is becoming more prevalent. When browsing webpages on small-screen devices, it is difficult to locate information of interest since the limited screen space can be fully packed with information. Also, browsing Web tables on small-screen devices is a non-trivial problem. To fit a large table in a small-screen device, the association between data values and their corresponding headers may be disrupted. It is difficult to locate information accurately if the data meanings are lost. For visually impaired users, the problem is even more challenging. Sequential presentation of the webpage by a screen reader is too time-consuming if the information of interest is placed at or near the end of the webpage. Therefore, there is a need to re-organize useful information in webpages in order to enhance information finding on small-screen devices. In this thesis, various adaptations are proposed by exploring and exploiting relationships between Web elements in the webpage.

In the current literature, some proposed heuristics are based on specific HTML elements, which cannot be generalized. Some other algorithms assume a correct DOM structure, which would fail if the webpage is not properly marked up. Many algorithms extract blocks without assigning them the proper titles. A gap needs to be filled, such that extracted blocks will be given a proper title through exploring the relationships between semantic elements. In this thesis, I propose to integrate relationship analysis and DOM-tree structure traversal for identifying logical sections together with their section headings. By extracting all the section headings, a table of content can be constructed to provide direct access to interested sections in an efficient way. Relationship analysis is a critical complement to the DOM structure for identifying the semantic content hierarchy when a webpage is not properly marked up. By exploring relationships between table cells, the structure of an unstructured Web table can be extracted. The semantic meanings of the data values are retained by preserving the data values and their corresponding headers. A novel way of accessing a webpage, which converts the page itself and its Web table into menu-based presentation, is then proposed. Converting the webpage into an Interactive Voice Response System introduces yet another mode of access which can enhance the accessibility of the webpage. In addition to improving mobile accessibility, the proposed adaptations can also benefit the visually impaired users.

Experiments show that the average effectiveness and efficiency of adaptation with direct access are improved by 18% and 15% respectively, which are clearly better than the case without adaptation. Also, by adapting the Web table into a series of menu pages, the effectiveness and efficiency are improved by 61% and 37% respectively. For the evaluations with visually impaired users, the adaptation with direct access can greatly improve efficiency by 85%. Some complicated Web tables in fact could not be properly interpreted by visually impaired users; the Web table adaptation makes them accessible. Information finding indeed becomes more efficient and effective when using the adapted versions. / published_or_final_version / Computer Science / Doctoral / Doctor of Philosophy
Date January 2014
CreatorsLai, Po-yan, 賴寶欣
ContributorsLau, FCM
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Source SetsHong Kong University Theses
Detected LanguageEnglish
RightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works., Creative Commons: Attribution 3.0 Hong Kong License
RelationHKU Theses Online (HKUTO)

Page generated in 0.0029 seconds