Return to search

Visual exploratory analysis of large data sets : evaluation and application

Large data sets are difficult to analyze. Visualization has been proposed to assist exploratory data analysis (EDA) as our visual systems can process signals in
parallel to quickly detect patterns. Nonetheless, designing an effective visual
analytic tool remains a challenge.

This challenge is partly due to our incomplete understanding of how common
visualization techniques are used by human operators during analyses, either in
laboratory settings or in the workplace.
This thesis aims to further understand how visualizations can be used to support EDA. More specifically, we studied techniques that display multiple levels of visual information resolutions (VIRs) for analyses using a range of methods.

The first study is a summary synthesis conducted to obtain a snapshot of
knowledge in multiple-VIR use and to identify research questions for the thesis:
(1) low-VIR use and creation; (2) spatial arrangements of VIRs. The next two
studies are laboratory studies to investigate the visual memory cost of image
transformations frequently used to create low-VIR displays and overview use
with single-level data displayed in multiple-VIR interfaces.

For a more well-rounded evaluation, we needed to study these techniques in
ecologically-valid settings. We therefore selected the application domain of web
session log analysis and applied our knowledge from our first three evaluations
to build a tool called Session Viewer. Taking the multiple coordinated view
and overview + detail approaches, Session Viewer displays multiple levels of
web session log data and multiple views of session populations to facilitate data
analysis from the high-level statistical to the low-level detailed session analysis
approaches.

Our fourth and last study for this thesis is a field evaluation conducted at
Google Inc. with seven session analysts using Session Viewer to analyze their
own data with their own tasks. Study observations suggested that displaying
web session logs at multiple levels using the overview + detail technique helped bridge between high-level statistical and low-level detailed session analyses, and
the simultaneous display of multiple session populations at all data levels using
multiple views allowed quick comparisons between session populations. We also
identified design and deployment considerations to meet the needs of diverse
data sources and analysis styles.

  1. http://hdl.handle.net/2429/839
Identiferoai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:BVAU.2429/839
Date11 1900
CreatorsLam, Heidi Lap Mun
PublisherUniversity of British Columbia
Source SetsLibrary and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada
LanguageEnglish
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation

Page generated in 0.002 seconds