101 |
Semantic Interaction for Visual Analytics: Inferring Analytical Reasoning for Model SteeringEndert, Alex 18 July 2012 (has links)
User interaction in visual analytic systems is critical to enabling visual data exploration. Through interacting with visualizations, users engage in sensemaking, a process of developing and understanding relationships within datasets through foraging and synthesis. For example, two-dimensional layouts of high-dimensional data can be generated by dimension reduction models, and provide users with an overview of the relationships between information. However, exploring such spatializations can require expertise with the internal mechanisms and parameters of these models.
The core contribution of this work is semantic interaction, capable of steering such models without requiring expertise in dimension reduction models, but instead leveraging the domain expertise of the user. Semantic interaction infers the analytical reasoning of the user with model updates, steering the dimension reduction model for visual data exploration. As such, it is an approach to user interaction that leverages interactions designed for synthesis, and couples them with the underlying mathematical model to provide computational support for foraging. As a result, semantic interaction performs incremental model learning to enable synergy between the user's insights and the mathematical model. The contributions of this work are organized by providing a description of the principles of semantic interaction, providing design guidelines through the development of a visual analytic prototype, ForceSPIRE, and the evaluation of the impact of semantic interaction on the analytic process. The positive results of semantic interaction open a fundamentally new design space for designing user interactions in visual analytic systems.
This research was funded in part by the National Science Foundation, CCF-0937071 and CCF-0937133, the Institute for Critical Technology and Applied Science at Virginia Tech, and the National Geospatial-Intelligence Agency contract #HMI1582-05-1-2001. / Ph. D.
|
102 |
Using Data Analytics in Agriculture to Make Better Management DecisionsLiebe, Douglas Michael 19 May 2020 (has links)
The goal of this body of work is to explore various aspects of data analytics (DA) and its applications in agriculture. In our research, we produce decisions with mathematical models, create models, evaluate existing models, and review how certain models are best applied. The increasing granularity in decisions being made on farm, like individualized feeding, sub-plot level crop management, and plant and animal disease prevention, creates complex systems requiring DA to identify variance and patterns in data collected. Precision agriculture requires DA to make decisions about how to feasibly improve efficiency or performance in the system. Our research demonstrates ways to provide recommendations and make decisions in such systems.
Our first research goal was to clarify research on the topic of endophyte-infected tall fescue by relating different infection-measuring techniques and quantifying the effect of infection-level on grazing cattle growth. Cattle graze endophyte-infected tall fescue in many parts of the U.S and this feedstuff is thought to limit growth performance in those cattle. Our results suggest ergovaline concentration makes up close to 80% of the effect of measured total ergot alkaloids and cattle average daily gain decreased 33 g/d for each 100ppb increase in ergovaline concentration. By comparing decreased weight gain to the costs of reseeding a pasture, producers can make decisions related to the management of infected pastures.
The next research goal was to evaluate experimental and feed factors that affect measurements associated with ruminant protein digestion. Measurements explored were 0-h washout, potentially degradable, and undegradable protein fractions, protein degradation rate and digestibility of rumen undegradable protein. Our research found that the aforementioned measurements were significantly affected by feedstuff characteristics like neutral detergent fiber content and crude protein content, and also measurement variables like bag pore size, incubation time, bag area, and sample size to bag area ratio. Our findings suggest that current methods to measure and predict protein digestion lack robustness and are therefore not reliable to make feeding decisions or build research models.
The first two research projects involved creating models to help researchers and farmers make better decisions. Next, we aimed to produce a summary of existing DA frameworks and propose future areas for model building in agriculture. Machine learning models were discussed along with potential applications in animal agriculture. Additionally, we discuss the importance of model evaluation when producing applicable models. We propose that the future of DA in agriculture comes with increasing decision making done without human input and better integration of DA insights into farmer decision-making.
After detailing how mathematical models and machine learning could be used to further research, models were used to predict cases of clinical mastitis (CM) in dairy cows. Machine learning models took daily inputs relating to activity and production to produce probabilities of CM. By considering the economic costs of treatment and non-treatment in CM cases, we provide insight into the lack of applicable models being produced, and why smarter data collection, representative datasets, and validation that reflects how the model will be used are needed.
The overall goal of this body of work was to advance our understanding of agriculture and the complex decisions involved through the use of DA. Each project sheds light on model building, model evaluation, or model applicability. By relating modeling techniques in other fields to agriculture, this research aims to improve translation of these techniques in future research. As data collection in agriculture becomes even more commonplace, the need for good modeling practices will increase. / Doctor of Philosophy / Data analytics (DA) has become more popular with the increasing data collection capabilities using technologies like sensors, improvement in data storage techniques, and expanding literature on algorithms that can be used in prediction and summarization. This body of work explores many aspects of agricultural DA and its applications on-farm. The field of precision agriculture has risen from an influx of data and new possibilities for using these data. Even small farms are now able to collect data using technologies like sensor-equipped tractors and drones which are relatively inexpensive. Our research shows how using mathematical models combined with these data can help researchers produce more applicable tools and, in turn, help producers make more targeted decisions. We examine cases where models improve the understanding of a system, specifically, the effect of endophyte infection in tall fescue pastures, the effect of measurement on protein digestibility for ration formulation, and methods to predict sparse diseases using big data. Although DA is widely applied, specific agricultural research on topics such as model types, model performance, and model utility needs to be done. This research presented herein expands on these topics in detail, using DA and mathematical models to make predictions and understand systems while utilizing applicable DA frameworks for future research.
|
103 |
Visual Analytics with Biclusters: Exploring Coordinated Relationships in ContextSun, Maoyuan 06 September 2016 (has links)
Exploring coordinated relationships is an important task in data analytics. For example, an intelligence analyst may want to find three suspicious people who all visited the same four cities. However, existing techniques that display individual relationships, such as between lists of entities, require repetitious manual selection and significant mental aggregation in cluttered visualizations to find coordinated relationships.
This work presents a visual analytics approach that applies biclusters to support coordinated relationships exploration. Each computed bicluster aggregates individual relationships into coordinated sets. Thus, coordinated relationships can be formalized as biclusters. However, how to incorporate biclusters into a visual analytics tool to support sensemaking tasks is challenging. To address this, this work features three key contributions: 1) a five-level design framework for bicluster visualizations, 2) BiSet, highlighting bicluster-based edge bundling, seriation-based multiple lists ordering, and interactions for dynamic information foraging and management, and 3) an evaluation of BiSet. / Ph. D.
|
104 |
Data Mining Academic Emails to Model Employee Behaviors and Analyze Organizational StructureStraub, Kayla Marie 06 June 2016 (has links)
Email correspondence has become the predominant method of communication for businesses. If not for the inherent privacy concerns, this electronically searchable data could be used to better understand how employees interact. After the Enron dataset was made available, researchers were able to provide great insight into employee behaviors based on the available data despite the many challenges with that dataset. The work in this thesis demonstrates a suite of methods to an appropriately anonymized academic email dataset created from volunteers' email metadata. This new dataset, from an internal email server, is first used to validate feature extraction and machine learning algorithms in order to generate insight into the interactions within the center. Based solely on email metadata, a random forest approach models behavior patterns and predicts employee job titles with $96%$ accuracy. This result represents classifier performance not only on participants in the study but also on other members of the center who were connected to participants through email. Furthermore, the data revealed relationships not present in the center's formal operating structure. The culmination of this work is an organic organizational chart, which contains a fuller understanding of the center's internal structure than can be found in the official organizational chart. / Master of Science
|
105 |
On Grouped Observation Level Interaction and a Big Data Monte Carlo Sampling AlgorithmHu, Xinran 26 January 2015 (has links)
Big Data is transforming the way we live. From medical care to social networks, data is playing a central role in various applications. As the volume and dimensionality of datasets keeps growing, designing effective data analytics algorithms emerges as an important research topic in statistics. In this dissertation, I will summarize our research on two data analytics algorithms: a visual analytics algorithm named Grouped Observation Level Interaction with Multidimensional Scaling and a big data Monte Carlo sampling algorithm named Batched Permutation Sampler. These two algorithms are designed to enhance the capability of generating meaningful insights and utilizing massive datasets, respectively. / Ph. D.
|
106 |
Multi-Model Semantic Interaction for Scalable Text AnalyticsBradel, Lauren C. 28 May 2015 (has links)
Learning from text data often involves a loop of tasks that iterate between foraging for information and synthesizing it in incremental hypotheses. Past research has shown the advantages of using spatial workspaces as a means for synthesizing information through externalizing hypotheses and creating spatial schemas. However, spatializing the entirety of datasets becomes prohibitive as the number of documents available to the analysts grows, particularly when only a small subset are relevant to the tasks at hand. To address this issue, we developed the multi-model semantic interaction (MSI) technique, which leverages user interactions to aid in the display layout (as was seen in previous semantic interaction work), forage for new, relevant documents as implied by the interactions, and then place them in context of the user's existing spatial layout. This results in the ability for the user to conduct both implicit queries and traditional explicit searches. A comparative user study of StarSPIRE discovered that while adding implicit querying did not impact the quality of the foraging, it enabled users to 1) synthesize more information than users with only explicit querying, 2) externalize more hypotheses, 3) complete more synthesis-related semantic interactions. Also, 18% of relevant documents were found by implicitly generated queries when given the option. StarSPIRE has also been integrated with web-based search engines, allowing users to work across vastly different levels of data scale to complete exploratory data analysis tasks (e.g. literature review, investigative journalism).
The core contribution of this work is multi-model semantic interaction (MSI) for usable big data analytics. This work has expanded the understanding of how user interactions can be interpreted and mapped to underlying models to steer multiple algorithms simultaneously and at varying levels of data scale. This is represented in an extendable multi-model semantic interaction pipeline. The lessons learned from this dissertation work can be applied to other visual analytics systems, promoting direct manipulation of the data in context of the visualization rather than tweaking algorithmic parameters and creating usable and intuitive interfaces for big data analytics. / Ph. D.
|
107 |
Designing Display Ecologies for Visual AnalysisChung, HaeYong 07 May 2015 (has links)
The current proliferation of connected displays and mobile devices from smart phones and tablets to wall-sized displays presents a number of exciting opportunities for information visualization and visual analytics. When a user employs heterogeneous displays collaboratively to achieve a goal, they form what is known as a display ecology. The display ecology enables multiple displays to function in concert within a broader technological environment to accomplish tasks and goals. However, since information and tasks are scattered and disconnected among separate displays, one of the inherent challenges associated with visual analysis in display ecologies is enabling users to seamlessly coordinate and subsequently connect and integrate information across displays. This research primarily addresses these challenges through the creation of interaction and visualization techniques and systems for display ecologies in order to support sensemaking with visual analysis.
This dissertation explores essential visual analysis activities and design considerations for visual analysis in order to inform the new design of display ecologies for visual analysis. Based on identified design considerations, we then designed and developed two visual analysis systems. First, VisPorter supports intuitive gesture interactions for sharing and integrating information in a display ecology. Second, the Spatially Aware Visual Links (SAViL) presents a cross-display visual link technique capable of guiding the user's attention to relevant information across displays. It also enables the user to visually connect related information over displays in order to facilitate synthesizing information scattered over separate displays and devices. The various aspects associated with the techniques described herein help users to transform and empower the multiple displays in a display ecology for enhanced visual analysis and sensemaking. / Ph. D.
|
108 |
Immersive Space to Think: Immersive Analytics for Sensemaking with Non-Quantitative DatasetsLisle, Lorance Richard 09 February 2023 (has links)
Analysts often work with large complex non-quantitative datasets in order to better understand concepts, themes, and other forms of insight contained within them. As defined by Pirolli and Card, this act of sensemaking is cognitively difficult, and is performed iteratively and repetitively through various stages of understanding. Immersive analytics has purported to assist with this process through putting users in virtual environments that allows them to sift through and explore data in three-dimensional interactive settings. Most previous research, however, has focused on quantitative data, where users are interacting with mostly numerical representations of data. We designed Immersive Space to Think, an immersive analytics approach to assist users perform the act of sensemaking with non-quantitative datasets, affording analysts the ability to manipulate data artifacts, annotate them, search through them, and present their findings. We performed several studies to understand and refine our approach and how it affects users sensemaking strategies. An exploratory virtual reality study found that users place documents in 2.5-dimensional structures, where we saw semicircular, environmental, and planar layouts. The environmental layout, in particular, used features of the environment as scaffolding for users' sensemaking process. In a study comparing levels of mixed reality as defined by Milgram-Kishino's Reality-Virtuality Continuum, we found that an augmented virtuality solution best fits users' preferences while still supporting external tools. Lastly, we explored how users deal with varying amounts of space and three-dimensional user interaction techniques in a comparative study comparing small virtual monitors, large virtual monitors, and a seated-version implementation of Immersive Space to Think. Our participants found IST best supported the task of sensemaking, with evidence that users leveraged spatial memory and utilized depth to denote additional meaning in the immersive condition. Overall, Immersive Space to Think affords an effective sensemaking three-dimensional space using 3D user interaction techniques that can leverage embodied cognition and spatial memory which aids the users understanding. / Doctor of Philosophy / Humans are constantly trying to make sense of the world around them. Whether they're a detective trying to understand what happened at a crime scene or a shopper trying to find the best office chair, people are consuming vast quantities of data to assist them with their choices. This process can be difficult, and people are often returning to various pieces of data repeatedly to remember why they are making the choice they decided upon. With the advent of cheap virtual reality products, researchers have pursued the technology as a way for people to better understand large sets of data. However, most mixed reality applications looking into this problem focus on numerical data, whereas a lot of the data people process is multimedia or text-based in nature. We designed and developed a mixed reality approach for analyzing this type of data called Immersive Space to Think. Our approach allows users to look at and move various documents around in a virtual environment, take notes or highlight those documents, search those documents, and create reports that summarize what they've learned. We also performed several studies to investigate and evolve our design. First, we ran a study in virtual reality to understand how users interact with documents using Immersive Space to Think. We found users arranging documents around themselves in a semicircular or flat plane pattern, or using various cues in the virtual environment as a way to organize the document set. Furthermore, we performed a study to understand user preferences with augmented and virtual reality. We found a mix of the two, also known as augmented virtuality, would best support user preferences and ability. Lastly, we ran two comparative studies to understand how three dimensional space and interaction affects user strategies. We ran a small user study looking at how a single student uses a desktop computer with a single display as well as immersive space to think to write essays. We found that they wrote essays with a better understanding of the source data with Immersive Space to Think than the desktop setup. We conducted a larger study where we compared a small virtual monitor simulating a traditional desktop screen, a large virtual monitor simulating a monitor 8 times the size of traditional desktop monitors, and immersive space to think. We found participants engaged with documents more in Immersive Space to Think, and used the space to denote importance for documents. Overall, Immersive Space to Think provides a compelling environment that assists users in understanding sets of documents.
|
109 |
Narrative Maps: A Computational Model to Support Analysts in Narrative SensemakingKeith Norambuena, Brian Felipe 08 August 2023 (has links)
Narratives are fundamental to our understanding of the world, and they are pervasive in all activities that involve representing events in time. Narrative analysis has a series of applications in computational journalism, intelligence analysis, and misinformation modeling. In particular, narratives are a key element of the sensemaking process of analysts.
In this work, we propose a narrative model and visualization method to aid analysts with this process. In particular, we propose the narrative maps framework—an event-based representation that uses a directed acyclic graph to represent the narrative structure—and a series of empirically defined design guidelines for map construction obtained from a user study.
Furthermore, our narrative extraction pipeline is based on maximizing coherence—modeled as a function of surface text similarity and topical similarity—subject to coverage—modeled through topical clusters—and structural constraints through the use of linear programming optimization. For the purposes of our evaluation, we focus on the news narrative domain and showcase the capabilities of our model through several case studies and user evaluations.
Moreover, we augment the narrative maps framework with interactive AI techniques—using semantic interaction and explainable AI—to create an interactive narrative model that is capable of learning from user interactions to customize the narrative model based on the user's needs and providing explanations for each core component of the narrative model. Throughout this process, we propose a general framework for interactive AI that can handle similar models to narrative maps—that is, models that mix continuous low-level representations (e.g., dimensionality reduction) with more abstract high-level discrete structures (e.g., graphs).
Finally, we evaluate our proposed framework through an insight-based user study. In particular, we perform a quantitative and qualitative assessment of the behavior of users and explore their cognitive strategies, including how they use the explainable AI and semantic interaction capabilities of our system. Our evaluation shows that our proposed interactive AI framework for narrative maps is capable of aiding users in finding more insights from data when compared to the baseline. / Doctor of Philosophy / Narratives are essential to how we understand the world. They help us make sense of events that happen over time. This research focuses on developing a method to assist people, like journalists and analysts, in understanding complex information.
To do this, we introduce a new approach called narrative maps. This model allows us to extract and visualize stories from text data. To improve our model, we use interactive artificial intelligence techniques. These techniques allow our model to learn from user feedback and be customized to fit different needs. We also use these methods to explain how the model works, so users can understand it better.
We evaluate our approach by studying how users interact with it when doing a task with news stories. We consider how useful the system is in helping users gain insights. Our results show that our method aids users in finding important insights compared to traditional methods.
|
110 |
Comparison of Computational Notebook Platforms for Interactive Visual Analytics: Case Study of Andromeda ImplementationsLiu, Han 22 September 2022 (has links)
Existing notebook platforms have different capabilities for supporting visual analytics use. It is not clear which platform to choose for implementing visual analytics notebooks. In this work, we investigated the problem using Andromeda, an interactive dimension reduction algorithm, and implemented it using three different notebook platforms: 1) Python-based Jupyter Notebook, 2) JavaScript-based Observable Notebook, and 3) Jupyter Notebook embedding both Python (data science use) and JavaScript (visual analytics use). We also made comparisons for all the notebook platforms via a case study based on metrics such as programming difficulty, notebook organization, interactive performance, and UI design choice. Furthermore, guidelines are provided for data scientists to choose one notebook platform for implementing their visual analytics notebooks in various situations. Laying the groundwork for future developers, advice is also given on architecting better notebook platforms. / Master of Science / Data scientists are interested in developing visual analytics notebooks. However, different notebook platforms have different support for visual analytics components, such as visualizations and user interactions. To investigate which notebook platform to use for visual analytics, we built notebooks based on three different notebook platforms, i.e., Jupyter Notebook (with Python), Observable Notebook (with JavaScript), and Jupyter Notebook (with Python and JavaScript). Based on the implementation and user interactions, we explained why significant differences exist via specific metrics, such as programming difficulty, notebook organization, interactive performance, and the UI design choice. Furthermore, our work will benefit future researchers in choosing suitable notebook platforms for implementing visual analytics notebooks.
|
Page generated in 0.0516 seconds