Global ETD Search

1	2D Jupyter: Design and Evaluation of 2D Computational Notebooks Christman, Elizabeth 12 June 2023 (has links) Computational notebooks are a popular tool for data analysis. However, the 1D linear structure used by many computational notebooks can lead to challenges and pain points in data analysis, including messiness, tedious navigation, inefficient use of screen space, and presentation of non-linear narratives. To address these problems, we designed a prototype Jupyter Notebooks extension called 2D Jupyter that enables a 2D organization of code cells in a multi-column layout, as well as freeform cell placement. We conducted a user study using this extension to evaluate the usability of 2D computational notebooks and understand the advantages and disadvantages that it provides over a 1D layout. As a result of this study, we found evidence that the 2D layout provides enhanced usability and efficiency in computational notebooks. Additionally, we gathered feedback on the design of the prototype that can be used to inform future work. Overall, 2D Jupyter was positively received and users not only enjoyed using the extension, but also expressed a desire to use 2D notebook environments in the future. / Master of Science / Computational notebooks are a tool commonly used by data analysts that allows them to construct computational narratives through a combination of code, text and visualizations. Many computational notebooks use a 1D linear layout; however data analysis work is often conducted in a non-linear fashion due to the need to debug code, test new theories, and evaluate and compare results. In this work, we present a prototype extension for Jupyter Notebooks called 2D Jupyter that enables the user to arrange their notebook in a 2D multi-column layout. A user study was conducted to evaluate the usability of this extension and understand the benefits that a 2D layout may provide. Feedback on the extension's design was also collected to inform future design opportunities. The prototype received a positive reaction overall and users expressed a desire to use 2D computational notebooks in their future work. computational notebooks software data science
2	Comparison of Computational Notebook Platforms for Interactive Visual Analytics: Case Study of Andromeda Implementations Liu, Han 22 September 2022 (has links) Existing notebook platforms have different capabilities for supporting visual analytics use. It is not clear which platform to choose for implementing visual analytics notebooks. In this work, we investigated the problem using Andromeda, an interactive dimension reduction algorithm, and implemented it using three different notebook platforms: 1) Python-based Jupyter Notebook, 2) JavaScript-based Observable Notebook, and 3) Jupyter Notebook embedding both Python (data science use) and JavaScript (visual analytics use). We also made comparisons for all the notebook platforms via a case study based on metrics such as programming difficulty, notebook organization, interactive performance, and UI design choice. Furthermore, guidelines are provided for data scientists to choose one notebook platform for implementing their visual analytics notebooks in various situations. Laying the groundwork for future developers, advice is also given on architecting better notebook platforms. / Master of Science / Data scientists are interested in developing visual analytics notebooks. However, different notebook platforms have different support for visual analytics components, such as visualizations and user interactions. To investigate which notebook platform to use for visual analytics, we built notebooks based on three different notebook platforms, i.e., Jupyter Notebook (with Python), Observable Notebook (with JavaScript), and Jupyter Notebook (with Python and JavaScript). Based on the implementation and user interactions, we explained why significant differences exist via specific metrics, such as programming difficulty, notebook organization, interactive performance, and the UI design choice. Furthermore, our work will benefit future researchers in choosing suitable notebook platforms for implementing visual analytics notebooks. Visual Analytics Data Science Computational Notebooks
3	Automatic Restoration and Management of Computational Notebooks Venkatesan, Satish 03 March 2022 (has links) Computational Notebook platforms are very commonly used by programmers and data scientists. However, due to the interactive development environment of notebooks, developers struggle to maintain effective code organization which has an adverse effect on their productivity. In this thesis, we research and develop techniques to help solve issues with code organization that developers face in an effort to improve productivity. Notebooks are often executed out of order which adversely effects their portability. To determine cell execution orders in computational notebooks, we develop a technique that determines the execution order for a given cell and if need be, attempt to rearrange the cells to match the intended execution order. With such a tool, users would not need to manually determine the execution orders themselves. In a user study with 9 participants, our approach on average saves users about 95% of the time required to determine execution orders manually. We also developed a technique to support insertion of cells in rows in addition to the standard column insertion to help better represent multiple contexts. In a user study with 9 participants, this technique on a scale of one to ten on average was judged as a 8.44 in terms of representing multiple contexts as opposed to standard view which was judged as 4.77. / Master of Science / In the field of data science computational notebooks are a very commonly used tool. They allow users to create programs to perform computations and to display graphs, tables and other visualizations to supplement their analysis. Computational Notebooks have some limitations in the development environment which can make it difficult for users to organize their code. This can make it very difficult to read through and analyze the code to find or fix any errors which in turn can have a very negative effect on developer productivity. In this thesis, we research methods to improve the development environment and increase developer productivity. We achieve this by offering tools to the user that can help organize and cleanup their code making it easier to comprehend the code and make any necessary changes. Computational Notebooks Dependency Analysis Version Control
4	Code duplication and reuse in Jupyter notebooks Koenzen, Andreas Peter 21 September 2020 (has links) Reusing code can expedite software creation, analysis and exploration of data. Expediency can be particularly valuable for users of computational notebooks, where duplication allows them to quickly test hypotheses and iterate over data, without creating code from scratch. In this thesis, I’ll explore the topic of code duplication and the behaviour of code reuse for Jupyter notebooks; quantifying and describing snippets of code and explore potential barriers for reuse. As part of this thesis I conducted two studies into Jupyter notebooks use. In my first study, I mined GitHub repositories, quantifying and describing code duplicates contained within repositories that contained at least one Jupyter notebook. For my second study, I conducted an observational user study using a contextual inquiry, where my participants solved specific tasks using notebooks, while I observed and took notes. The work in this thesis can be categorized as exploratory, since both my studies were aimed at generating hypotheses for which further studies can build upon. My contributions with this thesis is two-fold: a thorough description of code duplicates contained within GitHub repositories and an exploration of the behaviour behind code reuse in Jupyter notebooks. It is my desire that others can build upon this work to provide new tools, addressing some of the issues outlined in this thesis. / Graduate Jupyter computational notebooks code duplication code clones code reuse data analysis data exploration exploratory programming

1

Page generated in 0.1023 seconds