Global ETD Search

271	Knowledge Creation Analytics for Online Engineering Learning Teo, Hon Jie 25 July 2014 (has links) The ubiquitous use of computers and greater accessibility of the Internet have triggered widespread use of educational innovations such as online discussion forums, Wikis, Open Educational Resources, MOOCs, to name a few. These advances have led to the creation of a wide range of instructional videos, written documents and discussion archives by engineering learners seeking to expand their learning and advance their knowledge beyond the engineering classroom. However, it remains a challenging task to assess the quality of knowledge advancement on these learning platforms particularly due to the informal nature of engagement as a whole and the massive amount of learner-generated data. This research addresses this broad challenge through a research approach based on the examination of the state of knowledge advancement, analysis of relationships between variables indicative of knowledge creation and participation in knowledge creation, and identification of groups of learners. The study site is an online engineering community, All About Circuits, that serves 31,219 electrical and electronics engineering learners who contributed 503,908 messages in 65,209 topics. The knowledge creation metaphor provides the guiding theoretical framework for this research. This metaphor is based on a set of related theories that conceptualizes learning as a collaborative process of developing shared knowledge artifacts for the collective benefit of a community of learners. In a knowledge-creating community, the quality of learning and participation can be evaluated by examining the degree of collaboration and the advancement of knowledge artifacts over an extended period of time. Software routines were written in Python programming language to collect and process more than half a million messages, and to extract user-produced data from 87,263 web pages to examine the use of engineering terms, social networks and engineering artifacts. Descriptive analysis found that state of knowledge advancement varies across discussion topics and the level of engagement in knowledge creating activities varies across individuals. Non-parametric correlation analysis uncovered strong associations between topic length and knowledge creating activities, and between the total interactions experienced by individuals and individual engagement in knowledge creating activities. On the other hand, the variable of individual total membership period has week associations with individual engagement in knowledge creating activities. K-means clustering analysis identified the presence of eight clusters of individuals with varying lengths of participation and membership, and Kruskal-Wallis tests confirmed that significant differences between the clusters. Based on a comparative analysis of Kruskal-Wallis Score Means and the examination of descriptive statistics for each cluster, three groups of learners were identified: Disengaged (88% of all individuals), Transient (10%) and Engaged (2%). A comparison of Spearman Correlations between pairs of variables suggests that variable of individual active membership period exhibits stronger association with knowledge creation activities for the group of Disengaged, whereas the variable of individual total interactions exhibits stronger association with knowledge creation activities for the group of Engaged. Limitations of the study are discussed and recommendations for future work are made. / Ph. D. knowledge creation engineering education online engineering community analytics
272	Designing and Evaluating Object-Level Interaction to Support Human-Model Communication in Data Analysis Self, Jessica Zeitz 09 May 2016 (has links) High-dimensional data appear in all domains and it is challenging to explore. As the number of dimensions in datasets increases, the harder it becomes to discover patterns and develop insights. Data analysis and exploration is an important skill given the amount of data collection in every field of work. However, learning this skill without an understanding of high-dimensional data is challenging. Users naturally tend to characterize data in simplistic one-dimensional terms using metrics such as mean, median, mode. Real-world data is more complex. To gain the most insight from data, users need to recognize and create high-dimensional arguments. Data exploration methods can encourage thinking beyond traditional one-dimensional insights. Dimension reduction algorithms, such as multidimensional scaling, support data explorations by reducing datasets to two dimensions for visualization. Because these algorithms rely on underlying parameterizations, they may be manipulated to assess the data from multiple perspectives. Manipulating can be difficult for users without a strong knowledge of the underlying algorithms. Visual analytics tools that afford object-level interaction (OLI) allow for generation of more complex insights, despite inexperience with multivariate data or the underlying algorithm. The goal of this research is to develop and test variations on types of interactions for interactive visual analytic systems that enable users to tweak model parameters directly or indirectly so that they may explore high-dimensional data. To study interactive data analysis, we present an interface, Andromeda, that enables non-experts of statistical models to explore domain-specific, high-dimensional data. This application implements interactive weighted multidimensional scaling (WMDS) and allows for both parametric and observation-level interaction to provide in-depth data exploration. We performed multiple user studies to answer how parametric and object-level interaction aid in data analysis. With each study, we found usability issues and then designed solutions for the next study. With each critique we uncovered design principles of effective, interactive, visual analytic tools. The final part of this research presents these principles supported by the results of our multiple informal and formal usability studies. The established design principles focus on human-centered usability for developing interactive visual analytic systems that enable users to analyze high-dimensional data through object-level interaction. / Ph. D. visual analytics human-computer interaction interface design dimension reduction
273	Compressive Sensing Approaches for Sensor based Predictive Analytics in Manufacturing and Service Systems Bastani, Kaveh 14 March 2016 (has links) Recent advancements in sensing technologies offer new opportunities for quality improvement and assurance in manufacturing and service systems. The sensor advances provide a vast amount of data, accommodating quality improvement decisions such as fault diagnosis (root cause analysis), and real-time process monitoring. These quality improvement decisions are typically made based on the predictive analysis of the sensor data, so called sensor-based predictive analytics. Sensor-based predictive analytics encompasses a variety of statistical, machine learning, and data mining techniques to identify patterns between the sensor data and historical facts. Given these patterns, predictions are made about the quality state of the process, and corrective actions are taken accordingly. Although the recent advances in sensing technologies have facilitated the quality improvement decisions, they typically result in high dimensional sensor data, making the use of sensor-based predictive analytics challenging due to their inherently intensive computation. This research begins in Chapter 1 by raising an interesting question, whether all these sensor data are required for making effective quality improvement decisions, and if not, is there any way to systematically reduce the number of sensors without affecting the performance of the predictive analytics? Chapter 2 attempts to address this question by reviewing the related research in the area of signal processing, namely, compressive sensing (CS), which is a novel sampling paradigm as opposed to the traditional sampling strategy following the Shannon Nyquist rate. By CS theory, a signal can be reconstructed from a reduced number of samples, hence, this motivates developing CS based approaches to facilitate predictive analytics using a reduced number of sensors. The proposed research methodology in this dissertation encompasses CS approaches developed to deliver the following two major contributions, (1) CS sensing to reduce the number of sensors while capturing the most relevant information, and (2) CS predictive analytics to conduct predictive analysis on the reduced number of sensor data. The proposed methodology has a generic framework which can be utilized for numerous real-world applications. However, for the sake of brevity, the validity of the proposed methodology has been verified with real sensor data associated with multi-station assembly processes (Chapters 3 and 4), additive manufacturing (Chapter 5), and wearable sensing systems (Chapter 6). Chapter 7 summarizes the contribution of the research and expresses the potential future research directions with applications to big data analytics. / Ph. D. Compressive Sensing Sparse Solution Predictive Analytics Sensor-based Quality Assurance
274	Towards Support of Visual Analytics for Synthetic Information Agashe, Aditya Vidyanand 15 September 2015 (has links) This thesis describes a scalable system for visualizing and exploring global synthetic populations. The implementation described in this thesis addresses the following existing limitations of the Syn- thetic Information Viewer (SIV): (i) it adds ability to support synthetic populations for the entire globe by resolving data inconsistencies, (ii) introduces opportunities to explore and find patterns in the data, and (iii) allows the addition of new synthetic population centers with minimal effort. We propose the following extensions to the system: (i) Data Registry: an abstraction layer for handling heterogeneity of data across countries, and adding new population centers for visualizations, and (ii) Visual Query Interface: for exploring and analyzing patterns to gain insights. With these additions, our system is capable of visual exploration and querying of heterogeneous, temporal, spatial and social data for 14 countries with a total population of 830 million. Work in this thesis takes a step towards providing visual analytics capability for synthetic information. This system will assist urban planners, public health analysts, and, any individuals interested in socially-coupled systems, by empowering them to make informed decisions through exploration of synthetic information. / Master of Science Visual Analytics GIS System Dynamic Query Visual Query
275	Efficient Spatio-Temporal Network Analytics in Epidemiological Studies using Distributed Databases Khan, Mohammed Saquib Akmal 26 January 2015 (has links) Real-time Spatio-Temporal Analytics has become an integral part of Epidemiological studies. The size of the spatio-temporal data has been increasing tremendously over the years, gradually evolving into Big Data. The processing in such domains are highly data and compute intensive. High performance computing resources resources are actively being used to handle such workloads over massive datasets. This confluence of High performance computing and datasets with Big Data characteristics poses great challenges pertaining to data handling and processing. The resource management of supercomputers is in conflict with the data-intensive nature of spatio-temporal analytics. This is further exacerbated due to the fact that the data management is decoupled from the computing resources. Problems of these nature has provided great opportunities in the growth and development of tools and concepts centered around MapReduce based solutions. However, we believe that advanced relational concepts can still be employed to provide an effective solution to handle these issues and challenges. In this study, we explore distributed databases to efficiently handle spatio-temporal Big Data for epidemiological studies. We propose DiceX (Data Intensive Computational Epidemiology using supercomputers), which couples high-performance, Big Data and relational computing by embedding distributed data storage and processing engines within the supercomputer. It is characterized by scalable strategies for data ingestion, unified framework to setup and configure various processing engines, along with the ability to pause, materialize and restore images of a data session. In addition, we have successfully configured DiceX to support approximation algorithms from MADlib Analytics Library [54], primarily Count-Min Sketch or CM Sketch [33][34][35]. DiceX enables a new style of Big Data processing, which is centered around the use of clustered databases and exploits supercomputing resources. It can effectively exploit the cores, memory and compute nodes of supercomputers to scale processing of spatio-temporal queries on datasets of large volume. Thus, it provides a scalable and efficient tool for data management and processing of spatio-temporal data. Although DiceX has been designed for computational epidemiology, it can be easily extended to different data-intensive domains facing similar issues and challenges. We thank our external collaborators and members of the Network Dynamics and Simulation Science Laboratory (NDSSL) for their suggestions and comments. This work has been partially supported by DTRA CNIMS Contract HDTRA1-11-D-0016-0001, DTRA Validation Grant HDTRA1-11-1-0016, NSF - Network Science and Engineering Grant CNS-1011769, NIH and NIGMS - Models of Infectious Disease Agent Study Grant 5U01GM070694-11. Disclaimer: The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the U.S. Government. / Master of Science Data Analytics Data Mining Distributed Systems Database Systems
276	Data-Driven Park Planning: Comparative Study of Survey with Social Media Data Sim, Jisoo 05 May 2020 (has links) The purpose of this study was (1) to identify visitors’ behaviors in and perceptions of linear parks, (2) to identify social media users’ behaviors in and perceptions of linear parks, and (3) to compare small data with big data. This chapter discusses the main findings and their implications for practitioners such as landscape architects and urban planners. It has three sections. The first addresses the main findings in the order of the research questions at the center of the study. The second describes implications and recommendations for practitioners. The final section discusses the limitations of the study and suggests directions for future work. This study compares two methods of data collection, focused on activities and benefits. The survey asked respondents to check all the activities they did in the park. Social media users’ activities were detected by term frequency in social media data. Both results ordered the activities similarly. For example social interaction and art viewing were most popular on the High Line, then the 606, then the High Bridge according to both methods. Both methods also reported that High Line visitors engaged in viewing from overlooks the most. As for benefits, according to both methods vistors to the 606 were more satisfied than High Line visitors with the parks’ social and natural benefits. These results suggest social media analytics can replace surveys when the textual information is sufficient for analysis. Social media analytics also differ from surveys in accuracy of results. For example, social media revealed that 606 users were interested in events and worried about housing prices and crimes, but the pre-designed survey could not capture those facts. Social media analytics can also catch hidden and more general information: through cluster analysis, we found possible reasons for the High Line’s success in the arts and in the New York City itself. These results involve general information that would be hard to identify through a survey. On the other hand, surveys provide specific information and can describe visitors’ demographics, motivations, travel information, and specific benefits. For example, 606 users tend to be young, high-income, well educated, white, and female. These data cannot be collected through social media. / Doctor of Philosophy / Turning unused infrastructure into green infrastructure, such as linear parks, is not a new approach to managing brownfields. In the last few decades, changes in the industrial structure and the development of transportation have had a profound effect on urban spatial structure. As the need for infrastructure, which played an important role in the development of past industry, has decreased, many industrial sites, power plants, and military bases have become unused. This study identifies new ways of collecting information about a new type of park, linear parks, using a new method, social media analytics. The results are then compared with survey results to establish the credibility of social media analytics. Lastly, shortcomings of social media analytics are identified. This study is meaningful in helping us understand the users of new types of parks and suggesting design and planning strategies. Regarding methodology, this study also involves evaluating the use of social media analytics and its advantages, disadvantages, and reliability. linear park survey small data social media big data analytics
277	Automated extraction of product feedback from online reviews: Improving efficiency, value, and total yield Goldberg, David Michael 25 April 2019 (has links) In recent years, the expansion of online media has presented firms with rich and voluminous new datasets with profound business applications. Among these, online reviews provide nuanced details on consumers' interactions with products. Analysis of these reviews has enormous potential, but the enormity of the data and the nature of unstructured text make mining these insights challenging and time-consuming. This paper presents three studies examining this problem and suggesting techniques for automated extraction of vital insights. The first study examines the problem of identifying mentions of safety hazards in online reviews. Discussions of hazards may have profound importance for firms and regulators as they seek to protect consumers. However, as most online reviews do not pertain to safety hazards, identifying this small portion of reviews is a challenging problem. Much of the literature in this domain focuses on selecting "smoke terms," or specific words and phrases closely associated with the mentions of safety hazards. We first examine and evaluate prior techniques to identify these reviews, which incorporate substantial human opinion in curating smoke terms and thus vary in their effectiveness. We propose a new automated method that utilizes a heuristic to curate smoke terms, and we find that this method is far more efficient than the human-driven techniques. Finally, we incorporate consumers' star ratings in our analysis, further improving prediction of safety hazard-related discussions. The second study examines the identification of consumer-sourced innovation ideas and opportunities from online reviews. We build upon a widely-accepted attribute mapping framework from the entrepreneurship literature for evaluating and comparing product attributes. We first adapt this framework for use in the analysis of online reviews. Then, we develop analytical techniques based on smoke terms for automated identification of innovation opportunities mentioned in online reviews. These techniques can be used to profile products as to attributes that affect or have the potential to affect their competitive standing. In collaboration with a large countertop appliances manufacturer, we assess and validate the usefulness of these suggestions, tying together the theoretical value of the attribute mapping framework and the practical value of identifying innovation-related discussions in online reviews. The third study addresses safety hazard monitoring for use cases in which a higher yield of safety hazards detected is desirable. We note a trade-off between the efficiency of hazard techniques described in the first study and the depth of such techniques, as a high proportion of identified records refer to true hazards, but several important hazards may be undetected. We suggest several techniques for handling this trade-off, including alternate objective functions for heuristics and fuzzy term matching, which improve the total yield. We examine the efficacy of each of these techniques and contrast their merits with past techniques. Finally, we test the capability of these methods to generalize to online reviews across different product categories. / Doctor of Philosophy / This dissertation presents three studies that utilize text analytic methods to analyze and derive insights from online reviews. The first study aims to detect distinctive words and phrases particularly prevalent in online reviews that describe safety hazards. This study proposes algorithmic and heuristic methods for identifying words and phrases that are especially common in these reviews, allowing for an automated process to prioritize these reviews for practitioners more efficiently. The second study extends these methods for use in detecting mentions of product innovation opportunities in online reviews. We show that these techniques can used to profile products based on attributes that differentiate them from competition or have the potential to do so in the future. Additionally, we validate that product managers find this attribute profiling useful to their innovation processes. Finally, the third study examines automated safety hazard monitoring for situations in which the yield or total number of safety hazards detected is an important consideration in addition to efficiency. We propose a variety of new techniques for handling these situations and contrast them with the techniques used in prior studies. Lastly, we test these methods across diverse product categories. text analytics online reviews business intelligence heuristics classification
278	Dynamic Behavior Visualizer: A Dynamic Visual Analytics Framework for Understanding Complex Networked Models Maloo, Akshay 04 February 2014 (has links) Dynamic Behavior Visualizer (DBV) is a visual analytics environment to visualize the spatial and temporal movements and behavioral changes of an individual or a group, e.g. family within a realistic urban environment. DBV is specifically designed to visualize the adaptive behavioral changes, as they pertain to the interactions with multiple inter-dependent infrastructures, in the aftermath of a large crisis, e.g. hurricane or the detonation of an improvised nuclear device. DBV is web-enabled and thus is easily accessible to any user with access to a web browser. A novel aspect of the system is its scale and fidelity. The goal of DBV is to synthesize information and derive insight from it; detect the expected and discover the unexpected; provide timely and easily understandable assessment and the ability to piece together all this information. / Master of Science Information Visualization Visual Analytics Data Modeling Networked Models
279	Solving Mysteries with Crowds: Supporting Crowdsourced Sensemaking with a Modularized Pipeline and Context Slices Li, Tianyi 28 July 2020 (has links) The increasing volume and complexity of text data are challenging the cognitive capabilities of expert analysts. Machine learning and crowdsourcing present new opportunities for large-scale sensemaking, but it remains a challenge to model the overall process so that many distributed agents can contribute to suitable components asynchronously and meaningfully. In this work, I explore how to crowdsource sensemaking for intelligence analysis. Specifically, I focus on the complex processes that include developing hypotheses and theories from a raw dataset and iteratively refining the analysis. I first developed Connect the Dots, a web application that implements the concept of "context slices" and supports novice crowds in building relationship networks for exploratory analysis. Then I developed CrowdIA, a software platform that implements the entire crowd sensemaking pipeline and the context slicing for each step, to enable unsupervised crowd sensemaking. Using the pipeline as a testbed, I probed the errors and bottlenecks in crowdsourced sensemaking,and suggested design recommendations for integrated crowdsourcing systems. Building on these insights and to support iterative crowd sensemaking, I developed the concept of "crowd auditing" in which an auditor examines a pipeline of crowd analyses and diagnoses the problems to steer future refinement. I explored the design space to support crowd auditing and developed CrowdTrace, a crowd auditing tool that enables novice auditors to effectively identify the important problems with the crowd analysis and create microtasks for crowd workers to fix the problems.The core contributions of this work include a pipeline that enables distributed crowd collaboration to holistic sensemaking processes, two novel concepts of "context slices" and "crowd auditing", web applications that support crowd sensemaking and auditing, as well as design implications for crowd sensemaking systems. The hope is that the crowd sensemaking pipeline can serve to accelerate research on sensemaking, and contribute to helping people conduct in-depth investigations of large collections of information. / Doctor of Philosophy / In today's world, we have access to large amounts of data that provide opportunities to solve problems at unprecedented depths and scales. While machine learning offers powerful capabilities to support data analysis, to extract meaning from raw data is cognitively demanding and requires significant person-power. Crowdsourcing aggregates human intelligence, yet it remains a challenge for many distributed agents to collaborate asynchronously and meaningfully. The contribution of this work is to explore how to use crowdsourcing to make sense of the copious and complex data. I first implemented the concept of ``context slices'', which split up complex sensemaking tasks by context, to support meaningful division of work. I developed a web application, Connect the Dots, which generates relationship networks from text documents with crowdsourcing and context slices. Then I developed a crowd sensemaking pipeline based on the expert sensemaking process. I implemented the pipeline as a web platform, CrowdIA, which guides crowds to solve mysteries without expert intervention. Using the pipeline as a testbed, I probed the errors and bottlenecks in crowd sensemaking and provided design recommendations for crowd intelligence systems. Finally, I introduced the concept of ``crowd auditing'', in which an auditor examines a pipeline of crowd analyses and diagnoses the problems to steer a top-down path of the pipeline and refine the crowd analysis. The hope is that the crowd sensemaking pipeline can serve to accelerate research on sensemaking, and contribute to helping people conduct in-depth investigations of large collections of data. Sensemaking Text Analytics Intelligence Analysis Mystery Solving Investigation Crowdsourcing
280	The impact of big data analytics on firms’ high value business performance Popovic, A., Hackney, R., Tassabehji, Rana, Castelli, M. 2016 October 1928 (has links) Yes / Big Data Analytics (BDA) is an emerging phenomenon with the reported potential to transform how firms manage and enhance high value businesses performance. The purpose of our study is to investigate the impact of BDA on operations management in the manufacturing sector, which is an acknowledged infrequently researched context. Using an interpretive qualitative approach, this empirical study leverages a comparative case study of three manufacturing companies with varying levels of BDA usage (experimental, moderate and heavy). The information technology (IT) business value literature and a resource based view informed the development of our research propositions and the conceptual framework that illuminated the relationships between BDA capability and organizational readiness and design. Our findings indicate that BDA capability (in terms of data sourcing, access, integration, and delivery, analytical capabilities, and people’s expertise) along with organizational readiness and design factors (such as BDA strategy, top management support, financial resources, and employee engagement) facilitated better utilization of BDA in manufacturing decision making, and thus enhanced high value business performance. Our results also highlight important managerial implications related to the impact of BDA on empowerment of employees, and how BDA can be integrated into organizations to augment rather than replace management capabilities. Our research will be of benefit to academics and practitioners in further aiding our understanding of BDA utilization in transforming operations and production management. It adds to the body of limited empirically based knowledge by highlighting the real business value resulting from applying BDA in manufacturing firms and thus encouraging beneficial economic societal changes. Big data analytics Business value Operations performance Case analysis

Search results