• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 158
  • 18
  • 8
  • 6
  • 4
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 274
  • 274
  • 116
  • 65
  • 56
  • 49
  • 47
  • 47
  • 44
  • 43
  • 38
  • 31
  • 29
  • 29
  • 29
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
121

Demographic Transparency to Combat Discriminatory Data Analytics Recommendations

Ebrahimi, Sepideh January 2018 (has links)
Data Analytics (DA) has been blamed for contributing to discriminatory managerial decisions in organizations. To date, most studies have focused on the technical antecedents of such discriminations. As a result, little is known about how to ameliorate the problem by focusing on the human aspects of decision making when using DA in organizational settings. This study represents an effort to address this gap. Drawing on the cognitive elaboration model of ethical decision-making, construal level theory, and the literature on moral intensity, this study investigates how the availability and the design of demographic transparency (a form of decisional guidance) can lower DA users’ likelihood of agreement with discriminatory recommendations of DA tools. In addition, this study examines the role of user’s mindfulness and organizational ethical culture on this process. In an experimental study users interact with a DA tool that provides them with a discriminatory recommendation. The results confirm that demographic transparency significantly impacts both recognition of the moral issue at hand and perceived proximity toward the subject of the decision, which in turn help decrease the likelihood of users’ approval of the discriminatory recommendation. Moreover, the results suggest that user’s mindfulness and organizational ethical culture enhance the positive impacts of demographic transparency. / Dissertation / Doctor of Philosophy (PhD)
122

Integrated Process Modeling and Data Analytics for Optimizing Polyolefin Manufacturing

Sharma, Niket 19 November 2021 (has links)
Polyolefins are one of the most widely used commodity polymers with applications in films, packaging and automotive industry. The modeling of polymerization processes producing polyolefins, including high-density polyethylene (HDPE), polypropylene (PP), and linear low-density polyethylene (LLDPE) using Ziegler-Natta catalysts with multiple active sites, is a complex and challenging task. In our study, we integrate process modeling and data analytics for improving and optimizing polyolefin manufacturing processes. Most of the current literature on polyolefin modeling does not consider all of the commercially important production targets when quantifying the relevant polymerization reactions and their kinetic parameters based on measurable plant data. We develop an effective methodology to estimate kinetic parameters that have the most significant impacts on specific production targets, and to develop the kinetics using all commercially important production targets validated over industrial polyolefin processes. We showcase the utility of dynamic models for efficient grade transition in polyolefin processes. We also use the dynamic models for inferential control of polymer processes. Thus, we showcase the methodology for making first-principle polyolefin process models which are scientifically consistent, but tend to be less accurate due to many modeling assumptions in a complex system. Data analytics and machine learning (ML) have been applied in the chemical process industry for accurate predictions for data-based soft sensors and process monitoring/control. Specifically, for polymer processes, they are very useful since the polymer quality measurements like polymer melt index, molecular weight etc. are usually less frequent compared to the continuous process variable measurements. We showcase the use of predictive machine learning models like neural networks for predicting polymer quality indicators and demonstrate the utility of causal models like partial least squares to study the causal effect of the process parameters on the polymer quality variables. ML models produce accurate results can over-fit the data and also produce scientifically inconsistent results beyond the operating data range. Thus, it is growingly important to develop hybrid models combining data-based ML models and first-principle models. We present a broad perspective of hybrid process modeling and optimization combining the scientific knowledge and data analytics in bioprocessing and chemical engineering with a science-guided machine learning (SGML) approach and not just the direct combinations of first-principle and ML models. We present a detailed review of scientific literature relating to the hybrid SGML approach, and propose a systematic classification of hybrid SGML models according to their methodology and objective. We identify the themes and methodologies which have not been explored much in chemical engineering applications, like the use of scientific knowledge to help improve the ML model architecture and learning process for more scientifically consistent solutions. We apply these hybrid SGML techniques to industrial polyolefin processes such as inverse modeling, science guided loss and many others which have not been applied previously to such polymer applications. / Doctor of Philosophy / Almost everything we see around us from furniture, electronics to bottles, cars, etc. are made fully or partially from plastic polymers. The two most popular polymers which comprise almost two-thirds of polymer production globally are polyethylene (PE) and polypropylene (PP), collectively known as polyolefins. Hence, the optimization of polyolefin manufacturing processes with the aid of simulation models is critical and profitable for chemical industry. Modeling of a chemical/polymer process is helpful for process-scale up, product quality estimation/monitoring and new process development. For making a good simulation model, we need to validate the predictions with actual industrial data. Polyolefin process has complex reaction kinetics with multiple parameters that need to be estimated to accurately match the industrial process. We have developed a novel strategy for estimating the kinetics for the model, including the reaction chemistry and the polymer quality information validating with industrial process. Thus, we have developed a science-based model which includes the knowledge of reaction kinetics, thermodynamics, heat and mass balance for the polyolefin process. The science-based model is scientifically consistent, but may not be very accurate due to many model assumptions. Therefore, for applications requiring very high accuracy predicting any polymer quality targets such as melt index (MI), density, data-based techniques might be more appropriate. Recently, we may have heard a lot about artificial intelligence (AI) and machine learning (ML) the basic principle behind these methods is to making the model learn from data for prediction. The process data that are measured in a chemical/polymer plant can be utilized for data analysis. We can build ML models to predict polymer targets like MI as a function of the input process variables. The ML model predictions are very accurate in the process operating range of the dataset on which the model is learned, but outside the prediction range, they may tend to give scientifically inconsistent results. Thus, there is a need to combine the data-based models and scientific models. In our research, we showcase novel approaches to integrate the science-based models and the data-based ML methodology which we term as the hybrid science-guided machine learning methods (SGML). The hybrid SGML methods applied to polyolefin processes yield not only accurate, but scientifically consistent predictions which can be used for polyolefin process optimization for applications like process development and quality monitoring.
123

Analysis of Information Diffusion through Social Media

Khalili, Nastaran 16 June 2021 (has links)
The changes in the course of communication changed the world from different perspectives. Public participation on social media means the generation, diffusion, and exposure to a tremendous amount of user-generated content without supervision. This four-essay dissertation analyzes information diffusion through social media and its opportunities and challenges through management systems engineering and data analytics. First, we evaluate how information can be shared to reach maximum exposure for the case on online petitions. We use system dynamics modeling and propose policies for campaign managers to schedule the reminders they send to have the highest number of petition signatures. We find that sending reminders is more effective in the case of increasing the signature rate. In the second essay, we investigate how people build trust/ mistrust in science during an emergency. We use data analytics methods on more than 700,000 tweets containing keywords of Hydroxychloroquine and chloroquine, two candidate medicines, to prevent and cure patients infected with COVID-19. We show that people's opinions are concentrated in the case of polarity and spread out in the case of subjectivity. Also, they tend to share subjective tweets than objective ones. In the third essay, building on the same dataset as essay two, we study the changes in science communication during the coronavirus pandemic. We used topic modeling and clustered the tweets into seven different groups. Our analysis suggests that a highly scientific and health-related subject can become political in the case of an emergency. We found that the groups of medical information and research and study have fewer tweets than the political one. Fourth, we investigated fake news diffusion as one of the main challenges of user-generated content. We built a system dynamics model and analyzed the effects of competition and correction in combating fake news. We show that correction of misinformation and competition in fake news needs a high percentage of participation to be effective enough to deal with fake news. / Doctor of Philosophy / The prevalence of social media among people has changed information diffusion in several ways. This change caused the emergence of a variety of opportunities and challenges. We discuss instances of these in this dissertation in four main essays. In the first essay, we study online social and political campaigns. Considering the main goal of campaign managers is to gain the highest reach and signatures, we generate a model to show the effects of sending reminders after the initial announcement and its schedule on the final total number of signatures. We found that the best policy for online petition success is sending reminders when people are increasingly signing it rather than when people lose interest in it. In the second essay, we investigated how people build trust/ mistrust in scientific information in emergency cases. We used public tweets about two candidate medicines to prevent and treat patients infected with COVID-19 and analyzed them. Our results suggest that people trust and retweet the information that is based on emotions and judgments more than the one containing facts. We evaluated the science communication during the mentioned emergency by further investigating the same dataset in the third essay. We clustered all the tweets based on the words they used into seven different groups and labeled each of them. Then, we focused on three groups of medical, research and study, and political. Our analysis suggests that although the subject is a health-related scientific one, the number of tweets in the political group is greater than the other clusters. In the fourth essay, we analyzed the fake news diffusion through social media and the effects of correction and competition on it. In this context, correction means the reaction to misinformation that states its falsity or provides counter facts based on truth. We created a model and simulated it for the competition considering novelty as one influential factor of sharing. The results of this study reveal that active participation in correction and competition is needed to combat fake news effectively.
124

Data Integration Methodologies and Services for Evaluation and Forecasting of Epidemics

Deodhar, Suruchi 31 May 2016 (has links)
Most epidemiological systems described in the literature are built for evaluation and analysis of specific diseases, such as Influenza-like-illness. The modeling environments that support these systems are implemented for specific diseases and epidemiological models. Hence they are not reusable or extendable. This thesis focuses on the design and development of an integrated analytical environment with flexible data integration methodologies and multi-level web services for evaluation and forecasting of various epidemics in different regions of the world. The environment supports analysis of epidemics based on any combination of disease, surveillance sources, epidemiological models, geographic regions and demographic factors. The environment also supports evaluation and forecasting of epidemics when various policy-level and behavioral interventions are applied, that may inhibit the spread of an epidemic. First, we describe data integration methodologies and schema design, for flexible experiment design, storage and query retrieval mechanisms related to large scale epidemic data. We describe novel techniques for data transformation, optimization, pre-computation and automation that enable flexibility, extendibility and efficiency required in different categories of query processing. Second, we describe the design and engineering of adaptable middleware platforms based on service-oriented paradigms for interactive workflow, communication, and decoupled integration. This supports large-scale multi-user applications with provision for online analysis of interventions as well as analytical processing of forecast computations. Using a service-oriented architecture, we have provided a platform-as-a-service representation for evaluation and forecasting of epidemics. We demonstrate the applicability of our integrated environment through development of the applications, DISIMS and EpiCaster. DISIMS is an interactive web-based system for evaluating the effects of dynamic intervention strategies on epidemic propagation. EpiCaster is a situation assessment and forecasting tool for projecting the state of evolving epidemics such as flu and Ebola in different regions of the world. We discuss how our platform uses existing technologies to solve a novel problem in epidemiology, and provides a unique solution on which different applications can be built for analyzing epidemic containment strategies. / Ph. D.
125

The Impact of Sleep Disorders on Driving Safety - Findings from the SHRP2 Naturalistic Driving Study

Liu, Shuyuan 15 June 2017 (has links)
This study is the first examination on the association between seven types of sleep disorder and driving risk using large-scale naturalistic driving study data involving more than 3,400 participants. Regression analyses revealed that females with restless leg syndrome or sleep apnea and drivers with insomnia, shift work sleep disorder, or periodic limb movement disorder are associated with significantly higher driving risk than other drivers without those conditons. Furthermore, despite a small number of observations, there is a strong indication of increased risk for narcoleptic drivers. The findings confirmed results from simulator and epidemiological studies that the driving risk increases amongst people with certain types of sleep disorders. However, this study did not yield evidence in naturalistic driving settings to confirm significantly increased driving risk associated with migraine in prior research. The inconsistency may be an indication that the significant decline in cognitive performance among drivers with sleep disorders observed in laboratory settings may not nessarily translate to an increase in actual driving risk. Further research is necessary to define how to incentivize drivers with specific sleep disorders to balance road safety and personal mobility. / Master of Science
126

Marco teórico y estudios de caso para la mejora en la optimización de la red de agencias de una empresa bancaria en Lima Metropolitana

Briones Gallegos, Fernando David 15 June 2021 (has links)
La investigación toma sustento debido al proceso importante de transformación digital que están afrontando los bancos, lo cual implica una nueva estrategia de canales y educar a sus clientes a usar más aplicativos digitales. Esto es clave si estas organizaciones desean mantener una supervivencia en el mediano plazo debido a que hoy están saliendo nuevos competidores en el mercado. El objetivo de la investigación es identificar las fuentes teóricas que ayuden a plantear la mejor solución para la problemática identificada al momento de realizar un diagnóstico de los procesos en el Banco ABC: mejora del proceso de optimización de canales físicos usando marketing analytics y minería de datos. Como sustentos teóricos, toma como base algoritmos de machine learning de clustering relacionados a los modelos k-means y regresión multivariada. El procedimiento consiste en investigar en distintas fuentes académicas herramientas de diagnóstico de procesos, herramientas de la propuesta de mejora como conceptos de marketing analytics y minería de datos o algoritmos como regresiones y clustering. Finalmente, se analiza 3 casos que plantean problemáticas similares a la que se desea abordar en distintas industrias para poder comparar metodologías a seguir. Como resultados, se pudo consolidar una lista completa de conceptos sólidos del marco teórico que ayuden a sustentar la solución planteada, además, en los 3 casos planteados se identificó que existe un procedimiento claro de cómo abordar un problema de clustering. Como conclusión principal, se resume en que hoy existe mucha información sobre estos temas y casos prácticos como los que se abordan para poder sustentar cualquier propuesta de marketing analytics para una problemática en especifica. Se sugiere a los lectores manejar conceptos teóricos previos de estadística aplicada y algoritmos más sencillos como regresiones lineales para que pueda ser fácilmente entendible la teoría abordada al momento de buscar información de este tipo.
127

WiSDM: a platform for crowd-sourced data acquisition, analytics, and synthetic data generation

Choudhury, Ananya 15 August 2016 (has links)
Human behavior is a key factor influencing the spread of infectious diseases. Individuals adapt their daily routine and typical behavior during the course of an epidemic -- the adaptation is based on their perception of risk of contracting the disease and its impact. As a result, it is desirable to collect behavioral data before and during a disease outbreak. Such data can help in creating better computer models that can, in turn, be used by epidemiologists and policy makers to better plan and respond to infectious disease outbreaks. However, traditional data collection methods are not well suited to support the task of acquiring human behavior related information; especially as it pertains to epidemic planning and response. Internet-based methods are an attractive complementary mechanism for collecting behavioral information. Systems such as Amazon Mechanical Turk (MTurk) and online survey tools provide simple ways to collect such information. This thesis explores new methods for information acquisition, especially behavioral information that leverage this recent technology. Here, we present the design and implementation of a crowd-sourced surveillance data acquisition system -- WiSDM. WiSDM is a web-based application and can be used by anyone with access to the Internet and a browser. Furthermore, it is designed to leverage online survey tools and MTurk; WiSDM can be embedded within MTurk in an iFrame. WiSDM has a number of novel features, including, (i) ability to support a model-based abductive reasoning loop: a flexible and adaptive information acquisition scheme driven by causal models of epidemic processes, (ii) question routing: an important feature to increase data acquisition efficacy and reduce survey fatigue and (iii) integrated surveys: interactive surveys to provide additional information on survey topic and improve user motivation. We evaluate the framework's performance using Apache JMeter and present our results. We also discuss three other extensions of WiSDM: Adapter, Synthetic Data Generator, and WiSDM Analytics. The API Adapter is an ETL extension of WiSDM which enables extracting data from disparate data sources and loading to WiSDM database. The Synthetic Data Generator allows epidemiologists to build synthetic survey data using NDSSL's Synthetic Population as agents. WiSDM Analytics empowers users to perform analysis on the data by writing simple python code using Versa APIs. We also propose a data model that is conducive to survey data analysis. / Master of Science
128

Exploring the Landscape of Big Data Analytics Through Domain-Aware Algorithm Design

Dash, Sajal 20 August 2020 (has links)
Experimental and observational data emerging from various scientific domains necessitate fast, accurate, and low-cost analysis of the data. While exploring the landscape of big data analytics, multiple challenges arise from three characteristics of big data: the volume, the variety, and the velocity. High volume and velocity of the data warrant a large amount of storage, memory, and compute power while a large variety of data demands cognition across domains. Addressing domain-intrinsic properties of data can help us analyze the data efficiently through the frugal use of high-performance computing (HPC) resources. In this thesis, we present our exploration of the data analytics landscape with domain-aware approximate and incremental algorithm design. We propose three guidelines targeting three properties of big data for domain-aware big data analytics: (1) explore geometric and domain-specific properties of high dimensional data for succinct representation, which addresses the volume property, (2) design domain-aware algorithms through mapping of domain problems to computational problems, which addresses the variety property, and (3) leverage incremental arrival of data through incremental analysis and invention of problem-specific merging methodologies, which addresses the velocity property. We demonstrate these three guidelines through the solution approaches of three representative domain problems. We present Claret, a fast and portable parallel weighted multi-dimensional scaling (WMDS) tool, to demonstrate the application of the first guideline. It combines algorithmic concepts extended from the stochastic force-based multi-dimensional scaling (SF-MDS) and Glimmer. Claret computes approximate weighted Euclidean distances by combining a novel data mapping called stretching and Johnson Lindestrauss' lemma to reduce the complexity of WMDS from O(f(n)d) to O(f(n) log d). In demonstrating the second guideline, we map the problem of identifying multi-hit combinations of genetic mutations responsible for cancers to weighted set cover (WSC) problem by leveraging the semantics of cancer genomic data obtained from cancer biology. Solving the mapped WSC with an approximate algorithm, we identified a set of multi-hit combinations that differentiate between tumor and normal tissue samples. To identify three- and four-hits, which require orders of magnitude larger computational power, we have scaled out the WSC algorithm on a hundred nodes of Summit supercomputer. In demonstrating the third guideline, we developed a tool iBLAST to perform an incremental sequence similarity search. Developing new statistics to combine search results over time makes incremental analysis feasible. iBLAST performs (1+δ)/δ times faster than NCBI BLAST, where δ represents the fraction of database growth. We also explored various approaches to mitigate catastrophic forgetting in incremental training of deep learning models. / Doctor of Philosophy / Experimental and observational data emerging from various scientific domains necessitate fast, accurate, and low-cost analysis of the data. While exploring the landscape of big data analytics, multiple challenges arise from three characteristics of big data: the volume, the variety, and the velocity. Here volume represents the data's size, variety represents various sources and formats of the data, and velocity represents the data arrival rate. High volume and velocity of the data warrant a large amount of storage, memory, and computational power. In contrast, a large variety of data demands cognition across domains. Addressing domain-intrinsic properties of data can help us analyze the data efficiently through the frugal use of high-performance computing (HPC) resources. This thesis presents our exploration of the data analytics landscape with domain-aware approximate and incremental algorithm design. We propose three guidelines targeting three properties of big data for domain-aware big data analytics: (1) explore geometric (pair-wise distance and distribution-related) and domain-specific properties of high dimensional data for succinct representation, which addresses the volume property, (2) design domain-aware algorithms through mapping of domain problems to computational problems, which addresses the variety property, and (3) leverage incremental data arrival through incremental analysis and invention of problem-specific merging methodologies, which addresses the velocity property. We demonstrate these three guidelines through the solution approaches of three representative domain problems. We demonstrate the application of the first guideline through the design and development of Claret. Claret is a fast and portable parallel weighted multi-dimensional scaling (WMDS) tool that can reduce the dimension of high-dimensional data points. In demonstrating the second guideline, we identify combinations of cancer-causing gene mutations by mapping the problem to a well known computational problem known as the weighted set cover (WSC) problem. We have scaled out the WSC algorithm on a hundred nodes of Summit supercomputer to solve the problem in less than two hours instead of an estimated hundred years. In demonstrating the third guideline, we developed a tool iBLAST to perform an incremental sequence similarity search. This analysis was made possible by developing new statistics to combine search results over time. We also explored various approaches to mitigate the catastrophic forgetting of deep learning models, where a model forgets to perform machine learning tasks efficiently on older data in a streaming setting.
129

Building and Evaluating a Learning Environment for Data Structures and Algorithms Courses

Fouh Mbindi, Eric Noel 29 April 2015 (has links)
Learning technologies in computer science education have been most closely associated with teaching of programming, including automatic assessment of programming exercises. However, when it comes to teaching computer science content and concepts, learning technologies have not been heavily used. Perhaps the best known application today is Algorithm Visualization (AV), of which there are hundreds of examples. AVs tend to focus on presenting the procedural aspects of how a given algorithm works, rather than more conceptual content. There are also new electronic textbooks (eTextbooks) that incorporate the ability to edit and execute program examples. For many traditional courses, a longstanding problem is lack of sufficient practice exercises with feedback to the student. Automated assessment provides a way to increase the number of exercises on which students can receive feedback. Interactive eTextbooks have the potential to make it easy for instructors to introduce both visualizations and practice exercises into their courses. OpenDSA is an interactive eTextbook for data structures and algorithms (DSA) courses. It integrates tutorial content with AVs and automatically assessed interactive exercises. Since Spring 2013, OpenDSA has been regularly used to teach a fundamental data structures and algorithms course (CS2), and also a more advanced data structures, algorithms, and analysis course (CS3) at various institutions of higher education. In this thesis, I report on findings from early adoption of the OpenDSA system. I describe how OpenDSA's design addresses obstacles in the use of AV systems. I identify a wide variety of use for OpenDSA in the classroom. I found that instructors used OpenDSA exercises as graded assignments in all the courses where it was used. Some instructors assigned an OpenDSA assignment before lectures and started spending more time teaching higher-level concepts. OpenDSA also supported implementing a ``flipped classroom'' by some instructors. I found that students are enthusiastic about OpenDSA and voluntarily used the AVs embedded within OpenDSA. Students found OpenDSA beneficial and expressed a preference for a class format that included using OpenDSA as part of the assigned graded work. The relationship between OpenDSA and students' performance was inconclusive, but I found that students with higher grades tend to complete more exercises. / Ph. D.
130

Big data-driven fuzzy cognitive map for prioritising IT service procurement in the public sector

Choi, Y., Lee, Habin, Irani, Zahir 2016 August 1917 (has links)
Yes / The prevalence of big data is starting to spread across the public and private sectors however, an impediment to its widespread adoption orientates around a lack of appropriate big data analytics (BDA) and resulting skills to exploit the full potential of big data availability. In this paper, we propose a novel BDA to contribute towards this void, using a fuzzy cognitive map (FCM) approach that will enhance decision-making thus prioritising IT service procurement in the public sector. This is achieved through the development of decision models that capture the strengths of both data analytics and the established intuitive qualitative approach. By taking advantages of both data analytics and FCM, the proposed approach captures the strength of data-driven decision-making and intuitive model-driven decision modelling. This approach is then validated through a decision-making case regarding IT service procurement in public sector, which is the fundamental step of IT infrastructure supply for publics in a regional government in the Russia federation. The analysis result for the given decision-making problem is then evaluated by decision makers and e-government expertise to confirm the applicability of the proposed BDA. In doing so, demonstrating the value of this approach in contributing towards robust public decision-making regarding IT service procurement. / EU FP7 project Policy Compass (Project No. 612133)

Page generated in 0.0727 seconds