Global ETD Search

251	Human Learning-Augmented Machine Learning Frameworks for Text Analytics Xia, Long 18 May 2020 (has links) Artificial intelligence (AI) has made astonishing breakthroughs in recent years and achieved comparable or even better performance compared to humans on many real-world tasks and applications. However, it is still far from reaching human-level intelligence in many ways. Specifically, although AI may take inspiration from neuroscience and cognitive psychology, it is dramatically different from humans in both what it learns and how it learns. Given that current AI cannot learn as effectively and efficiently as humans do, a natural solution is analyzing human learning processes and projecting them into AI design. This dissertation presents three studies that examined cognitive theories and established frameworks to integrate crucial human cognitive learning elements into AI algorithms to build human learning–augmented AI in the context of text analytics. The first study examined compositionality—how information is decomposed into small pieces, which are then recomposed to generate larger pieces of information. Compositionality is considered as a fundamental cognitive process, and also one of the best explanations for humans' quick learning abilities. Thus, integrating compositionality, which AI has not yet mastered, could potentially improve its learning performance. By focusing on text analytics, we first examined three levels of compositionality that can be captured in language. We then adopted design science paradigms to integrate these three types of compositionality into a deep learning model to build a unified learning framework. Lastly, we extensively evaluated the design on a series of text analytics tasks and confirmed its superiority in improving AI's learning effectiveness and efficiency. The second study focused on transfer learning, a core process in human learning. People can efficiently and effectively use knowledge learned previously to solve new problems. Although transfer learning has been extensively studied in AI research and is often a standard procedure in building machine learning models, existing techniques are not able to transfer knowledge as effectively and efficiently as humans. To solve this problem, we first drew on the theory of transfer learning to analyze the human transfer learning process and identify the key elements that elude AI. Then, following the design science paradigm, a novel transfer learning framework was proposed to explicitly capture these cognitive elements. Finally, we assessed the design artifact's capability to improve transfer learning performance and validated that our proposed framework outperforms state-of-the-art approaches on a broad set of text analytics tasks. The two studies above researched knowledge composition and knowledge transfer, while the third study directly addressed knowledge itself by focusing on knowledge structure, retrieval, and utilization processes. We identified that despite the great progress achieved by current knowledge-aware AI algorithms, they are not dealing with complex knowledge in a way that is consistent with how humans manage knowledge. Grounded in schema theory, we proposed a new design framework to enable AI-based text analytics algorithms to retrieve and utilize knowledge in a more human-like way. We confirmed that our framework outperformed current knowledge-based algorithms by large margins with strong robustness. In addition, we evaluated more intricately the efficacy of each of the key design elements. / Doctor of Philosophy / This dissertation presents three studies that examined cognitive theories and established frameworks to integrate crucial human cognitive learning elements into artificial intelligence (AI) algorithm designs to build human learning–augmented AI in the context of text analytics. The first study examined compositionality—how information is decomposed into small pieces, which are then recomposed to generate larger pieces of information. Design science research methodology has been adopted to propose a novel deep learning–based framework that can incorporate three levels of compositionality in language with significantly improved learning performance on a series of text analytics tasks. The second study went beyond that basic element and focused on transfer learning—how humans can efficiently and effectively use knowledge learned previously to solve new problems. Our novel transfer learning framework, which is grounded in the theory of transfer learning, has been validated on a broad set of text analytics tasks with improved learning effectiveness and efficiency. Finally, the third study directly addressed knowledge itself by focusing on knowledge structure, retrieval, and utilization processes. We drew on schema theory and proposed a new design framework to enable AI-based text analytics algorithms to retrieve and utilize knowledge in a more human-like way. Lastly, we confirmed our design's superiority in dealing with knowledge on several common text analytics tasks compared to existing knowledge-based algorithms. Artificial intelligence text analytics design science human learning cognitive theories
252	A Framework for Automated Discovery and Analysis of Suspicious Trade Records Datta, Debanjan 27 May 2022 (has links) Illegal logging and timber trade presents a persistent threat to global biodiversity and national security due to its ties with illicit financial flows, and causes revenue loss. The scale of global commerce in timber and associated products, combined with the complexity and geographical spread of the supply chain entities present a non-trivial challenge in detecting such transactions. International shipment records, specifically those containing bill of lading is a key source of data which can be used to detect, investigate and act upon such transactions. The comprehensive problem can be described as building a framework that can perform automated discovery and facilitate actionability on detected transactions. A data driven machine learning based approach is necessitated due to the volume, velocity and complexity of international shipping data. Such an automated framework can immensely benefit our targeted end-users---specifically the enforcement agencies. This overall problem comprises of multiple connected sub-problems with associated research questions. We incorporate crucial domain knowledge---in terms of data as well as modeling---through employing expertise of collaborating domain specialists from ecological conservationist agencies. The collaborators provide formal and informal inputs spanning across the stages---from requirement specification to the design. Following the paradigm of similar problems such as fraud detection explored in prior literature, we formulate the core problem of discovering suspicious transactions as an anomaly detection task. The first sub-problem is to build a system that can be used find suspicious transactions in shipment data pertaining to imports and exports of multiple countries with different country specific schema. We present a novel anomaly detection approach---for multivariate categorical data, following constraints of data characteristics, combined with a data pipeline that incorporates domain knowledge. The focus of the second problem is U.S. specific imports, where data characteristics differ from the prior sub-problem---with heterogeneous attributes present. This problem is important since U.S. is a top consumer and there is scope of actionable enforcement. For this we present a contrastive learning based anomaly detection model for heterogeneous tabular data, with performance and scalability characteristics applicable to real world trade data. While the first two problems address the task of detecting suspicious trades through anomaly detection, a practical challenge with anomaly detection based systems is that of relevancy or scenario specific precision. The third sub-problem addresses this through a human-in-the-loop approach augmented by visual analytics, to re-rank anomalies in terms of relevance---providing explanations for cause of anomalies and soliciting feedback. The last sub-problem pertains to explainability and actionability towards suspicious records, through algorithmic recourse. Algorithmic recourse aims to provides meaningful alternatives towards flagged anomalous records, such that those counterfactual examples are not judged anomalous by the underlying anomaly detection system. This can help enforcement agencies advise verified trading entities in modifying their trading patterns to avoid false detection, thus streamlining the process. We present a novel formulation and metrics for this unexplored problem of algorithmic recourse in anomaly detection. and a deep learning based approach towards explaining anomalies and generating counterfactuals. Thus the overall research contributions presented in this dissertation addresses the requirements of the framework, and has general applicability in similar scenarios beyond the scope of this framework. / Doctor of Philosophy / Illegal timber trade presents multiple global challenges to ecological biodiversity, vulnerable ecosystems, national security and revenue collection. Enforcement agencies---the target end-users of this framework---face a myriad of challenges in discovering and acting upon shipments with illegal timber that violate national and transnational laws due to volume and complexity of shipment data, coupled with logistical hurdles. This necessitates an automated framework based upon shipment data that can address this task---through solving problems of discovery, analysis and actionability. The overall problem is decomposed into self contained sub-problems that address the associated specific research questions. These comprise of anomaly detection in multiple types of high dimensional tabular data, improving precision of anomaly detection through expert feedback and algorithmic recourse for anomaly detection. We present data mining and machine learning solutions to each of the sub-problems that overcome limitations and inapplicability of prior approaches. Further, we address two broader research questions. First is incorporation domain knowledge into the framework, which we accomplish through collaboration with domain experts from environmental conservation organizations. Secondly, we address the issue of explainability in anomaly detection for tabular data in multiple contexts. Such real world data presents with challenges of complexity and scalability, especially given the tabular format of the data that presents it's own set of challenges in terms of machine learning. The solutions presented to these machine learning problems associated with each of components of the framework provide an end-to-end solution to it's requirements. More importantly, the models and approaches presented in this dissertation have applicability beyond the application scenario with similar data and application specific challenges. Anomaly Detection Tabular Data Visual Analytics Algorithmic Recourse
253	Visual Analytics for High Dimensional Simulation Ensembles Dahshan, Mai Mansour Soliman Ismail 10 June 2021 (has links) Recent advancements in data acquisition, storage, and computing power have enabled scientists from various scientific and engineering domains to simulate more complex and longer phenomena. Scientists are usually interested in understanding the behavior of a phenomenon in different conditions. To do so, they run multiple simulations with different configurations (i.e., parameter settings, boundary/initial conditions, or computational models), resulting in an ensemble dataset. An ensemble empowers scientists to quantify the uncertainty in the simulated phenomenon in terms of the variability between ensemble members, the parameter sensitivity and optimization, and the characteristics and outliers within the ensemble members, which could lead to valuable insight(s) about the simulated model. The size, complexity, and high dimensionality (e.g., simulation input and output parameters) of simulation ensembles pose a great challenge in their analysis and exploration. Ensemble visualization provides a convenient way to convey the main characteristics of the ensemble for enhanced understanding of the simulated model. The majority of the current ensemble visualization techniques are mainly focused on analyzing either the ensemble space or the parameter space. Most of the parameter space visualizations are not designed for high-dimensional data sets or did not show the intrinsic structures in the ensemble. Conversely, ensemble space has been visualized either as a comparative visualization of a limited number of ensemble members or as an aggregation of multiple ensemble members omitting potential details of the original ensemble. Thus, to unfold the full potential of simulation ensembles, we designed and developed an approach to the visual analysis of high-dimensional simulation ensembles that merges sensemaking, human expertise, and intuition with machine learning and statistics. In this work, we explore how semantic interaction and sensemaking could be used for building interactive and intelligent visual analysis tools for simulation ensembles. Specifically, we focus on the complex processes that derive meaningful insights from exploring and iteratively refining the analysis of high dimensional simulation ensembles when prior knowledge about ensemble features and correlations is limited or/and unavailable. We first developed GLEE (Graphically-Linked Ensemble Explorer), an exploratory visualization tool that enables scientists to analyze and explore correlations and relationships between non-spatial ensembles and their parameters. Then, we developed Spatial GLEE, an extension to GLEE that explores spatial data while simultaneously considering spatial characteristics (i.e., autocorrelation and spatial variability) and dimensionality of the ensemble. Finally, we developed Image-based GLEE to explore exascale simulation ensembles produced from in-situ visualization. We collaborated with domain experts to evaluate the effectiveness of GLEE using real-world case studies and experiments from different domains. The core contribution of this work is a visual approach that enables the exploration of parameter and ensemble spaces for 2D/3D high dimensional ensembles simultaneously, three interactive visualization tools that explore search, filter, and make sense of non-spatial, spatial, and image-based ensembles, and usage of real-world cases from different domains to demonstrate the effectiveness of the proposed approach. The aim of the proposed approach is to help scientists gain insights by answering questions or testing hypotheses about the different aspects of the simulated phenomenon or/and facilitate knowledge discovery of complex datasets. / Doctor of Philosophy / Scientists run simulations to understand complex phenomena and processes that are expensive, difficult, or even impossible to reproduce in the real world. Current advancements in high-performance computing have enabled scientists from various domains, such as climate, computational fluid dynamics, and aerodynamics to run more complex simulations than before. However, a single simulation run would not be enough to capture all features in a simulated phenomenon. Therefore, scientists run multiple simulations using perturbed input parameters, initial and boundary conditions, or different models resulting in what is known as an ensemble. An ensemble empowers scientists to understand the model's behavior by studying relationships between and among ensemble members, the optimal parameter settings, and the influence of input parameters on the simulation output, which could lead to useful knowledge and insights about the simulated phenomenon. To effectively analyze and explore simulation ensembles, visualization techniques play a significant role in facilitating knowledge discoveries through graphical representations. Ensemble visualization offers scientists a better way to understand the simulated model. Most of the current ensemble visualization techniques are designed to analyze or/and explore either the ensemble space or the parameter space. Therefore, we designed and developed a visual analysis approach for exploring and analyzing high-dimensional parameter and ensemble spaces simultaneously by integrating machine learning and statistics with sensemaking and human expertise. The contribution of this work is to explore how to use semantic interaction and sensemaking to explore and analyze high-dimensional simulation ensembles. To do so, we designed and developed a visual analysis approach that manifested in an exploratory visualization tool, GLEE (Graphically-Linked Ensemble Explorer), that allowed scientists to explore, search, filter, and make sense of high dimensional 2D/3D simulations ensemble. GLEE's visualization pipeline and interaction techniques used deep learning, feature extraction, spatial regression, and Semantic Interaction (SI) techniques to support the exploration of non-spatial, spatial, and image-based simulation ensembles. GLEE different visualization tools were evaluated with domain experts from different fields using real-world case studies and experiments. Simulation Ensembles Visual Analytics Deep Learning Gaussian Process Parallel Computation
254	Designing a Management and Referral Tool for Patients with Multiple Chronic Illnesses in Primary Care Settings Owolabi, Flavien 11 1900 (has links) Some local health organizations in Ontario (e.g., Local Health Integration Network or LHINs) have put forward a strategic objective to identify patients with preventable high cost healthcare service usage (e.g., hospitalizations, emergency department [ED] visits). To attain this goal, primary care service providers, who are considered the entry point to the health system, need tools to help diagnose, treat and refer those patients identified as being potential high users of the health care system. The goal of this study was to develop a management and referral tool to identify, manage and refer patients living with multiple comorbidities to specialized care teams such as Health Links. Data used in this analysis were obtained from the Canadian Primary Care Sentinel Surveillance Network (CPCSSN) primary care data holdings. The dataset created for this study contained 14,004 patient records. Data analysis techniques included use of both statistical and predictive analytic tools. The base models included four data mining classification algorithms: Decision Tree, Naïve Bayes, Neural Network and Clustering. The predictive modeling approach was complemented by an association analysis. The one-way ANOVA analysis indicated that age and health status (number of conditions, and individual medical conditions) identified statistically significant differences in patient utilization of health services. Results from the predictive analytics showed that patient age and patient medical conditions, as well as number of medical conditions for each patient (5 or more) could be used as criteria to develop tools (e.g. searches, reminders). Specifically, Parkinson disease, dementia and epilepsy were found to be important predictors (i.e. most frequently associated with) the top 4 most prevalent conditions (hypertension, osteoarthritis, depression and diabetes) within the population of the study. The association analysis also revealed that chronic obstructive pulmonary disease (COPD) was closely associated with the top 4 most prevalent conditions. Based on the findings of this study, Parkinson Disease, dementia, epilepsy and COPD can be used to identify patients with complex medical needs who are likely to be high users of the healthcare system and to be considered for early, personalized intervention. / Thesis / Master of Health Sciences (MSc)
255	Supporting K-12 Teachers’ Decision Making through Interactive Visualizations : A case study to improve the usability of a real-time analytic dashboard Luo, Xinyan January 2020 (has links) Recent research have been focusing on supporting teachers in the classroom. Such support has been shown to benefit from the development and employment of teacher-facing analytic dashboards to help them to make fast and effective decisions in regard to the in-class student learning activities. The evolving interest in this field has facilitated the emergence of the Teaching Analytics area of practice and research. However, current research efforts have indicated that the use of such dashboards usually adds another layer to the already dynamic and complex situation for teachers, which can divert their attention and can often be experienced as a disturbing factor in the class. Therefore, it is highly important to examine how such teacher-facing dashboards can be improved from the user experience perspective, in a way that would allow teachers to grasp student learning activities easily and with good perceived usability. The aim of this study is to understand how we can better design teacher-facing dashboards to more adequately support K-12 teachers in their decisions that would provide relevant in-time and student support. The study applies Nielsen's three-round iterative design approach to understand the existing usability problems and further develop the dashboard, originally designed by the company. In order to investigate users’ perceived attitude towards the redesigned dashboard, the final prototype has been evaluated through a Technology Acceptance Model questionnaire and semi-structured interviews with nine participants. As a result, the redesigned teacher-faced dashboard was proven to have a high potential to support teachers’ decisions. The efficiency of the Technology Acceptance Model was verified and put into general context on how tools for teachers should be designed for the usage in the classroom. Additionally, some major challenges for teachers with using external tools during class were discovered and are discussed in the context of a newly designed dashboard. / Befintlig forskning stödjer lärare i klassrummet genom att utveckla analytiska visualiseringsverktyg (a.k.a. dashboards) som lärare kan använda för att fatta snabba och effektiva beslut med avseende på elevernas läraktiviteter. Det växande intresset för detta område har lett till framväxten av Teaching Analytics-fältet inom praktik och forskning. Forskning har dock visat att användandet av dessa verktyg vanligtvis lägger till ytterligare ett lager till den redan dynamiska och komplexa situationen för lärare, vilket kan avleda deras uppmärksamhet och ofta fungera som en störande faktor i klassrummet. Därför är det mycket viktigt att undersöka hur sådana visualiseringsverktyg för lärare kan förbättras ur användarperspektiv, på ett sätt som skulle göra det möjligt för lärare att förstå elevernas läraktiviteter enkelt och med god upplevd användbarhet. Syftet med denna studie är att förbättra användargränssnittet för ett befintligt, så att det på ett mer adekvat sätt kan stödja lärare i sina beslut och erbjuda relevant stöd till eleverna. Studien tillämpar Nielsens tre-rundors iterativa designmetod för att förstå de befintliga användbarhetsproblemen och vidareutveckla en existerande dashboard, ursprungligen utvecklad av företaget. För att undersöka användarnas inställning till det omdesignade verktyget har den slutliga prototypen utvärderats genom ett frågeformulär och semistrukturerade intervjuer med nio deltagare. Resultat visar att det omdesignade de verktyget har en stor potential för att stödja lärarnas beslut i klassrummet. Effektiviteten för Teknik Acceptant Modellen (TAM) verifieras och sattes i allmän kontext för hur olika verktyg för lärare bör utformas för användning i klassrummet. Dessutom diskuteras lärarnas stora utmaningar med att använda externa verktyg under lektioner i samband med ny verktyget. Learning Analytics Teaching Analytics Visualization K-12 education Real-time Learner Data Decision-Making Teacher-facing dashboards. Learning Analytics Teaching Analytics Visualization K-12 education Real-time Learner Data Decision-Making Teacher-facing dashboards. Computer and Information Sciences Data- och informationsvetenskap
256	Understanding the Implication of Blockchain Technology on the Audit Profession Jackson, Brittany 01 January 2018 (has links) The purpose of this research is to identify the implications of blockchain technology on the auditing profession. By conducting interviews with current professionals in the auditing profession, as well as those in academic with a background in auditing, primary data was collected to aggregate what potential effects will be on the auditing profession in the next five years and the next decade. The data includes assumptions of how the accounting major itself, the auditing planning phase, assumptions of risk, and audit completions will change with the developing technology. The goal of this research is a better understanding of how auditing will be affected by blockchain technology for students, current audit professionals, and those in academia. With the results, it was concluded that training of new and current employees will need to evolve with more emphasis on IT skills and analytical reasoning, blockchain's development is on a precipice of adoption within the next decade, and that there is a current gap regarding regulation of blockchain technology. audit accounting blockchain audit profession data analytics Accounting
257	Advanced Data Analytics Methodologies for Anomaly Detection in Multivariate Time Series Vehicle Operating Data Alizadeh, Morteza 06 August 2021 (has links) Early detection of faults in the vehicle operating systems is a research domain of high significance to sustain full control of the systems since anomalous behaviors usually result in performance loss for a long time before detecting them as critical failures. In other words, operating systems exhibit degradation when failure begins to occur. Indeed, multiple presences of the failures in the system performance are not only anomalous behavior signals but also show that taking maintenance actions to keep the system performance is vital. Maintaining the systems in the nominal performance for the lifetime with the lowest maintenance cost is extremely challenging and it is important to be aware of imminent failure before it arises and implement the best countermeasures to avoid extra losses. In this context, the timely anomaly detection of the performance of the operating system is worthy of investigation. Early detection of imminent anomalous behaviors of the operating system is difficult without appropriate modeling, prediction, and analysis of the time series records of the system. Data based technologies have prepared a great foundation to develop advanced methods for modeling and prediction of time series data streams. In this research, we propose novel methodologies to predict the patterns of multivariate time series operational data of the vehicle and recognize the second-wise unhealthy states. These approaches help with the early detection of abnormalities in the behavior of the vehicle based on multiple data channels whose second-wise records for different functional working groups in the operating systems of the vehicle. Furthermore, a real case study data set is used to validate the accuracy of the proposed prediction and anomaly detection methodologies. Time series data analytics Operational behavior prediction Unhealthy states detection
258	Enhancing Individualized Instruction through Hidden Markov Models Lindberg, David Seaman, III 26 December 2014 (has links) No description available. Mathematics Statistics Education hidden Markov model individualized instruction learning analytics
259	USING GRAPH MODELING IN SEVERAL VISUAL ANALYTIC TASKS Huang, Xiaoke 18 July 2016 (has links) No description available. Computer Science
260	AN ANALYTICAL FRAMEWORK FOR OPTIMAL PLANNING OF LONG-TERM CARE FACILITIES IN ONTARIO Zargoush, Mohsen January 2019 (has links) Long-term care facility network in Ontario, and in Canada as a whole, encounters critical issues regarding balancing demand with capacity. Even worse, it is faced with rising demand in the coming years. Moreover, there is an urgent need to provide long-term care for patients in their own language (particularly French). This study proposes a dynamic Mixed-Integer Linear Programming model based on the current standing of the long-term care system in Ontario, which simultaneously optimizes the time and location of constructing new long-term care facilities, adjusting the capacity (namely, human resources and beds) of each facility dynamically, and the assignment of patients to the facilities based on their demand region, gender, language, and age group over a finite time horizon. We apply the diversity-support constraints, based on patients’ gender and language, to save patients from loneliness and to comply with the Canadian values of providing care. Finally, we validate the model by performing a case study in Hamilton, Ontario. An extensive set of numerical analyses are explored to provide deeper insights into the whole issue. One set of such analysis is an extensive simulation study to examine the effect of distributional uncertainty in some of the input parameters on the optimal results, hence providing a much more realistic understanding of the optimization model. / Thesis / Master of Science (MSc) Prescriptive Analytics Long-Term Care Mathematical Programming Optimization Simulation

Search results