Global ETD Search

71	A Spreadsheet Model for Using Web Services and Creating Data-Driven Applications Chang, Kerry Shih-Ping 01 April 2016 (has links) Web services have made many kinds of data and computing services available. However, to use web services often requires significant programming efforts and thus limits the people who can take advantage of them to only a small group of skilled programmers. In this dissertation, I will present a tool called Gneiss that extends the spreadsheet model to support four challenging aspects of using web services: programming two-way data communications with web services, creating interactive GUI applications that use web data sources, using hierarchical data, and using live streaming data. Gneiss contributes innovations in spreadsheet languages, spreadsheet user interfaces and interaction techniques to allow programming tasks that currently require writing complex, lengthy code to instead be done using familiar spreadsheet mechanisms. Spreadsheets are arguably the most successful and popular data tools among people of all programming levels. This work advances the use of spreadsheets to new domains and could benefit a wide range of users from professional programmers to end-user programmers. spreadsheets web services data-driven applications web applications streaming data hierarchical data
72	Data-driven approaches to load modeling andmonitoring in smart energy systems Tang, Guoming 23 January 2017 (has links) In smart energy systems, load curve refers to the time series reported by smart meters, which indicate the energy consumption of customers over a certain period of time. The widespread use of load curve (data) in demand side management and demand response programs makes it one of the most important resources. To capture the load behavior or energy consumption patterns, load curve modeling is widely applied to help the utilities and residents make better plans and decisions. In this dissertation, with the help of load curve modeling, we focus on data-driven solutions to three load monitoring problems in different scenarios of smart energy systems, including residential power systems and datacenter power systems and covering the research fields of i) data cleansing, ii) energy disaggregation, and iii) fine-grained power monitoring. First, to improve the data quality for load curve modeling on the supply side, we challenge the regression-based approaches as an efficient way to load curve data cleansing and propose a new approach to analyzing and organizing load curve data. Our approach adopts a new view, termed portrait, on the load curve data by analyzing the inherent periodic patterns and re-organizing the data for ease of analysis. Furthermore, we introduce strategies to build virtual portrait datasets and demonstrate how this technique can be used for outlier detection in load curve. To identify the corrupted load curve data, we propose an appliance-driven approach that particularly takes advantage of information available on the demand side. It identifies corrupted data from the smart meter readings by solving a carefully-designed optimization problem. To solve the problem efficiently, we further develop a sequential local optimization algorithm that tackles the original NP-hard problem by solving an approximate problem in polynomial time. Second, to separate the aggregated energy consumption of a residential house into that of individual appliances, we propose a practical and universal energy disaggregation solution, only referring to the readily available information of appliances. Based on the sparsity of appliances' switching events, we first build a sparse switching event recovering (SSER) model. Then, by making use of the active epochs of switching events, we develop an efficient parallel local optimization algorithm to solve our model and obtain individual appliances' energy consumption. To explore the benefit of introducing low-cost energy meters for energy disaggregation, we propose a semi-intrusive appliance load monitoring (SIALM) approach for large-scale appliances situation. Instead of using only one meter, multiple meters are distributed in the power network to collect the aggregated load data from sub-groups of appliances. The proposed SSER model and parallel optimization algorithm are used for energy disaggregation within each sub-group of appliances. We further provide the sufficient conditions for unambiguous state recovery of multiple appliances, under which a minimum number of meters is obtained via a greedy clique-covering algorithm. Third, to achieve fine-grained power monitoring at server level in legacy datacenters, we present a zero-cost, purely software-based solution. With our solution, no power monitoring hardware is needed any more, leading to much reduced operating cost and hardware complexity. In detail, we establish power mapping functions (PMFs) between the states of servers and their power consumption, and infer the power consumption of each server with the aggregated power of the entire datacenter. We implement and evaluate our solution over a real-world datacenter with 326 servers. The results show that our solution can provide high precision power estimation at both the rack level and the server level. In specific, with PMFs including only two nonlinear terms, our power estimation i) at the rack level has mean relative error of 2.18%, and ii) at the server level has mean relative errors of 9.61% and 7.53% corresponding to the idle and peak power, respectively. / Graduate / 0984 / 0791 / 0800 / tangguo1999@gmail.com Load Curve Data Load Modeling Load Monitoring Smart Energy Systems Data-driven Approaches
73	A Graphical Analysis of Simultaneously Choosing the Bandwidth and Mixing Parameter for Semiparametric Regression Techniques Rivers, Derick L. 31 July 2009 (has links) There has been extensive research done in the area of Semiparametric Regression. These techniques deliver substantial improvements over previously developed methods, such as Ordinary Least Squares and Kernel Regression. Two of these hybrid techniques: Model Robust Regression 1 (MRR1) and Model Robust Regression 2 (MRR2) require the choice of an appropriate bandwidth for smoothing and a mixing parameter that allows a portion of a nonparametric fit to be used in fitting a model that may be misspecifed by other regression methods. The current method of choosing the bandwidth and mixing parameter does not guarantee the optimal choices in either case. The immediate objective of the current work is to address this process of choosing the optimal bandwidth and mixing parameter and to examine the behavior of these estimates using 3D plots. The 3D plots allow us to examine how the semiparametric techniques: MRR1 and MRR2, behave for the optimal (AVEMSE) selection process when compared to data-driven selectors, such as PRESS* and PRESS**. It was found that the structure of MRR2 behaved consistently under all conditions. MRR2 displayed a wider range of "acceptable" values for the choice of bandwidth as opposed to a much more limited choice when using MRR1. These results provide general support for earlier fndings by Mays et al. (2000). Bandwidth Selection Data-driven Selectors Mixing Parameter Selection 3D Plots Physical Sciences and Mathematics
74	Data Driven Visual Recognition Aghazadeh, Omid January 2014 (has links) This thesis is mostly about supervised visual recognition problems. Based on a general definition of categories, the contents are divided into two parts: one which models categories and one which is not category based. We are interested in data driven solutions for both kinds of problems. In the category-free part, we study novelty detection in temporal and spatial domains as a category-free recognition problem. Using data driven models, we demonstrate that based on a few reference exemplars, our methods are able to detect novelties in ego-motions of people, and changes in the static environments surrounding them. In the category level part, we study object recognition. We consider both object category classification and localization, and propose scalable data driven approaches for both problems. A mixture of parametric classifiers, initialized with a sophisticated clustering of the training data, is demonstrated to adapt to the data better than various baselines such as the same model initialized with less subtly designed procedures. A nonparametric large margin classifier is introduced and demonstrated to have a multitude of advantages in comparison to its competitors: better training and testing time costs, the ability to make use of indefinite/invariant and deformable similarity measures, and adaptive complexity are the main features of the proposed model. We also propose a rather realistic model of recognition problems, which quantifies the interplay between representations, classifiers, and recognition performances. Based on data-describing measures which are aggregates of pairwise similarities of the training data, our model characterizes and describes the distributions of training exemplars. The measures are shown to capture many aspects of the difficulty of categorization problems and correlate significantly to the observed recognition performances. Utilizing these measures, the model predicts the performance of particular classifiers on distributions similar to the training data. These predictions, when compared to the test performance of the classifiers on the test sets, are reasonably accurate. We discuss various aspects of visual recognition problems: what is the interplay between representations and classification tasks, how can different models better adapt to the training data, etc. We describe and analyze the aforementioned methods that are designed to tackle different visual recognition problems, but share one common characteristic: being data driven. / <p>QC 20140604</p> Visual Recognition Data Driven Supervised Learning Mixture Models Non-Parametric Models Category Recognition Novelty Detection
75	A robust and reliable data-driven prognostics approach based on Extreme Learning Machine and Fuzzy Clustering / Une approche robuste et fiable de pronostic guidé par les données robustes et basée sur l'apprentissage automatique extrême et la classification floue Javed, kamran 09 April 2014 (has links) Le pronostic industriel vise à étendre le cycle de vie d’un dispositif physique, tout en réduisant les couts d’exploitation et de maintenance. Pour cette raison, le pronostic est considéré comme un processus clé avec des capacités de prédiction. En effet, des estimations précises de la durée de vie avant défaillance d’un équipement, Remaining Useful Life (RUL), permettent de mieux définir un plan d’action visant à accroitre la sécurité, réduire les temps d’arrêt, assurer l’achèvement de la mission et l’efficacité de la production.Des études récentes montrent que les approches guidées par les données sont de plus en plus appliquées pour le pronostic de défaillance. Elles peuvent être considérées comme des modèles de type boite noire pour l’ étude du comportement du système directement `a partir des données de surveillance d’ état, pour définir l’ état actuel du système et prédire la progression future de défauts. Cependant, l’approximation du comportement des machines critiques est une tâche difficile qui peut entraîner des mauvais pronostic. Pour la compréhension de la modélisation du pronostic guidé par les données, on considère les points suivants. 1) Comment traiter les données brutes de surveillance pour obtenir des caractéristiques appropriées reflétant l’ évolution de la dégradation? 2) Comment distinguer les états de dégradation et définir des critères de défaillance (qui peuvent varier d’un cas `a un autre)? 3) Comment être sûr que les modèles définis seront assez robustes pour montrer une performance stable avec des entrées incertaines s’ écartant des expériences acquises, et seront suffisamment fiables pour intégrer des données inconnues (c’est `a dire les conditions de fonctionnement, les variations de l’ingénierie, etc.)? 4) Comment réaliser facilement une intégration sous des contraintes et des exigence industrielles? Ces questions sont des problèmes abordés dans cette thèse. Elles ont conduit à développer une nouvelle approche allant au-delà des limites des méthodes classiques de pronostic guidé par les données. / Prognostics and Health Management (PHM) aims at extending the life cycle of engineerin gassets, while reducing exploitation and maintenance costs. For this reason,prognostics is considered as a key process with future capabilities. Indeed, accurateestimates of the Remaining Useful Life (RUL) of an equipment enable defining furtherplan of actions to increase safety, minimize downtime, ensure mission completion andefficient production.Recent advances show that data-driven approaches (mainly based on machine learningmethods) are increasingly applied for fault prognostics. They can be seen as black-boxmodels that learn system behavior directly from Condition Monitoring (CM) data, usethat knowledge to infer its current state and predict future progression of failure. However,approximating the behavior of critical machinery is a challenging task that canresult in poor prognostics. As for understanding, some issues of data-driven prognosticsmodeling are highlighted as follows. 1) How to effectively process raw monitoringdata to obtain suitable features that clearly reflect evolution of degradation? 2) Howto discriminate degradation states and define failure criteria (that can vary from caseto case)? 3) How to be sure that learned-models will be robust enough to show steadyperformance over uncertain inputs that deviate from learned experiences, and to bereliable enough to encounter unknown data (i.e., operating conditions, engineering variations,etc.)? 4) How to achieve ease of application under industrial constraints andrequirements? Such issues constitute the problems addressed in this thesis and have ledto develop a novel approach beyond conventional methods of data-driven prognostics. Pronostic Prognostics Data-driven Extreme learning Machine Fuzzy Clustering Remaining Useful Life 629.8
76	Revenue Generation in Data-driven Healthcare : An exploratory study of how big data solutions can be integrated into the Swedish healthcare system Jonsson, Hanna, Mazomba, Luyolo January 2019 (has links) Abstract The purpose of this study is to investigate how big data solutions in the Swedish healthcare system can generate a revenue. As technology continues to evolve, the use of big data is beginning to transform processes in many different industries, making them more efficient and effective. The opportunities presented by big data have been researched to a large extent in commercial fields, however, research in the use of big data in healthcare is scarce and this is particularly true in the case of Sweden. Furthermore, there is a lack in research that explores the interface between big data, healthcare and revenue models. The interface between these three fields of research is important as innovation and the integration of big data in healthcare could be affected by the ability of companies to generate a revenue from developing such innovations or solutions. Thus, this thesis aims to fill this gap in research and contribute to the limited body of knowledge that exists on this topic. The study conducted in this thesis was done via qualitative methods, in which a literature search was done and interviews were conducted with individuals who hold managerial positions at Region Västerbotten. The purpose of conducting these interviews was to establish a better understanding of the Swedish healthcare system and how its structure has influenced the use, or lack thereof, of big data in the healthcare delivery process, as well as, how this structure enables the generation of revenue through big data solutions. The data collected was analysed using the grounded theory approach which includes the coding and thematising of the empirical data in order to identify the key areas of discussion. The findings revealed that the current state of the Swedish healthcare system does not present an environment in which big data solutions that have been developed for the system can thrive and generate a revenue. However, if action is taken to make some changes to the current state of the system, then revenue generation may be possible in the future. The findings from the data also identified key barriers that need to be overcome in order to increase the integration of big data into the healthcare system. These barriers included the (i) lack of big data knowledge and expertise, (ii) data protection regulations, (iii) national budget allocation and the (iv) lack of structured data. Through collaborative work between actors in both the public and private sectors, these barriers can be overcome and Sweden could be on its way to transforming its healthcare system with the use of big data solutions, thus, improving the quality of care provided to its citizens. Key words: big data, healthcare, Swedish healthcare system, AI, revenue models, data-driven revenue models big data healthcare revenue models Swedish healthcare system AI data-driven revenue models Business Administration Företagsekonomi
77	Practice-driven solutions for inventory management problems in data-scarce environments Wang, Le 03 June 2019 (has links) Many firms are challenged to make inventory decisions with limited data, and high customer service level requirements. This thesis focuses on heuristic solutions for inventory management problems in data-scarce environments, employing rigorous mathematical frameworks and taking advantage of the information that is available in practice but often ignored in literature. We define a class of inventory models and solutions with demonstrable value in helping firms solve these challenges. Management Data-driven solution Heuristic Inventory management Markov chain Monte Carlo method Sampling
78	Self-Service Business Intelligence : En studie om vilka grundläggande kunskaper en slutanvändare bör inneha vid användningen av SSBI / Self-Service Business Intelligence : A study of which basic knowledge end users should include for the use of SSBI Johansson, Linus January 2019 (has links) Eftersom dagens affärsklimat ständigt utvecklas i och med utökad konkurrens behöver organisationer fatta beslut som är baserade på data i ett tidigt skede. Business Intelligence (BI) tillhandahåller beslutsfattare inom organisationer snabb och riktig information som kan användas som beslutsstöd. I och med att BI’s omfattning gått från enstaka avdelningar till att beröra hela organisationer sätter det stor press på experter inom IT-avdelningar.Det bidrar till att slutanvändare behöver en miljö som ger dem direkt åtkomst till data för egna analyser och beslut. Den miljön nås genom att implementera Self-Service Business Intelligence (SSBI). Det SSBI gör är att det effektiviserar processen för beslut. När SSBI implementeras kräver det att de slutanvändare som berörs av det behöver utöka sina kunskaper för att utnyttja potentialen vilket SSBI medför. För nuvarande saknas forskning kring vilka kunskaper slutanvändare behöver inneha vilket har bidragit till att följande frågeställning kommer att undersökas i studien:➢ Vilka grundläggande kunskaper bör en slutanvändare inneha vid användningen av Self-Service Business Intelligence?Studien grundas i en litteraturgranskning och en fallstudie där intervjuer av sex respondenter, vilka förfogar över god kunskap kring SSBI, använts för datainsamling. Resultatet framställer fyra grundkunskaper vilka slutanvändare bör inneha för att öka möjligheten att börja använda SSBI på ett mer framgångsrikt sätt. Self-Service Business Intelligence Business Intelligence Data-driven organizations Information Systems
79	The Influence of Participation in Structured Data Analysis on Teachers' Instructional Practice Napier, Percy January 2011 (has links) Thesis advisor: Diana Pullin / The current high stakes testing environment has resulted in intense pressure on schools to become more data-driven. As a result, an increasing number of schools are implementing systems where teachers and school leaders collaboratively analyze assessment data and use the results to inform instructional practice. This study examined how teacher participation in the analysis of assessment data influences instructional outcomes. It also examined how levels of capacity in the areas of data use, professional learning, and leadership interact to influence the ability to respond to data. The method is a qualitative case study of an elementary school in the southeastern United States that has implemented formal structures for analyzing and collaborating around assessment data. Data collection occurred through teacher and administrator interviews, data analysis meeting observations, and through the examination of school and district documents. The school in this study responded to data analysis results through three major actions: large-scale initiatives designed to improve instruction in various content areas, remediation, and individual teacher variations in instructional practices. Findings show that while teachers express support for data analysis and suggest positive benefits for the school, they also indicate that participation in data analysis and the resultant improvement efforts have had minimal to modest impact on their teaching practices. Possibly contributing to this outcome was the finding that the school had uneven capacity in the areas of data use, professional learning, and leadership. The school has a well-developed system for data access and reporting. However, it has been less successful in providing the professional learning experiences that will enable more substantial changes in teacher beliefs and practices. Furthermore, a lack of clarity regarding the instructional purpose of data analysis from multiple levels of district and school leadership and the procedural nature of the data analysis process has reduced the ability of school leaders to effectively leverage data analysis for the purpose of substantive and sustained instructional improvement. / Thesis (PhD) — Boston College, 2011. / Submitted to: Boston College. Lynch School of Education. / Discipline: Educational Leadership and Higher Education. Data Analysis Data-Driven Decision Making Instructional Change Organizational Change Professional Learning School Improvement
80	The Effect of a Data-Based Instructional Program on Teacher Practices: The Roles of Instructional Leadership, School Culture, and Teacher Characteristics Morton, Beth A. January 2016 (has links) Thesis advisor: Henry I. Braun / Data-based instructional programs, including interim assessments, are a common tool for improving teaching and learning. However, few studies have rigorously examined whether they achieve those ends and contributors to their effectiveness. This study conducts a secondary analysis of data from a matched-pair school-randomized evaluation of the Achievement Network (ANet). Year-two teacher surveys (n=616) and interviews from a subset of ANet school leaders and teachers (n=40) are used to examine the impact of ANet on teachers’ data-based instructional practices and the mediating roles of instructional leadership, professional and achievement cultures, and teacher attitudes and confidence. Survey results showed an impact of ANet on the frequency with which teachers’ reviewed and used data, but not their instructional planning or differentiation. Consistent with the program model, ANet had a modest impact on school-mean teacher ratings of their leaders’ instructional leadership abilities and school culture, but no impact on individual teachers’ attitudes toward assessment or confidence with data-based instructional practices. Therefore, it was not surprising that these school and teacher characteristics only partially accounted for ANet’s impact on teachers’ data practices. Interview findings were consistent. Teachers described numerous opportunities to review students’ ANet assessment results and examples of how they used these data (e.g., to pinpoint skills on which their students struggled). However, there were fewer examples of strategies such as differentiated instruction. Interview findings also suggested some ways leadership, culture, and teacher characteristics influenced ANet teachers’ practices. Leaders’ roles seemed as much about holding teachers accountable for implementation as offering instructional support and, while teachers had opportunities to collaborate, a few schools’ implementation efforts were likely hampered by poor collegial trust. Teacher confidence and attitudes varied, but improved over the two years; the latter following from a perceived connection between ANet practices and better student performance. However, some teachers were concerned with the assessments being too difficult for their students or poorly aligned with the curriculum, resulting in data that were not always instructionally useful. / Thesis (PhD) — Boston College, 2016. / Submitted to: Boston College. Lynch School of Education. / Discipline: Educational Research, Measurement and Evaluation. Data-driven decision making Data use Instructional improvement School culture School leadership

Search results