Spelling suggestions: "subject:"[een] DATA-DRIVEN"" "subject:"[enn] DATA-DRIVEN""
51 |
A Multi-Site Case Study: Acculturating Middle Schools to Use Data-Driven Instruction for Improved Student AchievementJames, Rebecca C. 05 January 2011 (has links)
In the modern era of high-stakes accountability, test data have become much more than a simple comparison (Schmoker, 2006; Payne & Miller, 2009). The information provided in modern data reports has become an invaluable tool to drive instruction in classrooms. However, there is a lack of good training for educators to evaluate data and translate findings into solid practices that can improve student learning (Blair, 2006; Dynarski, 2008; Light, Wexler, & Heinze, 2005; Payne & Miller, 2009). Some schools are good at collecting data, but often fall short at what to do next. It is the role of the principal to serve as an instructional leader and guide teachers to the answer the reoccurring question of "now what?"
The purpose of this study was to investigate ways in which principals build successful data-driven instructional systems within their schools using a qualitative multi-site case study method. This research utilized a triangulation approach with structured interviews, on-site visits, and document reviews from various middle school supervisors, principals, and teachers.
The findings are presented in four common themes and patterns identified as essential components administrators used to implement data-driven instructional systems to improve student achievement. The themes are 1) administrators must clearly define the vision and set the expectation of using data to improve student achievement, 2) administrators must take an active role in the data-driven process, 3) data must be easily accessible to stakeholders, and 4) stakeholders must devote time on a regular basis to the data-driven process. The four themes led to the conclusion of ten common steps administrators can use to acculturate their school or school division with the data-driven instruction process. / Ed. D.
|
52 |
Data-Driven Diagnosis For Fuel Injectors Of Diesel Engines In Heavy-Duty TrucksEriksson, Felix, Björkkvist, Emely January 2024 (has links)
The diesel engine in heavy-duty trucks is a complex system with many components working together, and a malfunction in any of these components can impact engine performance and result in increased emissions. Fault detection and diagnosis have therefore become essential in modern vehicles, ensuring optimal performance and compliance with progressively stricter legal requirements. One of the most common faults in a diesel engineis faulty injectors, which can lead to fluctuations in the amount of fuel injected. Detecting these issues is crucial, prompting a growing interest in exploring additional signals beyond the currently used signal to enhance the performance and robustness of diagnosing this fault. In this work, an investigation was conducted to identify signals that correlate with faulty injectors causing over- and underfueling. It was found that the NOx, O2, and exhaust pressure signals are sensitive to this fault and could potentially serve as additional diagnostic signals. With these signals, two different diagnostic methods were evaluated to assess their effectiveness in detecting injector faults. The methods evaluated were data-driven residuals and Random Forest classifier. The data-driven residuals, when combined with the CUSUM algorithm, demonstrated promising results in detecting faulty injectors. The O2 signal proved effective in identifying both fault instances, while NOx and exhaust pressure were more effective at detecting overfueling. The Random Forest classifier also showed good performance in detecting both over- and underfueling. However, it was observed that using a classifier requires more extensive data preprocessing. Two preprocessing methods were employed: integrating previous measurements and calculating statistical measures over a defined time span. Both methods showed promising results, with the latter proving to be the better choice. Additionally, the generalization capabilities of these methods across different operating conditions were evaluated. It was demonstrated thatthe data-driven residuals yielded better results compared to the classifier, which requiredtraining on new cases to perform effectively.
|
53 |
Data driven modelling for environmental water managementSyed, Mofazzal January 2007 (has links)
Management of water quality is generally based on physically-based equations or hypotheses describing the behaviour of water bodies. In recent years models built on the basis of the availability of larger amounts of collected data are gaining popularity. This modelling approach can be called data driven modelling. Observational data represent specific knowledge, whereas a hypothesis represents a generalization of this knowledge that implies and characterizes all such observational data. Traditionally deterministic numerical models have been used for predicting flow and water quality processes in inland and coastal basins. These models generally take a long time to run and cannot be used as on-line decision support tools, thereby enabling imminent threats to public health risk and flooding etc. to be predicted. In contrast, Data driven models are data intensive and there are some limitations in this approach. The extrapolation capability of data driven methods are a matter of conjecture. Furthermore, the extensive data required for building a data driven model can be time and resource consuming or for the case predicting the impact of a future development then the data is unlikely to exist. The main objective of the study was to develop an integrated approach for rapid prediction of bathing water quality in estuarine and coastal waters. Faecal Coliforms (FC) were used as a water quality indicator and two of the most popular data mining techniques, namely, Genetic Programming (GP) and Artificial Neural Networks (ANNs) were used to predict the FC levels in a pilot basin. In order to provide enough data for training and testing the neural networks, a calibrated hydrodynamic and water quality model was used to generate input data for the neural networks. A novel non-linear data analysis technique, called the Gamma Test, was used to determine the data noise level and the number of data points required for developing smooth neural network models. Details are given of the data driven models, numerical models and the Gamma Test. Details are also given of a series experiments being undertaken to test data driven model performance for a different number of input parameters and time lags. The response time of the receiving water quality to the input boundary conditions obtained from the hydrodynamic model has been shown to be a useful knowledge for developing accurate and efficient neural networks. It is known that a natural phenomenon like bacterial decay is affected by a whole host of parameters which can not be captured accurately using solely the deterministic models. Therefore, the data-driven approach has been investigated using field survey data collected in Cardiff Bay to investigate the relationship between bacterial decay and other parameters. Both of the GP and ANN models gave similar, if not better, predictions of the field data in comparison with the deterministic model, with the added benefit of almost instant prediction of the bacterial levels for this recreational water body. The models have also been investigated using idealised and controlled laboratory data for the velocity distributions along compound channel reaches with idealised rods have located on the floodplain to replicate large vegetation (such as mangrove trees).
|
54 |
DDD metodologija paremto projektavimo įrankio kodo generatoriaus kūrimas ir tyrimas / DDD methodology based design tool‘s code generator development and researchValinčius, Kęstutis 13 August 2010 (has links)
Data Driven Design metodologija plačiai naudojama įvairiose programinėse sistemose. Šios metodologijos tikslas – atskirti bei lygiagretinti programuotojų ir projektuotojų veiklą. Sistemos branduolio funkcionalumas yra įgyvendinamas sąsajomis, o dinamika – scenarijų pagalba. Taip įvedamas abstrakcijos lygmuo, kurio dėka programinis produktas tampa lankstesnis, paprasčiau palaikomas ir tobulinamas, be to šiuos veiksmus galima atlikti lygiagrečiai.
Darbo tikslas buvo sukurti automatinį kodo generatorių, kuris transformuotų grafiškai sumodeliuotą scenarijų į programinį kodą. Generuojant programinį kodą automatiškai ženkliai sumažėja sintaksinių bei loginių klaidų tikimybė, viskas priklauso nuo sumodeliuoto scenarijaus. Kodas sugeneruojamas labai greitai ir visiškai nereikalingas programuotojo įsikišimas. Šis tikslas pasiektas iškėlus biznio logikos projektavimą į scenarijaus projektavimą, o kodo generavimo posistemę realizavus žiniatinklio paslaugos principu. Kodas generuojamas neprisirišant prie konkrečios architektūros, technologijos ar taikymo srities panaudojant įskiepių sistemą . Grafiniame scenarijų kūrimo įrankyje sumodeliuojamas scenarijus ir tada transformuojamas į metakalbą , iš kurios ir generuojamas galutinis programinis kodas. Metakalba – tam tikromis taisyklėmis apibrėžta „XML “ kalba.
Realizavus eksperimentinę sistemą su didelėmis problemomis nebuvo susidurta. Naujos sistemos modeliavimas projektavimo įrankiu paspartino kūrimo procesą septynis kartus. Tai įrodo... [toliau žr. visą tekstą] / Data Driven Design methodology is widely used in various program systems. This methodology aim is to distinguish and parallel software developer and scenario designer’s work. Core functionality is implemented via interfaces and dynamics via scenario support. This introduces a level of abstraction, which makes software product more flexible easily maintained and improved, in addition these actions can be performed in parallel. The main aim of this work was to create automatic code generator that transforms graphically modeled scenario to software code. Automatically generated software code restricts probability of syntactic and logical errors, all depends on scenario modeling. Code is generated instantly and no need software developer interference. This aim is achieved by moving business logic designing to scenario designing process and code generator service making as a “Web service”. Using cartridge based system code is generated not attached to a specific architecture, technology or application domain. In graphical scenario modeling tool scenario is modeled and transformed to metalanguage, from which software code is generated. Metalanguage – with specific rules defined “XML” language. Experimental system was developed with no major problems. New project modeling with our modeling tool speeded the development process by seven times. This proves modeling tool advantage over manual programming.
|
55 |
A Spreadsheet Model for Using Web Services and Creating Data-Driven ApplicationsChang, Kerry Shih-Ping 01 April 2016 (has links)
Web services have made many kinds of data and computing services available. However, to use web services often requires significant programming efforts and thus limits the people who can take advantage of them to only a small group of skilled programmers. In this dissertation, I will present a tool called Gneiss that extends the spreadsheet model to support four challenging aspects of using web services: programming two-way data communications with web services, creating interactive GUI applications that use web data sources, using hierarchical data, and using live streaming data. Gneiss contributes innovations in spreadsheet languages, spreadsheet user interfaces and interaction techniques to allow programming tasks that currently require writing complex, lengthy code to instead be done using familiar spreadsheet mechanisms. Spreadsheets are arguably the most successful and popular data tools among people of all programming levels. This work advances the use of spreadsheets to new domains and could benefit a wide range of users from professional programmers to end-user programmers.
|
56 |
Data-driven approaches to load modeling andmonitoring in smart energy systemsTang, Guoming 23 January 2017 (has links)
In smart energy systems, load curve refers to the time series reported by smart meters, which indicate the energy consumption of customers over a certain period of time. The widespread use of load curve (data) in demand side management and demand response programs makes it one of the most important resources. To capture the load behavior or energy consumption patterns, load curve modeling is widely applied to help the utilities and residents make better plans and decisions. In this dissertation, with the help of load curve modeling, we focus on data-driven solutions to three load monitoring problems in different scenarios of smart energy systems, including residential power systems and datacenter power systems and covering the research fields of i) data cleansing, ii) energy disaggregation, and iii) fine-grained power monitoring.
First, to improve the data quality for load curve modeling on the supply side, we challenge the regression-based approaches as an efficient way to load curve data cleansing and propose a new approach to analyzing and organizing load curve data. Our approach adopts a new view, termed portrait, on the load curve data by analyzing the inherent periodic patterns and re-organizing the data for ease of analysis. Furthermore, we introduce strategies to build virtual portrait datasets and demonstrate how this technique can be used for outlier detection in load curve. To identify the corrupted load curve data, we propose an appliance-driven approach that particularly takes advantage of information available on the demand side. It identifies corrupted data from the smart meter readings by solving a carefully-designed optimization problem. To solve the problem efficiently, we further develop a sequential local optimization algorithm that tackles the original NP-hard problem by solving an approximate problem in polynomial time.
Second, to separate the aggregated energy consumption of a residential house into that of individual appliances, we propose a practical and universal energy disaggregation solution, only referring to the readily available information of appliances. Based on the sparsity of appliances' switching events, we first build a sparse switching event recovering (SSER) model. Then, by making use of the active epochs of switching events, we develop an efficient parallel local optimization algorithm to solve our model and obtain individual appliances' energy consumption. To explore the benefit of introducing low-cost energy meters for energy disaggregation, we propose a semi-intrusive appliance load monitoring (SIALM) approach for large-scale appliances situation. Instead of using only one meter, multiple meters are distributed in the power network to collect the aggregated load data from sub-groups of appliances. The proposed SSER model and parallel optimization algorithm are used for energy disaggregation within each sub-group of appliances. We further provide the sufficient conditions for unambiguous state recovery of multiple appliances, under which a minimum number of meters is obtained via a greedy clique-covering algorithm.
Third, to achieve fine-grained power monitoring at server level in legacy datacenters, we present a zero-cost, purely software-based solution. With our solution, no power monitoring hardware is needed any more, leading to much reduced operating cost and hardware complexity. In detail, we establish power mapping functions (PMFs) between the states of servers and their power consumption, and infer the power consumption of each server with the aggregated power of the entire datacenter. We implement and evaluate our solution over a real-world datacenter with 326 servers. The results show that our solution can provide high precision power estimation at both the rack level and the server level. In specific, with PMFs including only two nonlinear terms, our power estimation i) at the rack level has mean relative error of 2.18%, and ii) at the server level has mean relative errors of 9.61% and 7.53% corresponding to the idle and peak power, respectively. / Graduate / 0984 / 0791 / 0800 / tangguo1999@gmail.com
|
57 |
A Graphical Analysis of Simultaneously Choosing the Bandwidth and Mixing Parameter for Semiparametric Regression TechniquesRivers, Derick L. 31 July 2009 (has links)
There has been extensive research done in the area of Semiparametric Regression. These techniques deliver substantial improvements over previously developed methods, such as Ordinary Least Squares and Kernel Regression. Two of these hybrid techniques: Model Robust Regression 1 (MRR1) and Model Robust Regression 2 (MRR2) require the choice of an appropriate bandwidth for smoothing and a mixing parameter that allows a portion of a nonparametric fit to be used in fitting a model that may be misspecifed by other regression methods. The current method of choosing the bandwidth and mixing parameter does not guarantee the optimal choices in either case. The immediate objective of the current work is to address this process of choosing the optimal bandwidth and mixing parameter and to examine the behavior of these estimates using 3D plots. The 3D plots allow us to examine how the semiparametric techniques: MRR1 and MRR2, behave for the optimal (AVEMSE) selection process when compared to data-driven selectors, such as PRESS* and PRESS**. It was found that the structure of MRR2 behaved consistently under all conditions. MRR2 displayed a wider range of "acceptable" values for the choice of bandwidth as opposed to a much more limited choice when using MRR1. These results provide general support for earlier fndings by Mays et al. (2000).
|
58 |
A Systematic Examination of Data-Driven Decision-making within a School Division: The Relationships among Principal Beliefs, School Characteristics, and Accreditation StatusTeigen, Beth 23 November 2009 (has links)
This non-experimental, census survey included the elementary, middle, and high school principals at the comprehensive schools within a large, suburban school division in Virginia. The focus of this study was the factors that influence building administrators in using data to make instructional decisions. The purpose was to discover if there is a difference in the perceptions of elementary, middle, and high school principals of data use to make instructional decisions within their buildings. McLeod’s (2006) Statewide Data-Driven Readiness Study: Principal Survey was used to assess the principals’ beliefs about the data-driven readiness of their individual schools. Each principal indicated the degree to which they agreed or disagreed with statements about acting upon data, data support systems, and the data school culture. Twenty-two items aligned with four constructs identified by White (2008) in her study of elementary school principals in Florida. These four constructs or factors were used to determine if there was a significant difference in principal beliefs concerning teacher use of data to improve student achievement, principal beliefs regarding a data-driven culture within their building, the existence of systems for supporting data-driven decision-making, and collaboration among teachers to make data-driven decisions. For each of the survey items a majority of the responses (≥62%) were in agreement with the statements, indicating the principals agreed slightly, agreed moderately, or agreed strongly that data-driven decision-making by teachers to improve student achievement was occurring within the building, a data-driven culture and data supporting systems exists, and teachers are collaborating and using data to make decisions. Multiple analyses of variance showed significant differences in the means. Some of these differences in means were based on the principals’ assignment levels. While both groups responded positively to the statement that collaboration among teachers to make data-driven decisions, the elementary principals agreed more strongly than the high school principals. When mediating variables were examined, significance was found in principals’ beliefs concerning teacher use of data to improve student achievement depending on the years of experience as a principal. Principals with six or more years of experience had a mean response for Construct 1 of 4.84 while those with five or less years of experience had a mean of 4.38, suggesting that on average those principals with more experience had a stronger belief that teachers are using data to improve student achievement. There is significance between the means of principals with three or fewer years versus those with more than three years in their current assignment on two of the constructs – a data-driven culture and collaboration among teachers. Principals with less time in their current position report a slightly higher agreement than their less experienced colleagues with statements about the data-driven culture within their school. Significant difference was also found between principals’ beliefs about teacher collaboration to improve student achievement and their beliefs regarding collaboration among teachers using data-driven decision-making and the school’s AYP status for 2008-2009. Principals assigned to schools that had made AYP for 2008-2009 moderately agreed that teachers were collaborating to make data-driven decisions. In comparison, principals assigned to schools that had not made AYP only slightly agreed that this level of collaboration was occurring in their schools.
|
59 |
Data Driven Visual RecognitionAghazadeh, Omid January 2014 (has links)
This thesis is mostly about supervised visual recognition problems. Based on a general definition of categories, the contents are divided into two parts: one which models categories and one which is not category based. We are interested in data driven solutions for both kinds of problems. In the category-free part, we study novelty detection in temporal and spatial domains as a category-free recognition problem. Using data driven models, we demonstrate that based on a few reference exemplars, our methods are able to detect novelties in ego-motions of people, and changes in the static environments surrounding them. In the category level part, we study object recognition. We consider both object category classification and localization, and propose scalable data driven approaches for both problems. A mixture of parametric classifiers, initialized with a sophisticated clustering of the training data, is demonstrated to adapt to the data better than various baselines such as the same model initialized with less subtly designed procedures. A nonparametric large margin classifier is introduced and demonstrated to have a multitude of advantages in comparison to its competitors: better training and testing time costs, the ability to make use of indefinite/invariant and deformable similarity measures, and adaptive complexity are the main features of the proposed model. We also propose a rather realistic model of recognition problems, which quantifies the interplay between representations, classifiers, and recognition performances. Based on data-describing measures which are aggregates of pairwise similarities of the training data, our model characterizes and describes the distributions of training exemplars. The measures are shown to capture many aspects of the difficulty of categorization problems and correlate significantly to the observed recognition performances. Utilizing these measures, the model predicts the performance of particular classifiers on distributions similar to the training data. These predictions, when compared to the test performance of the classifiers on the test sets, are reasonably accurate. We discuss various aspects of visual recognition problems: what is the interplay between representations and classification tasks, how can different models better adapt to the training data, etc. We describe and analyze the aforementioned methods that are designed to tackle different visual recognition problems, but share one common characteristic: being data driven. / <p>QC 20140604</p>
|
60 |
A robust and reliable data-driven prognostics approach based on Extreme Learning Machine and Fuzzy Clustering / Une approche robuste et fiable de pronostic guidé par les données robustes et basée sur l'apprentissage automatique extrême et la classification floueJaved, kamran 09 April 2014 (has links)
Le pronostic industriel vise à étendre le cycle de vie d’un dispositif physique, tout en réduisant les couts d’exploitation et de maintenance. Pour cette raison, le pronostic est considéré comme un processus clé avec des capacités de prédiction. En effet, des estimations précises de la durée de vie avant défaillance d’un équipement, Remaining Useful Life (RUL), permettent de mieux définir un plan d’action visant à accroitre la sécurité, réduire les temps d’arrêt, assurer l’achèvement de la mission et l’efficacité de la production.Des études récentes montrent que les approches guidées par les données sont de plus en plus appliquées pour le pronostic de défaillance. Elles peuvent être considérées comme des modèles de type boite noire pour l’ étude du comportement du système directement `a partir des données de surveillance d’ état, pour définir l’ état actuel du système et prédire la progression future de défauts. Cependant, l’approximation du comportement des machines critiques est une tâche difficile qui peut entraîner des mauvais pronostic. Pour la compréhension de la modélisation du pronostic guidé par les données, on considère les points suivants. 1) Comment traiter les données brutes de surveillance pour obtenir des caractéristiques appropriées reflétant l’ évolution de la dégradation? 2) Comment distinguer les états de dégradation et définir des critères de défaillance (qui peuvent varier d’un cas `a un autre)? 3) Comment être sûr que les modèles définis seront assez robustes pour montrer une performance stable avec des entrées incertaines s’ écartant des expériences acquises, et seront suffisamment fiables pour intégrer des données inconnues (c’est `a dire les conditions de fonctionnement, les variations de l’ingénierie, etc.)? 4) Comment réaliser facilement une intégration sous des contraintes et des exigence industrielles? Ces questions sont des problèmes abordés dans cette thèse. Elles ont conduit à développer une nouvelle approche allant au-delà des limites des méthodes classiques de pronostic guidé par les données. / Prognostics and Health Management (PHM) aims at extending the life cycle of engineerin gassets, while reducing exploitation and maintenance costs. For this reason,prognostics is considered as a key process with future capabilities. Indeed, accurateestimates of the Remaining Useful Life (RUL) of an equipment enable defining furtherplan of actions to increase safety, minimize downtime, ensure mission completion andefficient production.Recent advances show that data-driven approaches (mainly based on machine learningmethods) are increasingly applied for fault prognostics. They can be seen as black-boxmodels that learn system behavior directly from Condition Monitoring (CM) data, usethat knowledge to infer its current state and predict future progression of failure. However,approximating the behavior of critical machinery is a challenging task that canresult in poor prognostics. As for understanding, some issues of data-driven prognosticsmodeling are highlighted as follows. 1) How to effectively process raw monitoringdata to obtain suitable features that clearly reflect evolution of degradation? 2) Howto discriminate degradation states and define failure criteria (that can vary from caseto case)? 3) How to be sure that learned-models will be robust enough to show steadyperformance over uncertain inputs that deviate from learned experiences, and to bereliable enough to encounter unknown data (i.e., operating conditions, engineering variations,etc.)? 4) How to achieve ease of application under industrial constraints andrequirements? Such issues constitute the problems addressed in this thesis and have ledto develop a novel approach beyond conventional methods of data-driven prognostics.
|
Page generated in 0.0347 seconds