Global ETD Search

21	Turning Smart Water Meter Data Into Useful Information : A case study on rental apartments in Södertälje Söderberg, Anna, Dahlström, Philip January 2017 (has links) Managing water in urban areas is an ever increasingly complex challenge. Technology enables sustainable urban water management and with integrated smart metering solutions, massive amounts of water consumption data from the end users can be collected. However, the possibility of generating data from the end user holds no value in itself. It is with the use of data analysis the vast amount of the collected data can provide more insightful information creating potential benefits. It is recognized that a deeper understanding of the end user could potentially provide benefits for operational managers as well as for the end users. A single case study of a data set containing high frequency end user water consumption data from rental apartments has been conducted, where the data set was analyzed in order to see what possible information that could be extracted and interpreted based on an exploratory data analysis (EDA). Furthermore, an interview with the operational manager of the buildings under study as well as a literature review have been carried out in order to understand how the gathered data is used today and to which contexts it could be extrapolated to provide potential benefits at a building level. The results suggests that the EDA is a powerful method approach when starting out without strong preconception of the data under study and have successfully revealed patterns and a fundamental understanding of the data and its structure. Through analysis, variations over time, water consumption patterns and excessive water users have been identified as well as a leak identification process. Even more challenging than to make meaning of the data is to trigger actions, decisions and measures based on the data analysis. The unveiled information could be applied for an improved operational building management, to empower the customers, for business and campaign opportunities as well as for an integrated decision support system. To summarize, it is concluded that the usage of smart water metering data holds an untapped opportunity to save water, energy as well as money. In the drive towards a more sustainable and smarter city, smart water meter data from end users have the potential to enable smarter building management as well as smarter water services. End User Water Consumption Smart Water Meter Data Exploratory Data Analysis Water Management Building Operational Management Civil Engineering Samhällsbyggnadsteknik
22	Piece-wise Linear Approximation for Improved Detection in Structural Health Monitoring Essegbey, John W. 08 October 2012 (has links) No description available. Electrical Engineering structural health monitoring piece-wise linear approximation mental model intelligent parameter varying improved detection exploratory data analysis
23	Análise de agrupamentos baseada na topologia dos dados e em mapas auto-organizáveis. / Data clustering based on data topology and self organizing-maps. Boscarioli, Clodis 16 May 2008 (has links) Cada vez mais, na conjuntura das grandes tomadas de decisões, a análise de dados massivamente armazenados se torna uma necessidade das mais variadas áreas de conhecimento. A análise de dados envolve a realização de diferentes tarefas, que podem ser realizadas por diferentes técnicas e estratégias como análise de agrupamento de dados. Esta pesquisa enfatiza a realização da tarefa de análise de agrupamento de dados (Data Clustering) usando SOM (Self-Organizing Maps) como principal artefato. SOM é uma rede neural artificial baseada em aprendizado competitivo e não-supervisionado, o que significa que o treinamento é inteiramente guiado pelos dados e que os neurônios do mapa competem entre si. Essa rede neural possui a habilidade de formar mapeamentos que quantizam os dados, preservando a sua topologia. Este trabalho introduz uma nova metodologia de análise de agrupamentos a partir de SOM, que considera o mapa topológico gerado por ele e a topologia dos dados no processo de agrupamento. Uma análise experimental e comparativa é apresentada, evidenciando a potencialidade da proposta, destacando, por fim, as principais contribuições do trabalho. / More than ever, in environment of large decision making, the analysis of data stored massively becomes a real need in almost all knowledge areas. The data analyzing process covers the performing of different tasks that can be executed for different techniques and strategies as the data clustering analysis. This research is focused on the analysis task of data groups, called Data Clustering using Self Organizing Maps (SOM) as principal artifact. SOM is an artificial neural network based on competitive and unsupervised learning, what means that its training is entirely driven by the data, such the neurons of the map compete themselves for doing it. This neural network has the ability to build the mapping task that quantifies the source data, but preserving the topology. This work introduces a new clustering analysis methodology based on SOM, considering the topological map produced by it and also the topology of the data obtained in the clustering process. The experimental and comparative analysis are also presented to demonstrate the potential of the proposal, highlighting at the end the mainly contributions of the work. Análise de agrupamentos Análise exploratória de dados Data clustering Data mining Descoberta de conhecimento Exploratory data analysis Knowledge discovery Mapas Auto-organizáveis (SOM) Mineração de dados Self-organizing Maps (SOM)
24	Adaptive Prefetching for Visual Data Exploration Doshi, Punit Rameshchandra 31 January 2003 (has links) Loading of data from slow persistent memory (disk storage) to main memory represents a bottleneck for current interactive visual data exploration applications, especially when applied to huge volumnes of data. Semantic caching of queries at the client-side is a recently emerging technology that can significantly improve the performance of such systems, though it may not in all cases fully achieve the near real-time responsiveness required by such interactive applications. We hence propose to augment the semantic caching techniques by applying prefetching. That is, the system predicts the user's next requested data and loads the data into the cache as a background process before the next user request is made. Our experimental studies confirm that prefetching indeed achieves performance improvements for interactive visual data exploration. However, a given prefetching technique is not always able to correctly predict changes in a user's navigation pattern. Especially, as different users may have different navigation patterns, implying that the same strategy might fail for a new user. In this research, we tackle this shortcoming by utilizing the adaptation concept of strategy selection to allow the choice of prefetching strategy to change over time both across as well as within one user session. While other adaptive prefetching research has focused on refining a single strategy, we instead have developed a framework that facilitates strategy selection. For this, we explored various metrics to measure performance of prefetching strategies in action and thus guide the adaptive selection process. This work is the first to study caching and prefetching in the context of visual data exploration. In particular, we have implemented and evaluated our proposed approach within XmdvTool, a free-ware visualization system for visually exploring hierarchical multivariate data. We have tested our technique on real user traces gathered by the logging tool of our system as well as on synthetic user traces. Our results confirm that our adaptive approach improves system performance by selecting a good combination of prefetching strategies that adapts to the user's changing navigation patterns. Adaptive prefetching Semantic caching Hierarchical data exploration Exploratory data analysis Cache memory Image processing Digital techniques Multivariate analysis Data processing
25	Análise de agrupamentos baseada na topologia dos dados e em mapas auto-organizáveis. / Data clustering based on data topology and self organizing-maps. Clodis Boscarioli 16 May 2008 (has links) Cada vez mais, na conjuntura das grandes tomadas de decisões, a análise de dados massivamente armazenados se torna uma necessidade das mais variadas áreas de conhecimento. A análise de dados envolve a realização de diferentes tarefas, que podem ser realizadas por diferentes técnicas e estratégias como análise de agrupamento de dados. Esta pesquisa enfatiza a realização da tarefa de análise de agrupamento de dados (Data Clustering) usando SOM (Self-Organizing Maps) como principal artefato. SOM é uma rede neural artificial baseada em aprendizado competitivo e não-supervisionado, o que significa que o treinamento é inteiramente guiado pelos dados e que os neurônios do mapa competem entre si. Essa rede neural possui a habilidade de formar mapeamentos que quantizam os dados, preservando a sua topologia. Este trabalho introduz uma nova metodologia de análise de agrupamentos a partir de SOM, que considera o mapa topológico gerado por ele e a topologia dos dados no processo de agrupamento. Uma análise experimental e comparativa é apresentada, evidenciando a potencialidade da proposta, destacando, por fim, as principais contribuições do trabalho. / More than ever, in environment of large decision making, the analysis of data stored massively becomes a real need in almost all knowledge areas. The data analyzing process covers the performing of different tasks that can be executed for different techniques and strategies as the data clustering analysis. This research is focused on the analysis task of data groups, called Data Clustering using Self Organizing Maps (SOM) as principal artifact. SOM is an artificial neural network based on competitive and unsupervised learning, what means that its training is entirely driven by the data, such the neurons of the map compete themselves for doing it. This neural network has the ability to build the mapping task that quantifies the source data, but preserving the topology. This work introduces a new clustering analysis methodology based on SOM, considering the topological map produced by it and also the topology of the data obtained in the clustering process. The experimental and comparative analysis are also presented to demonstrate the potential of the proposal, highlighting at the end the mainly contributions of the work. Análise de agrupamentos Análise exploratória de dados Descoberta de conhecimento Mapas Auto-organizáveis (SOM) Mineração de dados Data clustering Data mining Exploratory data analysis Knowledge discovery Self-organizing Maps (SOM)
26	Spatial classification methods for efficient infiltration measurements and transfer of measuring results / Räumlich orientierte Klassifikationsverfahren für effiziente Fremdwassermessungen und für die Übertragung von Messergebnissen Franz, Torsten 13 June 2007 (has links) (PDF) A comprehensive knowledge about the infiltration situation in a sewer system is required for sustainable operation and cost-effective maintenance. Due to the high expenditures of infiltration measurements an optimisation of necessary measurement campaigns and a reliable transfer of measurement results to comparable areas are essential. Suitable methods were developed to improve the information yield of measurements by identifying appropriate measuring point locations and to assign measurement results to other potential measuring points by comparing sub-catchments and classifying reaches. The methods are based on the introduced similarity approach “Similar sewer conditions lead to similar infiltration/inflow rates” and on modified multivariate statistical techniques. The developed methods have a high degree of freedom against data needs. They were successfully tested on real and generated data. For suitable catchments it is estimated, that the optimisation potential amounts up to 40 % accuracy improvement compared to non-optimised measuring point configurations. With an acceptable error the transfer of measurement results was successful for up to 75 % of the investigated sub-catchments. With the proposed methods it is possible to improve the information about the infiltration status of sewer systems and to reduce the measurement related uncertainty which results in significant cost savings for the operator. / Für den nachhaltigen Betrieb und die kosteneffiziente Unterhaltung von Kanalnetzen ist eine genaue Bestimmung ihrer Fremdwassersituation notwendig. Eine Optimierung der dazu erforderlichen Messkampagnen und eine zuverlässige Übertragung der Messergebnisse auf vergleichbare Gebiete sind aufgrund der hohen Aufwendungen für Infiltrationsmessungen angezeigt. Dafür wurden geeignete Methoden entwickelt, welche einerseits den Informationsgehalt von Messungen durch die Bestimmung optimaler Messpunkte verbessern und andererseits Messresultate mittels Vergleichen von Teileinzugsgebieten und Klassifizierungen von Kanalhaltungen zu anderen potenziellen Messstellen zuordnen. Die Methoden basieren auf dem Ähnlichkeitsansatz “Ähnliche Kanaleigenschaften führen zu ähnlichen Fremdwasserraten” und nutzen modifizierte multivariate statistische Verfahren. Sie haben einen hohen Freiheitsgrad bezüglich der Datenanforderung. Die Methoden wurden erfolgreich anhand gemessener und generierter Daten validiert. Es wird eingeschätzt, dass das Optimierungspotenzial bei geeigneten Einzugsgebieten bis zu 40 % gegenüber nicht optimierten Mess-netzen beträgt. Die Übertragung der Messergebnisse war mit einem akzeptablen Fehler für bis zu 75 % der untersuchten Teileinzugsgebiete erfolgreich. Mit den entwickelten Methoden ist es möglich, den Kenntnisstand über die Fremdwassersituation eines Kanalnetzes zu verbessern und die messungsbezogene Unsicherheit zu verringern. Dies resultiert in Kostenersparnissen für den Betreiber. exploratory data analysis extraneous water gauge positioning infiltration measurements sewer leakage similarity approach transfer of result Ähnlichkeitsansatz Ergebnisübertragung Explorative Datenanalyse Fremdwasser Infiltrationsmessung Kanalleckage Messstellenpositionierung ddc:550 rvk:ZI 6730
27	Statistical decisions in optimising grain yield Norng, Sorn January 2004 (has links) This thesis concerns Precision Agriculture (PA) technology which involves methods developed to optimise grain yield by examining data quality and modelling protein/yield relationship of wheat and sorghum fields in central and southern Queensland. An important part of developing strategies to optimisise grain yield is the understanding of PA technology. This covers major aspects of PA which includes all the components of Site- Specific Crop Management System (SSCM). These components are 1. Spatial referencing, 2. Crop, soil and climate monitoring, 3. Attribute mapping, 4. Decision suppport systems and 5. Differential action. Understanding how all five components fit into PA significantly aids the development of data analysis methods. The development of PA is dependent on the collection, analysis and interpretation of information. A preliminary data analysis step is described which covers both non-spatial and spatial data analysis methods. The non-spatial analysis involves plotting methods (maps, histograms), standard distribution and statistical summary (mean, standard deviation). The spatial analysis covers both undirected and directional variogram analyses. In addition to the data analysis, a theoretical investigation into GPS error is given. GPS plays a major role in the development of PA. A number of sources of errors affect the GPS and therefore effect the positioning measurements. Therefore, an understanding of the distribution of the errors and how they are related to each other over time is needed to complement the understanding of the nature of the data. Understanding the error distribution and the data give useful insights for model assumptions in regard to position measurement errors. A review of filtering methods is given and new methods are developed, namely, strip analysis and a double harvesting algoritm. These methods are designed specifically for controlled traffic and normal traffic respectively but can be applied to all kinds of yield monitoring data. The data resulting from the strip analysis and double harvesting algorithm are used in investigating the relationship between on-the-go yield and protein. The strategy is to use protein and yield in determining decisions with respect to nitrogen managements. The agronomic assumption is that protein and yield have a significant relationship based on plot trials. We investigate whether there is any significant relationship between protein and yield at the local level to warrent this kind of assumption. Understanding PA technology and being aware of the sources of errors that exist in data collection and data analysis are all very important in the steps of developing management decision strategies. precision agriculture combine harvesters yield maps grain protein protein/yield relationship local neighbourhoods weighted regression controlled trafiic haphazard harveasting GPS errors filtering methods exploratory data analysis data cleaning
28	Visualization of Learning Paths as Networks of Topics García, Sara January 2020 (has links) Nowadays, interactive visualizations have been one of the most used tools in Big Data fields for the purpose of searching for relationships and structured information in large datasets of unstructured information. In this project, these tools are applied to extract structured information from students following Self-Regulated Learning (SRL). By means of an interactive graph, we are able to study the paths that the students follow in the learning materials. Our visualization supports the investigation of patterns of behaviour of the students, which later could be used, for example, to adapt the study program to the student’s needs in a dynamic way or offer guidance if necessary. Visual Learning Analytics Exploratory Data Analysis Multidi- mensional Data Interactive Graphs Annan elektroteknik och elektronik
29	Building predictive models for dynamic line rating using data science techniques Doban, Nicolae January 2016 (has links) The traditional power systems are statically rated and sometimes renewable energy sources (RES) are curtailed in order not to exceed this static rating. The RES are curtailed because of their intermittent character and therefore, it is difficult to predict their output at specific time periods throughout the day. Dynamic Line Rating (DLR) technology can overcome this constraint by leveraging the available weather data and technical parameters of the transmission line. The main goal of the thesis is to present prediction models of Dynamic Line Rating (DLR) capacity on two days ahead and on one day ahead. The models are evaluated based on their error rate profiles. DLR provides the capability to up-rate the line(s) according to the environmental conditions and has always a much higher profile than the static rating. By implementing DLR a power utility can increase the efficiency of the power system, decrease RES curtailment and optimize their integration within the grid. DLR is mainly dependent on the weather parameters and specifically, in large wind speeds and low ambient temperature, the DLR can register the highest profile. Additionally, this is especially profitable for the wind energy producers that can both, produce more (until pitch control) and transmit more in high wind speeds periods with the same given line(s), thus increasing the energy efficiency. The DLR was calculated by employing modern Data Science and Machine Learning tools and techniques and leveraged historical weather and transmission line data provided by SMHI and Vattenfall respectively. An initial phase of Exploratory Data Analysis (EDA) was developed to understand data patterns and relationships between different variables, as well as to determine the most predictive variables for DLR. All the predictive models and data processing routines were built in open source R and are available on GitHub. There were three types of models built: for historical data, for one day-ahead and for two days-ahead time-horizons. The models built for both time-horizons registered a low error rate profile of 9% (for day-ahead) and 11% (for two days-ahead). As expected, the predictive models built on historical data were more accurate with an error as low as 2%-3%. In conclusion, the implemented models met the requirements set by Vattenfall of maximum error of 20% and they can be applied in the control room for that specific line. Moreover, predictive models can also be built for other lines if the required data is available. Therefore, this Master Thesis project’s findings and outcomes can be reproduced in other power lines and geographic locations in order to achieve a more efficient power system and an increased share of RES in the energy mix Dynamic Line Rating Data Science Exploratory Data Analysis Predictive Modeling Energy Efficiency Renewable Energy Sources Power system planning and operations Reproducible Elektroteknik och elektronik
30	Computing Random Forests Variable Importance Measures (VIM) on Mixed Numerical and Categorical Data / Beräkning av Random Forests variable importance measures (VIM) på kategoriska och numeriska prediktorvariabler Hjerpe, Adam January 2016 (has links) The Random Forest model is commonly used as a predictor function and the model have been proven useful in a variety of applications. Their popularity stems from the combination of providing high prediction accuracy, their ability to model high dimensional complex data, and their applicability under predictor correlations. This report investigates the random forest variable importance measure (VIM) as a means to find a ranking of important variables. The robustness of the VIM under imputation of categorical noise, and the capability to differentiate informative predictors from non-informative variables is investigated. The selection of variables may improve robustness of the predictor, improve the prediction accuracy, reduce computational time, and may serve as a exploratory data analysis tool. In addition the partial dependency plot obtained from the random forest model is examined as a means to find underlying relations in a non-linear simulation study. / Random Forest (RF) är en populär prediktormodell som visat goda resultat vid en stor uppsättning applikationsstudier. Modellen ger hög prediktionsprecision, har förmåga att modellera komplex högdimensionell data och modellen har vidare visat goda resultat vid interkorrelerade prediktorvariabler. Detta projekt undersöker ett mått, variabel importance measure (VIM) erhållna från RF modellen, för att beräkna graden av association mellan prediktorvariabler och målvariabeln. Projektet undersöker känsligheten hos VIM vid kvalitativt prediktorbrus och undersöker VIMs förmåga att differentiera prediktiva variabler från variabler som endast, med aveende på målvariableln, beskriver brus. Att differentiera prediktiva variabler vid övervakad inlärning kan användas till att öka robustheten hos klassificerare, öka prediktionsprecisionen, reducera data dimensionalitet och VIM kan användas som ett verktyg för att utforska relationer mellan prediktorvariabler och målvariablel. machine learning ml variable importance vim random forests rf feature selection variable selection exploratory data analysis eda Computer Sciences Datavetenskap (datalogi)

Search results