Spelling suggestions: "subject:"multicollinearity regression.""
1 |
DISTRICT HEAT PRICE MODEL ANALYSIS : A risk assesment of Mälarenergi's new district heat price modelLandelius, Erik, Åström, Magnus January 2019 (has links)
Energy efficiency measures in buildings and alternative heating methods have led to a decreased demand for district heating (DH). Furthermore, due to a recent increase in extreme weather events, it is harder for DH providers to maintain a steady production leading to increased costs. These issues have led DH companies to change their price models. This thesis investigated such a price model change, made by Mälarenergi (ME) on the 1st of August 2018. The aim was to compare the old price model (PM1) with the new price model (PM2) by investigating the choice of base and peak loads a customer can make for the upcoming year, and/or if they should let ME choose for them. A prediction method, based on predicting the hourly DH demand, was chosen after a literature study and several method comparisons were made from using weather parameters as independent variables. Consumption data from Mälarenergi for nine customers of different sizes were gathered, and eight weather parameters from 2014 to 2018 were implemented to build up the prediction model. The method comparison results from Unscrambler showed that multilinear regression was the most accurate statistical modelling method, which was later used for all predictions. These predictions from Unscrambler were then used in MATLAB to estimate the total annual cost for each customer and outcome. For PM1, the results showed that the flexible cost for the nine customers stands for 76 to 85 % of the total cost, with the remaining cost as fixed fees. For PM2, the flexible cost for the nine customers stands for 46 to 61 % of the total cost, with the remaining as fixed cost. Regarding the total cost, PM2 is on average 7.5 % cheaper than PM1 for smaller customer, 8.6 % cheaper for medium customers and 15.9 % cheaper for larger customers. By finding the lowest cost case for each customer their optimal base and peaks loads were found and with the use of a statistical inference method (Bootstrapping) a 95 % confidence interval for the base load and the total yearly cost with could be established. The conclusion regarding choices is that the customer should always choose their own base load within the recommended confidence interval, with ME’s choice seen as a recommendation. Moreover, ME should always make the peak load choice because they are willing to pay for an excess fee that the customer themselves must pay otherwise.
|
2 |
Modeling and analysis of actual evapotranspiration using data driven and wavelet techniquesIzadifar, Zohreh 22 July 2010
Large-scale mining practices have disturbed many natural watersheds in northern Alberta, Canada. To restore disturbed landscapes and ecosystems functions, reconstruction strategies have been adopted with the aim of establishing sustainable reclaimed lands. The success of the reconstruction process depends on the design of reconstruction strategies, which can be optimized by improving the understanding of the controlling hydrological processes in the reconstructed watersheds. Evapotranspiration is one of the important components of the hydrological cycle; its estimation and analysis are crucial for better assessment of the reconstructed landscape hydrology, and for more efficient design. The complexity of the evapotranspiration process and its variability in time and space has imposed some limitations on previously developed evapotranspiration estimation models. The vast majority of the available models estimate the rate of potential evapotranspiration, which occurs under unlimited water supply condition. However, the rate of actual evapotranspiration (AET) depends on the available soil moisture, which makes its physical modeling more complicated than the potential evapotranspiration. The main objective of this study is to estimate and analyze the AET process in a reconstructed landscape.<p>
Data driven techniques can model the process without having a complete understanding of its physics. In this study, three data driven models; genetic programming (GP), artificial neural networks (ANNs), and multilinear regression (MLR), were developed and compared for estimating the hourly eddy covariance (EC)-measured AET using meteorological variables. The AET was modeled as a function of five meteorological variables: net radiation (Rn), ground temperature (Tg), air temperature (Ta), relative humidity (RH), and wind speed (Ws) in a reconstructed landscape located in northern Alberta, Canada. Several ANN models were evaluated using two training algorithms of Levenberg-Marquardt and Bayesian regularization. The GP technique was employed to generate mathematical equations correlating AET to the five meteorological variables. Furthermore, the available data were statistically analyzed to obtain MLR models and to identify the meteorological variables that have significant effect on the evapotranspiration process. The utility of the investigated data driven models was also compared with that of HYDRUS-1D model, which is a physically based model that makes use of conventional Penman-Monteith (PM) method for the prediction of AET. HYDRUS-1D model was examined for estimating AET using meteorological variables, leaf area index, and soil moisture information. Furthermore, Wavelet analysis (WA), as a multiresolution signal processing tool, was examined to improve the understanding of the available time series temporal variations, through identifying the significant cyclic features, and to explore the possible correlation between AET and the meteorological signals. WA was used with the purpose of input determination of AET models, a priori.<p>
The results of this study indicated that all three proposed data driven models were able to approximate the AET reasonably well; however, GP and MLR models had better generalization ability than the ANN model. GP models demonstrated that the complex process of hourly AET can be efficiently modeled as simple semi-linear functions of few meteorological variables. The results of HYDRUS-1D model exhibited that a physically based model, such as HYDRUS-1D, might perform on par or even inferior to the data driven models in terms of the overall prediction accuracy. The developed equation-based models; GP and MLR, revealed the larger contribution of net radiation and ground temperature, compared to other variables, to the estimation of AET. It was also found that the interaction effects of meteorological variables are important for the AET modeling. The results of wavelet analysis demonstrated the presence of both small-scale (2 to 8 hours) and larger-scale (e.g. diurnal) cyclic features in most of the investigated time series. Larger-scale cyclic features were found to be the dominant source of temporal variations in the AET and most of the meteorological variables. The results of cross wavelet analysis indicated that the cause and effect relationship between AET and the meteorological variables might vary based on the time-scale of variation under consideration. At small time-scales, significant linear correlations were observed between AET and Rn, RH, and Ws time series, while at larger time-scales significant linear correlations were observed between AET and Rn, RH, Tg, and Ta time series.
|
3 |
Modeling and analysis of actual evapotranspiration using data driven and wavelet techniquesIzadifar, Zohreh 22 July 2010 (has links)
Large-scale mining practices have disturbed many natural watersheds in northern Alberta, Canada. To restore disturbed landscapes and ecosystems functions, reconstruction strategies have been adopted with the aim of establishing sustainable reclaimed lands. The success of the reconstruction process depends on the design of reconstruction strategies, which can be optimized by improving the understanding of the controlling hydrological processes in the reconstructed watersheds. Evapotranspiration is one of the important components of the hydrological cycle; its estimation and analysis are crucial for better assessment of the reconstructed landscape hydrology, and for more efficient design. The complexity of the evapotranspiration process and its variability in time and space has imposed some limitations on previously developed evapotranspiration estimation models. The vast majority of the available models estimate the rate of potential evapotranspiration, which occurs under unlimited water supply condition. However, the rate of actual evapotranspiration (AET) depends on the available soil moisture, which makes its physical modeling more complicated than the potential evapotranspiration. The main objective of this study is to estimate and analyze the AET process in a reconstructed landscape.<p>
Data driven techniques can model the process without having a complete understanding of its physics. In this study, three data driven models; genetic programming (GP), artificial neural networks (ANNs), and multilinear regression (MLR), were developed and compared for estimating the hourly eddy covariance (EC)-measured AET using meteorological variables. The AET was modeled as a function of five meteorological variables: net radiation (Rn), ground temperature (Tg), air temperature (Ta), relative humidity (RH), and wind speed (Ws) in a reconstructed landscape located in northern Alberta, Canada. Several ANN models were evaluated using two training algorithms of Levenberg-Marquardt and Bayesian regularization. The GP technique was employed to generate mathematical equations correlating AET to the five meteorological variables. Furthermore, the available data were statistically analyzed to obtain MLR models and to identify the meteorological variables that have significant effect on the evapotranspiration process. The utility of the investigated data driven models was also compared with that of HYDRUS-1D model, which is a physically based model that makes use of conventional Penman-Monteith (PM) method for the prediction of AET. HYDRUS-1D model was examined for estimating AET using meteorological variables, leaf area index, and soil moisture information. Furthermore, Wavelet analysis (WA), as a multiresolution signal processing tool, was examined to improve the understanding of the available time series temporal variations, through identifying the significant cyclic features, and to explore the possible correlation between AET and the meteorological signals. WA was used with the purpose of input determination of AET models, a priori.<p>
The results of this study indicated that all three proposed data driven models were able to approximate the AET reasonably well; however, GP and MLR models had better generalization ability than the ANN model. GP models demonstrated that the complex process of hourly AET can be efficiently modeled as simple semi-linear functions of few meteorological variables. The results of HYDRUS-1D model exhibited that a physically based model, such as HYDRUS-1D, might perform on par or even inferior to the data driven models in terms of the overall prediction accuracy. The developed equation-based models; GP and MLR, revealed the larger contribution of net radiation and ground temperature, compared to other variables, to the estimation of AET. It was also found that the interaction effects of meteorological variables are important for the AET modeling. The results of wavelet analysis demonstrated the presence of both small-scale (2 to 8 hours) and larger-scale (e.g. diurnal) cyclic features in most of the investigated time series. Larger-scale cyclic features were found to be the dominant source of temporal variations in the AET and most of the meteorological variables. The results of cross wavelet analysis indicated that the cause and effect relationship between AET and the meteorological variables might vary based on the time-scale of variation under consideration. At small time-scales, significant linear correlations were observed between AET and Rn, RH, and Ws time series, while at larger time-scales significant linear correlations were observed between AET and Rn, RH, Tg, and Ta time series.
|
4 |
Perceptual map teaching strategyChipoco Quevedo, Mario 15 November 2016 (has links)
<p>Este documento contiene el diseño de una estrategia para enseñar mapas perceptuales en un curso de gerencia de marca, con la adición de una técnica de modelado para elaborarlos. Los mapas perceptuales son herramientas para el análisis del posicionamiento de marca, y se enseñan en cursos de pregrado y postgrado. Sin embargo, es muy usual utilizar un marco puramente descriptivo y teórico, sin explicar los mecanismos para construirlos. Se presentan métodos basados en regresión multilineal y en análisis factorial como herramientas de modelado, para explicar en clase y proporcionar una mejor comprensión de esta materia.</p> / Este documento contiene el diseño de una estrategia para enseñar mapas
perceptuales en un curso de gerencia de marca, con la adición de una técnica de modelado para
elaborarlos. Los mapas perceptuales son herramientas para el análisis del posicionamiento
de marca, y se enseñan en cursos de pregrado y postgrado. Sin embargo, es muy usual utilizar
un marco puramente descriptivo y teórico, sin explicar los mecanismos para construirlos. Se
presentan métodos basados en regresión multilineal y en análisis factorial como herramientas
de modelado, para explicar en clase y proporcionar una mejor comprensión de esta materia.
|
5 |
Automatic Differential Diagnosis Model of Patients with Parkinsonian Syndrome : A model using multiple linear regression and classification tree learningLöwe, Rakel, Schneider, Ida January 2020 (has links)
Parkinsonian syndrome is an umbrella term including several diseases with similar symptoms. PET images are key when differential diagnosing patients with parkinsonsian syndrome. In this work two automatic diagnosing models are developed and evaluated, with PET images as input, and a diagnosis as output. The two devoloped models are evaluated based on performance, in terms of sensitivity, specificity and misclassification error. The models consists of 1) regression model and 2) either a decision tree or a random forest. Two coefficients, alpha and beta, are introduced to train and test the models. The coefficients are the output from the regression model. They are calculated with multiple linear regression, with the patient images as dependent variables, and mean images of four patient groups as explanatory variables. The coefficients are the underlying relationship between the two. The four patient groups consisted of 18 healthy controls, 21 patients with Parkinson's disease, 17 patients with dementia with Lewi bodies and 15 patients with vascular parkinsonism. The models predict the patients with misclassification errors of 27% for the decision tree and 34% for the random forest. The patient group which is easiest to classify according to both models is healthy controls. The patient group which is hardest to classify is vascular parkinsonism. These results implies that alpha and beta are interesting outcomes from PET scans, and could, after further development of the model, be used as a guide when diagnosing in the models developed.
|
6 |
EFFECT OF SOCIOECONOMIC AND DEMOGRAPHIC FACTORS ON KENTUCKY CRASHESCambron, Aaron Berry 01 January 2018 (has links)
The goal of this research was to examine the potential predictive ability of socioeconomic and demographic data for drivers on Kentucky crash occurrence. Identifying unique background characteristics of at-fault drivers that contribute to crash rates and crash severity may lead to improved and more specific interventions to reduce the negative impacts of motor vehicle crashes. The driver-residence zip code was used as a spatial unit to connect five years of Kentucky crash data with socioeconomic factors from the U.S. Census, such as income, employment, education, age, and others, along with terrain and vehicle age. At-fault driver crash counts, normalized over the driving population, were used as the dependent variable in a multivariate linear regression to model socioeconomic variables and their relationship with motor vehicle crashes. The final model consisted of nine socioeconomic and demographic variables and resulted in a R-square of 0.279, which indicates linear correlation but a lack of strong predicting power. The model resulted in both positive and negative correlations of socioeconomic variables with crash rates. Positive associations were found with the terrain index (a composite measure of road curviness), travel time, high school graduation and vehicle age. Negative associations were found with younger drivers, unemployment, college education, and terrain difference, which considers the terrain index at the driver residence and crash location. Further research seems to be warranted to fully understand the role that socioeconomic and demographic characteristics play in driving behavior and crash risk.
|
7 |
Χρήση του προτύπου MPEG-4 ALS και διακαναλλική πρόβλεψη για κωδικοποίηση πολυκαναλλικού ηλεκτροκαρδιογραφήματοςΚωνσταντίνου, Ιωάννης 03 July 2009 (has links)
Είναι γεγονός ότι το ηλεκτροκαρδιογράφημα είναι ένα πολύ καλά μελετημένο σήμα. Ειδικά τα τελευταία χρόνια, έχει προταθεί ένας μεγάλος αριθμός αλγορίθμων επεξεργασίας, συμπίεσης, αυτόματης διάγνωσης, φιλτραρίσματος, αποθορυβοποίησης και κωδικοποίησης.
Σ’ αυτή τη διπλωματική εργασία, προτείνουμε ένα αποδοτικό αλγόριθμο κωδικοποίησης χωρίς απώλειες για δεδομένα από δωδεκακάναλλο ηλεκτροκαρδιογράφημα. Ο κωδικοποιητής υλοποιεί ένα πολυγραμμικό μοντέλο υψηλής απόδοσης, το οποίο είναι «ειδικευμένο στους ασθενείς», ενδοκαναλικής πρόβλεψη και εφαρμόζει το πρότυπο κωδικοποίησης MPEG-4 ALS για διακαναλική πρόβλεψη και κωδικοποίηση. Τα αποτελέσματα του αλγορίθμου συγκρίθηκαν με τεχνικές κωδικοποίησης εντροπίας χωρίς απώλειες και δείχνουν αύξηση της απόδοσης κωδικοποίησης. / The Electrocardiogram (ECG) is one of the most well studied medical signals. A large number of ECG processing algorithms have being proposed over the years covering the areas of ECG noise filtering, automated diagnostic interpretation and coding.
In this master thesis, we propose a robust multi-channel ECG encoder architecture, which operates on 12-channel ECG data. The encoder utilizes highly efficient multi-linear patient specific models for inter-channel prediction and the MPEG-4 Audio Lossless Coding (ALS) architecture for intra-channel prediction and coding. The results of the algorithm show improved performance over standard encoding techniques.
|
8 |
Modélisation géométrique du corps humain (externe et interne) à partir des données externes / Subject-specific geometric modeling of the human body (external and internal) from external dataNérot, Agathe 08 September 2016 (has links)
Les modèles humains numériques sont devenus des outils indispensables à l’étude de la posture est du mouvement dans de nombreux domaines de la biomécanique visant des applications en ergonomie ou pour la clinique. Ces modèles intègrent une représentation géométrique de la surface du corps et un squelette filaire interne composé de segments rigides et d’articulations assurant leur mise en mouvement. La personnalisation des mannequins s'effectue d’abord sur les dimensions anthropométriques externes, servant ensuite de données d’entrée à l’ajustement des longueurs des segments du squelette en interne. Si les données externes sont de plus en plus facilement mesurables à l’aide des outils de scanning 3D actuels, l’enjeu scientifique est de pouvoir prédire des points caractéristiques du squelette en interne à partir de données uniquement externes. L’Institut de Biomécanique Humaine Georges Charpak (Arts et Métiers ParisTech) a développé des méthodes de reconstruction des os et de l’enveloppe externe à partir de radiographies biplanes obtenues avec le système basse dose EOS. En s’appuyant sur cette technologie, ces travaux ont permis de proposer de nouvelles relations statistiques externes-internes pour prédire des points du squelette longitudinal, en particulier l’ensemble des centres articulaires du rachis, à partir d’une base de données de 80 sujets. L'application de ce travail pourrait permettre d’améliorer le réalisme des modèles numériques actuels en vue de mener des analyses biomécaniques, principalement en ergonomie, nécessitant des informations dépendant de la position des articulations comme les mesures d’amplitude de mouvement et de charges articulaires / Digital human models have become instrumental tools in the analysis of posture and motion in many areas of biomechanics, including ergonomics and clinical settings. These models include a geometric representation of the body surface and an internal linkage composed of rigid segments and joints allowing simulation of human movement. The customization of human models first starts with the adjustment of external anthropometric dimensions, which are then used as input data to the adjustment of internal skeletal segments lengths. While the external data points are more readily measurable using current 3D scanning tools, the scientific challenge is to predict the characteristic points of the internal skeleton from external data only. The Institut de Biomécanique Humaine Georges Charpak (Arts et Métiers ParisTech) has developed 3D reconstruction methods of bone and external envelope from biplanar radiographs obtained from the EOS system (EOS Imaging, Paris), a low radiation dose technology. Using this technology, this work allowed proposing new external-internal statistical relationships to predict points of the longitudinal skeleton, particularly the complete set of spine joint centers, from a database of 80 subjects. The implementation of this work could improve the realism of current digital human models used for biomechanical analysis requiring information of joint center location, such as the estimation of range of motion and joint loading
|
9 |
Predikce průměrných hodinových koncentrací přízemního ozonu z měření pasivními dozimetry / Prediction of mean hourly values of surface ozone concentrations from passive sampler measurementsSinkulová, Michaela January 2020 (has links)
In terms of air pollution, ground-level ozone is according to current knowledge, contributes the most to damage to ecosystems. To calculate the key indicators of potential damage to ecosystems, such as the exposure index AOT40 and stomatal flux, it is important to know the hourly ozone concentrations, which are the input data for both calculations. For the measurement of O3 air pollution concentrations for the purposes of environmental studies, continuous measurement is not used, but measurement by passive (diffusion) dosimeters, which are exposed for a longer period (usually 1 week-1 month) and thus indicate the average concentration for the relevant longer period. The aim of this diploma thesis is the prediction of hourly concentrations of ground-level ozone from measurements by diffusive samplers, which took place in the period 2006-2010 in Jizerské hory mountains. Monitoring always took place for 2 weeks during the vegetation seasons (April-October) at localities and at various altitudes (714 m above sea level - 1,000 m above sea level). Ogawa diffusive samplers were used. From these average and meteorological concentrations, hourly values of ground-level ozone concentrations were calculated according to the model from professional study and these were compared with measurements from an...
|
10 |
Contribution à la modélisation de la qualité de l'orge et du malt pour la maîtrise du procédé de maltage / Modeling contribution of barley and malt quality for the malting process controlAjib, Budour 18 December 2013 (has links)
Dans un marché en permanente progression et pour répondre aux besoins des brasseurs en malt de qualité, la maîtrise du procédé de maltage est indispensable. La qualité du malt est fortement dépendante des conditions opératoires, en particulier des conditions de trempe, mais également de la qualité de la matière première : l'orge. Dans cette étude, nous avons établi des modèles polynomiaux qui mettent en relation les conditions opératoires et la qualité du malt. Ces modèles ont été couplés à nos algorithmes génétiques et nous ont permis de déterminer les conditions optimales de maltage, soit pour atteindre une qualité ciblée de malt (friabilité), soit pour permettre un maltage à faible teneur en eau (pour réduire la consommation en eau et maîtriser les coûts environnementaux de production) tout en conservant une qualité acceptable de malt. Cependant, la variabilité de la matière première est un facteur limitant de notre approche. Les modèles établis sont en effet très sensibles à l'espèce d'orge (printemps, hiver) ou encore à la variété d'orge utilisée. Les modèles sont surtout très dépendants de l'année de récolte. Les variations observées sur les propriétés d'une année de récolte à une autre sont mal caractérisées et ne sont donc pas intégrées dans nos modèles. Elles empêchent ainsi de capitaliser l'information expérimentale au cours du temps. Certaines propriétés structurelles de l'orge (porosité, dureté) ont été envisagées comme nouveaux facteurs pour mieux caractériser la matière première mais ils n'ont pas permis d'expliquer les variations observés en malterie.Afin de caractériser la matière première, 394 échantillons d'orge issus de 3 années de récolte différentes 2009-2010-2011 ont été analysés par spectroscopie MIR. Les analyses ACP ont confirmé l'effet notable des années de récolte, des espèces, des variétés voire des lieux de culture sur les propriétés de l'orge. Une régression PLS a permis, pour certaines années et pour certaines espèces, de prédire les teneurs en protéines et en béta-glucanes de l'orge à partir des spectres MIR. Cependant, ces résultats, pourtant prometteurs, se heurtent toujours à la variabilité. Ces nouveaux modèles PLS peuvent toutefois être exploités pour mettre en place des stratégies de pilotage du procédé de maltage à partir de mesures spectroscopiques MIR / In a continuously growing market and in order to meet the needs of Brewers in high quality malt, control of the malting process is a great challenge. Malt quality is highly dependent on the malting process operating conditions, especially on the steeping conditions, but also the quality of the raw material: barley. In this study, we established polynomial models that relate the operating conditions and the malt quality. These models have been coupled with our genetic algorithms to determine the optimal steeping conditions, either to obtain a targeted quality of malt (friability), or to allow a malting at low water content while maintaining acceptable quality of malt (to reduce water consumption and control the environmental costs of malt production). However, the variability of the raw material is a limiting factor for our approach. Established models are very sensitive to the species (spring and winter barley) or to the barley variety. The models are especially highly dependent on the crop year. Variations on the properties of a crop from one to another year are poorly characterized and are not incorporated in our models. They thus prevent us to capitalize experimental information over time. Some structural properties of barley (porosity, hardness) were considered as new factors to better characterize barley but they did not explain the observed variations.To characterize barley, 394 samples from 3 years of different crops 2009-2010-2011 were analysed by MIR spectroscopy. ACP analyses have confirmed the significant effect of the crop-years, species, varieties and sometimes of places of harvest on the properties of barley. A PLS regression allowed, for some years and for some species, to predict content of protein and beta-glucans of barley using MIR spectra. These results thus still face product variability, however, these new PLS models are very promising and could be exploited to implement control strategies in malting process using MIR spectroscopic measurements
|
Page generated in 0.0906 seconds