• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 121
  • 21
  • 20
  • 11
  • 7
  • 6
  • 3
  • 3
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 232
  • 75
  • 53
  • 46
  • 44
  • 38
  • 36
  • 30
  • 30
  • 30
  • 27
  • 25
  • 23
  • 20
  • 20
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
211

[en] FORECASTING AMERICAN INDUSTRIAL PRODUCTION WITH HIGH DIMENSIONAL ENVIRONMENTS FROM FINANCIAL MARKETS, SENTIMENTS, EXPECTATIONS, AND ECONOMIC VARIABLES / [pt] PREVENDO A PRODUÇÃO INDUSTRIAL AMERICANA EM AMBIENTES DE ALTA DIMENSIONALIDADE, ATRAVÉS DE MERCADOS FINANCEIROS, SENTIMENTOS, EXPECTATIVAS E VARIÁVEIS ECONÔMICAS

EDUARDO OLIVEIRA MARINHO 20 February 2020 (has links)
[pt] O presente trabalho traz 6 diferentes técnicas de previsão para a variação mensal do Índice da Produção Industrial americana em 3 ambientes diferentes totalizando 18 modelos. No primeiro ambiente foram usados como variáveis explicativas a própria defasagem da variação mensal do Índice da produção industrial e outras 55 variáveis de mercado e de expectativa tais quais retornos setoriais, prêmio de risco de mercado, volatilidade implícita, prêmio de taxa de juros (corporate e longo prazo), sentimento do consumidor e índice de incerteza. No segundo ambiente foi usado à data base do FRED com 130 variáveis econômicas como variáveis explicativas. No terceiro ambiente foram usadas as variáveis mais relevantes do ambiente 1 e do ambiente 2. Observa-se no trabalho uma melhora em prever o IP contra um modelo AR e algumas interpretações a respeito do comportamento da economia americana nos últimos 45 anos (importância de setores econômicos, períodos de incerteza, mudanças na resposta a prêmio de risco, volatilidade e taxa de juros). / [en] This thesis presents 6 different forecasting techniques for the monthly variation of the American Industrial Production Index in 3 different environments, totaling 18 models. In the first environment, the lags of the monthly variation of the industrial production index and other 55 market and expectation variables such as sector returns, market risk premium, implied volatility, and interest rate risk premiums (corporate premium and long term premium), consumer sentiment and uncertainty index. In the second environment was used the FRED data base with 130 economic variables as explanatory variables. In the third environment, the most relevant variables of environment 1 and environment 2 were used. It was observed an improvement in predicting IP against an AR model and some interpretations regarding the behavior of the American economy in the last 45 years (importance of sectors, uncertainty periods, and changes in response to risk premium, volatility and interest rate).
212

Predicting PV self-consumption in villas with machine learning

GALLI, FABIAN January 2021 (has links)
In Sweden, there is a strong and growing interest in solar power. In recent years, photovoltaic (PV) system installations have increased dramatically and a large part are distributed grid connected PV systems i.e. rooftop installations. Currently the electricity export rate is significantly lower than the import rate which has made the amount of self-consumed PV electricity a critical factor when assessing the system profitability. Self-consumption (SC) is calculated using hourly or sub-hourly timesteps and is highly dependent on the solar patterns of the location of interest, the PV system configuration and the building load. As this varies for all potential installations it is difficult to make estimations without having historical data of both load and local irradiance, which is often hard to acquire or not available. A method to predict SC using commonly available information at the planning phase is therefore preferred.  There is a scarcity of documented SC data and only a few reports treating the subject of mapping or predicting SC. Therefore, this thesis is investigating the possibility of utilizing machine learning to create models able to predict the SC using the inputs: Annual load, annual PV production, tilt angle and azimuth angle of the modules, and the latitude. With the programming language Python, seven models are created using regression techniques, using real load data and simulated PV data from the south of Sweden, and evaluated using coefficient of determination (R2) and mean absolute error (MAE). The techniques are Linear Regression, Polynomial regression, Ridge Regression, Lasso regression, K-Nearest Neighbors (kNN), Random Forest, Multi-Layer Perceptron (MLP), as well as the only other SC prediction model found in the literature. A parametric analysis of the models is conducted, removing one variable at a time to assess the model’s dependence on each variable.  The results are promising, with five out of eight models achieving an R2 value above 0.9 and can be considered good for predicting SC. The best performing model, Random Forest, has an R2 of 0.985 and a MAE of 0.0148. The parametric analysis also shows that while more input data is helpful, using only annual load and PV production is sufficient to make good predictions. This can only be stated for model performance for the southern region of Sweden, however, and are not applicable to areas outside the latitudes or country tested. / I Sverige finns ett starkt och växande intresse för solenergi. De senaste åren har antalet solcellsanläggningar ökat dramatiskt och en stor del är distribuerade nätanslutna solcellssystem, dvs takinstallationer. För närvarande är elexportpriset betydligt lägre än importpriset, vilket har gjort mängden egenanvänd solel till en kritisk faktor vid bedömningen av systemets lönsamhet. Egenanvändning (EA) beräknas med tidssteg upp till en timmes längd och är i hög grad beroende av solstrålningsmönstret för platsen av intresse, PV-systemkonfigurationen och byggnadens energibehov. Eftersom detta varierar för alla potentiella installationer är det svårt att göra uppskattningar utan att ha historiska data om både energibehov och lokal solstrålning, vilket ofta inte är tillgängligt. En metod för att förutsäga EA med allmän tillgänglig information är därför att föredra.  Det finns en brist på dokumenterad EA-data och endast ett fåtal rapporter som behandlar kartläggning och prediktion av EA. I denna uppsats undersöks möjligheten att använda maskininlärning för att skapa modeller som kan förutsäga EA. De variabler som ingår är årlig energiförbrukning, årlig solcellsproduktion, lutningsvinkel och azimutvinkel för modulerna och latitud. Med programmeringsspråket Python skapas sju modeller med hjälp av olika regressionstekniker, där energiförbruknings- och simulerad solelproduktionsdata från södra Sverige används. Modellerna utvärderas med hjälp av determinationskoefficienten (R2) och mean absolute error (MAE). Teknikerna som används är linjär regression, polynomregression, Ridge regression, Lasso regression, K-nearest neighbor regression, Random Forest regression, Multi-Layer Perceptron regression. En additionell linjär regressions-modell skapas även med samma metodik som används i en tidigare publicerad rapport. En parametrisk analys av modellerna genomförs, där en variabel exkluderas åt gången för att bedöma modellens beroende av varje enskild variabel.  Resultaten är mycket lovande, där fem av de åtta undersökta modeller uppnår ett R2-värde över 0,9. Den bästa modellen, Random Forest, har ett R2 på 0,985 och ett MAE på 0,0148. Den parametriska analysen visar också att även om ingångsdata är till hjälp, är det tillräckligt att använda årlig energiförbrukning och årlig solcellsproduktion för att göra bra förutsägelser. Det måste dock påpekas att modellprestandan endast är tillförlitlig för södra Sverige, från var beräkningsdata är hämtad, och inte tillämplig för områden utanför de valda latituderna eller land.
213

PV self-consumption: Regression models and data visualization

Tóth, Martos January 2022 (has links)
In Sweden the installed capacity of the residential PV systems is increasing every year. The lack of feed-in-tariff-scheme makes the techno-economic optimization of the PV systems mainly based on the self-consumption. The calculation of this parameter involves hourly building loads and hourly PV generation. This data cannot be obtained easily from households. A predictive model based on already available data would be preferred and needed in this case. The already available machine learning models can be suitable and have been tested but the amount of literature in this topic is fairly low. The machine learning models are using a dataset which includes real measurement data of building loads and simulated PV generation data and the calculated self-consumption data based on these two inputs. The simulation of PV generation can be based on Typical Meteorological Year (TMY) weather file or on measured weather data. The TMY file can be generated quicker and more easily, but it is only spatially matched to the building load, while the measured data is matched temporally and spatially. This thesis investigates if the usage of TMY file leads to any major impact on the performance of the regression models by comparing it to the measured weather file model. In this model the buildings are single-family houses from south Sweden region.  The different building types can have different load profiles which can affect the performance of the model. Because of the different load profiles, the effect of using TMY file may have more significant impact. This thesis also compares the impact of the TMY file usage in the case of multifamily houses and also compares the two building types by performance of the machine learning models. The PV and battery prices are decreasing from year to year. The subsidies in Sweden offer a significant tax credit on battery investments with PV systems. This can make the batteries profitable. Lastly this thesis evaluates the performance of the machine learning models after adding the battery to the system for both TMY and measured data. Also, the optimal system is predicted based on the self-consumption, PV generation and battery size.  The models have high accuracy, the random forest model is above 0.9 R2for all cases. The results confirm that using the TMY file only leads to marginal errors, and it can be used for the training of the models. The battery model has promising results with above 0.9 R2 for four models: random forest, k-NN, MLP and polynomial. The prediction of the optimal system model has promising results as well for the polynomial model with 18% error in predicted payback time compared to the reference. / I Sverige ökar den installerade kapaciteten för solcellsanläggningarna för bostäder varje år. Bristen på inmatningssystem gör att den tekniska ekonomiska optimeringen av solcellssystemen huvudsakligen bygger på egen konsumtion. Beräkningen av denna parameter omfattar byggnadsbelastningar per timme och PV-generering per timme. Dessa uppgifter kan inte lätt erhållas från hushållen. En prediktiv modell baserad på redan tillgängliga data skulle vara att föredra och behövas i detta fall. De redan tillgängliga maskininlärningsmodellerna kan vara lämpliga och redan testade men mängden litteratur i detta ämne är ganska låg. Maskininlärningsmodellerna använder en datauppsättning som inkluderar verkliga mätdata från byggnader och simulerad PV-genereringsdata och den beräknade egenförbrukningsdata baserad på dessa två indata. Simuleringen av PV-generering kan baseras på väderfilen Typical Meteorological Year (TMY) eller på uppmätta väderdata. TMY-filen kan genereras snabbare och enklare, men den anpassas endast rumsligt till byggnadsbelastningen, medan uppmätta data är temporärt och rumsligt. Denna avhandling undersöker om användningen av TMY-fil leder till någon större påverkan på prestandan genom att jämföra den med den uppmätta väderfilsmodellen. I denna modell är byggnaderna småhus från södra Sverige. De olika byggnadstyperna kan ha olika belastningsprofiler vilket kan påverka modellens prestanda. På grund av dessa olika belastningsprofiler kan effekten av att använda TMY-fil ha mer betydande inverkan. Den här avhandlingen jämför också effekten av TMY-filanvändningen i fallet med flerfamiljshus och jämför också de två byggnadstyperna efter prestanda för maskininlärningsmodellerna. PV- och batteripriserna minskar från år till år. Subventionerna i Sverige ger en betydande skattelättnad på batteriinvesteringar med solcellssystem. Detta kan göra batterierna lönsamma. Slutligen utvärderar denna avhandling prestandan för maskininlärningsmodellerna efter att ha lagt till batteriet i systemet för både TMY och uppmätta data. Det optimala systemet förutsägs också baserat på egen förbrukning, årlig byggnadsbelastning, årlig PV-generering och batteristorlek. Modellerna har hög noggrannhet, den slumpmässiga skogsmodellen är över 0,9 R2 för alla fall. Resultaten bekräftar att användningen av TMY-filen endast leder till marginella fel, och den kan användas för träning av modellerna. Batterimodellen har lovande resultat med över 0,9 R2 för fyra modeller: random skog, k-NN, MLP och polynom. Förutsägelsen av den optimala systemmodellen har också lovande resultat för polynommodellen med 18 % fel i förutspådd återbetalningstid jämfört med referensen.
214

Dynamique des filaments élastiques et visqueux

Brun, Pierre-Thomas 17 September 2012 (has links) (PDF)
Ce manuscrit porte conjointement sur la dynamique des filaments élastiques et visqueux. Leur dynamique est décrite par les équations de Kirchhoff qui sont très largement discutées dans le manuscrit. La description Lagrandienne des filaments visqueux a conduit au code numérique pour leur simulation : Discrete Viscous Rods . Il est validé par l'exemple des enroulements de filaments visqueux similaires à ce qu'on voit en faisant chuter du miel sur une tartine. Nous en étudions ensuite une variante : la machine à coudre fluide. Le filament s'écoule alors sur un tapis roulant dont la vitesse est contrôlée et forme une myriade de motifs complexes ressemblant à ceux d'une véritable machine à coudre. Nous en produisons un diagramme de phase numérique et les décrivons dans un formalisme cinématique unifié. La cinématique de leur contact avec le tapis est d'ailleurs à l'origine de leur formation. Le filament compose avec des fractions exactes de sa fréquence propre pour s'adapter à la vitesse prescrite par le tapis roulant. Les motifs sont une trace de cette adaptation. Ensuite, nous étudions un exemple élastique de dynamique filamentaire : le repliement d'un ressort à courbure constante initialement déposé à plat sur une surface, maintenu par une extrémité, l'autre étant libérée à t = 0. L'énergie élastique de flexion est convertie à mesure que le ressort décolle du support. Son enroulement est inertiel et est formé d'une zone de rayon deux fois supérieur à celui du ressort repos. L'enroulement a une forme autosimilaire, et avance à vitesse constante. On finit par l'étude de la dynamique du lasso dont la dynamique est étudiée avec un modèle minimal de chaîne et d'un noeud coulant.
215

Contributions à l'apprentissage statistique dans les modèles parcimonieux

Alquier, Pierre 06 December 2013 (has links) (PDF)
Ce mémoire d'habilitation a pour objet diverses contributions à l'estimation et à l'apprentissage statistique dans les modeles en grande dimension, sous différentes hypothèses de parcimonie. Dans une première partie, on introduit la problématique de la statistique en grande dimension dans un modèle générique de régression linéaire. Après avoir passé en revue les différentes méthodes d'estimation populaires dans ce modèle, on présente de nouveaux résultats tirés de (Alquier & Lounici 2011) pour des estimateurs agrégés. La seconde partie a essentiellement pour objet d'étendre les résultats de la première partie à l'estimation de divers modèles de séries temporelles (Alquier & Doukhan 2011, Alquier & Wintenberger 2013, Alquier & Li 2012, Alquier, Wintenberger & Li 2012). Enfin, la troisième partie présente plusieurs extensions à des modèles non param\étriques ou à des applications plus spécifiques comme la statistique quantique (Alquier & Biau 2013, Guedj & Alquier 2013, Alquier, Meziani & Peyré 2013, Alquier, Butucea, Hebiri, Meziani & Morimae 2013, Alquier 2013, Alquier 2008). Dans chaque section, des estimateurs sont proposés, et, aussi souvent que possible, des inégalités oracles optimales sont établies.
216

Inférence non-paramétrique pour des interactions poissoniennes

Sansonnet, Laure 14 June 2013 (has links) (PDF)
L'objet de cette thèse est d'étudier divers problèmes de statistique non-paramétrique dans le cadre d'un modèle d'interactions poissoniennes. De tels modèles sont, par exemple, utilisés en neurosciences pour analyser les interactions entre deux neurones au travers leur émission de potentiels d'action au cours de l'enregistrement de l'activité cérébrale ou encore en génomique pour étudier les distances favorisées ou évitées entre deux motifs le long du génome. Dans ce cadre, nous introduisons une fonction dite de reproduction qui permet de quantifier les positions préférentielles des motifs et qui peut être modélisée par l'intensité d'un processus de Poisson. Dans un premier temps, nous nous intéressons à l'estimation de cette fonction que l'on suppose très localisée. Nous proposons une procédure d'estimation adaptative par seuillage de coefficients d'ondelettes qui est optimale des points de vue oracle et minimax. Des simulations et une application en génomique sur des données réelles provenant de la bactérie E. coli nous permettent de montrer le bon comportement pratique de notre procédure. Puis, nous traitons les problèmes de test associés qui consistent à tester la nullité de la fonction de reproduction. Pour cela, nous construisons une procédure de test optimale du point de vue minimax sur des espaces de Besov faibles, qui a également montré ses performances du point de vue pratique. Enfin, nous prolongeons ces travaux par l'étude d'une version discrète en grande dimension du modèle précédent en proposant une procédure adaptative de type Lasso.
217

Learning algorithms for sparse classification

Sanchez Merchante, Luis Francisco 07 June 2013 (has links) (PDF)
This thesis deals with the development of estimation algorithms with embedded feature selection the context of high dimensional data, in the supervised and unsupervised frameworks. The contributions of this work are materialized by two algorithms, GLOSS for the supervised domain and Mix-GLOSS for unsupervised counterpart. Both algorithms are based on the resolution of optimal scoring regression regularized with a quadratic formulation of the group-Lasso penalty which encourages the removal of uninformative features. The theoretical foundations that prove that a group-Lasso penalized optimal scoring regression can be used to solve a linear discriminant analysis bave been firstly developed in this work. The theory that adapts this technique to the unsupervised domain by means of the EM algorithm is not new, but it has never been clearly exposed for a sparsity-inducing penalty. This thesis solidly demonstrates that the utilization of group-Lasso penalized optimal scoring regression inside an EM algorithm is possible. Our algorithms have been tested with real and artificial high dimensional databases with impressive resuits from the point of view of the parsimony without compromising prediction performances.
218

Probabilistic solar power forecasting using partially linear additive quantile regression models: an application to South African data

Mpfumali, Phathutshedzo 18 May 2019 (has links)
MSc (Statistics) / Department of Statistics / This study discusses an application of partially linear additive quantile regression models in predicting medium-term global solar irradiance using data from Tellerie radiometric station in South Africa for the period August 2009 to April 2010. Variables are selected using a least absolute shrinkage and selection operator (Lasso) via hierarchical interactions and the parameters of the developed models are estimated using the Barrodale and Roberts's algorithm. The best models are selected based on the Akaike information criterion (AIC), Bayesian information criterion (BIC), adjusted R squared (AdjR2) and generalised cross validation (GCV). The accuracy of the forecasts is evaluated using mean absolute error (MAE) and root mean square errors (RMSE). To improve the accuracy of forecasts, a convex forecast combination algorithm where the average loss su ered by the models is based on the pinball loss function is used. A second forecast combination method which is quantile regression averaging (QRA) is also used. The best set of forecasts is selected based on the prediction interval coverage probability (PICP), prediction interval normalised average width (PINAW) and prediction interval normalised average deviation (PINAD). The results show that QRA is the best model since it produces robust prediction intervals than other models. The percentage improvement is calculated and the results demonstrate that QRA model over GAM with interactions yields a small improvement whereas QRA over a convex forecast combination model yields a higher percentage improvement. A major contribution of this dissertation is the inclusion of a non-linear trend variable and the extension of forecast combination models to include the QRA. / NRF
219

Statistical and Machine Learning for assessment of Traumatic Brain Injury Severity and Patient Outcomes

Rahman, Md Abdur January 2021 (has links)
Traumatic brain injury (TBI) is a leading cause of death in all age groups, causing society to be concerned. However, TBI diagnostics and patient outcomes prediction are still lacking in medical science. In this thesis, I used a subset of TBIcare data from Turku University Hospital in Finland to classify the severity, patient outcomes, and CT (computerized tomography) as positive/negative. The dataset was derived from the comprehensive metabolic profiling of serum samples from TBI patients. The study included 96 TBI patients who were diagnosed as 7 severe (sTBI=7), 10 moderate (moTBI=10), and 79 mild (mTBI=79). Among them, there were 85 good recoveries (Good_Recovery=85) and 11 bad recoveries (Bad_Recovery=11), as well as 49 CT positive (CT. Positive=49) and 47 CT negative (CT. Negative=47). There was a total of 455 metabolites (features), excluding three response variables. Feature selection techniques were applied to retain the most important features while discarding the rest. Subsequently, four classifications were used for classification: Ridge regression, Lasso regression, Neural network, and Deep learning. Ridge regression yielded the best results for binary classifications such as patient outcomes and CT positive/negative. The accuracy of CT positive/negative was 74% (AUC of 0.74), while the accuracy of patient outcomes was 91% (AUC of 0.91). For severity classification (multi-class classification), neural networks performed well, with a total accuracy of 90%. Despite the limited number of data points, the overall result was satisfactory.
220

Forecasting hourly electricity demand in South Africa using machine learning models

Thanyani, Maduvhahafani 12 August 2020 (has links)
MSc (Statistics) / Department of Statistics / Short-term load forecasting in South Africa using machine learning and statistical models is discussed in this study. The research is focused on carrying out a comparative analysis in forecasting hourly electricity demand. This study was carried out using South Africa’s aggregated hourly load data from Eskom. The comparison is carried out in this study using support vector regression (SVR), stochastic gradient boosting (SGB), artificial neural networks (NN) with generalized additive model (GAM) as a benchmark model in forecasting hourly electricity demand. In both modelling frameworks, variable selection is done using least absolute shrinkage and selection operator (Lasso). The SGB model yielded the least root mean square error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE) on testing data. SGB model also yielded the least RMSE, MAE and MAPE on training data. Forecast combination of the models’ forecasts is done using convex combination and quantile regres- sion averaging (QRA). The QRA was found to be the best forecast combination model ibased on the RMSE, MAE and MAPE. / NRF

Page generated in 0.0315 seconds