Global ETD Search

391	The Effects of Long-Term Exposure of an Artificially Assembled Microbial Community to Uranium or Low pH Brzoska, Ryann Michelle 27 July 2015 (has links) No description available. Microbiology Ecology Biology Environmental Science Artificial microbial community consortia Oak Ridge interactions evolution long-term uranium low pH Sediminibacterium genomics physiology coculture
392	Sélection bayésienne de variables et méthodes de type Parallel Tempering avec et sans vraisemblance Baragatti, Meïli 10 November 2011 (has links) Cette thèse se décompose en deux parties. Dans un premier temps nous nous intéressons à la sélection bayésienne de variables dans un modèle probit mixte.L'objectif est de développer une méthode pour sélectionner quelques variables pertinentes parmi plusieurs dizaines de milliers tout en prenant en compte le design d'une étude, et en particulier le fait que plusieurs jeux de données soient fusionnés. Le modèle de régression probit mixte utilisé fait partie d'un modèle bayésien hiérarchique plus large et le jeu de données est considéré comme un effet aléatoire. Cette méthode est une extension de la méthode de Lee et al. (2003). La première étape consiste à spécifier le modèle ainsi que les distributions a priori, avec notamment l'utilisation de l'a priori conventionnel de Zellner (g-prior) pour le vecteur des coefficients associé aux effets fixes (Zellner, 1986). Dans une seconde étape, nous utilisons un algorithme Metropolis-within-Gibbs couplé à la grouping (ou blocking) technique de Liu (1994) afin de surmonter certaines difficultés d'échantillonnage. Ce choix a des avantages théoriques et computationnels. La méthode développée est appliquée à des jeux de données microarray sur le cancer du sein. Cependant elle a une limite : la matrice de covariance utilisée dans le g-prior doit nécessairement être inversible. Or il y a deux cas pour lesquels cette matrice est singulière : lorsque le nombre de variables sélectionnées dépasse le nombre d'observations, ou lorsque des variables sont combinaisons linéaires d'autres variables. Nous proposons donc une modification de l'a priori de Zellner en y introduisant un paramètre de type ridge, ainsi qu'une manière de choisir les hyper-paramètres associés. L'a priori obtenu est un compromis entre le g-prior classique et l'a priori supposant l'indépendance des coefficients de régression, et se rapproche d'un a priori précédemment proposé par Gupta et Ibrahim (2007).Dans une seconde partie nous développons deux nouvelles méthodes MCMC basées sur des populations de chaînes. Dans le cas de modèles complexes ayant de nombreux paramètres, mais où la vraisemblance des données peut se calculer, l'algorithme Equi-Energy Sampler (EES) introduit par Kou et al. (2006) est apparemment plus efficace que l'algorithme classique du Parallel Tempering (PT) introduit par Geyer (1991). Cependant, il est difficile d'utilisation lorsqu'il est couplé avec un échantillonneur de Gibbs, et nécessite un stockage important de valeurs. Nous proposons un algorithme combinant le PT avec le principe d'échanges entre chaînes ayant des niveaux d'énergie similaires dans le même esprit que l'EES. Cette adaptation appelée Parallel Tempering with Equi-Energy Moves (PTEEM) conserve l'idée originale qui fait la force de l'algorithme EES tout en assurant de bonnes propriétés théoriques et une utilisation facile avec un échantillonneur de Gibbs.Enfin, dans certains cas complexes l'inférence peut être difficile car le calcul de la vraisemblance des données s'avère trop coûteux, voire impossible. De nombreuses méthodes sans vraisemblance ont été développées. Par analogie avec le Parallel Tempering, nous proposons une méthode appelée ABC-Parallel Tempering, basée sur la théorie des MCMC, utilisant une population de chaînes et permettant des échanges entre elles. / This thesis is divided into two main parts. In the first part, we propose a Bayesian variable selection method for probit mixed models. The objective is to select few relevant variables among tens of thousands while taking into account the design of a study, and in particular the fact that several datasets are merged together. The probit mixed model used is considered as part of a larger hierarchical Bayesian model, and the dataset is introduced as a random effect. The proposed method extends a work of Lee et al. (2003). The first step is to specify the model and prior distributions. In particular, we use the g-prior of Zellner (1986) for the fixed regression coefficients. In a second step, we use a Metropolis-within-Gibbs algorithm combined with the grouping (or blocking) technique of Liu (1994). This choice has both theoritical and practical advantages. The method developed is applied to merged microarray datasets of patients with breast cancer. However, this method has a limit: the covariance matrix involved in the g-prior should not be singular. But there are two standard cases in which it is singular: if the number of observations is lower than the number of variables, or if some variables are linear combinations of others. In such situations we propose to modify the g-prior by introducing a ridge parameter, and a simple way to choose the associated hyper-parameters. The prior obtained is a compromise between the conditional independent case of the coefficient regressors and the automatic scaling advantage offered by the g-prior, and can be linked to the work of Gupta and Ibrahim (2007).In the second part, we develop two new population-based MCMC methods. In cases of complex models with several parameters, but whose likelihood can be computed, the Equi-Energy Sampler (EES) of Kou et al. (2006) seems to be more efficient than the Parallel Tempering (PT) algorithm introduced by Geyer (1991). However it is difficult to use in combination with a Gibbs sampler, and it necessitates increased storage. We propose an algorithm combining the PT with the principle of exchange moves between chains with same levels of energy, in the spirit of the EES. This adaptation which we are calling Parallel Tempering with Equi-Energy Move (PTEEM) keeps the original idea of the EES method while ensuring good theoretical properties and a practical use in combination with a Gibbs sampler.Then, in some complex models whose likelihood is analytically or computationally intractable, the inference can be difficult. Several likelihood-free methods (or Approximate Bayesian Computational Methods) have been developed. We propose a new algorithm, the Likelihood Free-Parallel Tempering, based on the MCMC theory and on a population of chains, by using an analogy with the Parallel Tempering algorithm. Sélection bayésienne de variables Modèle probit mixte A priori de Zellner Paramètre ridge Monte Carlo Markov Chains Parallel Tempering Equi-Energy Sampler Approximate Bayesian Computation Méthodes sans vraisemblance Bayesian variable selection Probit mixed model Zellner g-prior Ridge parameter Monte Carlo Markov Chains Parallel Tempering Equi-Energy Sampler Approximate Bayesian Computation Likelihood-Free methods
393	Pénalisation et réduction de la dimension des variables auxiliaires en théorie des sondages / Penalization and data reduction of auxiliary variables in survey sampling Shehzad, Muhammad Ahmed 12 October 2012 (has links) Les enquêtes par sondage sont utiles pour estimer des caractéristiques d'une populationtelles que le total ou la moyenne. Cette thèse s'intéresse à l'étude detechniques permettant de prendre en compte un grand nombre de variables auxiliairespour l'estimation d'un total.Le premier chapitre rappelle quelques définitions et propriétés utiles pour lasuite du manuscrit : l'estimateur de Horvitz-Thompson, qui est présenté commeun estimateur n'utilisant pas l'information auxiliaire ainsi que les techniques decalage qui permettent de modifier les poids de sondage de facon à prendre encompte l'information auxiliaire en restituant exactement dans l'échantillon leurstotaux sur la population.Le deuxième chapitre, qui est une partie d'un article de synthèse accepté pourpublication, présente les méthodes de régression ridge comme un remède possibleau problème de colinéarité des variables auxiliaires, et donc de mauvais conditionnement.Nous étudions les points de vue "model-based" et "model-assisted" dela ridge regression. Cette technique qui fournit de meilleurs résultats en termed'erreur quadratique en comparaison avec les moindres carrés ordinaires peutégalement s'interpréter comme un calage pénalisé. Des simulations permettentd'illustrer l'intérêt de cette technique par compar[a]ison avec l'estimateur de Horvitz-Thompson.Le chapitre trois présente une autre manière de traiter les problèmes de colinéaritévia une réduction de la dimension basée sur les composantes principales. Nousétudions la régression sur composantes principales dans le contexte des sondages.Nous explorons également le calage sur les moments d'ordre deux des composantesprincipales ainsi que le calage partiel et le calage sur les composantes principalesestimées. Une illustration sur des données de l'entreprise Médiamétrie permet deconfirmer l'intérêt des ces techniques basées sur la réduction de la dimension pourl'estimation d'un total en présence d'un grand nombre de variables auxiliaires / Survey sampling techniques are quite useful in a way to estimate population parameterssuch as the population total when the large dimensional auxiliary data setis available. This thesis deals with the estimation of population total in presenceof ill-conditioned large data set.In the first chapter, we give some basic definitions that will be used in thelater chapters. The Horvitz-Thompson estimator is defined as an estimator whichdoes not use auxiliary variables. Along with, calibration technique is defined toincorporate the auxiliary variables for sake of improvement in the estimation ofpopulation totals for a fixed sample size.The second chapter is a part of a review article about ridge regression estimationas a remedy for the multicollinearity. We give a detailed review ofthe model-based, design-based and model-assisted scenarios for ridge estimation.These estimates give improved results in terms of MSE compared to the leastsquared estimates. Penalized calibration is also defined under survey sampling asan equivalent estimation technique to the ridge regression in the classical statisticscase. Simulation results confirm the improved estimation compared to theHorvitz-Thompson estimator.Another solution to the ill-conditioned large auxiliary data is given in terms ofprincipal components analysis in chapter three. Principal component regression isdefined and its use in survey sampling is explored. Some new types of principalcomponent calibration techniques are proposed such as calibration on the secondmoment of principal component variables, partial principal component calibrationand estimated principal component calibration to estimate a population total. Applicationof these techniques on real data advocates the use of these data reductiontechniques for the improved estimation of population totals Sondage Colinéarité Régression ridge Calage pénalisé Estimateur assisté par un modèle Estimateur basé sur un modèle Estimateur de Horvitz-Thompson Calage sur composantes principales Survey sampling Multicollinearity Ridge regression Penalized calibration Model-based estimator Model-assisted estimator Horvitz-Thompson estimator Principal component calibration 519
394	At the center of American modernism: Lola Ridge's politics, poetics, and publishing Wheeler, Belinda 23 September 2008 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Although many of Lola Ridge's poems champion the causes of minorities and the disenfranchised, it is too easy to state that politics were the sole reason for her neglect. A simple look at well-known female poets who often wrote about social or political issues during Ridge's lifetime, such as Edna St. Vincent Millay and Muriel Rukeyser, weakens such a claim. Furthermore, Ridge's five books of poetry illustrate that many of her poems focused on themes beyond the political or social. The decisions by critics to focus on selections of Ridge's poems that do not display her ability to employ multiple aesthetics in her poetry have caused them to present her work one-dimensionally. Likewise, politically motivated critics often overlook aesthetic experiments that poets like Ridge employ in their poetry. Few poets during Ridge's time made use of such drastically varied styles, and because her work resists easy categorization (as either traditional or avant-garde), her poetry has largely gone unnoticed by modern scholars. Chapter two of my thesis focuses on a selection Ridge's social and political poems and highlights how Ridge's social poetry coupled with the multiple aesthetics she employed has played a part in her critical neglect. My findings will open up the discussion of Ridge's poetry and situate her work both politically and aesthetically, something no critic has yet attempted. Chapter three examines Ridge’s role as editor of Modern School, Others and Broom. Ridge's work for these magazines, particularly Others and Broom, places her at the center of American modernism. My examination of Ridge's social poetry and her role as editor for two leading literary magazines, in conjunction with her use of multiple aesthetics, will build a strong case for why her work deserves to be recovered. Leftist Mentor American editor Hart Crane Jean Toomer William Carlos Williams Kay Boyle Alfred Kreymborg Harold Loeb Ballad Lola Ridge Poetry Modernism Broom Others Little Magazines Working-class Multiple aesthetics Avant-garde Women poets, American -- 20th century
395	Auto-Tuning Apache Spark Parameters for Processing Large Datasets / Auto-Optimering av Apache Spark-parametrar för bearbetning av stora datamängder Zhou, Shidi January 2023 (has links) Apache Spark is a popular open-source distributed processing framework that enables efficient processing of large amounts of data. Apache Spark has a large number of configuration parameters that are strongly related to performance. Selecting an optimal configuration for Apache Spark application deployed in a cloud environment is a complex task. Making a poor choice may not only result in poor performance but also increases costs. Manually adjusting the Apache Spark configuration parameters can take a lot of time and may not lead to the best outcomes, particularly in a cloud environment where computing resources are allocated dynamically, and workloads can fluctuate significantly. The focus of this thesis project is the development of an auto-tuning approach for Apache Spark configuration parameters. Four machine learning models are formulated and evaluated to predict Apache Spark’s performance. Additionally, two models for Apache Spark configuration parameter search are created and evaluated to identify the most suitable parameters, resulting in the shortest execution time. The obtained results demonstrates that with the developed auto-tuning approach and adjusting Apache Spark configuration parameters, Apache Spark applications can achieve a shorter execution time than when using the default parameters. The developed auto-tuning approach gives an improved cluster utilization and shorter job execution time, with an average performance improvement of 49.98%, 53.84%, and 64.16% for the three different types of Apache Spark applications benchmarked. / Apache Spark är en populär öppen källkodslösning för distribuerad databehandling som möjliggör effektiv bearbetning av stora mängder data. Apache Spark har ett stort antal konfigurationsparametrar som starkt påverkar prestandan. Att välja en optimal konfiguration för en Apache Spark-applikation som distribueras i en molnmiljö är en komplex uppgift. Ett dåligt val kan inte bara leda till dålig prestanda utan också ökade kostnader. Manuell anpassning av Apache Spark-konfigurationsparametrar kan ta mycket tid och leda till suboptimala resultat, särskilt i en molnmiljö där beräkningsresurser tilldelas dynamiskt och arbetsbelastningen kan variera avsevärt. Fokus för detta examensprojekt är att utveckla en automatisk optimeringsmetod för konfigurationsparametrarna i Apache Spark. Fyra maskininlärningsmodeller formuleras och utvärderas för att förutsäga Apache Sparks prestanda. Dessutom skapas och utvärderas två modeller för att söka efter de mest lämpliga konfigurationsparametrarna för Apache Spark, vilket resulterar i kortast möjliga exekveringstid. De erhållna resultaten visar att den utvecklade automatiska optimeringsmetoden, med anpassning av Apache Sparks konfigurationsparameterar, bidrar till att Apache Spark-applikationer kan uppnå kortare exekveringstider än vid användning av standard-parametrar. Den utvecklade metoden för automatisk optimering bidrar till en förbättrad användning av klustret och kortare exekveringstider, med en genomsnittlig prestandaförbättring på 49,98%, 53,84% och 64,16% för de tre olika typerna av Apache Spark-applikationer som testades. Apache Spark Cloud Environment Spark Configuration Parameter Resource Utilization Ridge Regression Elastic Net Random Forest Deep Neural Network Bayesian Optimization Particle Swarm Optimization. Apache Spark Molnmiljö Apache Spark konfigurationsparameter Resursutnyttjande Ridge-regression Elastisk nät Slumpskog Djupt neuralt nätverk Bayesiansk optimering Partikelsvärmsoptimering. Computer and Information Sciences Data- och informationsvetenskap
396	[en] FORECASTING AMERICAN INDUSTRIAL PRODUCTION WITH HIGH DIMENSIONAL ENVIRONMENTS FROM FINANCIAL MARKETS, SENTIMENTS, EXPECTATIONS, AND ECONOMIC VARIABLES / [pt] PREVENDO A PRODUÇÃO INDUSTRIAL AMERICANA EM AMBIENTES DE ALTA DIMENSIONALIDADE, ATRAVÉS DE MERCADOS FINANCEIROS, SENTIMENTOS, EXPECTATIVAS E VARIÁVEIS ECONÔMICAS EDUARDO OLIVEIRA MARINHO 20 February 2020 (has links) [pt] O presente trabalho traz 6 diferentes técnicas de previsão para a variação mensal do Índice da Produção Industrial americana em 3 ambientes diferentes totalizando 18 modelos. No primeiro ambiente foram usados como variáveis explicativas a própria defasagem da variação mensal do Índice da produção industrial e outras 55 variáveis de mercado e de expectativa tais quais retornos setoriais, prêmio de risco de mercado, volatilidade implícita, prêmio de taxa de juros (corporate e longo prazo), sentimento do consumidor e índice de incerteza. No segundo ambiente foi usado à data base do FRED com 130 variáveis econômicas como variáveis explicativas. No terceiro ambiente foram usadas as variáveis mais relevantes do ambiente 1 e do ambiente 2. Observa-se no trabalho uma melhora em prever o IP contra um modelo AR e algumas interpretações a respeito do comportamento da economia americana nos últimos 45 anos (importância de setores econômicos, períodos de incerteza, mudanças na resposta a prêmio de risco, volatilidade e taxa de juros). / [en] This thesis presents 6 different forecasting techniques for the monthly variation of the American Industrial Production Index in 3 different environments, totaling 18 models. In the first environment, the lags of the monthly variation of the industrial production index and other 55 market and expectation variables such as sector returns, market risk premium, implied volatility, and interest rate risk premiums (corporate premium and long term premium), consumer sentiment and uncertainty index. In the second environment was used the FRED data base with 130 economic variables as explanatory variables. In the third environment, the most relevant variables of environment 1 and environment 2 were used. It was observed an improvement in predicting IP against an AR model and some interpretations regarding the behavior of the American economy in the last 45 years (importance of sectors, uncertainty periods, and changes in response to risk premium, volatility and interest rate). [pt] VOLATILIDADE [pt] PREMIO A TERMO [pt] RIDGE [pt] ALTA DIMENSIONALIDADE [pt] RANDOM FOREST [pt] BAGGING [pt] LASSO [pt] PRODUCAO INDUSTRIAL [pt] EXPECTATIVAS [pt] SENTIMENTO [pt] PREMIO DE RISCO [pt] INCERTEZA [pt] TAXA DE JUROS [en] VOLATILITY MODELS [en] TERM PREMIUM [en] RIDGE [en] HIGH DIMENSION [en] RANDOM FOREST [en] BAGGING [en] LASSO [en] INDUSTRIAL PRODUCTION [en] EXPECTATIONS [en] FEELING [en] EQUITY RISK PREMIUM [en] UNCERTAINTY [en] INTEREST RATES
397	Predicting PV self-consumption in villas with machine learning GALLI, FABIAN January 2021 (has links) In Sweden, there is a strong and growing interest in solar power. In recent years, photovoltaic (PV) system installations have increased dramatically and a large part are distributed grid connected PV systems i.e. rooftop installations. Currently the electricity export rate is significantly lower than the import rate which has made the amount of self-consumed PV electricity a critical factor when assessing the system profitability. Self-consumption (SC) is calculated using hourly or sub-hourly timesteps and is highly dependent on the solar patterns of the location of interest, the PV system configuration and the building load. As this varies for all potential installations it is difficult to make estimations without having historical data of both load and local irradiance, which is often hard to acquire or not available. A method to predict SC using commonly available information at the planning phase is therefore preferred. There is a scarcity of documented SC data and only a few reports treating the subject of mapping or predicting SC. Therefore, this thesis is investigating the possibility of utilizing machine learning to create models able to predict the SC using the inputs: Annual load, annual PV production, tilt angle and azimuth angle of the modules, and the latitude. With the programming language Python, seven models are created using regression techniques, using real load data and simulated PV data from the south of Sweden, and evaluated using coefficient of determination (R2) and mean absolute error (MAE). The techniques are Linear Regression, Polynomial regression, Ridge Regression, Lasso regression, K-Nearest Neighbors (kNN), Random Forest, Multi-Layer Perceptron (MLP), as well as the only other SC prediction model found in the literature. A parametric analysis of the models is conducted, removing one variable at a time to assess the model’s dependence on each variable. The results are promising, with five out of eight models achieving an R2 value above 0.9 and can be considered good for predicting SC. The best performing model, Random Forest, has an R2 of 0.985 and a MAE of 0.0148. The parametric analysis also shows that while more input data is helpful, using only annual load and PV production is sufficient to make good predictions. This can only be stated for model performance for the southern region of Sweden, however, and are not applicable to areas outside the latitudes or country tested. / I Sverige finns ett starkt och växande intresse för solenergi. De senaste åren har antalet solcellsanläggningar ökat dramatiskt och en stor del är distribuerade nätanslutna solcellssystem, dvs takinstallationer. För närvarande är elexportpriset betydligt lägre än importpriset, vilket har gjort mängden egenanvänd solel till en kritisk faktor vid bedömningen av systemets lönsamhet. Egenanvändning (EA) beräknas med tidssteg upp till en timmes längd och är i hög grad beroende av solstrålningsmönstret för platsen av intresse, PV-systemkonfigurationen och byggnadens energibehov. Eftersom detta varierar för alla potentiella installationer är det svårt att göra uppskattningar utan att ha historiska data om både energibehov och lokal solstrålning, vilket ofta inte är tillgängligt. En metod för att förutsäga EA med allmän tillgänglig information är därför att föredra. Det finns en brist på dokumenterad EA-data och endast ett fåtal rapporter som behandlar kartläggning och prediktion av EA. I denna uppsats undersöks möjligheten att använda maskininlärning för att skapa modeller som kan förutsäga EA. De variabler som ingår är årlig energiförbrukning, årlig solcellsproduktion, lutningsvinkel och azimutvinkel för modulerna och latitud. Med programmeringsspråket Python skapas sju modeller med hjälp av olika regressionstekniker, där energiförbruknings- och simulerad solelproduktionsdata från södra Sverige används. Modellerna utvärderas med hjälp av determinationskoefficienten (R2) och mean absolute error (MAE). Teknikerna som används är linjär regression, polynomregression, Ridge regression, Lasso regression, K-nearest neighbor regression, Random Forest regression, Multi-Layer Perceptron regression. En additionell linjär regressions-modell skapas även med samma metodik som används i en tidigare publicerad rapport. En parametrisk analys av modellerna genomförs, där en variabel exkluderas åt gången för att bedöma modellens beroende av varje enskild variabel. Resultaten är mycket lovande, där fem av de åtta undersökta modeller uppnår ett R2-värde över 0,9. Den bästa modellen, Random Forest, har ett R2 på 0,985 och ett MAE på 0,0148. Den parametriska analysen visar också att även om ingångsdata är till hjälp, är det tillräckligt att använda årlig energiförbrukning och årlig solcellsproduktion för att göra bra förutsägelser. Det måste dock påpekas att modellprestandan endast är tillförlitlig för södra Sverige, från var beräkningsdata är hämtad, och inte tillämplig för områden utanför de valda latituderna eller land. Self-Consumption photovoltaics machine learning solar energy Random Forest K-nearest neighbor multi-layer perceptron lasso regression ridge regression linear regression prediction Egenanvändning photovoltaics maskininlärning solenergi Random Forest K-nearest neighbors multi-layer perceptron lasso regression ridge regression linjär regression data-förutsägelse Engineering and Technology Teknik och teknologier
398	PV self-consumption: Regression models and data visualization Tóth, Martos January 2022 (has links) In Sweden the installed capacity of the residential PV systems is increasing every year. The lack of feed-in-tariff-scheme makes the techno-economic optimization of the PV systems mainly based on the self-consumption. The calculation of this parameter involves hourly building loads and hourly PV generation. This data cannot be obtained easily from households. A predictive model based on already available data would be preferred and needed in this case. The already available machine learning models can be suitable and have been tested but the amount of literature in this topic is fairly low. The machine learning models are using a dataset which includes real measurement data of building loads and simulated PV generation data and the calculated self-consumption data based on these two inputs. The simulation of PV generation can be based on Typical Meteorological Year (TMY) weather file or on measured weather data. The TMY file can be generated quicker and more easily, but it is only spatially matched to the building load, while the measured data is matched temporally and spatially. This thesis investigates if the usage of TMY file leads to any major impact on the performance of the regression models by comparing it to the measured weather file model. In this model the buildings are single-family houses from south Sweden region. The different building types can have different load profiles which can affect the performance of the model. Because of the different load profiles, the effect of using TMY file may have more significant impact. This thesis also compares the impact of the TMY file usage in the case of multifamily houses and also compares the two building types by performance of the machine learning models. The PV and battery prices are decreasing from year to year. The subsidies in Sweden offer a significant tax credit on battery investments with PV systems. This can make the batteries profitable. Lastly this thesis evaluates the performance of the machine learning models after adding the battery to the system for both TMY and measured data. Also, the optimal system is predicted based on the self-consumption, PV generation and battery size. The models have high accuracy, the random forest model is above 0.9 R2for all cases. The results confirm that using the TMY file only leads to marginal errors, and it can be used for the training of the models. The battery model has promising results with above 0.9 R2 for four models: random forest, k-NN, MLP and polynomial. The prediction of the optimal system model has promising results as well for the polynomial model with 18% error in predicted payback time compared to the reference. / I Sverige ökar den installerade kapaciteten för solcellsanläggningarna för bostäder varje år. Bristen på inmatningssystem gör att den tekniska ekonomiska optimeringen av solcellssystemen huvudsakligen bygger på egen konsumtion. Beräkningen av denna parameter omfattar byggnadsbelastningar per timme och PV-generering per timme. Dessa uppgifter kan inte lätt erhållas från hushållen. En prediktiv modell baserad på redan tillgängliga data skulle vara att föredra och behövas i detta fall. De redan tillgängliga maskininlärningsmodellerna kan vara lämpliga och redan testade men mängden litteratur i detta ämne är ganska låg. Maskininlärningsmodellerna använder en datauppsättning som inkluderar verkliga mätdata från byggnader och simulerad PV-genereringsdata och den beräknade egenförbrukningsdata baserad på dessa två indata. Simuleringen av PV-generering kan baseras på väderfilen Typical Meteorological Year (TMY) eller på uppmätta väderdata. TMY-filen kan genereras snabbare och enklare, men den anpassas endast rumsligt till byggnadsbelastningen, medan uppmätta data är temporärt och rumsligt. Denna avhandling undersöker om användningen av TMY-fil leder till någon större påverkan på prestandan genom att jämföra den med den uppmätta väderfilsmodellen. I denna modell är byggnaderna småhus från södra Sverige. De olika byggnadstyperna kan ha olika belastningsprofiler vilket kan påverka modellens prestanda. På grund av dessa olika belastningsprofiler kan effekten av att använda TMY-fil ha mer betydande inverkan. Den här avhandlingen jämför också effekten av TMY-filanvändningen i fallet med flerfamiljshus och jämför också de två byggnadstyperna efter prestanda för maskininlärningsmodellerna. PV- och batteripriserna minskar från år till år. Subventionerna i Sverige ger en betydande skattelättnad på batteriinvesteringar med solcellssystem. Detta kan göra batterierna lönsamma. Slutligen utvärderar denna avhandling prestandan för maskininlärningsmodellerna efter att ha lagt till batteriet i systemet för både TMY och uppmätta data. Det optimala systemet förutsägs också baserat på egen förbrukning, årlig byggnadsbelastning, årlig PV-generering och batteristorlek. Modellerna har hög noggrannhet, den slumpmässiga skogsmodellen är över 0,9 R2 för alla fall. Resultaten bekräftar att användningen av TMY-filen endast leder till marginella fel, och den kan användas för träning av modellerna. Batterimodellen har lovande resultat med över 0,9 R2 för fyra modeller: random skog, k-NN, MLP och polynom. Förutsägelsen av den optimala systemmodellen har också lovande resultat för polynommodellen med 18 % fel i förutspådd återbetalningstid jämfört med referensen. Self-Consumption photovoltaics battery machine learning solar energy random forest k-nearest neighbor multi-layer perceptron lasso regression ridge regression linear regression Egenanvändning photovoltaics batteri maskininlärning solenergi random forest k-nearest neighbors multi-layer perceptron lasso regression ridge regression linjär regression Engineering and Technology Teknik och teknologier
399	兩種正則化方法用於假設檢定與判別分析時之比較 / A comparison between two regularization methods for discriminant analysis and hypothesis testing 李登曜, Li, Deng-Yao Unknown Date (has links) 在統計學上，高維度常造成許多分析上的問題，如進行多變量迴歸的假設檢定時，當樣本個數小於樣本維度時，其樣本共變異數矩陣之反矩陣不存在，使得檢定無法進行，本文研究動機即為在進行兩群多維常態母體的平均數檢定時，所遇到的高維度問題，並引發在分類上的研究，試圖尋找解決方法。本文研究目的為在兩種不同的正則化方法中，比較何者在檢定與分類上表現較佳。本文研究方法為以 Warton 與 Friedman 的正則化方法來分別進行檢定與分類上的分析，根據其檢定力與分類錯誤的表現來判斷何者較佳。由分析結果可知，兩種正則化方法並沒有絕對的優劣，須視母體各項假設而定。 / High dimensionality causes many problems in statistical analysis. For instance, consider the testing of hypotheses about multivariate regression models. Suppose that the dimension of the multivariate response is larger than the number of observations, then the sample covariance matrix is not invertible. Since the inverse of the sample covariance matrix is often needed when computing the usual likelihood ratio test statistic (under normality), the matrix singularity makes it difficult to implement the test . The singularity of the sample covariance matrix is also a problem in classification when the linear discriminant analysis (LDA) or the quadratic discriminant analysis (QDA) is used. Different regularization methods have been proposed to deal with the singularity of the sample covariance matrix for different purposes. Warton (2008) proposed a regularization procedure for testing, and Friedman (1989) proposed a regularization procedure for classification. Is it true that Warton's regularization works better for testing and Friedman's regularization works better for classification? To answer this question, some simulation studies are conducted and the results are presented in this thesis. It is found that neither regularization method is superior to the other. 脊迴歸正則化交叉驗證排列檢定概似比檢定判別分析 Ridge regression Regularization Cross-validation Permutation test Likelihood ration test Discriminant analysis
400	Codes of Interaction Martin, Timothy Michael 01 January 2005 (has links) The ideas within this thesis are meant to clarify my explorations, research and painting practice during my studies at Virginia Commonwealth University. I expand on my general statements about being fascinated by advancing technologies and concerned about the after effects of these advancements. The writing explores my curiosity about the internal, skeletal structure of things and how they operate. I explain how the paintings are idiosyncratic hybrids that evoke animation, imaginary scientific propositions, blueprints, maps, and advancing technologies. The work combines these interests with my observations of day-to-day experiences. Isolated events provide found compositions which I then manipulate: a seemingly mundane bike ride gets mapped into a well–ordered schematic of social interaction. painting diagrams schematics sculpture installation interaction codes maps mixed media glitter peter halley panofsky plumbing construction industry tennessee norris tn oak ridge jay searcy foucault narrative minimal geometric abstraction hue color joseph albers Art and Design Arts and Humanities

Search results