Global ETD Search

91	Novel variable influence on projection (VIP) methods in OPLS, O2PLS, and OnPLS models for single- and multi-block variable selection : VIPOPLS, VIPO2PLS, and MB-VIOP methods Galindo-Prieto, Beatriz January 2017 (has links) Multivariate and multiblock data analysis involves useful methodologies for analyzing large data sets in chemistry, biology, psychology, economics, sensory science, and industrial processes; among these methodologies, partial least squares (PLS) and orthogonal projections to latent structures (OPLS®) have become popular. Due to the increasingly computerized instrumentation, a data set can consist of thousands of input variables which contain latent information valuable for research and industrial purposes. When analyzing a large number of data sets (blocks) simultaneously, the number of variables and underlying connections between them grow very much indeed; at this point, reducing the number of variables keeping high interpretability becomes a much needed strategy. The main direction of research in this thesis is the development of a variable selection method, based on variable influence on projection (VIP), in order to improve the model interpretability of OnPLS models in multiblock data analysis. This new method is called multiblock variable influence on orthogonal projections (MB-VIOP), and its novelty lies in the fact that it is the first multiblock variable selection method for OnPLS models. Several milestones needed to be reached in order to successfully create MB-VIOP. The first milestone was the development of a single-block variable selection method able to handle orthogonal latent variables in OPLS models, i.e. VIP for OPLS (denoted as VIPOPLS or OPLS-VIP in Paper I), which proved to increase the interpretability of PLS and OPLS models, and afterwards, was successfully extended to multivariate time series analysis (MTSA) aiming at process control (Paper II). The second milestone was to develop the first multiblock VIP approach for enhancement of O2PLS® models, i.e. VIPO2PLS for two-block multivariate data analysis (Paper III). And finally, the third milestone and main goal of this thesis, the development of the MB-VIOP algorithm for the improvement of OnPLS model interpretability when analyzing a large number of data sets simultaneously (Paper IV). The results of this thesis, and their enclosed papers, showed that VIPOPLS, VIPO2PLS, and MB-VIOP methods successfully assess the most relevant variables for model interpretation in PLS, OPLS, O2PLS, and OnPLS models. In addition, predictability, robustness, dimensionality reduction, and other variable selection purposes, can be potentially improved/achieved by using these methods. Variable influence on projection VIP MB-VIOP OPLS O2PLS OnPLS variable selection
92	Variable Selection in High-Dimensional Data Reichhuber, Sarah, Hallberg, Johan January 2021 (has links) Estimating the variables of importance in inferentialmodelling is of significant interest in many fields of science,engineering, biology, medicine, finance and marketing. However,variable selection in high-dimensional data, where the number ofvariables is relatively large compared to the observed data points,is a major challenge and requires more research in order toenhance reliability and accuracy. In this bachelor thesis project,several known methods of variable selection, namely orthogonalmatching pursuit (OMP), ridge regression, lasso, adaptive lasso,elastic net, adaptive elastic net and multivariate adaptive regressionsplines (MARS) were implemented on a high-dimensional dataset.The aim of this bachelor thesis project was to analyze andcompare these variable selection methods. Furthermore theirperformance on the same data set but extended, with the numberof variables and observations being of similar size, were analyzedand compared as well. This was done by generating models forthe different variable selection methods using built-in packagesin R and coding in MATLAB. The models were then used topredict the observations, and these estimations were compared tothe real observations. The performances of the different variableselection methods were analyzed utilizing different evaluationmethods. It could be concluded that some of the variable selectionmethods provided more accurate models for the implementedhigh-dimensional data set than others. Elastic net, for example,was one of the methods that performed better. Additionally, thecombination of final models could provide further insight in whatvariables that are crucial for the observations in the given dataset, where, for example, variable 112 and 23 appeared to be ofimportance. / Att skatta vilka variabler som är viktigai inferentiell modellering är av stort intresse inom mångaforskningsområden, industrier, biologi, medicin, ekonomi ochmarknadsföring. Variabel-selektion i högdimensionella data, därantalet variabler är relativt stort jämfört med antalet observeradedatapunkter, är emellertid en stor utmaning och krävermer forskning för att öka trovärdigheten och noggrannheteni resultaten. I detta projekt implementerades ett flertal kändavariabel-selektions-metoder, nämligen orthogonal matching pursuit(OMP), ridge regression, lasso, elastic net, adaptive lasso,adaptive elastic net och multivariate adaptive regression splines(MARS), på ett högdimensionellt data-set. Syftet med dettakandidat-examensarbete var att analysera och jämföra resultatenav dessa metoder. Vidare analyserades och jämfördes metodernasresultat på samma data-set, fast utökat, med antalet variableroch observationer ungefär lika stora. Detta gjordes genom attgenerera modeller för de olika variabel-selektions-metodernavia inbygga paket i R och programmering i MATLAB. Dessamodeller användes sedan för att prediktera observationer, ochestimeringarna jämfördes därefter med de verkliga observationerna.Resultaten av de olika variabel-selektions-metodernaanalyserades sedan med hjälp av ett flertal evaluerings-metoder.Det kunde fastställas att vissa av de implementerade variabelselektions-metoderna gav mer relevanta modeller för datanän andra. Exempelvis var elastic net en av metoderna sompresterade bättre. Dessutom drogs slutsatsen att kombineringav resultaten av de slutgiltiga modellerna kunde ge en djupareinsikt i vilka variabler som är viktiga för observationerna, där,till exempel, variabel 112 och 23 tycktes ha betydelse. / Kandidatexjobb i elektroteknik 2021, KTH, Stockholm variable selection variable selection methods linear regression high-dimensional data variable importance Elektroteknik och elektronik
93	Constructing identity: phonetic variation of the variable (ing) by Swedish L2 speakers of English Holm, Idamaria January 2015 (has links) This study investigates the use of the (ing) variable in the speech of Swedish L2 speakers of English. Developments in recent years have led to a shift in the language environment in Sweden, and the position of English has arguably evolved from a foreign language to a second language. The aim of the study is to investigate to what extent and in what ways Swedish L2 speakers’ use of the variable (ing) is affected by extra-linguistic conditioning relating to age, gender and style, in similar ways as have been uncovered in various studies on native speakers of English and L2 immersion learners. Furthermore, the construction of identity is examined based on the application of the variable. Sociolinguistic interviews with twelve participants of different age and gender were conducted to elicit the phonetic variable in different speech styles. Significantly, the study shows that the standard variant [ɪŋ] is favored by the Swedish L1 speakers, but that the choice of variant also is affected by all of the extra-linguistic variables to varying extents. The results show tendencies that the nonstandard [ɪn] is applied more the younger the participants are, if they are male and in less monitored speech styles. Moreover, the participants appear to be constructing their identity through the use of the variable, positioning themselves with English native peers. Sociolinguistics English in Sweden (ing) variable EFL
94	Model selection and estimation in high dimensional settings Ngueyep Tzoumpe, Rodrigue 08 June 2015 (has links) Several statistical problems can be described as estimation problem, where the goal is to learn a set of parameters, from some data, by maximizing a criterion. These type of problems are typically encountered in a supervised learning setting, where we want to relate an output (or many outputs) to multiple inputs. The relationship between these outputs and these inputs can be complex, and this complexity can be attributed to the high dimensionality of the space containing the inputs and the outputs; the existence of a structural prior knowledge within the inputs or the outputs that if ignored may lead to inefficient estimates of the parameters; and the presence of a non-trivial noise structure in the data. In this thesis we propose new statistical methods to achieve model selection and estimation when there are more predictors than observations. We also design a new set of algorithms to efficiently solve the proposed statistical models. We apply the implemented methods to genetic data sets of cancer patients and to some economics data. Variable selection High dimensional statistics Regularization
95	Study of interactions of terminal units of a variable air volume air conditioning system 洪淵深, Hung, Yuen-sum. January 1997 (has links) published_or_final_version / Mechanical Engineering / Master / Master of Philosophy
96	Identifying historical financial crisis: Bayesian stochastic search variable selection in logistic regression Ho, Chi-San 2009 August 1900 (has links) This work investigates the factors that contribute to financial crises. We first study the Dow Jones index performance by grouping the daily adjusted closing value into a two-month window and finding several critical quantiles in each window. Then, we identify severe downturn in these quantiles and find that the 5th quantile is the best to identify financial crises. We then matched these quantiles with historical financial crises and gave a basic explanation about them. Next, we introduced all exogenous factors that could be related to the crises. Then, we applied a rapid Bayesian variable selection technique - Stochastic Search Variable Selection (SSVS) using a Bayesian logistic regression model. Finally, we analyzed the result of SSVS, leading to the conclusion that that the dummy variable we created for disastrous hurricane, crude oil price and gold price (GOLD) should be included in the model. / text Logistic regression
97	INVESTIGATIONS OF LONG-PERIOD DQ HERCULIS STARS. PENNING, WILLIAM ROY. January 1986 (has links) The magnetic rotator model has long been the favored explanation for coherent photometric modulations in the DQ Herculis class of cataclysmic variables. However, to date, all evidence supporting this model has been of the indirect variety. Unlike their synchronously rotating cousins, the AM Herculis objects, DQ Herculis stars have not yet been discovered to emit polarized radiation. Therefore, in light of this crucial lack, the evidence used to place these objects in the magnetic cataclysmic variable category has been strictly circumstantial, based primarily on the coherence of the photometric periodicities. In this work, time-resolved spectroscopy of four long-period DQ Herculis stars is performed. In addition, two of the same objects are observed with a new, sensitive circular polarimeter. Chapters II and III describe these observations and the results of each. To summarize, coherent variations in the wavelength of emission lines were found with the spectroscopic observations. A model is put forth, explaining this phenomenon as being due to varying illumination from a bright spot on the primary. This, of course, adds strength to the magnetic rotator model. Secondly, circular polarization was definitely found in one object studied, and possibly in a second. Therefore, for the first time, there is direct evidence of the magnetic nature of these binaries. In Chapter IV, the model of the rotating bright spot illuminating the disk is explored in further detail, including modeling with a minicomputer. Afterward, a problem brought out by the low polarization coupled with large amplitude photometric variations and a cool spectrum is investigated, namely, is it possible to produce large amounts of cyclotron radiation without producing large amounts of circular polarization? The results tend to show that, for a large emitting area, the answer is yes. Chapter V is a summary of the rest of the work. Cataclysmic variable stars. Stars -- Observations. Astronomical spectroscopy.
98	Modelling the accretion process in intermediate polars Taylor, Peter January 1997 (has links) No description available. 523.01
99	Nonlinear systems identification using the Narmax method Mao, Ke Zhi January 1998 (has links) No description available. 629.8
100	HARMONIC INVESTIGATION IN LOW AND MEDIUM VOLTAGE NETWORKS USING COMPUTER SIMULATION AND MEASUREMENT DEVICES Egner, Sean Robert William 31 October 2006 (has links) Student Number : 9811492X - MSc dissertation - School of Electrical and Information Engineering - Faculty of Engineering and the Built Environment / This dissertation discusses the development of an ATP model of a network to aid measurement techniques in a harmonic evaluation. A theoretical back- ground discussion of various pieces of equipment and their signi#12;cance to har- monics is included. National Electricity Regulator (NRS 048) standards are discussed with refer- ence to performing a basic investigation and short comings. A test study was performed on the Brandspruit Mine in Secunda. ATP models are developed for equipment relevant to the test case, these in- clude AC{AC converters, AC{DC converters, three phase transformers and cables. Finally the measured test case is compared to simulation results and conclusions drawn. harmonics ATP modelling variable speed drives VSO

Search results