Global ETD Search

81	Machine Learning Techniques for Large-Scale System Modeling Lv, Jiaqing 31 August 2011 (has links) This thesis is about some issues in system modeling: The first is a parsimonious representation of MISO Hammerstein system, which is by projecting the multivariate linear function into a univariate input function space. This leads to the so-called semiparamtric Hammerstein model, which overcomes the commonly known “Curse of dimensionality” for nonparametric estimation on MISO systems. The second issue discussed in this thesis is orthogonal expansion analysis on a univariate Hammerstein model and hypothesis testing for the structure of the nonlinear subsystem. The generalization of this technique can be used to test the validity for parametric assumptions of the nonlinear function in Hammersteim models. It can also be applied to approximate a general nonlinear function by a certain class of parametric function in the Hammerstein models. These techniques can also be extended to other block-oriented systems, e.g, Wiener systems, with slight modification. The third issue in this thesis is applying machine learning and system modeling techniques to transient stability studies in power engineering. The simultaneous variable section and estimation lead to a substantially reduced complexity and yet possesses a stronger prediction power than techniques known in the power engineering literature so far. nonparametric estimation semiparametric MISO Hammerstein model curse of dimensionality model selection Lasso transient stability boundary machine learning
82	Bayesian latent class metric conjoint analysis. A case study from the Austrian mineral water market. Otter, Thomas, Tüchler, Regina, Frühwirth-Schnatter, Sylvia January 2002 (has links) (PDF) This paper presents the fully Bayesian analysis of the latent class model using a new approach towards MCMC estimation in the context of mixture models. The approach starts with estimating unidentified models for various numbers of classes. Exact Bayes' factors are computed by the bridge sampling estimator to compare different models and select the number of classes. Estimation of the unidentified model is carried out using the random permutation sampler. From the unidentified model estimates for model parameters that are not class specific are derived. Then, the exploration of the MCMC output from the unconstrained model yields suitable identifiability constraints. Finally, the constrained version of the permutation sampler is used to estimate group specific parameters. Conjoint data from the Austrian mineral water market serve to illustrate the method. (author's abstract) / Series: Report Series SFB "Adaptive Information Systems and Modelling in Economics and Management Science"
83	Machine Learning Techniques for Large-Scale System Modeling Lv, Jiaqing 31 August 2011 (has links) This thesis is about some issues in system modeling: The first is a parsimonious representation of MISO Hammerstein system, which is by projecting the multivariate linear function into a univariate input function space. This leads to the so-called semiparamtric Hammerstein model, which overcomes the commonly known “Curse of dimensionality” for nonparametric estimation on MISO systems. The second issue discussed in this thesis is orthogonal expansion analysis on a univariate Hammerstein model and hypothesis testing for the structure of the nonlinear subsystem. The generalization of this technique can be used to test the validity for parametric assumptions of the nonlinear function in Hammersteim models. It can also be applied to approximate a general nonlinear function by a certain class of parametric function in the Hammerstein models. These techniques can also be extended to other block-oriented systems, e.g, Wiener systems, with slight modification. The third issue in this thesis is applying machine learning and system modeling techniques to transient stability studies in power engineering. The simultaneous variable section and estimation lead to a substantially reduced complexity and yet possesses a stronger prediction power than techniques known in the power engineering literature so far. nonparametric estimation semiparametric MISO Hammerstein model curse of dimensionality model selection Lasso transient stability boundary machine learning
84	Bayesian Analysis of Spatial Point Patterns Leininger, Thomas Jeffrey January 2014 (has links) <p>We explore the posterior inference available for Bayesian spatial point process models. In the literature, discussion of such models is usually focused on model fitting and rejecting complete spatial randomness, with model diagnostics and posterior inference often left as an afterthought. Posterior predictive point patterns are shown to be useful in performing model diagnostics and model selection, as well as providing a wide array of posterior model summaries. We prescribe Bayesian residuals and methods for cross-validation and model selection for Poisson processes, log-Gaussian Cox processes, Gibbs processes, and cluster processes. These novel approaches are demonstrated using existing datasets and simulation studies.</p> / Dissertation Statistics cross-validation Gibbs process Log-Gaussian Cox process model selection point pattern residuals Poisson process
85	A Fully Bayesian Analysis of Multivariate Latent Class Models with an Application to Metric Conjoint Analysis Frühwirth-Schnatter, Sylvia, Otter, Thomas, Tüchler, Regina January 2002 (has links) (PDF) In this paper we head for a fully Bayesian analysis of the latent class model with a priori unknown number of classes. Estimation is carried out by means of Markov Chain Monte Carlo (MCMC) methods. We deal explicitely with the consequences the unidentifiability of this type of model has on MCMC estimation. Joint Bayesian estimation of all latent variables, model parameters, and parameters determining the probability law of the latent process is carried out by a new MCMC method called permutation sampling. In a first run we use the random permutation sampler to sample from the unconstrained posterior. We will demonstrate that a lot of important information, such as e.g. estimates of the subject-specific regression coefficients, is available from such an unidentified model. The MCMC output of the random permutation sampler is explored in order to find suitable identifiability constraints. In a second run we use the permutation sampler to sample from the constrained posterior by imposing identifiablity constraints. The unknown number of classes is determined by formal Bayesian model comparison through exact model likelihoods. We apply a new method of computing model likelihoods for latent class models which is based on the method of bridge sampling. The approach is applied to simulated data and to data from a metric conjoint analysis in the Austrian mineral water market. (author's abstract) / Series: Working Papers SFB "Adaptive Information Systems and Modelling in Economics and Management Science"
86	從假設檢定的觀點探討ARMA模型的參數配適 / ARMA Model Selection from Hypothesis Point of View 林芸生, Lin, Yun Sheng Unknown Date (has links) 本篇論文著重於探討ARMA模型的選模準則，過去較為著名的AIC、BIC等選模準則中，若總參數個數相同，模型選擇便簡化為比較各模型的概似函數在MLE下的值，故本研究將假設檢定定義為檢定總參數個數；截至目前為止，選模準則在使用上以AIC及BIC較為普遍，此兩種選模準則從本研究所定義的假設檢定的觀點來看，AIC犯型一誤差機率高，同時檢定力也高；BIC犯型一誤差的機率極低，同時檢定力也相對不高，本研究從此觀點提出一個選模準則方法，嘗試將上述兩種方法折衷，將型一誤差控制在5%，且檢定力略高於BIC。模擬的結果在理想的情形下皆符合預期，但在真實情形本研究方法涉及第一階段的模型選取，本研究提供兩種第一階段的模型選取方法，模擬的結果顯示，方法一型一誤差略為膨脹，檢定力增幅顯著；方法二型一誤差控制精準，但檢定力表現較差。本研究所提出的方法計算時間較為冗長，但若想將 AIC 及 BIC 方法折衷，可考慮嘗試本研究方法。 / This thesis focuses on model selection criteria for ARMA models. For information-based criteria such as AIC and BIC, the task of model selection is reduced to the comparison among likelihood values at maximum likelihood estimates if the numbers of parameters in candidate models are all the same. Thus the key step in model selection is the determination of the total number of parameters. The determination of number of parameters can be addressed using a hypothesis testing approach, where the null hypothesis is that the total number of model parameters is equal to a given number k and the alternative hypothesis is that the total number of parameters is equal to k+1. In this thesis, an information-based model selection method is proposed, where the number of parameters is determined using a two-stage testing procedure, which is constructed with the attempt to control the average type I error probability to be 5%. When using BIC in the above testing problem, simulation results indicate that the average type I error probability for BIC is lower than 0.05, so it is expected the proposed test is more powerful than BIC. The first stage of the proposed test involves selecting the most likely models under the null and the alternative hypothesis respectively. Two methods are considered for the first-stage selection. For the first method, the type I error probability can be larger than 0.05, but the power is significantly larger than BIC. For the second method, the type I error probability is under control, but its power increment is comparatively low. The computing time for the proposed test is rather long. However, for those who need an eclectic method between AIC and BIC, the proposed test can serve as a reasonable choice. 假設檢定 ARMA Model selection AIC BIC Hypothesis testing
87	New results in dimension reduction and model selection Smith, Andrew Korb 26 March 2008 (has links) Dimension reduction is a vital tool in many areas of applied statistics in which the dimensionality of the predictors can be large. In such cases, many statistical methods will fail or yield unsatisfactory results. However, many data sets of high dimensionality actually contain a much simpler, low-dimensional structure. Classical methods such as principal components analysis are able to detect linear structures very effectively, but fail in the presence of nonlinear structures. In the first part of this thesis, we investigate the asymptotic behavior of two nonlinear dimensionality reduction algorithms, LTSA and HLLE. In particular, we show that both algorithms, under suitable conditions, asymptotically recover the true generating coordinates up to an isometry. We also discuss the relative merits of the two algorithms, and the effects of the underlying probability distributions of the coordinates on their performance. Model selection is a fundamental problem in nearly all areas of applied statistics. In particular, a balance must be achieved between good in-sample performance and out-of-sample prediction. It is typically very easy to achieve good fit in the sample data, but empirically we often find that such models will generalize poorly. In the second part of the thesis, we propose a new procedure for the model selection problem which generalizes traditional methods. Our algorithm allows the combination of existing model selection criteria via a ranking procedure, leading to the creation of new criteria which are able to combine measures of in-sample fit and out-of-sample prediction performance into a single value. We then propose an algorithm which provably finds the optimal combination with a specified probability. We demonstrate through simulations that these new combined criteria can be substantially more powerful than any individual criterion. Hlle Ltsa Manifold learning Model selection criteria Mathematical models Mathematical statistics Correlation (Statistics) Algorithms
88	eScience Approaches to Model Selection and Assessment : Applications in Bioinformatics Eklund, Martin January 2009 (has links) High-throughput experimental methods, such as DNA and protein microarrays, have become ubiquitous and indispensable tools in biology and biomedicine, and the number of high-throughput technologies is constantly increasing. They provide the power to measure thousands of properties of a biological system in a single experiment and have the potential to revolutionize our understanding of biology and medicine. However, the high expectations on high-throughput methods are challenged by the problem to statistically model the wealth of data in order to translate it into concrete biological knowledge, new drugs, and clinical practices. In particular, the huge number of properties measured in high-throughput experiments makes statistical model selection and assessment exigent. To use high-throughput data in critical applications, it must be warranted that the models we construct reﬂect the underlying biology and are not just hypotheses suggested by the data. We must furthermore have a clear picture of the risk of making incorrect decisions based on the models. The rapid improvements of computers and information technology have opened up new ways of how the problem of model selection and assessment can be approached. Speciﬁcally, eScience, i.e. computationally intensive science that is carried out in distributed network envi- ronments, provides computational power and means to efﬁciently access previously acquired scientiﬁc knowledge. This thesis investigates how we can use eScience to improve our chances of constructing biologically relevant models from high-throughput data. Novel methods for model selection and assessment that leverage on computational power and on prior scientiﬁc information to "guide" the model selection to models that a priori are likely to be relevant are proposed. In addition, a software system for deploying new methods and make them easily accessible to end users is presented. bioinformatics high-throughout biology eScience model selection model assessment Bioinformatics Bioinformatik
89	Model selection and testing for an automated constraint modelling toolchain Hussain, Bilal Syed January 2017 (has links) Constraint Programming (CP) is a powerful technique for solving a variety of combinatorial problems. Automated modelling using a refinement based approach abstracts over modelling decisions in CP by allowing users to specify their problem in a high level specification language such as ESSENCE. This refinement process produces many models resulting from different choices that can be selected, each with their own strengths. A parameterised specification represents a problem class where the parameters of the class define the instance of the class we wish to solve. Since each model has different performance characteristics the model chosen is crucial to be able to solve the instance effectively. This thesis presents a method to generate instances automatically for the purpose of choosing a subset of the available models that have superior performance across the instance space. The second contribution of this thesis is a framework to automate the testing of a toolchain for automated modelling. This process includes a generator of test cases that covers all aspects of the ESSENCE specification language. This process utilises our first contribution namely instance generation to generate parameterised specifications. This framework can detect errors such as inconsistencies in the model produced during the refinement process. Once we have identified a specification that causes an error, this thesis presents our third contribution; a method for reducing the specification to a much simpler form, which still exhibits a similar error. Additionally this process can generate a set of complementary specifications including specifications that do not cause the error to help pinpoint the root cause. 005.1
90	Assessing Nonlinear Relationships through Rich Stimulus Sampling in Repeated-Measures Designs Cole, James Jacob 01 August 2018 (has links) Explaining a phenomenon often requires identification of an underlying relationship between two variables. However, it is common practice in psychological research to sample only a few values of an independent variable. Young, Cole, and Sutherland (2012) showed that this practice can impair model selection in between-subject designs. The current study expands that line of research to within-subjects designs. In two Monte Carlo simulations, model discrimination under systematic sampling of 2, 3, or 4 levels of the IV was compared with that under random uniform sampling and sampling from a Halton sequence. The number of subjects, number of observations per subject, effect size, and between-subject parameter variance in the simulated experiments were also manipulated. Random sampling out-performed the other methods in model discrimination with only small, function-specific costs to parameter estimation. Halton sampling also produced good results but was less consistent. The systematic sampling methods were generally rank-ordered by the number of levels they sampled. mixed-effects modeling model selection Nonlinear modeling optimal design random sampling within-subjects research

Search results