• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 3010
  • 1002
  • 369
  • 345
  • 272
  • 182
  • 174
  • 160
  • 82
  • 54
  • 30
  • 29
  • 23
  • 22
  • 21
  • Tagged with
  • 6620
  • 2241
  • 1127
  • 915
  • 851
  • 791
  • 740
  • 738
  • 643
  • 542
  • 498
  • 486
  • 444
  • 417
  • 397
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
401

The use of control variates in bootstrap simulation.

January 2001 (has links)
Lui Ying Kin. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2001. / Includes bibliographical references (leaves 63-65). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 2 --- Introduction to bootstrap and efficiency bootstrap simulation --- p.5 / Chapter 2.1 --- Background of bootstrap --- p.5 / Chapter 2.2 --- Basic idea of bootstrap --- p.7 / Chapter 2.3 --- Variance reduction methods --- p.10 / Chapter 2.3.1 --- Control variates --- p.10 / Chapter 2.3.2 --- Common random numbers --- p.12 / Chapter 2.3.3 --- Antithetic variates --- p.14 / Chapter 2.3.4 --- Importance Sampling --- p.15 / Chapter 2.4 --- Efficient bootstrap simulation --- p.17 / Chapter 2.4.1 --- Linear approximation --- p.18 / Chapter 2.4.2 --- Centring method --- p.19 / Chapter 2.4.3 --- Balanced resampling --- p.20 / Chapter 2.4.4 --- Antithetic resampling --- p.21 / Chapter 3 --- Methodology --- p.22 / Chapter 3.1 --- Introduction --- p.22 / Chapter 3.2 --- Cluster analysis --- p.24 / Chapter 3.3 --- Regression estimator and mixture experiment --- p.25 / Chapter 3.4 --- Estimate of standard error and bias --- p.30 / Chapter 4 --- Simulation study --- p.45 / Chapter 4.1 --- Introduction --- p.45 / Chapter 4.2 --- Ratio estimation --- p.46 / Chapter 4.3 --- Time series problem --- p.50 / Chapter 4.4 --- Regression problem --- p.54 / Chapter 5 --- Conclusion and discussion --- p.60 / Reference --- p.63
402

Margin variations in support vector regression for the stock market prediction.

January 2003 (has links)
Yang, Haiqin. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2003. / Includes bibliographical references (leaves 98-109). / Abstracts in English and Chinese. / Abstract --- p.ii / Acknowledgement --- p.v / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Time Series Prediction and Its Problems --- p.1 / Chapter 1.2 --- Major Contributions --- p.2 / Chapter 1.3 --- Thesis Organization --- p.3 / Chapter 1.4 --- Notation --- p.4 / Chapter 2 --- Literature Review --- p.5 / Chapter 2.1 --- Framework --- p.6 / Chapter 2.1.1 --- Data Processing --- p.8 / Chapter 2.1.2 --- Model Building --- p.10 / Chapter 2.1.3 --- Forecasting Procedure --- p.12 / Chapter 2.2 --- Model Descriptions --- p.13 / Chapter 2.2.1 --- Linear Models --- p.15 / Chapter 2.2.2 --- Non-linear Models --- p.17 / Chapter 2.2.3 --- ARMA Models --- p.21 / Chapter 2.2.4 --- Support Vector Machines --- p.23 / Chapter 3 --- Support Vector Regression --- p.27 / Chapter 3.1 --- Regression Problem --- p.27 / Chapter 3.2 --- Loss Function --- p.29 / Chapter 3.3 --- Kernel Function --- p.34 / Chapter 3.4 --- Relation to Other Models --- p.36 / Chapter 3.4.1 --- Relation to Support Vector Classification --- p.36 / Chapter 3.4.2 --- Relation to Ridge Regression --- p.38 / Chapter 3.4.3 --- Relation to Radial Basis Function --- p.40 / Chapter 3.5 --- Implemented Algorithms --- p.40 / Chapter 4 --- Margins in Support Vector Regression --- p.46 / Chapter 4.1 --- Problem --- p.47 / Chapter 4.2 --- General ε-insensitive Loss Function --- p.48 / Chapter 4.3 --- Accuracy Metrics and Risk Measures --- p.52 / Chapter 5 --- Margin Variation --- p.55 / Chapter 5.1 --- Non-fixed Margin Cases --- p.55 / Chapter 5.1.1 --- Momentum --- p.55 / Chapter 5.1.2 --- GARCH --- p.57 / Chapter 5.2 --- Experiments --- p.58 / Chapter 5.2.1 --- Momentum --- p.58 / Chapter 5.2.2 --- GARCH --- p.65 / Chapter 5.3 --- Discussions --- p.72 / Chapter 6 --- Relation between Downside Risk and Asymmetrical Margin Settings --- p.77 / Chapter 6.1 --- Mathematical Derivation --- p.77 / Chapter 6.2 --- Algorithm --- p.81 / Chapter 6.3 --- Experiments --- p.83 / Chapter 6.4 --- Discussions --- p.86 / Chapter 7 --- Conclusion --- p.92 / Chapter A --- Basic Results for Solving SVR --- p.94 / Chapter A.1 --- Dual Theory --- p.94 / Chapter A.2 --- Standard Method to Solve SVR --- p.96 / Bibliography --- p.98
403

Methods for functional regression and nonlinear mixed-effects models with applications to PET data

Chen, Yakuan January 2017 (has links)
The overall theme of this thesis focuses on methods for functional regression and nonlinear mixed-effects models with applications to PET data. The first part considers the problem of variable selection in regression models with functional responses and scalar predictors. We pose the function-on-scalar model as a multivariate regression problem and use group-MCP for variable selection. We account for residual covariance by "pre-whitening" using an estimate of the covariance matrix, and establish theoretical properties for the resulting estimator. We further develop an iterative algorithm that alternately updates the spline coefficients and covariance. Our method is illustrated by the application to two-dimensional planar reaching motions in a study of the effects of stroke severity on motor control. The second part introduces a functional data analytic approach for the estimation of the IRF, which is necessary for describing the binding behavior of the radiotracer. Virtually all existing methods have three common aspects: summarizing the entire IRF with a single scalar measure; modeling each subject separately; and the imposition of parametric restrictions on the IRF. In contrast, we propose a functional data analytic approach that regards each subject's IRF as the basic analysis unit, models multiple subjects simultaneously, and estimates the IRF nonparametrically. We pose our model as a linear mixed effect model in which shrinkage and roughness penalties are incorporated to enforce identifiability and smoothness of the estimated curves, respectively, while monotonicity and non-negativity constraints impose biological information on estimates. We illustrate this approach by applying it to clinical PET data. The third part discusses a nonlinear mixed-effects modeling approach for PET data analysis under the assumption of a compartment model. The traditional NLS estimators of the population parameters are applied in a two-stage analysis, which brings instability issue and neglects the variation in rate parameters. In contrast, we propose to estimate the rate parameters by fitting nonlinear mixed-effects (NLME) models, in which all the subjects are modeled simultaneously by allowing rate parameters to have random effects and population parameters can be estimated directly from the joint model. Simulations are conducted to compare the power of detecting group effect in both rate parameters and summarized measures of tests based on both NLS and NLME models. We apply our NLME approach to clinical PET data to illustrate the model building procedure.
404

Análise de regressão incorporando o esquema amostral / Regression analysis incorporating the sample design

Cléber da Costa Figueiredo 22 June 2004 (has links)
Neste trabalho estudamos modelos lineares de regressão para a análise de dados obtidos de pesquisas amostrais complexas. Foram considerados aspectos teóricos e aplicações a conjuntos de dados reais por meio do uso do aplicativo SUDAAN e da biblioteca ADAC da linguagem R. Nas aplicações foram abordados os modelos de regressão normal e logística. Foram realizados também estudos comparativos dos métodos estudados com os que assumem que as observações são selecionadas segundo amostragem aleatória simples. / We have studied linear regression models for data analysis when the data set comes from a complex sampling survey. We have considered theoretical aspects and some applications utilizing the SUDAAN software and the ADAC library for R language. The applications involved the normal and logistic regression models. The studied methods were compared with those obtained from simple random samples.
405

Functional data analytics for wearable device and neuroscience data

Wrobel, Julia Lynn January 2019 (has links)
This thesis uses methods from functional data analysis (FDA) to solve problems from three scientific areas of study. While the areas of application are quite distinct, the common thread of functional data analysis ties them together. The first chapter describes interactive open-source software for explaining and disseminating results of functional data analyses. Chapters two and three use curve alignment, or registration, to solve common problems in accelerometry and neuroimaging, respectively. The final chapter introduces a novel regression method for modeling functional outcomes that are trajectories over time. The first chapter of this thesis details a software package for interactively visualizing functional data analyses. The software is designed to work for a wide range of datasets and several types of analyses. This chapter describes that software and provides an overview ofFDA in different contexts. The second chapter introduces a framework for curve alignment, or registration, of exponential family functional data. The approach distinguishes itself from previous registration methods in its ability to handle dense binary observations with computational efficiency. Motivation comes from the Baltimore Longitudinal Study on Aging, in which accelerometer data provides valuable insights into the timing of sedentary behavior. The third chapter takes lessons learned about curve registration from the second chapter and use them to develop methods in an entirely new context: large multisite brain imaging studies. Scanner effects in multisite imaging studies are non-biological variability due to technical differences across sites and scanner hardware. This method identifies and removes scanner effects by registering cumulative distribution functions of image intensities values. In the final chapter the focus shifts from curve registration to regression. Described within this chapter is an entirely new nonlinear regression framework that draws from both functional data analysis and systems of ordinary equations. This model is motivated by the neurobiology of skilled movement, and was developed to capture the relationship between neural activity and arm movement in mice.
406

Censored Regression Techniques for Credit Scoring

Glasson, Samuel, sglas@iinet.net.au January 2007 (has links)
This thesis investigates the use of newly-developed survival analysis tools for credit scoring. Credit scoring techniques are currently used by financial institutions to estimate the probability of a customer defaulting on a loan by a predetermined time in the future. While a number of classification techniques are currently used, banks are now becoming more concerned with estimating the lifetime of the loan rather than just the probability of default. Difficulties arise when using standard statistical techniques due to the presence of censoring in the data. Survival analysis, originating from medical and engineering fields, is an area of statistics that typically deals with censored lifetime data. The theoretical developments in this thesis revolve around linear regression for censored data, in particular the Buckley-James method. The Buckley-James method is analogous to linear regression and gives estimates of the mean expected lifetime given a set of explanato ry variables. The first development is a measure of fit for censored regression, similar to the classical r-squared of linear regression. Next, the variable-reduction technique of stepwise selection is extended to the Buckley-James method. For the last development, the Buckley-James algorithm is altered to incorporate non-linear regression methods such as neural networks and Multivariate Adaptive Regression Splines (MARS). MARS shows promise in terms of predictive power and interpretability in both simulation and empirical studies. The practical section of the thesis involves using the new techniques to predict the time to default and time to repayment of unsecured personal loans from a database obtained from a major Australian bank. The analyses are unique, being the first published work on applying Buckley-James and related methods to a large-scale financial database.
407

Adaptive Techniques for Enhancing the Robustness and Performance of Speciated PSOs in Multimodal Environments

Bird, Stefan Charles, stbird@seatiger.org January 2008 (has links)
This thesis proposes several new techniques to improve the performance of speciated particle swarms in multimodal environments. We investigate how these algorithms can become more robust and adaptive, easier to use and able to solve a wider variety of optimisation problems. We then develop a technique that uses regression to vastly improve an algorithm's convergence speed without requiring extra evaluations. Speciation techniques play an important role in particle swarms. They allow an algorithm to locate multiple optima, providing the user with a choice of solutions. Speciation also provides diversity preservation, which can be critical for dynamic optimisation. By increasing diversity and tracking multiple peaks simultaneously, speciated algorithms are better able to handle the changes inherent in dynamic environments. Speciation algorithms often require a user to specify a parameter that controls how species form. This is a major drawback since the knowledge may not be available a priori. If the parameter is incorrectly set, the algorithm's performance is likely to be highly degraded. We propose using a time-based measure to control the speciation, allowing the algorithm to define species far more adaptively, using the population's characteristics and behaviour to control membership. Two new techniques presented in this thesis, ANPSO and ESPSO, use time-based convergence measures to define species. These methods are shown to be robust while still providing highly competitive performance. Both algorithms effectively optimised all of our test functions without requiring any tuning. Speciated algorithms are ideally suited to optimising dynamic environments, however the complexity of these environments makes them far more difficult to design algorithms for. To increase an algorithm's performance it is necessary to determine in what ways it should be improved. While all performance metrics allow optimisation techniques to be compared, they cannot show how to improve an algorithm. Until now this has been done largely by trial and error. This is extremely inefficient, in the same way it is inefficient trying to improve a program's speed without profiling it first. This thesis proposes a new metric that exclusively measures convergence speed. We show that an algorithm can be profiled by correlating the performance as measured by multiple metrics. By combining these two techniques, we can obtain far better insight into how best to improve an algorithm. Using this information, we then propose a local convergence enhancement that greatly increases performance by actively estimating the location of an optimum. The enhancement uses regression to fit a surface to the peak, guiding the search by estimating the peak's true location. By incorporating this technique, the algorithm is able to use the information contained within the fitness landscape far more effectively. We show that by combining the regression with an existing speciated algorithm, we are able to vastly improve the algorithm's performance. This technique will greatly enhance the utility of PSO on problems where fitness evaluations are expensive, or that require fast reaction to change.
408

Contributions to the estimation of probabilistic discriminative models: semi-supervised learning and feature selection

Sokolovska, Nataliya 25 February 2010 (has links) (PDF)
Dans cette thèse nous étudions l'estimation de modèles probabilistes discriminants, surtout des aspects d'apprentissage semi-supervisé et de sélection de caractéristiques. Le but de l'apprentissage semi-supervisé est d'améliorer l'efficacité de l'apprentissage supervisé en utilisant des données non-étiquetées. Cet objectif est difficile à atteindre dans les cas des modèles discriminants. Les modèles probabilistes discriminants permettent de manipuler des représentations linguistiques riches, sous la forme de vecteurs de caractéristiques de très grande taille. Travailler en grande dimension pose des problèmes, en particulier computationnels, qui sont exacerbés dans le cadre de modèles de séquences tels que les champs aléatoires conditionnels (CRF). Notre contribution est double. Nous introduisons une méthode originale et simple pour intégrer des données non étiquetées dans une fonction objectif semi-supervisée. Nous démontrons alors que l'estimateur semi-supervisé correspondant est asymptotiquement optimal. Le cas de la régression logistique est illustré par des résultats d'expèriences. Dans cette étude, nous proposons un algorithme d'estimation pour les CRF qui réalise une sélection de modèle, par le truchement d'une pénalisation $L_1$. Nous présentons également les résultats d'expériences menées sur des tâches de traitement des langues (le chunking et la détection des entités nommées), en analysant les performances en généralisation et les caractéristiques sélectionnées. Nous proposons finalement diverses pistes pour améliorer l'efficacité computationelle de cette technique.
409

Analys av hur makroekonomiska faktorer påverkar registrering av aktiebolag

Janegren, Jonas, Borggren, Dan January 2010 (has links)
No description available.
410

Test case prioritization

Malishevsky, Alexey Grigorievich 19 June 2003 (has links)
Regression testing is an expensive software engineering activity intended to provide confidence that modifications to a software system have not introduced faults. Test case prioritization techniques help to reduce regression testing cost by ordering test cases in a way that better achieves testing objectives. In this thesis, we are interested in prioritizing to maximize a test suite's rate of fault detection, measured by a metric, APED, trying to detect regression faults as early as possible during testing. In previous work, several prioritization techniques using low-level code coverage information had been developed. These techniques try to maximize APED over a sequence of software releases, not targeting a particular release. These techniques' effectiveness was empirically evaluated. We present a larger set of prioritization techniques that use information at arbitrary granularity levels and incorporate modification information, targeting prioritization at a particular software release. Our empirical studies show significant improvements in the rate of fault detection over randomly ordered test suites. Previous work on prioritization assumed uniform test costs and fault seventies, which might not be realistic in many practical cases. We present a new cost-cognizant metric, APFD[subscript c], and prioritization techniques, together with approaches for measuring and estimating these costs. Our empirical studies evaluate prioritization in a cost-cognizant environment. Prioritization techniques have been developed independently with little consideration of their similarities. We present a general prioritization framework that allows us to express existing prioritization techniques by a framework algorithm using parameters and specific functions. Previous research assumed that prioritization was always beneficial if it improves the APFD metric. We introduce a prioritization cost-benefit model that more accurately captures relevant cost and benefit factors, and allows practitioners to assess whether it is economical to employ prioritization. Prioritization effectiveness varies across programs, versions, and test suites. We empirically investigate several of these factors on substantial software systems and present a classification-tree-based predictor that can help select the most appropriate prioritization technique in advance. Together, these results improve our understanding of test case prioritization and of the processes by which it is performed. / Graduation date: 2004

Page generated in 0.0365 seconds