• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 5
  • 2
  • Tagged with
  • 9
  • 4
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Non-uniformly expanding dynamical systems and decay of correlations for non-Hölder continuous observables

Lynch, Vincent Kevin January 2003 (has links)
No description available.
2

The reactive tabu search for efficient correlated experimental designs

Coombes, Neil Edwin January 2002 (has links)
No description available.
3

Contribution à la classification par modèles de mélange et classification simultanée d’échantillons d’origines multiples / Contribution to Model-Based Clustering and Simultaneous Clustering of Samples Arising from Multiple Origins

Lourme, Alexandre 17 June 2011 (has links)
Dans la première partie de cette thèse nous passons en revue la classification par modèle de mélange. En particulier nous décrivons une famille de mélanges gaussiens d’un usage courant, dont la parcimonie porte sur des paramètres d’interprétation géométrique. Comme ces modèles possèdent des inconvénients majeurs, nous leur opposons une nouvelle famille de mélanges dont la parcimonie porte sur des paramètres statistiques. Ces nouveaux modèles possèdent de nombreuses propriétés de stabilité qui les rendent mathématiquement cohérents et facilitent leur interprétation. Dans la seconde partie de ce travail nous présentons une méthode nouvelle dite de classification simultanée. Nous montrons que la classification d'un échantillon revient très souvent au partitionnement de plusieurs échantillons ; puis nous proposons d'établir un lien entre la population d'origine des différents échantillons. Ce lien, dont la nature varie selon le contexte, a toujours pour vocation de formaliser de façon réaliste une information commune aux données à classifier.Lorsque les échantillons sont décrits par des variables de même signification et que l'on cherche le même nombre de groupes dans chacun d'eux, nous établissons un lien stochastique entre populations conditionnelles. Lorsque les variables sont différentes mais sémantiquement proches d'un échantillon à l'autre, il se peut que leur pouvoir discriminant soit similaire et que l'imbrication des données conditionnelles soit comparable. Nous envisageons des mélanges spécifiques à ce contexte, liés par un chevauchement homogène de leurs composantes. / In the first part of this work we review the mixture model-based clustering method. In particular we describe a family of common Gaussian mixtures the parsimony of which is about geometrical parameters. As these models suffer from major drawbacks, we display new Gaussian mixtures the parsimony of which focuses on statistical parameters. These new models own many stability properties that make them mathematically consistent and facilitate their interpretation. In the second part of this work we display the so-called simultaneous clustering method. We highlight that the classification of a single sample can often be seen as a multiple sample clustering problem; then we propose to establish a link between the original population of the diverse samples. This link varies depending on the context but it always tries to formalize in a realistic way some common information of the samples to classify. When samples are described by variables with identical meaning and when the same number of groups is researched within each of them, we establish a stochastic link between the conditional populations. When the variables are different but semantically close through the diverse samples nevertheless their discriminant power may be similar and the nesting of the conditional data can be comparable. We consider specific mixtures dedicated to this context: the link between the populations consists in an homogeneous overlap of the components.
4

Estimation adaptative pour des problèmes inverses avec des applications à la division cellulaire / Adaptive estimation for inverse problem with application to cell division

Hoang, Van Hà 28 November 2016 (has links)
Cette thèse se divise en deux parties indépendantes. Dans la première, nous considérons un modèle stochastique individu-centré en temps continu décrivant une population structurée par la taille. La population est représentée par une mesure ponctuelle évoluant suivant un processus aléatoire déterministe par morceaux. Nous étudions ici l'estimation non-paramétrique du noyau régissant les divisions, sous deux schémas d'observation différents. Premièrement, dans le cas où nous obtenons l'arbre entier des divisions, nous construisons un estimateur à noyau avec une sélection adaptative de fenêtre dépendante des données. Nous obtenons une inégalité oracle et des vitesses de convergence exponentielles optimales. Deuxièmement, dans le cas où l'arbre de division n'est pas complètement observé, nous montrons que le processus microscopique renormalisé décrivant l'évolution de la population converge vers la solution faible d'une équation aux dérivés partielles. Nous proposons un estimateur du noyau de division en utilisant des techniques de Fourier. Nous montrons la consistance de l'estimateur. Dans la seconde partie, nous considérons le modèle de régression non-paramétrique avec erreurs sur les variables dans le contexte multidimensionnel. Notre objectif est d'estimer la fonction de régression multivariée inconnue. Nous proposons un estimateur adaptatif basé sur des noyaux de projection fondés sur une base d'ondelettes multi-index et sur un opérateur de déconvolution. Le niveau de résolution des ondelettes est obtenu par la méthode de Goldenshluger-Lepski. Nous obtenons une inégalité oracle et des vitesses de convergence optimales sur les espaces de Hölder anisotropes. / This thesis is divided into two independent parts. In the first one, we consider a stochastic individual-based model in continuous time to describe a size-structured population for cell divisions. The random point measure describing the cell population evolves as a piecewise deterministic Markov process. We address here the problem of nonparametric estimation of the kernel ruling the divisions, under two observation schemes. First, we observe the evolution of cells up to a fixed time T and we obtain the whole division tree. We construct an adaptive kernel estimator of the division kernel with a fully data-driven bandwidth selection. We obtain an oracle inequality and optimal exponential rates of convergence. Second, when the whole division tree is not completely observed, we show that, in a large population limit, the renormalized microscopic process describing the evolution of cells converges to the weak solution of a partial differential equation. We propose an estimator of the division kernel by using Fourier techniques. We prove the consistency of the estimator. In the second part, we consider the nonparametric regression with errors-in-variables model in the multidimensional setting. We estimate the multivariate regression function by an adaptive estimator based on projection kernels defined with multi-indexed wavelets and a deconvolution operator. The wavelet level resolution is selected by the method of Goldenshluger-Lepski. We obtain an oracle inequality and optimal rates of convergence over anisotropic Hölder classes.
5

Sélection de groupes de variables corrélées en grande dimension / Selection of groups of correlated variables in a high dimensionnal setting

Grimonprez, Quentin 14 December 2016 (has links)
Le contexte de cette thèse est la sélection de variables en grande dimension à l'aide de procédures de régression régularisée en présence de redondance entre variables explicatives. Parmi les variables candidates, on suppose que seul un petit nombre est réellement pertinent pour expliquer la réponse. Dans ce cadre de grande dimension, les approches classiques de type Lasso voient leurs performances se dégrader lorsque la redondance croît, puisqu'elles ne tiennent pas compte de cette dernière. Regrouper au préalable ces variables peut pallier ce défaut, mais nécessite usuellement la calibration de paramètres supplémentaires. L'approche proposée combine regroupement et sélection de variables dans un souci d'interprétabilité et d'amélioration des performances. D'abord une Classification Ascendante Hiérarchique (CAH) fournit à chaque niveau une partition des variables en groupes. Puis le Group-lasso est utilisé à partir de l'ensemble des groupes de variables des différents niveaux de la CAH à paramètre de régularisation fixé. Choisir ce dernier fournit alors une liste de groupe candidats issus potentiellement de différents niveaux. Le choix final des groupes est obtenu via une procédure de tests multiples. La procédure proposée exploite la structure hiérarchique de la CAH et des pondérations dans le Group-lasso. Cela permet de réduire considérablement la complexité algorithmique induite par la flexibilité. / This thesis takes place in the context of variable selection in the high dimensional setting using penalizedregression in presence of redundancy between explanatory variables. Among all variables, we supposethat only a few number is relevant for predicting the response variable. In this high dimensional setting,performance of classical lasso-based approaches decreases when redundancy increases as they do not takeit into account. Firstly aggregating variables can overcome this problem but generally requires calibrationof additional parameters. The proposed approach combines variables aggregation and selection in order to improve interpretabilityand performance. First, a hierarchical clustering procedure provides at each level a partition of the variablesinto groups. Then the Group-lasso is used with the set of groups of variables from the different levels ofthe hierarchical clustering and a fixed regularization parameter. Choosing this parameter provides a list ofcandidates groups potentially coming from different levels. The final choice of groups is done by a multipletesting procedure. The proposed procedure exploits the hierarchical structure from hierarchical clustering and some weightsin Group-lasso. This allows to greatly reduce the algorithm complexity induced by the possibility to choosegroups coming from different levels of the hierarchical clustering.
6

Extensions of the case-control design in genome-wide association studies

Loizides, Charalambos January 2012 (has links)
The case-control design is one of the most commonly used designs in genome- wide asociation studies. When we increase the sample size of either the controls or, more importantly, the cases, the power of whatever test we use will certainly increase. However increasing the sample size, means that addi- tional individuals need to be genotyped and this implies extra financial costs. However, nowadays with the emergence of genetic studies, a large number of genetic data are available at low or no extra cost. Even though those data may not be completely relevant to the current study, they can still be used to increase the probability to identify true associations. Furthermore, additional information, non-necessarily genetic, can also be used to improve the power of a method. In this thesis we extend the case-control design in order to take ad- vantage of such types of additional data and/or information. We discuss three designs; the case-cohort-control, the kin-cohort and the super-case– case–control–super-control designs. For each of these, we present methods that are adjusted or modified versions of standard case-control methods but we also propose novel ones developed with those extended designs in mind. Ultimately, we describe how those methods can be used in order to increase the power of association tests, especially compared to similar methods of the case-control design.
7

Canonical correlation analysis of aggravated robbery and poverty in Limpopo Province

Rwizi, Tandanai 05 1900 (has links)
The study was aimed at exploring the relationship between poverty and aggravated robbery in Limpopo Province. Sampled secondary data of aggravated robbery of- fenders, obtained from the South African Police (SAPS), Polokwane, was used in the analysis. From empirical researches on poverty and crime, there are some deductions that vulnerability to crime is increased by poverty. Poverty set was categorised by gender, employment status, marital status, race, age and educational attainment. Variables for aggravated robbery were house robbery, bank robbery, street/common robbery, carjacking, truck hijacking, cash-in-transit and business robbery. Canonical correlation analysis was used to make some inferences about the relationship of these two sets. The results revealed a signi cant positive correlation of 0.219(p-value = 0.025) between poverty and aggravated robbery at ve per cent signi cance level. Of the thirteen variables entered into the poverty-aggravated model, ve emerged as sta- tistically signi cant. These were gender, marital status, employment status, common robbery and business robbery. / Mathematical Sciences / M. Sc. (Statistics)
8

Canonical correlation analysis of aggravated robbery and poverty in Limpopo Province

Rwizi, Tandanai 05 1900 (has links)
The study was aimed at exploring the relationship between poverty and aggravated robbery in Limpopo Province. Sampled secondary data of aggravated robbery of- fenders, obtained from the South African Police (SAPS), Polokwane, was used in the analysis. From empirical researches on poverty and crime, there are some deductions that vulnerability to crime is increased by poverty. Poverty set was categorised by gender, employment status, marital status, race, age and educational attainment. Variables for aggravated robbery were house robbery, bank robbery, street/common robbery, carjacking, truck hijacking, cash-in-transit and business robbery. Canonical correlation analysis was used to make some inferences about the relationship of these two sets. The results revealed a signi cant positive correlation of 0.219(p-value = 0.025) between poverty and aggravated robbery at ve per cent signi cance level. Of the thirteen variables entered into the poverty-aggravated model, ve emerged as sta- tistically signi cant. These were gender, marital status, employment status, common robbery and business robbery. / Mathematical Sciences / M. Sc. (Statistics)
9

Understanding patterns of aggregation in count data

Sebatjane, Phuti 06 1900 (has links)
The term aggregation refers to overdispersion and both are used interchangeably in this thesis. In addressing the problem of prevalence of infectious parasite species faced by most rural livestock farmers, we model the distribution of faecal egg counts of 15 parasite species (13 internal parasites and 2 ticks) common in sheep and goats. Aggregation and excess zeroes is addressed through the use of generalised linear models. The abundance of each species was modelled using six different distributions: the Poisson, negative binomial (NB), zero-inflated Poisson (ZIP), zero-inflated negative binomial (ZINB), zero-altered Poisson (ZAP) and zero-altered negative binomial (ZANB) and their fit was later compared. Excess zero models (ZIP, ZINB, ZAP and ZANB) were found to be a better fit compared to standard count models (Poisson and negative binomial) in all 15 cases. We further investigated how distributional assumption a↵ects aggregation and zero inflation. Aggregation and zero inflation (measured by the dispersion parameter k and the zero inflation probability) were found to vary greatly with distributional assumption; this in turn changed the fixed-effects structure. Serial autocorrelation between adjacent observations was later taken into account by fitting observation driven time series models to the data. Simultaneously taking into account autocorrelation, overdispersion and zero inflation proved to be successful as zero inflated autoregressive models performed better than zero inflated models in most cases. Apart from contribution to the knowledge of science, predictability of parasite burden will help farmers with effective disease management interventions. Researchers confronted with the task of analysing count data with excess zeroes can use the findings of this illustrative study as a guideline irrespective of their research discipline. Statistical methods from model selection, quantifying of zero inflation through to accounting for serial autocorrelation are described and illustrated. / Statistics / M.Sc. (Statistics)

Page generated in 0.0206 seconds