• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 8
  • 5
  • 1
  • Tagged with
  • 14
  • 14
  • 6
  • 5
  • 4
  • 3
  • 3
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Ranking from Pairwise Comparisons : The Role of the Pairwise Preference Matrix

Rajkumar, Arun January 2016 (has links) (PDF)
Ranking a set of candidates or items from pair-wise comparisons is a fundamental problem that arises in many settings such as elections, recommendation systems, sports team rankings, document rankings and so on. Indeed it is well known in the psychology literature that when a large number of items are to be ranked, it is easier for humans to give pair-wise comparisons as opposed to complete rankings. The problem of ranking from pair-wise comparisons has been studied in multiple communities such as machine learning, operations research, linear algebra, statistics etc., and several algorithms (both classic and recent) have been proposed. However, it is not well under-stood under what conditions these different algorithms perform well. In this thesis, we aim to fill this fundamental gap, by elucidating precise conditions under which different algorithms perform well, as well as giving new algorithms that provably perform well under broader conditions. In particular, we consider a natural statistical model wherein for every pair of items (i; j), there is a probability Pij such that each time items i and j are compared, item j beats item i with probability Pij . Such models, which we summarize through a matrix containing all these pair-wise probabilities, have been used explicitly or implicitly in much previous work in the area; we refer to the resulting matrix as the pair-wise preference matrix, and elucidate clearly the crucial role it plays in determining the performance of various algorithms. In the first part of the thesis, we consider a natural generative model where all pairs of items can be sampled and where the underlying preferences are assumed to be acyclic. Under this setting, we elucidate the conditions on the pair-wise preference matrix under which popular algorithms such as matrix Borda, spectral ranking, least squares and maximum likelihood under a Bradley-Terry-Luce (BTL) model produce optimal rankings that minimize the pair-wise disagreement error. Specifically, we derive explicit sample complexity bounds for each of these algorithms to output an optimal ranking under interesting subclasses of the class of all acyclic pair-wise preference matrices. We show that none of these popular algorithms is guaranteed to produce optimal rankings for all acyclic preference matrices. We then pro-pose a novel support vector machine based rank aggregation algorithm that provably does so. In the second part of the thesis, we consider the setting where preferences may contain cycles. Here, finding a ranking that minimizes the pairwise disagreement error is in general NP-hard. However, even in the presence of cycles, one may wish to rank 'good' items ahead of the rest. We develop a framework for this setting using notions of winners based on tournament solution concepts from social choice theory. We first show that none of the existing algorithms are guaranteed to rank winners ahead of the rest for popular tournament solution based winners such as top cycle, Copeland set, Markov set etc. We propose three algorithms - matrix Copeland, unweighted Markov and parametric Markov - which provably rank winners at the top for these popular tournament solutions. In addition to ranking winners at the top, we show that the rankings output by the matrix Copeland and the parametric Markov algorithms also minimize the pair-wise disagreement error for certain classes of acyclic preference matrices. Finally, in the third part of the thesis, we consider the setting where the number of items to be ranked is large and it is impractical to obtain comparisons among all pairs. Here, one samples a small set of pairs uniformly at random and compares each pair a fixed number of times; in particular, the goal is to come up with good algorithms that sample comparisons among only O(nlog(n)) item pairs (where n is the number of items). Unlike existing results for such settings, where one either assumes a noisy permutation model (under which there is a true underlying ranking and the outcome of every comparison differs from the true ranking with some fixed probability) or assumes a BTL or Thurstone model, we develop a general algorithmic framework based on ideas from matrix completion, termed low-rank pair-wise ranking, which provably produces an good ranking by comparing only O(nlog(n)) pairs, O(log(n)) times each, not only for popular classes of models such as BTL and Thurstone, but also for much more general classes of models wherein a suitable transform of the pair-wise probabilities leads to a low-rank matrix; this subsumes the guarantees of many previous algorithms in this setting. Overall, our results help to understand at a fundamental level the statistical properties of various algorithms for the problem of ranking from pair-wise comparisons, and under various natural settings, lead to novel algorithms with improved statistical guarantees compared to existing algorithms for this problem.
12

AnÃlise das constataÃÃes e recomendaÃÃes das auditorias nas instituiÃÃes pÃblicas no Estado do Cearà no perÃodo 2008 a 2011 / Analysis of the findings and recommendations of audits in public institutions in the state of Cearà in the period 2008-2011

Cristina Maciel Aranha 20 December 2013 (has links)
nÃo hà / A partir de dados dos sistemas corporativos E-controle e Sistema Folha de Pagamento dos servidores pÃblicos do estado do Cearà e dos relatÃrios de auditoria de gestÃo elaborados pela Controladoria e Ouvidoria Geral do Estado no perÃodo de 2008 a 2011, este estudo desenvolveu modelos que permitiram estimar a probabilidade de ocorrÃncia de irregularidades e recomendaÃÃes de auditorias nos ÃrgÃos pÃblicos do Estado do CearÃ. As estimaÃÃes dos modelos de variÃveis dependentes binÃrias permitiram concluir que cerca de 31% das auditorias realizadas apresentaram irregularidades acima da mÃdia, conquanto em 35% destas auditorias foram verificadas recomendaÃÃes acima da mÃdia. A eficiÃncia nos empenhos e o orÃamento executado por servidor do ÃrgÃo analisado influenciam a reduÃÃo do nÃmero de irregularidades conquanto o valor executado por meio de convÃnios està relacionado ao aumento do nÃmero de constataÃÃes e recomendaÃÃes de auditoria. / From corporate data E-control systems and Payroll System for civil servants of the state of Cearà and Management Audit Reports prepared by the Controladoria e Ouvidoria Geral do Estado in the period 2008-2011, this study developed models that allowed estimate the probability of occurrence of irregularities and audit recommendations in public units of the State of CearÃ. The estimates of the models for binary dependent variables showed that about 31% of audits showed irregularities above average, although in 35% of these audits recommendations above average were observed. The efficiency in commitments and in executed budget by the server of the public unit influence the reduction of the detected irregularities while the value executed by voluntary transfer of resources are related to increases in the number of audit findings and recommendations.
13

Asymptotic Analysis for Nonlinear Spatial and Network Econometric Models

Xu, Xingbai, Xu 28 September 2016 (has links)
No description available.
14

Contribution à la statistique spatiale et l'analyse de données fonctionnelles / Contribution to spatial statistics and functional data analysis

Ahmed, Mohamed Salem 12 December 2017 (has links)
Ce mémoire de thèse porte sur la statistique inférentielle des données spatiales et/ou fonctionnelles. En effet, nous nous sommes intéressés à l’estimation de paramètres inconnus de certains modèles à partir d’échantillons obtenus par un processus d’échantillonnage aléatoire ou non (stratifié), composés de variables indépendantes ou spatialement dépendantes.La spécificité des méthodes proposées réside dans le fait qu’elles tiennent compte de la nature de l’échantillon étudié (échantillon stratifié ou composé de données spatiales dépendantes).Tout d’abord, nous étudions des données à valeurs dans un espace de dimension infinie ou dites ”données fonctionnelles”. Dans un premier temps, nous étudions les modèles de choix binaires fonctionnels dans un contexte d’échantillonnage par stratification endogène (échantillonnage Cas-Témoin ou échantillonnage basé sur le choix). La spécificité de cette étude réside sur le fait que la méthode proposée prend en considération le schéma d’échantillonnage. Nous décrivons une fonction de vraisemblance conditionnelle sous l’échantillonnage considérée et une stratégie de réduction de dimension afin d’introduire une estimation du modèle par vraisemblance conditionnelle. Nous étudions les propriétés asymptotiques des estimateurs proposées ainsi que leurs applications à des données simulées et réelles. Nous nous sommes ensuite intéressés à un modèle linéaire fonctionnel spatial auto-régressif. La particularité du modèle réside dans la nature fonctionnelle de la variable explicative et la structure de la dépendance spatiale des variables de l’échantillon considéré. La procédure d’estimation que nous proposons consiste à réduire la dimension infinie de la variable explicative fonctionnelle et à maximiser une quasi-vraisemblance associée au modèle. Nous établissons la consistance, la normalité asymptotique et les performances numériques des estimateurs proposés.Dans la deuxième partie du mémoire, nous abordons des problèmes de régression et prédiction de variables dépendantes à valeurs réelles. Nous commençons par généraliser la méthode de k-plus proches voisins (k-nearest neighbors; k-NN) afin de prédire un processus spatial en des sites non-observés, en présence de co-variables spatiaux. La spécificité du prédicteur proposé est qu’il tient compte d’une hétérogénéité au niveau de la co-variable utilisée. Nous établissons la convergence presque complète avec vitesse du prédicteur et donnons des résultats numériques à l’aide de données simulées et environnementales.Nous généralisons ensuite le modèle probit partiellement linéaire pour données indépendantes à des données spatiales. Nous utilisons un processus spatial linéaire pour modéliser les perturbations du processus considéré, permettant ainsi plus de flexibilité et d’englober plusieurs types de dépendances spatiales. Nous proposons une approche d’estimation semi paramétrique basée sur une vraisemblance pondérée et la méthode des moments généralisées et en étudions les propriétés asymptotiques et performances numériques. Une étude sur la détection des facteurs de risque de cancer VADS (voies aéro-digestives supérieures)dans la région Nord de France à l’aide de modèles spatiaux à choix binaire termine notre contribution. / This thesis is about statistical inference for spatial and/or functional data. Indeed, weare interested in estimation of unknown parameters of some models from random or nonrandom(stratified) samples composed of independent or spatially dependent variables.The specificity of the proposed methods lies in the fact that they take into considerationthe considered sample nature (stratified or spatial sample).We begin by studying data valued in a space of infinite dimension or so-called ”functionaldata”. First, we study a functional binary choice model explored in a case-controlor choice-based sample design context. The specificity of this study is that the proposedmethod takes into account the sampling scheme. We describe a conditional likelihoodfunction under the sampling distribution and a reduction of dimension strategy to definea feasible conditional maximum likelihood estimator of the model. Asymptotic propertiesof the proposed estimates as well as their application to simulated and real data are given.Secondly, we explore a functional linear autoregressive spatial model whose particularityis on the functional nature of the explanatory variable and the structure of the spatialdependence. The estimation procedure consists of reducing the infinite dimension of thefunctional variable and maximizing a quasi-likelihood function. We establish the consistencyand asymptotic normality of the estimator. The usefulness of the methodology isillustrated via simulations and an application to some real data.In the second part of the thesis, we address some estimation and prediction problemsof real random spatial variables. We start by generalizing the k-nearest neighbors method,namely k-NN, to predict a spatial process at non-observed locations using some covariates.The specificity of the proposed k-NN predictor lies in the fact that it is flexible and allowsa number of heterogeneity in the covariate. We establish the almost complete convergencewith rates of the spatial predictor whose performance is ensured by an application oversimulated and environmental data. In addition, we generalize the partially linear probitmodel of independent data to the spatial case. We use a linear process for disturbancesallowing various spatial dependencies and propose a semiparametric estimation approachbased on weighted likelihood and generalized method of moments methods. We establishthe consistency and asymptotic distribution of the proposed estimators and investigate thefinite sample performance of the estimators on simulated data. We end by an applicationof spatial binary choice models to identify UADT (Upper aerodigestive tract) cancer riskfactors in the north region of France which displays the highest rates of such cancerincidence and mortality of the country.

Page generated in 0.0299 seconds