Global ETD Search

21	Alguns métodos de amostragem para populações raras e agrupadas / Some sampling methods for rare and clustered populations Luis Henrique Teixeira Alves Affonso 11 April 2008 (has links) Em diversos levantamentos científicos, nos deparamos com a dificuldade de coletar os dados devido ao objeto em estudo ser de difícil observação, como por exemplo em estudos com indivíduos portadores de doenças raras, ou dotados de um comportamento evasivo, ou ainda indivíduos que distribuem-se de maneira geograficamente esparsa. Neste trabalho estudamos esquemas de amostragem voltados para populações raras com especial atenção às populações raras e agrupadas. Nos aprofundamos nas técnicas de amostragem por conglomerados adaptativos e amostragem seqüencial em dois estágios, fornecendo ao leitor subsídio teórico para entender os fundamentos das técnicas, bem como compreender a eficácia de seus estimadores apresentada em estudos de simulações. Em nossos estudos de simulação, mostramos que a técnica de amostragem seqüencial em dois estágios não apresenta perdas de eficiência quando o agrupamento dos elementos é menor. Entretanto, os estudos comparativos revelam que quando a população é rara e agrupada, a eficiência para a amostragem por conglomerados adaptativos é maior na maioria das parametrizações utilizadas. Ao final deste trabalho, fornecemos recomendações para as situações a respeito do conhecimento da raridade e agrupamento da população em estudo. / In many surveys we find hard observing individuals, like in rare diseases, elusive individuals or sparsely distributed individuals. This work is about sampling schemes for rare populations, more specifically rare and clustered, driving our attention to adaptive cluster sampling and two stage sequential sampling giving readers their theoretical basis and simulated efficiencies evaluation. In our simulation studies, we found that the efficiency of two-stage sequential sampling does not decrease when sample clustering is low. However, the comparison studies show that when sample is rare and clustered, adaptive cluster sampling in the majority of tested cases has better efficiency. At the end of this study, there are recommendations for each situation of knowing rarity and clustering of the population in study. amostragem adaptativa amostragem sequencial populações agrupadas populações raras populações raras e agrupadas adaptative sampling clustered populations rare populations two stage sequential sampling
22	Cardinality estimation using sample views with quality assurance Larson, Per-Ake, Lehner, Wolfgang, Zhou, Jingren, Zabback, Peter 13 September 2022 (has links) Accurate cardinality estimation is critically important to high-quality query optimization. It is well known that conventional cardinality estimation based on histograms or similar statistics may produce extremely poor estimates in a variety of situations, for example, queries with complex predicates, correlation among columns, or predicates containing user-defined functions. In this paper, we propose a new, general cardinality estimation technique that combines random sampling and materialized view technology to produce accurate estimates even in these situations. As a major innovation, we exploit feedback information from query execution and process control techniques to assure that estimates remain statistically valid when the underlying data changes. Experimental results based on a prototype implementation in Microsoft SQL Server demonstrate the practicality of the approach and illustrate the dramatic effects improved cardinality estimates may have. info:eu-repo/classification/ddc/004 ddc:004
23	Exploiting self-monitoring sample views for cardinality estimation Larson, Per-Ake, Lehner, Wolfgang, Zhou, Jingren, Zabback, Peter 13 December 2022 (has links) Good cardinality estimates are critical for generating good execution plans during query optimization. Complex predicates, correlations between columns, and user-defined functions are extremely hard to handle when using the traditional histogram approach. This demo illustrates the use of sample views for cardinality estimations as prototyped in Microsoft SQL Server. We show the creation of sample views, discuss how they are exploited during query optimization, and explain their potential effect on query plans. In addition, we also show our implementation of maintenance policies using statistical quality control techniques based on query feedback. info:eu-repo/classification/ddc/004 ddc:004
24	Modeling Confidence and Response Time in Brightness Discrimination: Testing Models of the Decision Process with Controlled Variability in Stimulus Strength Intermaggio, Victor G. 22 June 2012 (has links) No description available. Behavioral Sciences Cognitive Psychology Experimental Psychology Psychology Quantitative Psychology Sequential Sampling Models Diffusion Model Confidence Confidence Judgments Simple Decision Multichoice Decisions Response Time Brightness Discrimination
25	On Methods for Real Time Sampling and Distributions in Sampling Meister, Kadri January 2004 (has links) This thesis is composed of six papers, all dealing with the issue of sampling from a finite population. We consider two different topics: real time sampling and distributions in sampling. The main focus is on Papers A–C, where a somewhat special sampling situation referred to as real time sampling is studied. Here a finite population passes or is passed by the sampler. There is no list of the population units available and for every unit the sampler should decide whether or not to sample it when he/she meets the unit. We focus on the problem of finding suitable sampling methods for the described situation and some new methods are proposed. In all, we try not to sample units close to each other so often, i.e. we sample with negative dependencies. Here the correlations between the inclusion indicators, called sampling correlations, play an important role. Some evaluation of the new methods are made by using a simulation study and asymptotic calculations. We study new methods mainly in comparison to standard Bernoulli sampling while having the sample mean as an estimator for the population mean. Assuming a stationary population model with decreasing autocorrelations, we have found the form for the nearly optimal sampling correlations by using asymptotic calculations. Here some restrictions on the sampling correlations are used. We gain most in efficiency using methods that give negatively correlated indicator variables, such that the correlation sum is small and the sampling correlations are equal for units up to lag m apart and zero afterwards. Since the proposed methods are based on sequences of dependent Bernoulli variables, an important part of the study is devoted to the problem of how to generate such sequences. The correlation structure of these sequences is also studied. The remainder of the thesis consists of three diverse papers, Papers D–F, where distributional properties in survey sampling are considered. In Paper D the concern is with unified statistical inference. Here both the model for the population and the sampling design are taken into account when considering the properties of an estimator. In this paper the framework of the sampling design as a multivariate distribution is used to outline two-phase sampling. In Paper E, we give probability functions for different sampling designs such as conditional Poisson, Sampford and Pareto designs. Methods to sample by using the probability function of a sampling design are discussed. Paper F focuses on the design-based distributional characteristics of the π-estimator and its variance estimator. We give formulae for the higher-order moments and cumulants of the π-estimator. Formulae of the design-based variance of the variance estimator, and covariance of the π-estimator and its variance estimator are presented. Mathematical statistics Finite population sampling inferential issues real time sampling sequential sampling methods negative sampling correlations model-design-based inference Matematisk statistik Mathematical statistics Matematisk statistik
26	Procedimentos sequenciais Bayesianos aplicados ao processo de captura-recaptura Santos, Hugo Henrique Kegler dos 30 May 2014 (has links) Made available in DSpace on 2016-06-02T20:04:52Z (GMT). No. of bitstreams: 1 6306.pdf: 1062380 bytes, checksum: de31a51e2d0a59e52556156a08c37b41 (MD5) Previous issue date: 2014-05-30 / Financiadora de Estudos e Projetos / In this work, we make a study of the Bayes sequential decision procedure applied to capture-recapture with fixed sample sizes, to estimate the size of a finite and closed population process. We present the statistical model, review the Bayesian decision theory, presenting the pure decision problem, the statistical decision problem and the sequential decision procedure. We illustrate the theoretical methods discussed using simulated data. / Neste trabalho, fazemos um estudo do procedimento de decisão sequencial de Bayes aplicado ao processo de captura-recaptura com tamanhos amostrais fixados, para estimação do tamanho de uma população finita e fechada. Apresentamos o modelo estatístico, revisamos a teoria de decisão bayesiana, apresentando o problema de decisão puro, o problema de decisão estatística e o procedimento de decisão sequencial. Ilustramos os métodos teóricos discutidos através de dados simulados. Probabilidades Processo sequencial de capturarecaptura Estimadores de Bayes Risco de Bayes Teoria da decisão Amostragem sequencial Capture-recapture process Bayesians estimators Bayes risk Decision theory Sequential sampling
27	Metamodel-Based Multidisciplinary Design Optimization of Automotive Structures Ryberg, Ann-Britt January 2017 (has links) Multidisciplinary design optimization (MDO) can be used in computer aided engineering (CAE) to efficiently improve and balance performance of automotive structures. However, large-scale MDO is not yet generally integrated within automotive product development due to several challenges, of which excessive computing times is the most important one. In this thesis, a metamodel-based MDO process that fits normal company organizations and CAE-based development processes is presented. The introduction of global metamodels offers means to increase computational efficiency and distribute work without implementing complicated multi-level MDO methods. The presented MDO process is proven to be efficient for thickness optimization studies with the objective to minimize mass. It can also be used for spot weld optimization if the models are prepared correctly. A comparison of different methods reveals that topology optimization, which requires less model preparation and computational effort, is an alternative if load cases involving simulations of linear systems are judged to be of major importance. A technical challenge when performing metamodel-based design optimization is lack of accuracy for metamodels representing complex responses including discontinuities, which are common in for example crashworthiness applications. The decision boundary from a support vector machine (SVM) can be used to identify the border between different types of deformation behaviour. In this thesis, this information is used to improve the accuracy of feedforward neural network metamodels. Three different approaches are tested; to split the design space and fit separate metamodels for the different regions, to add estimated guiding samples to the fitting set along the boundary before a global metamodel is fitted, and to use a special SVM-based sequential sampling method. Substantial improvements in accuracy are observed, and it is found that implementing SVM-based sequential sampling and estimated guiding samples can result in successful optimization studies for cases where more conventional methods fail. metamodel artificial neural network (ANN) support vector machine (SVM) sequential sampling crashworthiness automotive structure spot weld optimization Mechanical Engineering Maskinteknik Applied Mechanics Teknisk mekanik Vehicle Engineering Farkostteknik
28	Dynamic Resampling for Preference-based Evolutionary Multi-objective Optimization of Stochastic Systems : Improving the efficiency of time-constrained optimization Siegmund, Florian January 2016 (has links) In preference-based Evolutionary Multi-objective Optimization (EMO), the decision maker is looking for a diverse, but locally focused non-dominated front in a preferred area of the objective space, as close as possible to the true Pareto-front. Since solutions found outside the area of interest are considered less important or even irrelevant, the optimization can focus its efforts on the preferred area and find the solutions that the decision maker is looking for more quickly, i.e., with fewer simulation runs. This is particularly important if the available time for optimization is limited, as is the case in many real-world applications. Although previous studies in using this kind of guided-search with preference information, for example, withthe R-NSGA-II algorithm, have shown positive results, only very few of them considered the stochastic outputs of simulated systems. In the literature, this phenomenon of stochastic evaluation functions is sometimes called noisy optimization. If an EMO algorithm is run without any countermeasure to noisy evaluation functions, the performance will deteriorate, compared to the case if the true mean objective values are known. While, in general, static resampling of solutions to reduce the uncertainty of all evaluated design solutions can allow EMO algorithms to avoid this problem, it will significantly increase the required simulation time/budget, as many samples will be wasted on candidate solutions which are inferior. In comparison, a Dynamic Resampling (DR) strategy can allow the exploration and exploitation trade-off to be optimized, since the required accuracy about objective values varies between solutions. In a dense, converged population, itis important to know the accurate objective values, whereas noisy objective values are less harmful when an algorithm is exploring the objective space, especially early in the optimization process. Therefore, a well-designed Dynamic Resampling strategy which resamples the solution carefully, according to the resampling need, can help an EMO algorithm achieve better results than a static resampling allocation. While there are abundant studies in Simulation-based Optimization that considered Dynamic Resampling, the survey done in this study has found that there is no related work that considered how combinations of Dynamic Resampling and preference-based guided search can further enhance the performance of EMO algorithms, especially if the problems under study involve computationally expensive evaluations, like production systems simulation. The aim of this thesis is therefore to study, design and then to compare new combinations of preference-based EMO algorithms with various DR strategies, in order to improve the solution quality found by simulation-based multi-objective optimization with stochastic outputs, under a limited function evaluation or simulation budget. Specifically, based on the advantages and flexibility offered by interactive, reference point-based approaches, studies of the performance enhancements of R-NSGA-II when augmented with various DR strategies, with increasing degrees of statistical sophistication, as well as several adaptive features in terms of optimization parameters, have been made. The research results have clearly shown that optimization results can be improved, if a hybrid DR strategy is used and adaptive algorithm parameters are chosen according to the noise level and problem complexity. In the case of a limited simulation budget, the results allow the conclusions that both decision maker preferences and DR should be used at the same time to achieve the best results in simulation-based multi-objective optimization. / Vid preferensbaserad evolutionär flermålsoptimering försöker beslutsfattaren hitta lösningar som är fokuserade kring ett valt preferensområde i målrymden och som ligger så nära den optimala Pareto-fronten som möjligt. Eftersom lösningar utanför preferensområdet anses som mindre intressanta, eller till och med oviktiga, kan optimeringen fokusera på den intressanta delen av målrymden och hitta relevanta lösningar snabbare, vilket betyder att färre lösningar behöver utvärderas. Detta är en stor fördel vid simuleringsbaserad flermålsoptimering med långa simuleringstider eftersom antalet olika konfigurationer som kan simuleras och utvärderas är mycket begränsat. Även tidigare studier som använt fokuserad flermålsoptimering styrd av användarpreferenser, t.ex. med algoritmen R-NSGA-II, har visat positiva resultat men enbart få av dessa har tagit hänsyn till det stokastiska beteendet hos de simulerade systemen. I litteraturen kallas optimering med stokastiska utvärderingsfunktioner ibland "noisy optimization". Om en optimeringsalgoritm inte tar hänsyn till att de utvärderade målvärdena är stokastiska kommer prestandan vara lägre jämfört med om optimeringsalgoritmen har tillgång till de verkliga målvärdena. Statisk upprepad utvärdering av lösningar med syftet att reducera osäkerheten hos alla evaluerade lösningar hjälper optimeringsalgoritmer att undvika problemet, men leder samtidigt till en betydande ökning av antalet nödvändiga simuleringar och därigenom en ökning av optimeringstiden. Detta är problematiskt eftersom det innebär att många simuleringar utförs i onödan på undermåliga lösningar, där exakta målvärden inte bidrar till att förbättra optimeringens resultat. Upprepad utvärdering reducerar ovissheten och hjälper till att förbättra optimeringen, men har också ett pris. Om flera simuleringar används för varje lösning så minskar antalet olika lösningar som kan simuleras och sökrymden kan inte utforskas lika mycket, givet att det totala antalet simuleringar är begränsat. Dynamisk upprepad utvärdering kan däremot effektivisera flermålsoptimeringens avvägning mellan utforskning och exploatering av sökrymden baserat på det faktum att den nödvändiga precisionen i målvärdena varierar mellan de olika lösningarna i målrymden. I en tät och konvergerad population av lösningar är det viktigt att känna till de exakta målvärdena, medan osäkra målvärden är mindre skadliga i ett tidigt stadium i optimeringsprocessen när algoritmen utforskar målrymden. En dynamisk strategi för upprepad utvärdering med en noggrann allokering av utvärderingarna kan därför uppnå bättre resultat än en allokering som är statisk. Trots att finns ett rikligt antal studier inom simuleringsbaserad optimering som använder sig av dynamisk upprepad utvärdering så har inga relaterade studier hittats som undersöker hur kombinationer av dynamisk upprepad utvärdering och preferensbaserad styrning kan förbättra prestandan hos algoritmer för flermålsoptimering ytterligare. Speciell avsaknad finns det av studier om optimering av problem med långa simuleringstider, som t.ex. simulering av produktionssystem. Avhandlingens mål är därför att studera, konstruera och jämföra nya kombinationer av preferensbaserade optimeringsalgoritmer och dynamiska strategier för upprepad utvärdering. Syftet är att förbättra resultatet av simuleringsbaserad flermålsoptimering som har stokastiska målvärden när antalet utvärderingar eller optimeringstiden är begränsade. Avhandlingen har speciellt fokuserat på att undersöka prestandahöjande åtgärder hos algoritmen R-NSGA-II i kombination med dynamisk upprepad utvärdering, baserad på fördelarna och flexibiliteten som interaktiva referenspunktbaserade algoritmer erbjuder. Exempel på förbättringsåtgärder är dynamiska algoritmer för upprepad utvärdering med förbättrad statistisk osäkerhetshantering och adaptiva optimeringsparametrar. Resultaten från avhandlingen visar tydligt att optimeringsresultaten kan förbättras om hybrida dynamiska algoritmer för upprepad utvärdering används och adaptiva optimeringsparametrar väljs beroende på osäkerhetsnivån och komplexiteten i optimeringsproblemet. För de fall där simuleringstiden är begränsad är slutsatsen från avhandlingen att både användarpreferenser och dynamisk upprepad utvärdering bör användas samtidigt för att uppnå de bästa resultaten i simuleringsbaserad flermålsoptimering. simulation-based optimization guided search preference-based optimization reference point decision support noise stochastic systems dynamic resampling budget allocation sequential sampling hybrid ranking and selection Information Systems Robotics Robotteknik och automation
29	自變數有誤差的邏輯式迴歸模型：估計、實驗設計及序貫分析 / Logistic regression models when covariates are measured with errors: Estimation, design and sequential method 簡至毅, Chien, Chih Yi Unknown Date (has links) 本文主要在探討自變數存在有測量誤差時，邏輯式迴歸模型的估計問題，並設計實驗使得測量誤差能滿足遞減假設，進一步應用序貫分析方法，在給定水準下，建立一個信賴範圍。當自變數存在有測量誤差時，通常會得到有偏誤的估計量，進而在做決策時會得到與無測量誤差所做出的決策不同。在本文中提出了一個遞減的測量誤差，使得滿足這樣的假設，可以證明估計量的強收斂，並證明與無測量誤差所得到的估計量相同的近似分配。相較於先前的假設，特別是證明大樣本的性質，新增加的樣本會有更小的測量誤差是更加合理的假設。我們同時設計了一個實驗來滿足所提出遞減誤差的條件，並利用序貫設計得到一個更省時也節省成本的處理方法。一般的case-control實驗，自變數也會出現測量誤差，我們也證明了斜率估計量的強收斂與近似分配的性質，並提出一個二階段抽樣方法，計算出所需的樣本數及建立信賴區間。 / In this thesis, we focus on the estimate of unknown parameters, experimental designs and sequential methods in both prospective and retrospective logistic regression models when there are covariates measured with errors. The imprecise measurement of exposure happens very often in practice, for example, in retrospective epidemiology studies, that may due to either the difficulty or the cost of measuring. It is known that the imprecisely measured variables can result in biased coefficients estimation in a regression model and therefore, it may lead to an incorrect inference. Thus, it is an important issue if the effects of the variables are of primary interest. When considering a prospective logistic regression model, we derive asymptotic results for the estimators of the regression parameters when there are mismeasured covariates. If the measurement error satisfies certain assumptions, we show that the estimators follow the normal distribution with zero mean, asymptotically unbiased and asymptotically normally distributed. Contrary to the traditional assumption on measurement error, which is mainly used for proving large sample properties, we assume that the measurement error decays gradually at a certain rate as there is a new observation added to the model. This kind of assumption can be fulfilled when the usual replicate observation method is used to dilute the magnitude of measurement errors, and therefore, is also more useful in practical viewpoint. Moreover, the independence of measurement error and covariate is not required in our theorems. An experimental design with measurement error satisfying the required degenerating rate is introduced. In addition, this assumption allows us to employ sequential sampling, which is popular in clinical trials, to such a measurement error logistic regression model. It is clear that the sequential method cannot be applied based on the assumption that the measurement errors decay uniformly as sample size increasing as in the most of the literature. Therefore, a sequential estimation procedure based on MLEs and such moment conditions is proposed and can be shown to be asymptotical consistent and efficient. Case-control studies are broadly used in clinical trials and epidemiological studies. It can be showed that the odds ratio can be consistently estimated with some exposure variables based on logistic models (see Prentice and Pyke (1979)). The two-stage case-control sampling scheme is employed for a confidence region of slope coefficient beta. A necessary sample size is calculated by a given pre-determined level. Furthermore, we consider the measurement error in the covariates of a case-control retrospective logistic regression model. We also derive some asymptotic results of the maximum likelihood estimators (MLEs) of the regression coefficients under some moment conditions on measurement errors. Under such kinds of moment conditions of measurement errors, the MLEs can be shown to be strongly consistent, asymptotically unbiased and asymptotically normally distributed. Some simulation results of the proposed two-stage procedures are obtained. We also give some numerical studies and real data to verify the theoretical results in different measurement error scenarios. 邏輯式迴歸測量誤差樣本數計算序貫分析二階段抽樣 logistic regression model measurement error sample size calculation sequential sampling two-stage case-control sampling Case-control study
30	A theoretical and experimental dissociation of two models of decision‐making Carland, Matthew A. 08 1900 (has links) La prise de décision est un processus computationnel fondamental dans de nombreux aspects du comportement animal. Le modèle le plus souvent rencontré dans les études portant sur la prise de décision est appelé modèle de diffusion. Depuis longtemps, il explique une grande variété de données comportementales et neurophysiologiques dans ce domaine. Cependant, un autre modèle, le modèle d’urgence, explique tout aussi bien ces mêmes données et ce de façon parcimonieuse et davantage encrée sur la théorie. Dans ce travail, nous aborderons tout d’abord les origines et le développement du modèle de diffusion et nous verrons comment il a été établi en tant que cadre de travail pour l’interprétation de la plupart des données expérimentales liées à la prise de décision. Ce faisant, nous relèveront ses points forts afin de le comparer ensuite de manière objective et rigoureuse à des modèles alternatifs. Nous réexaminerons un nombre d’assomptions implicites et explicites faites par ce modèle et nous mettrons alors l’accent sur certains de ses défauts. Cette analyse servira de cadre à notre introduction et notre discussion du modèle d’urgence. Enfin, nous présenterons une expérience dont la méthodologie permet de dissocier les deux modèles, et dont les résultats illustrent les limites empiriques et théoriques du modèle de diffusion et démontrent en revanche clairement la validité du modèle d'urgence. Nous terminerons en discutant l'apport potentiel du modèle d'urgence pour l'étude de certaines pathologies cérébrales, en mettant l'accent sur de nouvelles perspectives de recherche. / Decision‐making is a computational process of fundamental importance to many aspects of animal behavior. The prevailing model in the experimental study of decision‐making is the drift‐diffusion model, which has a long history and accounts for a broad range of behavioral and neurophysiological data. However, an alternative model – called the urgency‐gating model – has been offered which can account equally well for much of the same data in a more parsimonious and theoretically‐sound manner. In what follows, we will first trace the origins and development of the DDM, as well as give a brief overview of the manner in which it has supplied an explanatory framework for a large number of behavioral and physiological studies in the domain of decision‐making. In so doing, we will attempt to build a strong and clear case for its strengths so that it can be fairly and rigorously compared to potential alternative models. We will then re‐examine a number of the implicit and explicit theoretical assumptions made by the drift‐diffusion model, as well as highlight some of its empirical shortcomings. This analysis will serve as the contextual backdrop for our introduction and discussion of the urgency‐gating model. Finally, we present a novel experiment, the methodological design of which uniquely affords a decisive empirical dissociation of the models, the results of which illustrate the empirical and theoretical shortcomings of the drift‐diffusion model and instead offer clear support for the urgency‐gating model. We finish by discussing the potential for the urgency gating model to shed light on a number of clinical disorders, highlighting a number of future directions for research. Decision-making Drift-diffusion model Urgency-gating model Perceptual discrimination Speed-accuracy trade-off Random dot motion Sequential sampling Hypothesis-testing Neuroeconomics Reward rate Prise de décision Modèle de diffusion Modèle d'urgence Discrimination perceptuelle Compromis vitesse/précision Mouvement aléatoire de points Echantillonnage séquentiel Test d'hypothèses Neuroéconomie Taux de récompense

Search results