201 |
Variational Estimators in Statistical Multiscale AnalysisLi, Housen 17 February 2016 (has links)
No description available.
|
202 |
Heuristics for offline rectangular packing problemsOrtmann, Frank 03 1900 (has links)
Thesis (PhD (Logistics))--University of Stellenbosch, 2010. / ENGLISH ABSTRACT: Packing problems are common in industry and there is a large body of literature on the subject.
Two packing problems are considered in this dissertation: the strip packing problem and the
bin packing problem. The aim in both problems is to pack a speci ed set of small items, the
dimensions of which are all known prior to packing (hence giving rise to an o ine problem),
into larger objects, called bins. The strip packing problem requires packing these items into a
single bin, one dimension of which is unbounded (the bin is therefore referred to as a strip). In
two dimensions the width of the strip is typically speci ed and the aim is to pack all the items
into the strip, without overlapping, so that the resulting packing height is a minimum. The bin
packing problem, on the other hand, is the problem of packing the items into a speci ed set of
bins (all of whose dimensions are bounded) so that the wasted space remaining in the bins (which
contain items) is a minimum. The bins may all have the same dimensions (in which case the
problem is known as the single bin size bin packing problem), or may have di erent dimensions,
in which case the problem is called the multiple bin size bin packing problem (MBSBPP). In
two dimensions the wasted space is the sum total of areas of the bins (containing items) not
covered by items.
Many solution methodologies have been developed for above-mentioned problems, but the scope
of the solution methodologies considered in this dissertation is restricted to heuristics. Packing
heuristics follow a xed set of rules to pack items in such a manner as to nd good, feasible
(but not necessarily optimal) solutions to the strip and bin packing problems within as short
a time span as possible. Three types of heuristics are considered in this dissertation: (i) those
that pack items into levels (the heights of which are determined by the heights of the tallest
items in these levels) in such a manner that all items are packed along the bottom of the level,
(ii) those that pack items into levels in such a manner that items may be packed anywhere
between the horizontal boundaries that de ne the levels, and (iii) those heuristics that do not
restrict the packing of items to levels. These three classes of heuristics are known as level
algorithms, pseudolevel algorithms and plane algorithms, respectively.
A computational approach is adopted in this dissertation in order to evaluate the performances
of 218 new heuristics for the strip packing problem in relation to 34 known heuristics from
the literature with respect to a set of 1 170 benchmark problem instances. It is found that
the new level-packing heuristics do not yield signi cantly better solutions than the known
heuristics, but several of the newly proposed pseudolevel heuristics do yield signi cantly better
results than the best of the known pseudolevel heuristics in terms of both packing densities
achieved and computation times expended. During the evaluation of the plane algorithms two
classes of heuristics were identi ed for packing problems, namely sorting-dependent and sortingindependent
algorithms. Two new sorting techniques are proposed for the sorting-independent
algorithms and one of them yields the best-performing heuristic overall. A new heuristic approach for the MBSBPP is also proposed, which may be combined with
level and pseudolevel algorithms for the strip packing problem in order to nd solutions to the
problem very rapidly. The best-performing plane-packing heuristic is modi ed to pack items
into the largest bins rst, followed by an attempted repacking of the items in those bins into
smaller bins with the aim of further minimising wasted space. It is found that the resulting
plane-packing algorithm yields the best results in terms of time and packing density, but that
the solution di erences between pseudolevel algorithms are not as marked for the MBSBPP as
for the strip packing problem. / AFRIKAANSE OPSOMMING: Inpakkingsprobleme kom algemeen in die industrie voor en daar is 'n aansienlike volume literatuur
oor hierdie onderwerp. Twee inpakkingsprobleme word in hierdie proefskrif oorweeg,
naamlik die strook-inpakkingsprobleem en die houer-inpakkingsprobleem. In beide probleme is
die doel om 'n gespesi seerde versameling klein voorwerpe, waarvan die dimensies almal voordat
inpakking plaasvind, bekend is (en die probleem dus 'n sogenaamde a
yn-probleem is), in
een of meer groter houers te pak. In die strook-inpakkingsprobleem word hierdie voorwerpe
in een houer, waarvan een dimensie onbegrens is, ingepak (hierdie houer word dus 'n strook
genoem). In twee dimensies word die wydte van die strook gewoonlik gespesi seer en is die doel
om al die voorwerpe sonder oorvleueling op s o 'n manier in die strook te pak dat die totale
inpakkingshoogte geminineer word. In die houer-inpakkingsprobleem, daarenteen, is die doel
om die voorwerpe op s o 'n manier in 'n gespesi seerde aantal houers (waarvan al die dimensies
begrens is) te pak dat die vermorste of oorblywende ruimte in die houers (wat wel voorwerpe
bevat) 'n minimum is. Die houers mag almal dieselfde dimensies h^e (in welke geval die probleem
as die enkelgrootte houer-inpakkingsprobleem bekend staan), of mag verskillende dimensies h^e
(in welke geval die probleem as die veelvuldige-grootte houer-inpakkingsprobleem bekend staan,
afgekort as VGHIP). In twee dimensies word die vermorste ruimte geneem as die somtotaal van
daardie deelareas van die houers (wat wel voorwerpe bevat) waar daar geen voorwerpe geplaas
word nie.
Verskeie oplossingsmetodologie e is al vir die bogenoemde twee inpakkingsprobleme ontwikkel,
maar die bestek van die metodologie e wat in hierdie proefskrif oorweeg word, word beperk tot
heuristieke. 'n Inpakkingsheuristiek volg 'n vaste stel re els waarvolgens voorwerpe in houers
gepak word om so spoedig moontlik goeie, toelaatbare (maar nie noodwendig optimale) oplossings
tot die strook-inpakkingsprobleem en die houer-inpakkingsprobleem te vind. Drie tipes
inpakkingsheuristieke word in hierdie proefskrif oorweeg, naamlik (i) heuristieke wat voorwerpe
langs die onderste randte van horisontale vlakke in die houers pak (die hoogtes van hierdie vlakke
word bepaal deur die hoogtes van die hoogste item in elke vlak), (ii) heuristieke wat voorwerpe
op enige plek binne horisontale stroke in die houers pak, en (iii) heuristieke waar inpakking
nie volgens horisontale vlakke of stroke beperk word nie. Hierdie drie klasse heuristieke staan
onderskeidelik as vlakalgoritmes, pseudo-vlakalgoritmes en platvlakalgoritmes bekend.
'n Berekeningsbenadering word in hierdie proefskrif gevolg deur die werkverrigting van die
218 nuwe heuristieke vir die strook-inpakkingsprobleem met die werkverrigting van 34 bekende
heuristieke uit die literatuur te vergelyk, deur al die heuristieke op 1 170 toetsprobleme toe
te pas. Daar word bevind dat die nuwe vlakalgoritmes nie 'n noemenswaardige verbetering in
oplossingskwaliteit in vergeleke met soortgelyke bestaande algoritmes in die literatuur lewer nie,
maar dat verskeie nuwe pseudo-vlakalgoritmes wel noemenswaardige verbeteringe in terme van
beide inpakkingsdigthede en oplossingstye in vergeleke met die beste bestaande algoritmes in die
literatuur lewer. Assessering van die platvlakalgoritmes het gelei tot die identi kasie van twee
deelklasse van algoritmes, naamlik sorteringsafhanklike- en sorteringsonafhanklike algoritmes.
Twee nuwe sorteringstegnieke word ook vir die deelklas van sorteringsonafhanklike algoritmes
voorgestel, en een van hulle lewer die algeheel beste inpakkingsheursitiek.
'n Nuwe heuristiese benadering word ook vir die VGHIP ontwikkel. Hierdie benadering kan
met vlak- of pseudo-vlakalgoritmes vir die strook-inpakkingsprobleem gekombineer word om
baie vinnig oplossings vir die VGHIP te vind. Die beste platvlakheuristiek vir die strookinpakkingsprobleem
word ook aangepas om voorwerpe eers in die grootste houers te pak, en
daarna in kleiner houers te herpak met die doel om vermorste ruimte verder te minimeer.
Daar word bevind dat die resulterende platvlakalgoritme die beste resultate in terme van
beide inpakkingsdigtheid en oplossingstyd lewer, maar dat oplossingsverskille tussen die pseudovlakalgoritmes
nie so opmerklik vir die VGHIP is as wat die geval met die strookinpakkingsprobleem
was nie.
|
203 |
Nonparametric Discovery of Human Behavior Patterns from Multimodal DataSun, Feng-Tso 01 May 2014 (has links)
Recent advances in sensor technologies and the growing interest in context- aware applications, such as targeted advertising and location-based services, have led to a demand for understanding human behavior patterns from sensor data. People engage in routine behaviors. Automatic routine discovery goes beyond low-level activity recognition such as sitting or standing and analyzes human behaviors at a higher level (e.g., commuting to work). The goal of the research presented in this thesis is to automatically discover high-level semantic human routines from low-level sensor streams. One recent line of research is to mine human routines from sensor data using parametric topic models. The main shortcoming of parametric models is that they assume a fixed, pre-specified parameter regardless of the data. Choosing an appropriate parameter usually requires an inefficient trial-and-error model selection process. Furthermore, it is even more difficult to find optimal parameter values in advance for personalized applications. The research presented in this thesis offers a novel nonparametric framework for human routine discovery that can infer high-level routines without knowing the number of latent low-level activities beforehand. More specifically, the frame-work automatically finds the size of the low-level feature vocabulary from sensor feature vectors at the vocabulary extraction phase. At the routine discovery phase, the framework further automatically selects the appropriate number of latent low-level activities and discovers latent routines. Moreover, we propose a new generative graphical model to incorporate multimodal sensor streams for the human activity discovery task. The hypothesis and approaches presented in this thesis are evaluated on public datasets in two routine domains: two daily-activity datasets and a transportation mode dataset. Experimental results show that our nonparametric framework can automatically learn the appropriate model parameters from multimodal sensor data without any form of manual model selection procedure and can outperform traditional parametric approaches for human routine discovery tasks.
|
204 |
EMPIRICAL PROCESSES AND ROC CURVES WITH AN APPLICATION TO LINEAR COMBINATIONS OF DIAGNOSTIC TESTSChirila, Costel 01 January 2008 (has links)
The Receiver Operating Characteristic (ROC) curve is the plot of Sensitivity vs. 1- Specificity of a quantitative diagnostic test, for a wide range of cut-off points c. The empirical ROC curve is probably the most used nonparametric estimator of the ROC curve. The asymptotic properties of this estimator were first developed by Hsieh and Turnbull (1996) based on strong approximations for quantile processes. Jensen et al. (2000) provided a general method to obtain regional confidence bands for the empirical ROC curve, based on its asymptotic distribution.
Since most biomarkers do not have high enough sensitivity and specificity to qualify for good diagnostic test, a combination of biomarkers may result in a better diagnostic test than each one taken alone. Su and Liu (1993) proved that, if the panel of biomarkers is multivariate normally distributed for both diseased and non-diseased populations, then the linear combination, using Fisher's linear discriminant coefficients, maximizes the area under the ROC curve of the newly formed diagnostic test, called the generalized ROC curve. In this dissertation, we will derive the asymptotic properties of the generalized empirical ROC curve, the nonparametric estimator of the generalized ROC curve, by using the empirical processes theory as in van der Vaart (1998). The pivotal result used in finding the asymptotic behavior of the proposed nonparametric is the result on random functions which incorporate estimators as developed by van der Vaart (1998). By using this powerful lemma we will be able to decompose an equivalent process into a sum of two other processes, usually called the brownian bridge and the drift term, via Donsker classes of functions. Using a uniform convergence rate result given by Pollard (1984), we derive the limiting process of the drift term. Due to the independence of the random samples, the asymptotic distribution of the generalized empirical ROC process will be the sum of the asymptotic distributions of the decomposed processes. For completeness, we will first re-derive the asymptotic properties of the empirical ROC curve in the univariate case, using the same technique described before. The methodology is used to combine biomarkers in order to discriminate lung cancer patients from normals.
|
205 |
Facial feature localization using highly flexible yet sufficiently strict shape modelsTamersoy, Birgi 18 September 2014 (has links)
Accurate and efficient localization of facial features is a crucial first step in many face-related computer vision tasks. Some of these tasks include, but not limited to: identity recognition, expression recognition, and head-pose estimation. Most effort in the field has been exerted towards developing better ways of modeling prior appearance knowledge and image observations. Modeling prior shape knowledge, on the other hand, has not been explored as much. In this dissertation I primarily focus on the limitations of the existing methods in terms of modeling the prior shape knowledge. I first introduce a new pose-constrained shape model. I describe my shape model as being "highly flexible yet sufficiently strict". Existing pose-constrained shape models are either too strict, and have questionable generalization power, or they are too loose, and have questionable localization accuracies. My model tries to find a good middle-ground by learning which shape constraints are more "informative" and should be kept, and which ones are not-so-important and may be omitted. I build my pose-constrained facial feature localization approach on this new shape model using a probabilistic graphical model framework. Within this framework, observed and unobserved variables are defined as the local image observations, and the feature locations, respectively. Feature localization, or "probabilistic inference", is then achieved by nonparametric belief propagation. I show that this approach outperforms other popular pose-constrained methods through qualitative and quantitative experiments. Next, I expand my pose-constrained localization approach to unconstrained setting using a multi-model strategy. While doing so, once again I identify and address the two key limitations of existing multi-model methods: 1) semantically and manually defining the models or "guiding" their generation, and 2) not having efficient and effective model selection strategies. First, I introduce an approach based on unsupervised clustering where the models are automatically learned from training data. Then, I complement this approach with an efficient and effective model selection strategy, which is based on a multi-class naive Bayesian classifier. This way, my method can have many more models, each with a higher level of expressive power, and consequently, provides a more effective partitioning of the face image space. This approach is validated through extensive experiments and comparisons with state-of-the-art methods on state-of-the-art datasets. In the last part of this dissertation I discuss a particular application of the previously introduced techniques; facial feature localization in unconstrained videos. I improve the frame-by-frame localization results, by estimating the actual head-movement from a sequence of noisy head-pose estimates, and then using this information for detecting and fixing the localization failures. / text
|
206 |
Statistical Regular Pavings and their ApplicationsTeng, Gloria Ai Hui January 2013 (has links)
We propose using statistical regular pavings (SRPs) as an efficient and adaptive statistical data structure for processing massive, multi-dimensional data. A regular paving (RP) is an ordered binary tree that recursively bisects a box in $\Rz^{d}$ along the first widest side. An SRP is extended from an RP by allowing mutable caches of recursively computable statistics of the data. In this study we use SRPs for two major applications: estimating histogram densities and summarising large spatio-temporal datasets.
The SRP histograms produced are $L_1$-consistent density estimators driven by a randomised priority queue that adaptively grows the SRP tree, and formalised as a Markov chain over the space of SRPs. A way to select an estimate is to run a Markov chain over the space of SRP trees, also initialised by the randomised priority queue, but here the SRP tree either shrinks or grows adaptively through pruning or splitting operations. The stationary distribution of the Markov chain is then the posterior distribution over the space of all possible histograms. We then take advantage of the recursive nature of SRPs to make computationally efficient arithmetic averages, and take the average of the states sampled from the stationary distribution to obtain the posterior mean histogram estimate.
We also show that SRPs are capable of summarizing large datasets by working with a dataset containing high frequency aircraft position information. Recursively computable statistics can be stored for variable-sized regions of airspace. The regions themselves can be created automatically to reflect the varying density of aircraft observations, dedicating more computational resources and providing more detailed information in areas with more air traffic. In particular, SRPs are able to very quickly aggregate or separate data with different characteristics so that data describing individual aircraft or collected using different technologies (reflecting different levels of precision) can be stored separately and yet also very quickly combined using standard arithmetic operations.
|
207 |
CT3 as an Index of Knowledge Domain Structure: Distributions for Order Analysis and Information HierarchiesSwartz Horn, Rebecca 12 1900 (has links)
The problem with which this study is concerned is articulating all possible CT3 and KR21 reliability measures for every case of a 5x5 binary matrix (32,996,500 possible matrices). The study has three purposes. The first purpose is to calculate CT3 for every matrix and compare the results to the proposed optimum range of .3 to .5. The second purpose is to compare the results from the calculation of KR21 and CT3 reliability measures. The third purpose is to calculate CT3 and KR21 on every strand of a class test whose item set has been reduced using the difficulty strata identified by Order Analysis. The study was conducted by writing a computer program to articulate all possible 5 x 5 matrices. The program also calculated CT3 and KR21 reliability measures for each matrix. The nonparametric technique of Order Analysis was applied to two sections of test items to stratify the items into difficulty levels. The difficulty levels were used to reduce the item set from 22 to 9 items. All possible strands or chains of these items were identified so that both reliability measures (CT3 and KR21) could be calculated. One major finding of this study indicates that .3 to .5 is a desirable range for CT3 (cumulative p=.86 to p=.98) if cumulative frequencies are measured. A second major finding is that the KR21 reliability measure produced an invalid result more than half the time. The last major finding is that CT3, rescaled to range between 0 and 1, supports De Vellis' guidelines for reliability measures. The major conclusion is that CT3 is a better measure of reliability since it considers both inter- and intra-item variances.
|
208 |
A Performance Evaluation of Confidence Intervals for Ordinal Coefficient AlphaTurner, Heather Jean 05 1900 (has links)
Ordinal coefficient alpha is a newly derived non-parametric reliability estimate. As with any point estimate, ordinal coefficient alpha is merely an estimate of a population parameter and tends to vary from sample to sample. Researchers report the confidence interval to provide readers with the amount of precision obtained. Several methods with differing computational approaches exist for confidence interval estimation for alpha, including the Fisher, Feldt, Bonner, and Hakstian and Whalen (HW) techniques. Overall, coverage rates for the various methods were unacceptably low with the Fisher method as the highest performer at 62%. Because of the poor performance across all four confidence interval methods, a need exists to develop a method which works well for ordinal coefficient alpha.
|
209 |
Testy dobré shody při rušivých parametrech / Goodness of fit tests with nuisance parametersBaňasová, Barbora January 2015 (has links)
This thesis deals with the goodness of fit tests in nonparametric model in the presence of unknown parameters of the probability distribution. The first part is devoted to understanding of the theoretical basis. We compare two methodologies for the construction of test statistics with application of empirical characteristic and empirical distribution functions. We use kernel estimates of regression functions and parametric bootstrap method to approximate the critical values of the tests. In the second part of the thesis, the work is complemented with the simulation study for different choices of weighting functions and parameters. Finally we illustrate the use and the comparison of goodness of fit tests on the example with the real data set. Powered by TCPDF (www.tcpdf.org)
|
210 |
Régression isotonique itérée / Iterative isotonic regressionJégou, Nicolas 23 November 2012 (has links)
Ce travail se situe dans le cadre de la régression non paramétrique univariée. Supposant la fonction de régression à variation bornée et partant du résultat selon lequel une telle fonction se décompose en la somme d’une fonction croissante et d’une fonction décroissante, nous proposons de construire et d’étudier un nouvel estimateur combinant les techniques d’estimation des modèles additifs et celles d’estimation sous contraintes de monotonie. Plus précisément, notreméthode consiste à itérer la régression isotonique selon l’algorithme backfitting. On dispose ainsià chaque itération d’un estimateur de la fonction de régression résultant de la somme d’une partiecroissante et d’une partie décroissante.Le premier chapitre propose un tour d’horizon des références relatives aux outils cités à l’instant. Le chapitre suivant est dédié à l’étude théorique de la régression isotonique itérée. Dans un premier temps, on montre que, la taille d’échantillon étant fixée, augmenter le nombre d’itérations conduit à l’interpolation des données. On réussit à identifier les limites des termes individuels de la somme en montrant l’égalité de notre algorithme avec celui consistant à itérer la régressionisotonique selon un algorithme de type réduction itérée du biais. Nous établissons enfin la consistance de l’estimateur.Le troisième chapitre est consacré à l’étude pratique de l’estimateur. Comme augmenter le nombre d’itérations conduit au sur-ajustement, il n’est pas souhaitable d’itérer la méthode jusqu’à la convergence. Nous examinons des règles d’arrêt basées sur des adaptations de critères usuellement employés dans le cadre des méthodes linéaires de lissage (AIC, BIC,...) ainsi que des critères supposant une connaissance a priori sur le nombre de modes de la fonction de régression. Il en ressort un comportement intéressant de la méthode lorsque la fonction de régression possède des points de rupture. Nous appliquons ensuite l’algorithme à des données réelles de type puces CGH où la détection de ruptures est d’un intérêt crucial. Enfin, une application à l’estimation des fonctions unimodales et à la détection de mode(s) est proposée / This thesis is part of non parametric univariate regression. Assume that the regression function is of bounded variation then the Jordan’s decomposition ensures that it can be written as the sum of an increasing function and a decreasing function. We propose and analyse a novel estimator which combines the isotonic regression related to the estimation of monotonefunctions and the backfitting algorithm devoted to the estimation of additive models. The first chapter provides an overview of the references related to isotonic regression and additive models. The next chapter is devoted to the theoretical study of iterative isotonic regression. As a first step we show that increasing the number of iterations tends to reproduce the data. Moreover, we manage to identify the individual limits by making a connexion with the general property of isotonicity of projection onto convex cones and deriving another equivalent algorithm based on iterative bias reduction. Finally, we establish the consistency of the estimator.The third chapter is devoted to the practical study of the estimator. As increasing the number of iterations leads to overfitting, it is not desirable to iterate the procedure until convergence. We examine stopping criteria based on adaptations of criteria usually used in the context of linear smoothing methods (AIC, BIC, ...) as well as criteria assuming the knowledge of thenumber of modes of the regression function. As it is observed an interesting behavior of the method when the regression function has breakpoints, we apply the algorithm to CGH-array data where breakopoints detections are of crucial interest. Finally, an application to the estimation of unimodal functions is proposed
|
Page generated in 0.2958 seconds