• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 44
  • 6
  • 6
  • 4
  • 2
  • Tagged with
  • 73
  • 73
  • 60
  • 19
  • 15
  • 12
  • 11
  • 9
  • 8
  • 8
  • 7
  • 7
  • 6
  • 6
  • 6
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
41

Judgement post-stratification for designed experiments

Du, Juan, January 2006 (has links)
Thesis (Ph. D.)--Ohio State University, 2006. / Title from first page of PDF file. Includes bibliographical references (p. 143-146).
42

Integration of ranking and selection methods with the multi-objective optimisation cross-entropy method

Von Lorne von Saint Ange, Chantel 03 1900 (has links)
Thesis (MEng)--Stellenbosch University, 2015. / ENGLISH ABSTRACT: A method for multi-objective optimisation using the cross-entropy method (MOO CEM) was recently developed by Bekker & Aldrich (2010) and Bekker (2012). The method aims to identify the nondominated solutions of multi-objective problems, which are often dynamic and stochastic. The method does not use a statistical ranking and selection technique to account for the stochastic nature of the problems it solves. The research in this thesis aims to investigate possible techniques that can be incorporated into the MOO CEM. The cross-entropy method for single-objective optimisation is studied first. It is applied to an interesting problem in the soil sciences and water management domain. The purpose of this was for the researcher to grasp the fundamentals of the cross-entropy method, which will be needed later in the study. The second part of the study documents an overview of multi-objective ranking and selection methods found in literature. The first method covered is the multi-objective optimal computing budget allocation algorithm. The second method extends upon the first to include the concept of an indifference-zone. Both methods aim to maximise the probability of correctly selecting the non-dominated scenarios, while intelligently allocating simulation replications to minimise required sample sizes. These techniques are applied to two problems that are represented by simulation models, namely the buffer allocation problem and a classic single-commodity inventory problem. Performance is measured using the hyperarea indicator and Mann-Whitney U-tests. It was found that the two techniques have significantly different performances, although this could be due to the different number of solutions in the Pareto set. In the third part of the document, the aforementioned multi-objective ranking and selection techniques are incorporated into the MOO CEM. Once again, the buffer allocation problem and the inventory problem were chosen as test problems. The results were compared to experiments where the MOO CEM without ranking and selection was used. Results show that the MOO CEM with ranking and selection has various affects on different problems. Investigating the possibility of incorporating ranking and selection differently in the MOO CEM is recommended as future research. Additionally, the combined algorithm should be tested on more stochastic problems. / AFRIKAANSE OPSOMMING: 'n Metode vir meerdoelige optimering wat gebruik maak van die kruisentropie- metode (MOO CEM) is onlangs deur Bekker & Aldrich (2010) en Bekker (2012) ontwikkel. Die metode mik om die nie-gedomineerde oplossings van meerdoelige probleme te identifiseer, wat dikwels dinamies en stogasties is. Die metode maak nie gebruik van 'n statistiese orden-en-kies tegniek om die stogastiese aard van die problem aan te spreek nie. Die navorsing in hierdie tesis poog om moontlike tegnieke wat in die MOO CEM opgeneem kan word, te ondersoek. Die kruis-entropie-metode vir enkeldoelwit optimering is eerste bestudeer. Dit is toegepas op 'n interessante probleem in die grondwetenskappe en waterbestuur domein. Die doel hiervan was om die navorser die grondbeginsels van die kruis-entropie metode te help verstaan, wat later in die studie benodig sal word. Die tweede gedeelte van die studie verskaf 'n oorsig van meerdoelige orden-en-kies metodes wat in die literatuur aangetref word. Die eerste metode wat bespreek word, is die optimale toedeling van rekenaarbegroting vir multi-doelwit optimering algoritme. Die tweede metode brei uit oor die eerste metode wat die konsep van 'n neutrale sone insluit. Beide metodes streef daarna om die waarskynlikheid dat die nie-gedomineerde oplossings korrek gekies word te maksimeer, terwyl dit ook steekproefgroottes probeer minimeer deur die aantal simulasieherhalings intelligent toe te ken. Hierdie tegnieke word toegepas op twee probleme wat verteenwoordig word deur simulasiemodelle, naamlik die buffer-toedelingsprobleem en 'n klassieke enkelitem voorraadprobleem. Die prestasie van die algoritmes word deur middel van die hiperarea-aanwyser en Mann Whitney U-toetse gemeet. Daar is gevind dat die twee tegnieke aansienlik verskillend presteer, alhoewel dit as gevolg van die verskillende aantal oplossings in die Pareto versameling kan wees. In die derde gedeelte van die dokument, is die bogenoemde meerdoelige orden-en-kies tegnieke in die MOO CEM geïnkorporeer. Weereens is die buffer-toedelingsprobleem en die voorraadprobleem as toetsprobleme gekies. Die resultate was met die eksperimente waar die MOO CEM sonder orden-en-kies gebruik is, vergelyk. Resultate toon dat vir verskillende probleme, tree die MOO CEM met orden-en-kies anders op. 'n Ondersoek oor 'n alternatiewe manier om orden-en-kies met die MOO CEM te integreer is as toekomstige navorsing voorgestel. Bykomend moet die gekombineerde algoritme op meer stogastiese probleme getoets word.
43

A Nonparametric Test for the Non-Decreasing Alternative in an Incomplete Block Design

Ndungu, Alfred Mungai January 2011 (has links)
The purpose of this paper is to present a new nonparametric test statistic for testing against ordered alternatives in a Balanced Incomplete Block Design (BIBD). This test will then be compared with the Durbin test which tests for differences between treatments in a BIBD but without regard to order. For the comparison, Monte Carlo simulations were used to generate the BIBD. Random samples were simulated from: Normal Distribution; Exponential Distribution; T distribution with three degrees of freedom. The number of treatments considered was three, four and five with all the possible combinations necessary for a BIBD. Small sample sizes were 20 or less and large sample sizes were 30 or more. The powers and alpha values were then estimated after 10,000 repetitions.The results of the study show that the new test proposed is more powerful than the Durbin test. Regardless of the distribution, sample size or number of treatments, the new test tended to have higher powers than the Durbin test.
44

The rank analysis of triple comparisons

Pendergrass, Robert Nixon 12 March 2013 (has links)
General extensions of the probability model for paired comparisons, which was developed by R. A. Bradley and M. E. Terry, are considered. Four generalizations to triple comparisons are discussed. One of these models is used to develop methods of analysis of data obtained from the ranks of items compared in groups of size three. / Ph. D.
45

The curve through the expected values of order statistics with special reference to problems in nonparametric tests of hypotheses

Chow, Bryant January 1965 (has links)
The expected value ot the s<sup>th</sup> largest ot n ranked variates from a population with probability density f(x) occurs often in the statistical literature and especially in the theory of nonparametric statistics. A new expression for this value will be obtained tor any underlying density f(x) but emphasis will be placed on normal scores. A finite series representation, the individual terms of which are easy to calculate, will be obtained for the sum of squares of normal scores. The derivation of this series demonstrates a technique which can also be used to obtain the expected value of Fisher's measure or correlation as well as the expected value of the Fisher-Yates test statistic under an alternative hypothesis. / Ph. D.
46

Optimal randomized and non-randomized procedures for multinomial selection problems

Tollefson, Eric Sander 20 March 2012 (has links)
Multinomial selection problem procedures are ranking and selection techniques that aim to select the best (most probable) alternative based upon a sequence of multinomial observations. The classical formulation of the procedure design problem is to find a decision rule for terminating sampling. The decision rule should minimize the expected number of observations taken while achieving a specified indifference zone requirement on the prior probability of making a correct selection when the alternative configurations are in a particular subset of the probability space called the preference zone. We study the constrained version of the design problem in which there is a given maximum number of allowed observations. Numerous procedures have been proposed over the past 50 years, all of them suboptimal. In this thesis, we find via linear programming the optimal selection procedure for any given probability configuration. The optimal procedure turns out to be necessarily randomized in many cases. We also find via mixed integer programming the optimal non-randomized procedure. We demonstrate the performance of the methodology on a number of examples. We then reformulate the mathematical programs to make them more efficient to implement, thereby significantly expanding the range of computationally feasible problems. We prove that there exists an optimal policy which has at most one randomized decision point and we develop a procedure for finding such a policy. We also extend our formulation to replicate existing procedures. Next, we show that there is very little difference between the relative performances of the optimal randomized and non-randomized procedures. Additionally, we compare existing procedures using the optimal procedure as a benchmark, and produce updated tables for a number of those procedures. Then, we develop a methodology that guarantees the optimal randomized and non-randomized procedures for a broad class of variable observation cost functions -- the first of its kind. We examine procedure performance under a variety of cost functions, demonstrating that incorrect assumptions regarding marginal observation costs may lead to increased total costs. Finally, we investigate and challenge key assumptions concerning the indifference zone parameter and the conditional probability of correct selection, revealing some interesting implications.
47

Stability analysis of feature selection approaches with low quality data

Unknown Date (has links)
One of the greatest challenges to data mining is erroneous or noisy data. Several studies have noted the weak performance of classification models trained from low quality data. This dissertation shows that low quality data can also impact the effectiveness of feature selection, and considers the effect of class noise on various feature ranking techniques. It presents a novel approach to feature ranking based on ensemble learning and assesses these ensemble feature selection techniques in terms of their robustness to class noise. It presents a noise-based stability analysis that measures the degree of agreement between a feature ranking techniques output on a clean dataset versus its outputs on the same dataset but corrupted with different combinations of noise level and noise distribution. It then considers classification performances from models built with a subset of the original features obtained after applying feature ranking techniques on noisy data. It proposes the focused ensemble feature ranking as a noise-tolerant approach to feature selection and compares focused ensembles with general ensembles in terms of the ability of the selected features to withstand the impact of class noise when used to build classification models. Finally, it explores three approaches for addressing the combined problem of high dimensionality and class imbalance. Collectively, this research shows the importance of considering class noise when performing feature selection. / by Wilker Altidor. / Thesis (Ph.D.)--Florida Atlantic University, 2011. / Includes bibliography. / Electronic reproduction. Boca Raton, Fla., 2011. Mode of access: World Wide Web.
48

Feature selection techniques and applications in bioinformatics

Unknown Date (has links)
Possibly the largest problem when working in bioinformatics is the large amount of data to sift through to find useful information. This thesis shows that the use of feature selection (a method of removing irrelevant and redundant information from the dataset) is a useful and even necessary technique to use in these large datasets. This thesis also presents a new method in comparing classes to each other through the use of their features. It also provides a thorough analysis of the use of various feature selection techniques and classifier in different scenarios from bioinformatics. Overall, this thesis shows the importance of the use of feature selection in bioinformatics. / by David Dittman. / Thesis (M.S.C.S.)--Florida Atlantic University, 2011. / Includes bibliography. / Electronic reproduction. Boca Raton, Fla., 2011. Mode of access: World Wide Web.
49

Uncertain data management. / CUHK electronic theses & dissertations collection

January 2011 (has links)
In this thesis, we explore the issues of uncertain data management in several different aspects. First, we propose a novel linear time algorithm to compute the positional probability, the computation of which is a primitive operator for most of the ranking definitions. Our algorithm is based on the conditional probability formulation of positional probability and the system of linear equations. Based on the formulation of conditional probability, we also prove a tight upper bound of the top-k probability of tuples, which is then used to stop the top-k computation earlier. Second, we study top-k probabilistic ranking queries with joins when scores and probabilities are stored in different relations. We focus on reducing the join cost in probabilistic top-k ranking. We investigate two probabilistic score functions, namely, expected rank value and probability of highest ranking. We give upper/lower bounds of such probabilistic score functions in random access and sequential access, and propose new I/O efficient algorithms to find top-k objects. Third, we extend the possible worlds semantics to probabilistic XML ranking query, which is to rank top-k probabilities of the answers of a twig query in probabilistic XML data. The new challenge is how to compute top-k probabilities of answers of a twig query in probabilistic XML in the presence of containment (ancestor/descendant) relationships. We focus on node queries first, and propose a new dynamic programming algorithm which can compute top-k probabilities for the answers of node queries based on the previously computed results in probabilistic XML data. We further propose optimization techniques to share the computational cost. We also show techniques to support path queries and tree queries. Fourth, we study how to rank documents using a set of keywords, given a context that is associated with the documents. We model the problem using a graph with two different kinds of nodes (document nodes and multi-attribute nodes), where the edges between document nodes and multi-attribute nodes exist with some probability. We discuss its score function, cost function, and ranking with uncertainty. We also propose new algorithms to rank documents that are most related to the user-given keywords by integrating the context information. / Uncertain data management has received a lot of attentions recently due to the fact that data obtained can be incomplete or uncertain in many real applications. Ranking of uncertain data becomes an important research issue, the possible worlds semantics-based ranking makes it different from the ranking of deterministic data. In the traditional deterministic data, we can compute a score for each object, and then the objects are ranked based on the computed scores. However, in the scenario of uncertain data, each object has a probability to be the true answer (or the existence probability), besides the computed score. A probabilistic top-k ranking query ranks objects by the interplay of score and probability based on the possible worlds semantics. Many definitions have been proposed in the literature based on the possible worlds semantics. / Chang, Lijun. / Advisers: Hong Cheng; Jeffrey Xu Yu. / Source: Dissertation Abstracts International, Volume: 73-06, Section: B, page: . / Thesis (Ph.D.)--Chinese University of Hong Kong, 2011. / Includes bibliographical references (leaves 131-139). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [201-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract also in Chinese.
50

Variance Estimation in Steady-State Simulation, Selecting the Best System, and Determining a Set of Feasible Systems via Simulation

Batur, Demet 11 April 2006 (has links)
In this thesis, we first present a variance estimation technique based on the standardized time series methodology for steady-state simulations. The proposed variance estimator has competitive bias and variance compared to the existing estimators in the literature. We also present the technique of rebatching to further reduce the bias and variance of our variance estimator. Second, we present two fully sequential indifference-zone procedures to select the best system from a number of competing simulated systems when best is defined by the maximum or minimum expected performance. These two procedures have parabola shaped continuation regions rather than the triangular continuation regions employed in several papers. The rocedures we present accommodate unequal and unknown ariances across systems and the use of common random numbers. However, we assume that basic observations are independent and identically normally distributed. Finally, we present procedures for finding a set of feasible or near-feasible systems among a finite number of simulated systems in the presence of multiple stochastic constraints, especially when the number of systems or constraints is large.

Page generated in 0.1338 seconds