Global ETD Search

1	Generalization error rates for margin-based classifiers Park, Changyi 24 August 2005 (has links) No description available. Statistics Bayes risk classification consistency convex nonconvex
2	Stochastic Motion Planning for Applications in Subsea Survey and Area Protection Bays, Matthew Jason 24 April 2012 (has links) This dissertation addresses high-level path planning and cooperative control for autonomous vehicles. The objective of our work is to closely and rigorously incorporate classication and detection performance into path planning algorithms, which is not addressed with typical approaches found in literature. We present novel path planning algorithms for two different applications in which autonomous vehicles are tasked with engaging targets within a stochastic environment. In the first application an autonomous underwater vehicle (AUV) must reacquire and identify clusters of discrete underwater objects. Our planning algorithm ensures that mission objectives are met with a desired probability of success. The utility of our approach is verified through field trials. In the second application, a team of vehicles must intercept mobile targets before the targets enter a specified area. We provide a formal framework for solving the second problem by jointly minimizing a cost function utilizing Bayes risk. / Ph. D. Classification Bayes Risk Detection Robotics Autonomous Underwater Vehicles Decision Theory Path Planning
3	Probabilistic inference for phrase-based machine translation : a sampling approach Arun, Abhishek January 2011 (has links) Recent advances in statistical machine translation (SMT) have used dynamic programming (DP) based beam search methods for approximate inference within probabilistic translation models. Despite their success, these methods compromise the probabilistic interpretation of the underlying model thus limiting the application of probabilistically defined decision rules during training and decoding. As an alternative, in this thesis, we propose a novel Monte Carlo sampling approach for theoretically sound approximate probabilistic inference within these models. The distribution we are interested in is the conditional distribution of a log-linear translation model; however, often, there is no tractable way of computing the normalisation term of the model. Instead, a Gibbs sampling approach for phrase-based machine translation models is developed which obviates the need of computing this term yet produces samples from the required distribution. We establish that the sampler effectively explores the distribution defined by a phrase-based models by showing that it converges in a reasonable amount of time to the desired distribution, irrespective of initialisation. Empirical evidence is provided to confirm that the sampler can provide accurate estimates of expectations of functions of interest. The mix of high probability and low probability derivations obtained through sampling is shown to provide a more accurate estimate of expectations than merely using the n-most highly probable derivations. Subsequently, we show that the sampler provides a tractable solution for finding the maximum probability translation in the model. We also present a unified approach to approximating two additional intractable problems: minimum risk training and minimum Bayes risk decoding. Key to our approach is the use of the sampler which allows us to explore the entire probability distribution and maintain a strict probabilistic formulation through the translation pipeline. For these tasks, sampling allies the simplicity of n-best list approaches with the extended view of the distribution that lattice-based approaches benefit from, while avoiding the biases associated with beam search. Our approach is theoretically well-motivated and can give better and more stable results than current state of the art methods. 004.01
4	Tamanho amostral para estimar a concentração de organismos em água de lastro: uma abordagem bayesiana / Sample size for estimating the organism concentration in ballast water: a Bayesian approach Costa, Eliardo Guimarães da 05 June 2017 (has links) Metodologias para obtenção do tamanho amostral para estimar a concentração de organismos em água de lastro e verificar normas internacionais são desenvolvidas sob uma abordagem bayesiana. Consideramos os critérios da cobertura média, do tamanho médio e da minimização do custo total sob os modelos Poisson com distribuição a priori gama e binomial negativo com distribuição a priori Pearson Tipo VI. Além disso, consideramos um processo Dirichlet como distribuição a priori no modelo Poisson com o propósito de obter maior flexibilidade e robustez. Para fins de aplicação, implementamos rotinas computacionais usando a linguagem R. / Sample size methodologies for estimating the organism concentration in ballast water and for verifying international standards are developed under a Bayesian approach. We consider the criteria of average coverage, of average length and of total cost minimization under the Poisson model with a gamma prior distribution and the negative binomial model with a Pearson type VI prior distribution. Furthermore, we consider a Dirichlet process as a prior distribution in the Poisson model with the purpose to gain more flexibility and robustness. For practical applications, we implemented computational routines using the R language. Average coverage criterion Average length criterion Bayes risk Critério da cobertura média Critério do comprimento médio Dirichlet process Distribuição binomial negativa Distribuição Poisson Negative binomial distribution Poisson distribution Processo Dirichlet Risco de Bayes
5	Procedimentos sequenciais Bayesianos aplicados ao processo de captura-recaptura Santos, Hugo Henrique Kegler dos 30 May 2014 (has links) Made available in DSpace on 2016-06-02T20:04:52Z (GMT). No. of bitstreams: 1 6306.pdf: 1062380 bytes, checksum: de31a51e2d0a59e52556156a08c37b41 (MD5) Previous issue date: 2014-05-30 / Financiadora de Estudos e Projetos / In this work, we make a study of the Bayes sequential decision procedure applied to capture-recapture with fixed sample sizes, to estimate the size of a finite and closed population process. We present the statistical model, review the Bayesian decision theory, presenting the pure decision problem, the statistical decision problem and the sequential decision procedure. We illustrate the theoretical methods discussed using simulated data. / Neste trabalho, fazemos um estudo do procedimento de decisão sequencial de Bayes aplicado ao processo de captura-recaptura com tamanhos amostrais fixados, para estimação do tamanho de uma população finita e fechada. Apresentamos o modelo estatístico, revisamos a teoria de decisão bayesiana, apresentando o problema de decisão puro, o problema de decisão estatística e o procedimento de decisão sequencial. Ilustramos os métodos teóricos discutidos através de dados simulados. Probabilidades Processo sequencial de capturarecaptura Estimadores de Bayes Risco de Bayes Teoria da decisão Amostragem sequencial Capture-recapture process Bayesians estimators Bayes risk Decision theory Sequential sampling
6	Tamanho amostral para estimar a concentração de organismos em água de lastro: uma abordagem bayesiana / Sample size for estimating the organism concentration in ballast water: a Bayesian approach Eliardo Guimarães da Costa 05 June 2017 (has links) Metodologias para obtenção do tamanho amostral para estimar a concentração de organismos em água de lastro e verificar normas internacionais são desenvolvidas sob uma abordagem bayesiana. Consideramos os critérios da cobertura média, do tamanho médio e da minimização do custo total sob os modelos Poisson com distribuição a priori gama e binomial negativo com distribuição a priori Pearson Tipo VI. Além disso, consideramos um processo Dirichlet como distribuição a priori no modelo Poisson com o propósito de obter maior flexibilidade e robustez. Para fins de aplicação, implementamos rotinas computacionais usando a linguagem R. / Sample size methodologies for estimating the organism concentration in ballast water and for verifying international standards are developed under a Bayesian approach. We consider the criteria of average coverage, of average length and of total cost minimization under the Poisson model with a gamma prior distribution and the negative binomial model with a Pearson type VI prior distribution. Furthermore, we consider a Dirichlet process as a prior distribution in the Poisson model with the purpose to gain more flexibility and robustness. For practical applications, we implemented computational routines using the R language. Critério da cobertura média Critério do comprimento médio Distribuição binomial negativa Distribuição Poisson Processo Dirichlet Risco de Bayes Average coverage criterion Average length criterion Bayes risk Dirichlet process Negative binomial distribution Poisson distribution
7	Boundary uncertainty-based classifier evaluation / 境界曖昧性に基づく分類器評価 / キョウカイアイマイセイニモトズクブンルイキヒョウカアデイビッド, David Ha 20 September 2019 (has links) 種々の分類器を対象として，有限個の学習データのみが利用可能である現実においても理論的に的確で計算量的にも実際的な，分類器性能評価手法を提案する．分類器評価における難しさは，有限データのみの利用に起因する分類誤り推定に伴う偏りの発生にある．この困難を解決するため，「境界曖昧性」と呼ばれる新しい評価尺度を提案し，それを用いる評価法の有用性を，3種の分類器と13個のデータセットを用いた実験を通して実証する． / We propose a general method that makes accurate evaluation of any classifier model for realistic tasks, both in a theoretical sense despite the finiteness of the available data, and in a practical sense in terms of computation costs. The classifier evaluation challenge arises from the bias of the classification error estimate that is only based on finite data. We bypass this existing difficulty by proposing a new classifier evaluation measure called "boundary uncertainty'' whose estimate based on finite data can be considered a reliable representative of its expectation based on infinite data, and demonstrate the potential of our approach on three classifier models and thirteen datasets. / 博士(工学) / Doctor of Philosophy in Engineering / 同志社大学 / Doshisha University パターン分類ベイズリスク分類器評価分類器選択分類境界境界曖昧性 Pattern classification Bayes risk Classifier evaluation Classifier selection Classification boundary Boundary uncertainty
8	On the effective deployment of current machine translation technology González Rubio, Jesús 03 June 2014 (has links) Machine translation is a fundamental technology that is gaining more importance each day in our multilingual society. Companies and particulars are turning their attention to machine translation since it dramatically cuts down their expenses on translation and interpreting. However, the output of current machine translation systems is still far from the quality of translations generated by human experts. The overall goal of this thesis is to narrow down this quality gap by developing new methodologies and tools that improve the broader and more efficient deployment of machine translation technology. We start by proposing a new technique to improve the quality of the translations generated by fully-automatic machine translation systems. The key insight of our approach is that different translation systems, implementing different approaches and technologies, can exhibit different strengths and limitations. Therefore, a proper combination of the outputs of such different systems has the potential to produce translations of improved quality. We present minimum Bayes¿ risk system combination, an automatic approach that detects the best parts of the candidate translations and combines them to generate a consensus translation that is optimal with respect to a particular performance metric. We thoroughly describe the formalization of our approach as a weighted ensemble of probability distributions and provide efficient algorithms to obtain the optimal consensus translation according to the widespread BLEU score. Empirical results show that the proposed approach is indeed able to generate statistically better translations than the provided candidates. Compared to other state-of-the-art systems combination methods, our approach reports similar performance not requiring any additional data but the candidate translations. Then, we focus our attention on how to improve the utility of automatic translations for the end-user of the system. Since automatic translations are not perfect, a desirable feature of machine translation systems is the ability to predict at run-time the quality of the generated translations. Quality estimation is usually addressed as a regression problem where a quality score is predicted from a set of features that represents the translation. However, although the concept of translation quality is intuitively clear, there is no consensus on which are the features that actually account for it. As a consequence, quality estimation systems for machine translation have to utilize a large number of weak features to predict translation quality. This involves several learning problems related to feature collinearity and ambiguity, and due to the ¿curse¿ of dimensionality. We address these challenges by adopting a two-step training methodology. First, a dimensionality reduction method computes, from the original features, the reduced set of features that better explains translation quality. Then, a prediction model is built from this reduced set to finally predict the quality score. We study various reduction methods previously used in the literature and propose two new ones based on statistical multivariate analysis techniques. More specifically, the proposed dimensionality reduction methods are based on partial least squares regression. The results of a thorough experimentation show that the quality estimation systems estimated following the proposed two-step methodology obtain better prediction accuracy that systems estimated using all the original features. Moreover, one of the proposed dimensionality reduction methods obtained the best prediction accuracy with only a fraction of the original features. This feature reduction ratio is important because it implies a dramatic reduction of the operating times of the quality estimation system. An alternative use of current machine translation systems is to embed them within an interactive editing environment where the system and a human expert collaborate to generate error-free translations. This interactive machine translation approach have shown to reduce supervision effort of the user in comparison to the conventional decoupled post-edition approach. However, interactive machine translation considers the translation system as a passive agent in the interaction process. In other words, the system only suggests translations to the user, who then makes the necessary supervision decisions. As a result, the user is bound to exhaustively supervise every suggested translation. This passive approach ensures error-free translations but it also demands a large amount of supervision effort from the user. Finally, we study different techniques to improve the productivity of current interactive machine translation systems. Specifically, we focus on the development of alternative approaches where the system becomes an active agent in the interaction process. We propose two different active approaches. On the one hand, we describe an active interaction approach where the system informs the user about the reliability of the suggested translations. The hope is that this information may help the user to locate translation errors thus improving the overall translation productivity. We propose different scores to measure translation reliability at the word and sentence levels and study the influence of such information in the productivity of an interactive machine translation system. Empirical results show that the proposed active interaction protocol is able to achieve a large reduction in supervision effort while still generating translations of very high quality. On the other hand, we study an active learning framework for interactive machine translation. In this case, the system is not only able to inform the user of which suggested translations should be supervised, but it is also able to learn from the user-supervised translations to improve its future suggestions. We develop a value-of-information criterion to select which automatic translations undergo user supervision. However, given its high computational complexity, in practice we study different selection strategies that approximate this optimal criterion. Results of a large scale experimentation show that the proposed active learning framework is able to obtain better compromises between the quality of the generated translations and the human effort required to obtain them. Moreover, in comparison to a conventional interactive machine translation system, our proposal obtained translations of twice the quality with the same supervision effort. / González Rubio, J. (2014). On the effective deployment of current machine translation technology [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/37888 Statistical machine translation Minimum Bayes' Risk System combination Partial least squares regression Quality estimation Confidence measures Interactive machine translation Interactive translation prediction Active Interaction Active learning Online learning ESTADISTICA E INVESTIGACION OPERATIVA LENGUAJES Y SISTEMAS INFORMATICOS

Search results