• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 159
  • 45
  • 32
  • 16
  • 4
  • 4
  • 4
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 311
  • 311
  • 79
  • 53
  • 52
  • 49
  • 44
  • 42
  • 42
  • 42
  • 35
  • 34
  • 32
  • 28
  • 25
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
261

Model-based clustering and model selection for binned data. / Classification automatique à base de modèle et choix de modèles pour les données discrétisées

Wu, Jingwen 28 January 2014 (has links)
Cette thèse étudie les approches de classification automatique basées sur les modèles de mélange gaussiens et les critères de choix de modèles pour la classification automatique de données discrétisées. Quatorze algorithmes binned-EM et quatorze algorithmes bin-EM-CEM sont développés pour quatorze modèles de mélange gaussiens parcimonieux. Ces nouveaux algorithmes combinent les avantages des données discrétisées en termes de réduction du temps d’exécution et les avantages des modèles de mélange gaussiens parcimonieux en termes de simplification de l'estimation des paramètres. Les complexités des algorithmes binned-EM et bin-EM-CEM sont calculées et comparées aux complexités des algorithmes EM et CEM respectivement. Afin de choisir le bon modèle qui s'adapte bien aux données et qui satisfait les exigences de précision en classification avec un temps de calcul raisonnable, les critères AIC, BIC, ICL, NEC et AWE sont étendus à la classification automatique de données discrétisées lorsque l'on utilise les algorithmes binned-EM et bin-EM-CEM proposés. Les avantages des différentes méthodes proposées sont illustrés par des études expérimentales. / This thesis studies the Gaussian mixture model-based clustering approaches and the criteria of model selection for binned data clustering. Fourteen binned-EM algorithms and fourteen bin-EM-CEM algorithms are developed for fourteen parsimonious Gaussian mixture models. These new algorithms combine the advantages in computation time reduction of binning data and the advantages in parameters estimation simplification of parsimonious Gaussian mixture models. The complexities of the binned-EM and the bin-EM-CEM algorithms are calculated and compared to the complexities of the EM and the CEM algorithms respectively. In order to select the right model which fits well the data and satisfies the clustering precision requirements with a reasonable computation time, AIC, BIC, ICL, NEC, and AWE criteria, are extended to binned data clustering when the proposed binned-EM and bin-EM-CEM algorithms are used. The advantages of the different proposed methods are illustrated through experimental studies.
262

Análise e comparação de alguns métodos alternativos de seleção de variáveis preditoras no modelo de regressão linear / Analysis and comparison of some alternative methods of selection of predictor variables in linear regression models.

Matheus Augustus Pumputis Marques 04 June 2018 (has links)
Neste trabalho estudam-se alguns novos métodos de seleção de variáveis no contexto da regressão linear que surgiram nos últimos 15 anos, especificamente o LARS - Least Angle Regression, o NAMS - Noise Addition Model Selection, a Razão de Falsa Seleção - RFS (FSR em inglês), o LASSO Bayesiano e o Spike-and-Slab LASSO. A metodologia foi a análise e comparação dos métodos estudados e aplicações. Após esse estudo, realizam-se aplicações em bases de dados reais e um estudo de simulação, em que todos os métodos se mostraram promissores, com os métodos Bayesianos apresentando os melhores resultados. / In this work, some new variable selection methods that have appeared in the last 15 years in the context of linear regression are studied, specifically the LARS - Least Angle Regression, the NAMS - Noise Addition Model Selection, the False Selection Rate - FSR, the Bayesian LASSO and the Spike-and-Slab LASSO. The methodology was the analysis and comparison of the studied methods. After this study, applications to real data bases are made, as well as a simulation study, in which all methods are shown to be promising, with the Bayesian methods showing the best results.
263

Monte Carlo simulation studies in log-symmetric regressions / Estudos de simulação de Monte Carlo em regressões log- simétricas

Ventura, Marcelo dos Santos 09 March 2018 (has links)
Submitted by Franciele Moreira (francielemoreyra@gmail.com) on 2018-03-29T12:30:01Z No. of bitstreams: 2 Dissertação - Marcelo dos Santos Ventura - 2018.pdf: 4739813 bytes, checksum: 52211670f6e17c893ffd08843056f075 (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) / Approved for entry into archive by Luciana Ferreira (lucgeral@gmail.com) on 2018-03-29T13:40:08Z (GMT) No. of bitstreams: 2 Dissertação - Marcelo dos Santos Ventura - 2018.pdf: 4739813 bytes, checksum: 52211670f6e17c893ffd08843056f075 (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) / Made available in DSpace on 2018-03-29T13:40:08Z (GMT). No. of bitstreams: 2 Dissertação - Marcelo dos Santos Ventura - 2018.pdf: 4739813 bytes, checksum: 52211670f6e17c893ffd08843056f075 (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) Previous issue date: 2018-03-09 / Fundação de Amparo à Pesquisa do Estado de Goiás - FAPEG / This work deals with two Monte Carlo simulation studies in log-symmetric regression models, which are particularly useful for the cases when the response variable is continuous, strictly positive and asymmetric, with the possibility of the existence of atypical observations. In log- symmetric regression models, the distribution of the random errors multiplicative belongs to the log-symmetric class, which encompasses log-normal, log- Student-t, log-power- exponential, log-slash, log-hyperbolic distributions, among others. The first simulation study has as objective to examine the performance for the maximum-likelihood estimators of the model parameters, where various scenarios are considered. The objective of the second simulation study is to investigate the accuracy of popular information criteria as AIC, BIC, HQIC and their respective corrected versions. As illustration, a movie data set obtained and assembled for this dissertation is analyzed to compare log-symmetric models with the normal linear model and to obtain the best model by using the mentioned information criteria. / Este trabalho aborda dois estudos de simulação de Monte Carlo em modelos de regressão log- simétricos, os quais são particularmente úteis para os casos em que a variável resposta é contínua, estritamente positiva e assimétrica, com possibilidade da existência de observações atípicas. Nos modelos de regressão log-simétricos, a distribuição dos erros aleatórios multiplicativos pertence à classe log-simétrica, a qual engloba as distribuições log-normal, log-Student- t, log-exponencial- potência, log-slash, log-hyperbólica, entre outras. O primeiro estudo de simulação tem como objetivo examinar o desempenho dos estimadores de máxima verossimilhança desses modelos, onde vários cenários são considerados. No segundo estudo de simulação o objetivo é investigar a eficácia critérios de informação populares como AIC, BIC, HQIC e suas respectivas versões corrigidas. Como ilustração, um conjunto de dados de filmes obtido e montado para essa dissertação é analisado para comparar os modelos de regressão log-simétricos com o modelo linear normal e para obter o melhor modelo utilizando os critérios de informação mencionados.
264

Padrões de diversidade em comunidades de aves relacionadaos a varáveis de habitat em campos temperados do sudeste da América do Sul

Dias, Rafael Antunes January 2013 (has links)
Indivíduos, populações e espécies tendem a usar e selecionar habitats de modo não-aleatório. Consequentemente, a perda e a degradação de habitats geram impactos distintos sobre os organismos dependendo de seus atributos. Os efeitos da perda de habitat são claros – os organismos são eliminados ou desalojados por falta de habitat ou baixo sucesso reprodutivo. As consequências da degradação de habitat são mais sutis, e resultam na incapacidade de um ecossistema sustentar determinadas espécies. Como a perda e a degradação de habitat reduzem a disponibilidade de nichos, espera-se que táxons ecologicamente especializados e com requerimentos estreitos de nicho sejam mais propensos à extinção que generalistas. Organismos que são negativamente afetados por perda e degradação de habitat em geral exibem porte muito grande ou muito pequeno, baixa mobilidade, baixa fecundidade, reduzido recrutamento e estreitos requerimentos de nicho. Campos temperados constituem ambientes particularmente afetados por perda e degradação de habitat. No sudeste da América do Sul, como em muitas outras regiões do planeta, a expansão da agricultura e silvicultura são os principais responsáveis pela perda de habitat campestre. Os remanescentes de vegetação natural são usados para criação de gado, estando sujeitos à degradação pelo sobrepastejo, pisoteio e técnicas de manejo. Avaliar como a perda e degradação de habitat afetam a diversidade de organismos campestres é vital para o desenvolvimento de estratégias de conservação e manejo. A presente tese tem por objetivo investigar como a degradação e perda de habitat induzidas pela pecuária e silvicultura afetam a diversidade e a composição das comunidades de aves. Inicialmente, exploramos as relações entre variáveis de habitat e a composição da comunidade de aves num gradiente de altura da vegetação determinado por pastejo em campos litorâneos do Rio Grande do Sul. Posteriormente, avaliamos como variações no relevo interagem com variáveis de habitat e afetam a diversidade de aves em áreas de pecuária na Campanha gaúcha. Finalmente, avaliamos de que forma a perda de habitat resultante do estabelecimento de plantações industriais de celulose em áreas de campo afeta a composição de comunidades de aves campestres. Nossos resultados demonstram que a degradação de habitat decorrente do manejo de gado em campo nativo afeta a comunidade de aves de forma diferencial. Aves adaptadas a campos ralos ou generalistas tendem a ser beneficiadas pelo pastejo, ao passo que as espécies associadas à vegetação alta e densa são desfavorecidas. As variações na topografia reduzem os impactos da degradação de habitat nos campos. Essas variações interagem com o habitat e afetam de forma diferencial os distintos componentes da diversidade. Por outro lado, a perda de habitat decorrente da silvicultura gera um impacto de maior magnitude, alterando a composição das comunidades de aves e favorecendo aves não-campestres. Nesse contexto, impedir que novas áreas de campo nativo sejam convertidas em plantações de árvores passa a ser imperativo. Embora o manejo do gado aumente a diversidade em nível de paisagem ao criar um mosaico de manchas de vegetação de alturas distintas, maior atenção deve ser dada à manutenção e recuperação de formações densas de herbáceas de grande porte. Isso somente pode ser assegurado através de mudanças no regime do pastejo ou do desenvolvimento de técnicas de manejo alternativas. / Individuals, populations and species tend to select habitats in a non-random way. Consequently, habitat loss and degradation will have different impacts on organisms according to their traits. The effects of habitat loss are straightforward – organisms are eliminated or displaced because of the inexistence of adequate habitat or of low breeding success. Effects of habitat degradation are more subtle and result in the reduction of the capacity of an ecosystem to support some subsets of species. Since habitat loss and degradation reduce niche availability, ecologically specialized taxa with narrow niche requirements are expected to be more extinction prone than habitat generalists. Temperate grasslands have been strongly impacted by habitat loss and degradation. In southeastern South America, the expansion of agriculture and industrial pulpwood plantations are the main sources of habitat loss. Remnants of natural grassland vegetation are used for livestock ranching, being subject to habitat degradation from overgrazing, trampling and inadequate management techniques. The evaluation of how habitat loss and degradation affect the diversity of grassland organisms is vital for the development of management and conservation techniques. The main goal of this thesis is to evaluate how habitat degradation and loss related to cattle ranching and pulpwood plantations affect the diversity and composition of bird communities. We began by exploring the relationship between habitat variables and the composition of the bird community along a gradient of vegetation height determined by grazing in coastal grasslands of the state of Rio Grande do Sul. We then assessed how variations in the relief interact with habitat variables e affect the diversity of birds in rangelands of the Campanha gaúcha. Finally, we evaluated how habitat loss related with grassland afforestation for pulpwood plantations affects the composition of grassland bird communities. Our results demonstrate that habitat degradation resulting from livestock ranching in natural grasslands affects bird communities in a differential way. Birds adapted to stunted grasslands or habitat generalists tend to benefit from grazing, whereas tall-grass specialists are negatively affected. Variations in topography are responsible for reducing the impacts of habitat degradation in grasslands. These variations interact with habitat and have a differential effect on distinct components of diversity. On the other hand, the magnitude of the impact of habitat loss from afforestation is larger, altering the composition of bird communities and favoring a series of non-grassland species. In this sense, protecting remaining grasslands from afforestation is imperative. Although cattle ranching increases diversity at the landscape level by creating a mosaic of vegetation patches of different height, more attention should be given in maintaining and recovering dense formations of tall grassland plants. This can only be achieved by changing grazing regimes or developing alternative management techniques.
265

Assessing reservoir performance and modeling risk using real options

Singh, Harpreet 02 August 2012 (has links)
Reservoir economic performance is based upon future cash flows which can be generated from a reservoir. Future cash flows are a function of hydrocarbon volumetric flow rates which a reservoir can produce, and the market conditions. Both of these functions of future cash flows are associated with uncertainties. There is uncertainty associated in estimates of future hydrocarbon flow rates due to uncertainty in geological model, limited availability and type of data, and the complexities involved in the reservoir modeling process. The second source of uncertainty associated with future cash flows come from changing oil prices, rate of return etc., which are all functions of market dynamics. Robust integration of these two sources of uncertainty, i.e. future hydrocarbon flow rates and market dynamics, in a model to predict cash flows from a reservoir is an essential part of risk assessment, but a difficult task. Current practices to assess a reservoir’s economic performance by using Deterministic Cash Flow (DCF) methods have been unsuccessful in their predictions because of lack in parametric capability to robustly and completely incorporate these both types of uncertainties. This thesis presents a procedure which accounts for uncertainty in hydrocarbon production forecasts due to incomplete geologic information, and a novel real options methodology to assess the project economics for upstream petroleum industry. The modeling approach entails determining future hydrocarbon production rates due to incomplete geologic information with and without secondary information. The price of hydrocarbons is modeled separately, and the costs to produce them are determined based on market dynamics. A real options methodology is used to assess the effective cash flows from the reservoir, and hence, to determine the project economics. This methodology associates realistic probabilities, which are quantified using the method’s parameters, with benefits and costs. The results from this methodology are compared against the results from DCF methodology to examine if the real options methodology can identify some hidden potential of a reservoir’s performance which DCF might not be able to uncover. This methodology is then applied to various case studies and strategies for planning and decision making. / text
266

Contributions to quality improvement methodologies and computer experiments

Tan, Matthias H. Y. 16 September 2013 (has links)
This dissertation presents novel methodologies for five problem areas in modern quality improvement and computer experiments, i.e., selective assembly, robust design with computer experiments, multivariate quality control, model selection for split plot experiments, and construction of minimax designs. Selective assembly has traditionally been used to achieve tight specifications on the clearance of two mating parts. Chapter 1 proposes generalizations of the selective assembly method to assemblies with any number of components and any assembly response function, called generalized selective assembly (GSA). Two variants of GSA are considered: direct selective assembly (DSA) and fixed bin selective assembly (FBSA). In DSA and FBSA, the problem of matching a batch of N components of each type to give N assemblies that minimize quality cost is formulated as axial multi-index assignment and transportation problems respectively. Realistic examples are given to show that GSA can significantly improve the quality of assemblies. Chapter 2 proposes methods for robust design optimization with time consuming computer simulations. Gaussian process models are widely employed for modeling responses as a function of control and noise factors in computer experiments. In these experiments, robust design optimization is often based on average quadratic loss computed as if the posterior mean were the true response function, which can give misleading results. We propose optimization criteria derived by taking expectation of the average quadratic loss with respect to the posterior predictive process, and methods based on the Lugannani-Rice saddlepoint approximation for constructing accurate credible intervals for the average loss. These quantities allow response surface uncertainty to be taken into account in the optimization process. Chapter 3 proposes a Bayesian method for identifying mean shifts in multivariate normally distributed quality characteristics. Multivariate quality characteristics are often monitored using a few summary statistics. However, to determine the causes of an out-of-control signal, information about which means shifted and the directions of the shifts is often needed. We propose a Bayesian approach that gives this information. For each mean, an indicator variable that indicates whether the mean shifted upwards, shifted downwards, or remained unchanged is introduced. Default prior distributions are proposed. Mean shift identification is based on the modes of the posterior distributions of the indicators, which are determined via Gibbs sampling. Chapter 4 proposes a Bayesian method for model selection in fractionated split plot experiments. We employ a Bayesian hierarchical model that takes into account the split plot error structure. Expressions for computing the posterior model probability and other important posterior quantities that require evaluation of at most two uni-dimensional integrals are derived. A novel algorithm called combined global and local search is proposed to find models with high posterior probabilities and to estimate posterior model probabilities. The proposed method is illustrated with the analysis of three real robust design experiments. Simulation studies demonstrate that the method has good performance. The problem of choosing a design that is representative of a finite candidate set is an important problem in computer experiments. The minimax criterion measures the degree of representativeness because it is the maximum distance of a candidate point to the design. Chapter 5 proposes algorithms for finding minimax designs for finite design regions. We establish the relationship between minimax designs and the classical set covering location problem in operations research, which is a binary linear program. We prove that the set of minimax distances is the set of discontinuities of the function that maps the covering radius to the optimal objective function value, and optimal solutions at the discontinuities are minimax designs. These results are employed to design efficient procedures for finding globally optimal minimax and near-minimax designs.
267

Regularization in reinforcement learning

Farahmand, Amir-massoud Unknown Date
No description available.
268

Pairwise Classification and Pairwise Support Vector Machines

Brunner, Carl 04 June 2012 (has links) (PDF)
Several modifications have been suggested to extend binary classifiers to multiclass classification, for instance the One Against All technique, the One Against One technique, or Directed Acyclic Graphs. A recent approach for multiclass classification is the pairwise classification, which relies on two input examples instead of one and predicts whether the two input examples belong to the same class or to different classes. A Support Vector Machine (SVM), which is able to handle pairwise classification tasks, is called pairwise SVM. A common pairwise classification task is face recognition. In this area, a set of images is given for training and another set of images is given for testing. Often, one is interested in the interclass setting. The latter means that any person which is represented by an image in the training set is not represented by any image in the test set. From the mentioned multiclass classification techniques only the pairwise classification technique provides meaningful results in the interclass setting. For a pairwise classifier the order of the two examples should not influence the classification result. A common approach to enforce this symmetry is the use of selected kernels. Relations between such kernels and certain projections are provided. It is shown, that those projections can lead to an information loss. For pairwise SVMs another approach for enforcing symmetry is the symmetrization of the training sets. In other words, if the pair (a,b) of examples is a training pair then (b,a) is a training pair, too. It is proven that both approaches do lead to the same decision function for selected parameters. Empirical tests show that the approach using selected kernels is three to four times faster. For a good interclass generalization of pairwise SVMs training sets with several million training pairs are needed. A technique is presented which further speeds up the training time of pairwise SVMs by a factor of up to 130 and thus enables the learning of training sets with several million pairs. Another element affecting time is the need to select several parameters. Even with the applied speed up techniques a grid search over the set of parameters would be very expensive. Therefore, a model selection technique is introduced that is much less computationally expensive. In machine learning, the training set and the test set are created by using some data generating process. Several pairwise data generating processes are derived from a given non pairwise data generating process. Advantages and disadvantages of the different pairwise data generating processes are evaluated. Pairwise Bayes' Classifiers are introduced and their properties are discussed. It is shown that pairwise Bayes' Classifiers for interclass generalization tasks can differ from pairwise Bayes' Classifiers for interexample generalization tasks. In face recognition the interexample task implies that each person which is represented by an image in the test set is also represented by at least one image in the training set. Moreover, the set of images of the training set and the set of images of the test set are disjoint. Pairwise SVMs are applied to four synthetic and to two real world datasets. One of the real world datasets is the Labeled Faces in the Wild (LFW) database while the other one is provided by Cognitec Systems GmbH. Empirical evidence for the presented model selection heuristic, the discussion about the loss of information and the provided speed up techniques is given by the synthetic databases and it is shown that classifiers of pairwise SVMs lead to a similar quality as pairwise Bayes' classifiers. Additionally, a pairwise classifier is identified for the LFW database which leads to an average equal error rate (EER) of 0.0947 with a standard error of the mean (SEM) of 0.0057. This result is better than the result of the current state of the art classifier, namely the combined probabilistic linear discriminant analysis classifier, which leads to an average EER of 0.0993 and a SEM of 0.0051. / Es gibt verschiedene Ansätze, um binäre Klassifikatoren zur Mehrklassenklassifikation zu nutzen, zum Beispiel die One Against All Technik, die One Against One Technik oder Directed Acyclic Graphs. Paarweise Klassifikation ist ein neuerer Ansatz zur Mehrklassenklassifikation. Dieser Ansatz basiert auf der Verwendung von zwei Input Examples anstelle von einem und bestimmt, ob diese beiden Examples zur gleichen Klasse oder zu unterschiedlichen Klassen gehören. Eine Support Vector Machine (SVM), die für paarweise Klassifikationsaufgaben genutzt wird, heißt paarweise SVM. Beispielsweise werden Probleme der Gesichtserkennung als paarweise Klassifikationsaufgabe gestellt. Dazu nutzt man eine Menge von Bildern zum Training und ein andere Menge von Bildern zum Testen. Häufig ist man dabei an der Interclass Generalization interessiert. Das bedeutet, dass jede Person, die auf wenigstens einem Bild der Trainingsmenge dargestellt ist, auf keinem Bild der Testmenge vorkommt. Von allen erwähnten Mehrklassenklassifikationstechniken liefert nur die paarweise Klassifikationstechnik sinnvolle Ergebnisse für die Interclass Generalization. Die Entscheidung eines paarweisen Klassifikators sollte nicht von der Reihenfolge der zwei Input Examples abhängen. Diese Symmetrie wird häufig durch die Verwendung spezieller Kerne gesichert. Es werden Beziehungen zwischen solchen Kernen und bestimmten Projektionen hergeleitet. Zudem wird gezeigt, dass diese Projektionen zu einem Informationsverlust führen können. Für paarweise SVMs ist die Symmetrisierung der Trainingsmengen ein weiter Ansatz zur Sicherung der Symmetrie. Das bedeutet, wenn das Paar (a,b) von Input Examples zur Trainingsmenge gehört, dann muss das Paar (b,a) ebenfalls zur Trainingsmenge gehören. Es wird bewiesen, dass für bestimmte Parameter beide Ansätze zur gleichen Entscheidungsfunktion führen. Empirische Messungen zeigen, dass der Ansatz mittels spezieller Kerne drei bis viermal schneller ist. Um eine gute Interclass Generalization zu erreichen, werden bei paarweisen SVMs Trainingsmengen mit mehreren Millionen Paaren benötigt. Es wird eine Technik eingeführt, die die Trainingszeit von paarweisen SVMs um bis zum 130-fachen beschleunigt und es somit ermöglicht, Trainingsmengen mit mehreren Millionen Paaren zu verwenden. Auch die Auswahl guter Parameter für paarweise SVMs ist im Allgemeinen sehr zeitaufwendig. Selbst mit den beschriebenen Beschleunigungen ist eine Gittersuche in der Menge der Parameter sehr teuer. Daher wird eine Model Selection Technik eingeführt, die deutlich geringeren Aufwand erfordert. Im maschinellen Lernen werden die Trainingsmenge und die Testmenge von einem Datengenerierungsprozess erzeugt. Ausgehend von einem nicht paarweisen Datengenerierungsprozess werden unterschiedliche paarweise Datengenerierungsprozesse abgeleitet und ihre Vor- und Nachteile bewertet. Es werden paarweise Bayes-Klassifikatoren eingeführt und ihre Eigenschaften diskutiert. Es wird gezeigt, dass sich diese Bayes-Klassifikatoren für Interclass Generalization Aufgaben und für Interexample Generalization Aufgaben im Allgemeinen unterscheiden. Bei der Gesichtserkennung bedeutet die Interexample Generalization, dass jede Person, die auf einem Bild der Testmenge dargestellt ist, auch auf mindestens einem Bild der Trainingsmenge vorkommt. Außerdem ist der Durchschnitt der Menge der Bilder der Trainingsmenge mit der Menge der Bilder der Testmenge leer. Paarweise SVMs werden an vier synthetischen und an zwei Real World Datenbanken getestet. Eine der verwendeten Real World Datenbanken ist die Labeled Faces in the Wild (LFW) Datenbank. Die andere wurde von Cognitec Systems GmbH bereitgestellt. Die Annahmen der Model Selection Technik, die Diskussion über den Informationsverlust, sowie die präsentierten Beschleunigungstechniken werden durch empirische Messungen mit den synthetischen Datenbanken belegt. Zudem wird mittels dieser Datenbanken gezeigt, dass Klassifikatoren von paarweisen SVMs zu ähnlich guten Ergebnissen wie paarweise Bayes-Klassifikatoren führen. Für die LFW Datenbank wird ein paarweiser Klassifikator bestimmt, der zu einer durchschnittlichen Equal Error Rate (EER) von 0.0947 und einem Standard Error of The Mean (SEM) von 0.0057 führt. Dieses Ergebnis ist besser als das des aktuellen State of the Art Klassifikators, dem Combined Probabilistic Linear Discriminant Analysis Klassifikator. Dieser führt zu einer durchschnittlichen EER von 0.0993 und einem SEM von 0.0051.
269

Applying statistical and syntactic pattern recognition techniques to the detection of fish in digital images

Hill, Evelyn June January 2004 (has links)
This study is an attempt to simulate aspects of human visual perception by automating the detection of specific types of objects in digital images. The success of the methods attempted here was measured by how well results of experiments corresponded to what a typical human’s assessment of the data might be. The subject of the study was images of live fish taken underwater by digital video or digital still cameras. It is desirable to be able to automate the processing of such data for efficient stock assessment for fisheries management. In this study some well known statistical pattern classification techniques were tested and new syntactical/ structural pattern recognition techniques were developed. For testing of statistical pattern classification, the pixels belonging to fish were separated from the background pixels and the EM algorithm for Gaussian mixture models was used to locate clusters of pixels. The means and the covariance matrices for the components of the model were used to indicate the location, size and shape of the clusters. Because the number of components in the mixture is unknown, the EM algorithm has to be run a number of times with different numbers of components and then the best model chosen using a model selection criterion. The AIC (Akaike Information Criterion) and the MDL (Minimum Description Length) were tested.The MDL was found to estimate the numbers of clusters of pixels more accurately than the AIC, which tended to overestimate cluster numbers. In order to reduce problems caused by initialisation of the EM algorithm (i.e. starting positions of mixtures and number of mixtures), the Dynamic Cluster Finding algorithm (DCF) was developed (based on the Dog-Rabbit strategy). This algorithm can produce an estimate of the locations and numbers of clusters of pixels. The Dog-Rabbit strategy is based on early studies of learning behaviour in neurons. The main difference between Dog-Rabbit and DCF is that DCF is based on a toroidal topology which removes the tendency of cluster locators to migrate to the centre of mass of the data set and miss clusters near the edges of the image. In the second approach to the problem, data was extracted from the image using an edge detector. The edges from a reference object were compared with the edges from a new image to determine if the object occurred in the new image. In order to compare edges, the edge pixels were first assembled into curves using an UpWrite procedure; then the curves were smoothed by fitting parametric cubic polynomials. Finally the curves were converted to arrays of numbers which represented the signed curvature of the curves at regular intervals. Sets of curves from different images can be compared by comparing the arrays of signed curvature values, as well as the relative orientations and locations of the curves. Discrepancy values were calculated to indicate how well curves and sets of curves matched the reference object. The total length of all matched curves was used to indicate what fraction of the reference object was found in the new image. The curve matching procedure gave results which corresponded well with what a human being being might observe.
270

Optimization of convolutional neural networks for image classification using genetic algorithms and bayesian optimization

Rawat, Waseem 01 1900 (has links)
Notwithstanding the recent successes of deep convolutional neural networks for classification tasks, they are sensitive to the selection of their hyperparameters, which impose an exponentially large search space on modern convolutional models. Traditional hyperparameter selection methods include manual, grid, or random search, but these require expert knowledge or are computationally burdensome. Divergently, Bayesian optimization and evolutionary inspired techniques have surfaced as viable alternatives to the hyperparameter problem. Thus, an alternative hybrid approach that combines the advantages of these techniques is proposed. Specifically, the search space is partitioned into discrete-architectural, and continuous and categorical hyperparameter subspaces, which are respectively traversed by a stochastic genetic search, followed by a genetic-Bayesian search. Simulations on a prominent image classification task reveal that the proposed method results in an overall classification accuracy improvement of 0.87% over unoptimized baselines, and a greater than 97% reduction in computational costs compared to a commonly employed brute force approach. / Electrical and Mining Engineering / M. Tech. (Electrical Engineering)

Page generated in 0.591 seconds