• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 3
  • 2
  • Tagged with
  • 7
  • 7
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

A credit scoring model based on classifiers consensus system approach

Ala'raj, Maher A. January 2016 (has links)
Managing customer credit is an important issue for each commercial bank; therefore, banks take great care when dealing with customer loans to avoid any improper decisions that can lead to loss of opportunity or financial losses. The manual estimation of customer creditworthiness has become both time- and resource-consuming. Moreover, a manual approach is subjective (dependable on the bank employee who gives this estimation), which is why devising and implementing programming models that provide loan estimations is the only way of eradicating the ‘human factor’ in this problem. This model should give recommendations to the bank in terms of whether or not a loan should be given, or otherwise can give a probability in relation to whether the loan will be returned. Nowadays, a number of models have been designed, but there is no ideal classifier amongst these models since each gives some percentage of incorrect outputs; this is a critical consideration when each percent of incorrect answer can mean millions of dollars of losses for large banks. However, the LR remains the industry standard tool for credit-scoring models development. For this purpose, an investigation is carried out on the combination of the most efficient classifiers in credit-scoring scope in an attempt to produce a classifier that exceeds each of its classifiers or components. In this work, a fusion model referred to as ‘the Classifiers Consensus Approach’ is developed, which gives a lot better performance than each of single classifiers that constitute it. The difference of the consensus approach and the majority of other combiners lie in the fact that the consensus approach adopts the model of real expert group behaviour during the process of finding the consensus (aggregate) answer. The consensus model is compared not only with single classifiers, but also with traditional combiners and a quite complex combiner model known as the ‘Dynamic Ensemble Selection’ approach. As a pre-processing technique, step data-filtering (select training entries which fits input data well and remove outliers and noisy data) and feature selection (remove useless and statistically insignificant features which values are low correlated with real quality of loan) are used. These techniques are valuable in significantly improving the consensus approach results. Results clearly show that the consensus approach is statistically better (with 95% confidence value, according to Friedman test) than any other single classifier or combiner analysed; this means that for similar datasets, there is a 95% guarantee that the consensus approach will outperform all other classifiers. The consensus approach gives not only the best accuracy, but also better AUC value, Brier score and H-measure for almost all datasets investigated in this thesis. Moreover, it outperformed Logistic Regression. Thus, it has been proven that the use of the consensus approach for credit-scoring is justified and recommended in commercial banks. Along with the consensus approach, the dynamic ensemble selection approach is analysed, the results of which show that, under some conditions, the dynamic ensemble selection approach can rival the consensus approach. The good sides of dynamic ensemble selection approach include its stability and high accuracy on various datasets. The consensus approach, which is improved in this work, may be considered in banks that hold the same characteristics of the datasets used in this work, where utilisation could decrease the level of mistakenly rejected loans of solvent customers, and the level of mistakenly accepted loans that are never to be returned. Furthermore, the consensus approach is a notable step in the direction of building a universal classifier that can fit data with any structure. Another advantage of the consensus approach is its flexibility; therefore, even if the input data is changed due to various reasons, the consensus approach can be easily re-trained and used with the same performance.
2

Monte Carlo simulation of genetic drift in finite populations undergoing selection /

Sather, Allan Peter January 1973 (has links)
No description available.
3

Análise da cor da casca do mamão cv. Sunrise Solo por meio de modelo de regressão linear misto / Analysis of color peel of the papaya cv. Sunrise Solo through of the mixed linear regression model

Nascimento, Caroline Oliveira do 30 May 2019 (has links)
O mamão (Carica papaya L.) tem importância destacada na fruticultura e se encontra entre os seis principais produtos que somam mais de 50% da produção nacional desse setor. O mamão tem uma maturação relativamente rápida. Visando aumentar o potencial de comércio e possivelmente diminuir as perdas pós-colheita, a análise de imagens digitais é um recurso tecnológico para avaliar a tonalidade e intensidade da cor da casca dos frutos no período de maturação, que serve de base para estabelecer modelos funcionais para mensurações realizadas num período de tempo. Nesse contexto tem como motivação um estudo longitudinal envolvendo a avaliação da intensidade e tonalidade da cor da casca do mamão da espécie Carica papaya L. no período de maturação. Para a análise dos dados é utilizada a metodologia dos modelos lineares de efeitos mistos e para selecionar os modelos que melhor se ajustavam aos dados, utilizou-se teste da razão de verossimilhanças e teste F, em um método de seleção top-down. Verifica-se que modelo polinomial quadrático com efeito aleatório em todos os parâmetros descreve de maneira satisfatória a variável tonalidade. Para a variável intensidade obteve-se um modelo polinomial cúbico para os efeitos aleatórios e apenas o intercepto como parâmetro de efeito fixo. As análises de diagnóstico confirmaram o ajuste satisfatório dos modelos. / The papaya (Carica papaya L.) has important importance in fruticulture and is among the six main products that add up to more than 50% of the national production of this sector. Papaya has a relatively rapid maturation. In order to increase commercial potential and possibly reduce post-harvest losses, digital image analysis is a technological tool to evaluate the color tone and intensity of fruit peel during the maturation period, which serves as the basis for establishing functional models for measurements performed over a period of time. In this context it has as motivation a longitudinal study involving the evaluation of the intensity and color tone of the shell of the papaya of the species Carica papaya L. in the maturation period. For the analysis of the data the methodology of the linear models of mixed effects is used and to select the models that best fit the data, was used a test of the likelihood ratio and test F, in a method of selection top-down. It can be verified that the quadratic polynomial model with random effect in all the parameters describes in a satisfactory way the variable tonality. For the intensity variable we obtained a cubic polynomial model for the random effects and only the intercept as a fixed effect parameter. Diagnostic analyzes confirmed the satisfactory fit of the models.
4

A comparison of Bayesian model selection based on MCMC with an application to GARCH-type models

Miazhynskaia, Tatiana, Frühwirth-Schnatter, Sylvia, Dorffner, Georg January 2003 (has links) (PDF)
This paper presents a comprehensive review and comparison of five computational methods for Bayesian model selection, based on MCMC simulations from posterior model parameter distributions. We apply these methods to a well-known and important class of models in financial time series analysis, namely GARCH and GARCH-t models for conditional return distributions (assuming normal and t-distributions). We compare their performance vis--vis the more common maximum likelihood-based model selection on both simulated and real market data. All five MCMC methods proved feasible in both cases, although differing in their computational demands. Results on simulated data show that for large degrees of freedom (where the t-distribution becomes more similar to a normal one), Bayesian model selection results in better decisions in favour of the true model than maximum likelihood. Results on market data show the feasibility of all model selection methods, mainly because the distributions appear to be decisively non-Gaussian. / Series: Report Series SFB "Adaptive Information Systems and Modelling in Economics and Management Science"
5

O problema da superdispersão em dados categorizados politômicos nominais em estudos agrários / The problem of overdispersion in categorized polymorphic data in agrarian studies

Salvador, Maria Letícia 31 May 2019 (has links)
Variáveis politômicas são comuns em experimentos agronômicos, apresentando natureza nominal ou ordinal. O modelo dos logitos generalizados é uma classe de modelos que pode ser empregada para a análise desses dados. Uma das características deste modelo é a pressuposição de que a variância é uma função conhecida da média e, espera-se, que a variância observada esteja próxima da variância pressuposta pelo modelo assumido. Contudo, quando ela é maior do que a especificada pelo modelo, tem-se o fenômeno da superdispersão. Nesse contexto, o presente trabalho objetivou caracterizar o problema da superdispersão associado a dados nominais em estudos \"cross-sectional\". Como motivação apresentam-se dois estudos adaptados da área de ciências agrárias relativos à fruticultura e zootecnia, ambos planejados no delineamento inteiramente casualizado. Verifica-se indicativo de superdispersão nos dados dos dois exemplos e como uma alternativa metodológica utilizou-se o modelo Dirichlet-multinomial. Por meio do gráfico de diagnóstico half-normal plot avaliou-se o ajuste do modelo dos logitos generalizados e do Dirichlet-multinomial. Adicionalmente, foi proposta uma extensão do índice de dispersão para os dados politômicos, com performance avaliada sob simulação. O modelo Dirichlet-multinomial mostrou-se adequado para o ajuste aos dados com superdispersão comparativamente ao modelo dos logitos generalizados. Apesar dos resultados satisfatórios obtidos, ressalta-se que este trabalho é uma introdução ao problema. / Polytomic variables are common in agronomic experiments, presenting nominal or ordinal nature. The generalized logits model is a class of models that can be used to analyze these facts. One of the characteristics of this model is the assumption that variance is a known function of the mean and. It is expected, that the analyzed variance is close to that assumed by the model. However, when it is larger than the one specified by the model, it has the phenomenon of overdipersion. In this context, the present work aims to characterize the problem of overdispersion associated with nominal data in cross-sectional studies. As motivation, it is showed two adapted studies of the agricultural sciences area, related to fruit growing and zootechnics, both planned in the completely randomized design. The Dirichlet-multinomial model was used as a methodological alternative and was indicated as an overdispersion in the facts of the two examples. The model of the generalized logits and the Dirichlet-multinomial model were evaluated using the half-normal plot. In addition, it was proposed an extension of the dispersion index for the polytomic data, with performance evaluated under simulation. The Dirichlet-multinomial model proved to be adequate for the adjustment to the overdispersed fact compared to the generalized logit model. Despite the satisfactory results obtained, it is emphasized that this work is an introduction to the problem.
6

Classe de distribuições série de potências inflacionadas com aplicações

Silva, Deise Deolindo 06 April 2009 (has links)
Made available in DSpace on 2016-06-02T20:06:03Z (GMT). No. of bitstreams: 1 2510.pdf: 1878422 bytes, checksum: 882e21e70271b7a106e3a27a080da004 (MD5) Previous issue date: 2009-04-06 / This work has as central theme the Inflated Modified Power Series Distributions, where the objective is to study its main properties and the applicability in the bayesian context. This class of models includes the generalized Poisson, binomial and negative binomial distributions. These probability distributions are very helpful to models discrete data with inflated values. As particular case the - zero inflated Poisson models (ZIP) is studied, where the main purpose was to verify the effectiveness of it when compared to the Poisson distribution. The same methodology was considered for the negative binomial inflated distribution, but comparing it with the Poisson, negative binomial and ZIP distributions. The Bayes factor and full bayesian significance test were considered for selecting models. / Este trabalho tem como tema central a classe de distribuições série de potências inflacionadas, em que o intuito é estudar suas principais propriedades e a aplicabilidade no contexto bayesiano. Esta classe de modelos engloba as distribuições de Poisson, binomial e binomial negativa simples e as generalizadas e, por isso é muito aplicada na modelagem de dados discretos com valores excessivos. Como caso particular propôs-se explorar a distribuição de Poisson zero inflacionada (ZIP), em que o objetivo principal foi verificar a eficácia de sua modelagem quando comparada à distribuição de Poisson. A mesma metodologia foi considerada para a distribuição binomial negativa inflacionada, mas comparando-a com as distribuições de Poisson, binomial negativa e ZIP. Como critérios formais para seleção de modelos foram considerados o fator de Bayes e o teste de significância completamente bayesiano.
7

Modelling space-use and habitat preference from wildlife telemetry data

Aarts, Geert January 2007 (has links)
Management and conservation of populations of animals requires information on where they are, why they are there, and where else they could be. These objectives are typically approached by collecting data on the animals’ use of space, relating these to prevailing environmental conditions and employing these relations to predict usage at other geographical regions. Technical advances in wildlife telemetry have accomplished manifold increases in the amount and quality of available data, creating the need for a statistical framework that can use them to make population-level inferences for habitat preference and space-use. This has been slow-in-coming because wildlife telemetry data are, by definition, spatio-temporally autocorrelated, unbalanced, presence-only observations of behaviorally complex animals, responding to a multitude of cross-correlated environmental variables. I review the evolution of techniques for the analysis of space-use and habitat preference, from simple hypothesis tests to modern modeling techniques and outline the essential features of a framework that emerges naturally from these foundations. Within this framework, I discuss eight challenges, inherent in the spatial analysis of telemetry data and, for each, I propose solutions that can work in tandem. Specifically, I propose a logistic, mixed-effects approach that uses generalized additive transformations of the environmental covariates and is fitted to a response data-set comprising the telemetry and simulated observations, under a case-control design. I apply this framework to non-trivial case-studies using data from satellite-tagged grey seals (Halichoerus grypus) foraging off the east and west coast of Scotland, and northern gannets (Morus Bassanus) from Bass Rock. I find that sea bottom depth and sediment type explain little of the variation in gannet usage, but grey seals from different regions strongly prefer coarse sediment types, the ideal burrowing habitat of sandeels, their preferred prey. The results also suggest that prey aggregation within the water column might be as important as horizontal heterogeneity. More importantly, I conclude that, despite the complex behavior of the study species, flexible empirical models can capture the environmental relationships that shape population distributions.

Page generated in 0.1005 seconds