Spelling suggestions: "subject:"[een] REGRESSION TREES"" "subject:"[enn] REGRESSION TREES""
21 |
Diatomées épilithiques des cours d’eau pérennes de l’île de la Réunion : taxinomie - écologie / Epilithic diatoms of Réunion Island perennial rivers : taxonomy - ecologyGassiole, Gilles 26 March 2014 (has links)
La Réunion est une île volcanique du Sud-Ouest de l’Océan Indien, située à 800 km à l’Est de Madagascar et appartenant à l’archipel des Mascareignes, avec Maurice et Rodrigues. Ce territoire est un département français d’outremer et par conséquent les lois européennes sur la qualité des eaux des cours d’eau s’y appliquent. La chimie de l’eau n’étant pas suffisante pour surveiller la qualité des eaux, les bio-indicateurs peuvent compléter le diagnostic. Dans cette thèse, les diatomées ont été choisies pour évaluer la qualité des eaux courantes. Les diatomées de La Réunion ne sont connues que par des articles épars avec la description de nouvelles espèces. Pendant quatre ans, six campagnes de prélèvements ont été menées sur les rivières pérennes. Ce travail a permis d’acquérir des connaissances sur la flore diatomique épilithique des cours d’eau pérennes de La Réunion et par conséquent sur la biodiversité de ces milieux. De plus, l’écologie des diatomées a été étudiée statistiquement, avec les arbres de régression multivariable (ARM), l’IndVal et ses extensions et les moyennes pondérées, ce qui a permis d’acquérir des connaissances sur l’écologie des communautés, l’autoécologie des espèces et donc sur la qualité de l’eau des rivières. Les résultats intégrés de cette thèse ainsi que les travaux réalisés dans le cadre d’une collaboration avec l’Irstea, ont abouti à une nouvelle méthode indicielle, l’Indice Diatomique Réunion ou IDR, permettant d’évaluer la qualité écologique des cours d’eau. / The volcanic Réunion Island is situated in the southwest Indian Ocean, 800 km to the east of Madagascar; it belongs to the Mascarene archipelago with Mauritius and Rodrigues. This territory is a French oversea department and therefore follows EU laws rivers water quality. The water chemistry is not enough to monitor the quality of water and bioindicators can complete the diagnosis. In this thesis, diatoms have been choosen to assess the quality of the running waters. The diatoms of Réunion Island are only known through scarce articles with bthe description of new species. During four years, six sampling campaigns were conducted on the perennial rivers. The gain of this work is the knowledge of the epilithic diatom flora of Réunion Island perennial rivers and therefore the biodiversity of these environments. Moreover, the study of the diatom ecology by statistics, like Multivariable regression trees (MRT), IndVal and these extensions, Weighted Average (WA) allows acquire knowledge about their community ecology, their autoecology and accordingly on the quality of the river water. The results included in this thesis as well as the work done in collaboration with Irstea, led to a new index method, the Diatomic Index Réunion or IDR, to assess the ecological quality of rivers.
|
22 |
Habitat Suitability Modeling for the Eastern Hog-nosed Snake, 'Heterodon platirhinos', in OntarioThomasson, Victor 26 September 2012 (has links)
With exploding human populations and landscapes that are changing, an increasing number of wildlife species are brought to the brink of extinction. In Canada, the eastern hog-nosed snake, 'Heterodon platirhinos', is found in a limited portion of southern Ontario. Designated as threatened by the Committee on the Status of Endangered Wildlife in Canada (COSEWIC), this reptile has been losing its habitat at an alarming rate. Due to the increase in development of southern Ontario, it is crucial to document what limits the snake’s habitat to direct conservation efforts better, for the long-term survival of this species. The goals of this study are: 1) to examine what environmental parameters are linked to the presence of the species at a landscape scale; 2) to predict where the snakes can be found in Ontario through GIS-based habitat suitability models (HSMs); and 3) to assess the role of biotic interactions in HSMs. Three models with high predictive power were employed: Maxent, Boosted Regression Trees (BRTs), and the Genetic Algorithm for Rule-set Production (GARP). Habitat suitability maps were constructed for the eastern hog-nosed snake for its entire Canadian distribution and models were validated with both threshold dependent and independent metrics. Maxent and BRT performed better than GARP and all models predict fewer areas of high suitability when landscape variables are used with current occurrences. Forest density and maximum temperature during the active season were the two variables that contributed the most to models predicting the current distribution of the species. Biotic variables increased the performance of models not by representing a limiting resource, but by representing the inequality of sampling and areas where forest remains. Although habitat suitability models rely on many assumptions, they remain useful in the fields of conservation and landscape management. In addition to help identify critical habitat, HSMs may be used as a tool to better manage land to allow for the survival of species at risk.
|
23 |
Habitat Suitability Modeling for the Eastern Hog-nosed Snake, 'Heterodon platirhinos', in OntarioThomasson, Victor January 2012 (has links)
With exploding human populations and landscapes that are changing, an increasing number of wildlife species are brought to the brink of extinction. In Canada, the eastern hog-nosed snake, 'Heterodon platirhinos', is found in a limited portion of southern Ontario. Designated as threatened by the Committee on the Status of Endangered Wildlife in Canada (COSEWIC), this reptile has been losing its habitat at an alarming rate. Due to the increase in development of southern Ontario, it is crucial to document what limits the snake’s habitat to direct conservation efforts better, for the long-term survival of this species. The goals of this study are: 1) to examine what environmental parameters are linked to the presence of the species at a landscape scale; 2) to predict where the snakes can be found in Ontario through GIS-based habitat suitability models (HSMs); and 3) to assess the role of biotic interactions in HSMs. Three models with high predictive power were employed: Maxent, Boosted Regression Trees (BRTs), and the Genetic Algorithm for Rule-set Production (GARP). Habitat suitability maps were constructed for the eastern hog-nosed snake for its entire Canadian distribution and models were validated with both threshold dependent and independent metrics. Maxent and BRT performed better than GARP and all models predict fewer areas of high suitability when landscape variables are used with current occurrences. Forest density and maximum temperature during the active season were the two variables that contributed the most to models predicting the current distribution of the species. Biotic variables increased the performance of models not by representing a limiting resource, but by representing the inequality of sampling and areas where forest remains. Although habitat suitability models rely on many assumptions, they remain useful in the fields of conservation and landscape management. In addition to help identify critical habitat, HSMs may be used as a tool to better manage land to allow for the survival of species at risk.
|
24 |
Regression and time estimation in the manufacturing industryBjernulf, Walter January 2023 (has links)
In this thesis an analysis is performed on operation times for different sized products in a manufacturing company. The thesis will introduce and summarise most of the theory needed to perform regression and also cover a worked example where three different regression models are learned, evaluated and analysed. Conformal prediction, which at the moment is a hot topic in machine learning, will also be introduced and will be used in the worked example.
|
25 |
Bayesian Regression Trees for Count Data: Models and MethodsGeels, Vincent M. 27 September 2022 (has links)
No description available.
|
26 |
Comparison of MaxEnt and boosted regression tree model performance in predicting the spatial distribution of threatened plant, Telephus spurge (Euphorbia telephioides)Mainella, Alexa Marie 29 April 2016 (has links)
No description available.
|
27 |
Detection of erroneous payments utilizing supervised and utilizing supervised and unsupervised data mining techniquesYanik, Todd E. 09 1900 (has links)
Approved for public release; distribution in unlimited. / In this thesis we develop a procedure for detecting erroneous payments in the Defense Finance Accounting Service, Internal Review's (DFAS IR) Knowledge Base Of Erroneous Payments (KBOEP), with the use of supervised (Logistic Regression) and unsupervised (Classification and Regression Trees (C & RT)) modeling algorithms. S-Plus software was used to construct a supervised model of vendor payment data using Logistic Regression, along with the Hosmer-Lemeshow Test, for testing the predictive ability of the model. The Clementine Data Mining software was used to construct both supervised and unsupervised model of vendor payment data using Logistic Regression and C & RT algorithms. The Logistic Regression algorithm, in Clementine, generated a model with predictive probabilities, which were compared against the C & RT algorithm. In addition to comparing the predictive probabilities, Receiver Operating Characteristic (ROC) curves were generated for both models to determine which model provided the best results for a Coincidence Matrix's True Positive, True Negative, False Positive and False Negative Fractions. The best modeling technique was C & RT and was given to DFAS IR to assist in reducing the manual record selection process currently being used. A recommended ruleset was provided, along with a detailed explanation of the algorithm selection process. / Lieutenant Commander, United States Navy
|
28 |
Fatores abióticos condicionantes da distribuição de espécies arbóreas em quatro formações florestais do Estado de São Paulo / Abiotic factors determining spatial distribution of tree species in four forest formations of the State of São PauloMagalhães, Simone Rodrigues de 15 March 2016 (has links)
No estudo das comunidades florestais, estabelecer a importância relativa dos fatores que definem a composição e a distribuição das espécies é um desafio. Em termos de gradientes ambientais o estudo das respostas das espécies arbóreas são essenciais para a compreensão dos processos ecológicos e decisões de conservação. Neste sentido, para contribuir com a elucidação dos processos ecológicos nas principais formações florestais do Estado de São Paulo (Floresta Ombrófila Densa de Terras Baixas, Floresta Ombrófila Densa Submontana, Floresta Estacional Semidecidual e Savana Florestada) este trabalho objetivou responder as seguintes questões: (I) a composição florística e a abundância das espécies arbóreas, em cada unidade fitogeográfica, variam conforme o gradiente edáfico e topográfico?; (II) características do solo e topografia podem influenciar na previsibilidade de ocorrência de espécies arbóreas de ampla distribuição em diferentes tipos vegetacionais? (III) existe relação entre o padrão de distribuição espacial de espécies arbóreas e os parâmetros do solo e topografia? O trabalho foi realizado em parcelas alocadas em unidades de conservação (UC) que apresentaram trechos representativos, em termos de conservação e tamanho, das quatro principais formações florestais presentes no Estado de São Paulo. Em cada UC foram contabilizados os indivíduos arbóreos (CAP ≥ 15 cm), topografia, dados de textura e atributos químicos dos solos em uma parcela de 10,24 ha, subdividida em 256 subparcelas. Análises de correspodência canônica foram aplicadas para estabelecer a correspondência entre a abundância das espécies e o gradiente ambiental (solo e topografia). O método TWINSPAN modificado foi aplicado ao diagrama de ordenação da CCA para avaliar a influência das variáveis ambientais (solo e topografia) na composição de espécies. Árvores de regressão \"ampliadas\" (BRT) foram ajustadas para a predição da ocorrência das espécies segundo as variáveis de solo e topografia. O índice de Getis-Ord (G) foi utilizado para determinar a autocorrelação espacial das variáveis ambientais utilizadas nos modelos de predição da ocorrência das espécies. Nas unidades fitogeográficas analisadas, a correspondência entre o gradiente ambiental (solo e topografia) e a abundância das espécies foi significativa, especialmente na Savana Florestada onde observou-se a maior relação. O solo e a topografia também se relacionaram com a semelhança na composição florística das subparcelas, com exceção da Floresta Estacional Semicidual (EEC). As principais variáveis de solo e topografia relacionadas a flora em cada UC foram: (1) Na Floresta Ombrófila Densa de Terras Baixas (PEIC) - teor de alumínio na camada profunda (Al (80-100 cm)) que pode refletir os teor de Al na superfície, acidez do solo (pH(H2O) (5-25 cm)) e altitude, que delimitou as áreas alagadas; (2) Na Floresta Ombrófila Densa Submontana (PECB) - altitude, fator que, devido ao relevo acidentado, influencia a temperatura e incidência de sol no sub-bosque; (3) Na Savana Florestada (EEA) - fertilidade, tolerância ao alumínio e acidez do solo. Nos modelos de predição BRT, as variáveis químicas dos solos foram mais importantes do que a textura, devido à pequena variação deste atributo no solo nas áreas amostradas. Dentre as variáveis químicas dos solos, a capacidade de troca catiônica foi utilizada para prever a ocorrência das espécies nas quatro formações florestais, sendo particularmente importante na camada mais profunda do solo da Floresta Ombrófila Densa de Terras Baixas (PEIC). Quanto à topografia, a altitude foi inserida na maioria dos modelos e apresentou diferentes influências sobre as áreas de estudo. De modo geral, para presença das espécies de ampla distribuição observou-se uma mesma tendência quando à associação com os atributos dos solos, porém com amplitudes dos descritores edáficos que variaram de acordo com a área de estudo. A ocorrência de Guapira opposita e Syagrus romanzoffiana, cujo padrão variou conforme a escala, foi explicada por variáveis com padrões espaciais agregados que somaram entre 30% e 50% de importância relativa no modelo BRT. A presença de A. anthelmia, cujo padrão também apresentou certo nível de agregação, foi associada apenas a uma variável com padrão agregado, a altitude (21%), que pode ter exercido grande influência na distribuição da espécie ao delimitar áreas alagadas. T. guianensis se associou a variáveis ambientais preditoras com padrão espacial agregado que somaram cerca de 70% de importância relativa, o que deve ter sido suficiente para estabelecer o padrão agregado em todas as escalas. No entanto, a influência dos fatores ambientais no padrão de distribuição da espécie não depende apenas do ótimo ambiental da espécie, mas um resultado da interação espécie-ambiente. Concluiu-se que: (I) características edáficas e topográficas explicaram uma pequena parcela da composição florística, em cada unidade fitogeográfica, embora a ocorrência de algumas espécies tenha se associado ao gradiente edáfico e topográfico; (II) a partir de características dos solos e da topografia foi possível prever a presença de espécies arbóreas, que apresentaram particularidades em relação a sua associação com o solo de cada fitofisionomia; (III) a partir de associações descritivas o solo e a topografia influenciam o padrão de distribuição espacial das espécies, na proporção em que contribuem para a presença das mesmas. / In the study of forest communities, establish the relative importance of the factors that define the composition and distribution of species is a challenge. In terms of environmental gradients study the responses of tree species are essential to the understanding of ecological processes and conservation decisions. In this regard, to contribute to the elucidation of ecological processes in the main forest formations of São Paulo (Dense Ombrophylous Forest of Lowlands, Submontane Dense Ombrophylous Forest, Semideciduous Forest and Savanna Woodland) this study aimed to answer the following questions: (I) floristic composition and tree species abundance in each phytogeographic unit change according to edaphic and topographic gradient?; (II) soil characteristics and topography can influence the occurrence of predictability of tree species widely distributed in different types of vegetation? (III) there is a relationship between spatial distribution pattern of tree species and the soil parameters and topography? The work was carried out in allocated plots in protected areas (PA) with the four main forest formations in terms of conservation and size of Sao Paulo. In each PA was sampled individual trees, topography, texture data and chemical properties of the soil on a plot of 10.24 ha, subdivided into 256 subplots. Canonical corresponding analyzes (CCA) were applied to establish the correspondence between the abundance of species and environmental gradient (soil and topography). The modified TWINSPAN method was applied to CCA ordination diagram to evaluate the influence of environmental variables (soil and topography) on species composition. Boosteed Regression Trees (BRT) were adjusted for predicting the occurrence of the species according to soil variables and topography. The Getis Ord-index (G) was used to determine the spatial autocorrelation of environmental variables used in the BRT models. In analyzed phytogeographic units, correspondence between the environmental gradient (soil and topography) and abundance of species was significant, especially in Savanna Woodland. The soil and topography also correlated with the floristic composition similarity of the subplots, with the exception of Semicidual Seasonal Forest (EEC). The main soil and topography variables related to floristic in each PA were: (1) Dense Ombrophylous Forest of Lowlands (PEIC) - aluminium content in the deep layer (Al (80-100 cm)) which may reflect the Al content at the surface, soil acidity (pH (H2O) (5-25 cm)) and altitude, which outlined the flooded areas; (2) Submontane Dense Ombrophylous Forest (PECB) - elevation, due to the rugged terrain influences the temperature and light incidence in the understory; (3) Savanna Woodland (EEA) - fertility, tolerance to aluminum and soil acidity. In BRT prediction models, the chemical soil variables were more important than the texture due to small variation of this soil attribute in the sampled area. Among the soil chemical variables, cation exchange capacity was used to predict the species occurrence in four forest formations and particularly important in the soil deepest layer on the Dense Ombrophylous Forest of Lowlands (PEIC). In relation to topography, elevation was included in most models and had different influences on the study areas. Overall, the species widely distributed showed the same trend as the association with the attributes of the soil, but with amplitudes of edaphic descriptors that change according to the study area. The occurrence of the Guapira opposita and Syagrus romanzoffiana, whose pattern change according to the scale, was explained by variables with aggregated spatial patterns that amounted to between 30% and 50% relative importance in the BRT model. The presence of A. anthelmia, which defaults also presented certain level of aggregation, was associated only with one aggregate variable, elevation (21%), which may have exerted great influence on the species distribution to delimit wetlands. T. guianensis was related with the predictive environmental variables of aggregate spatial pattern which totaled to about 70% relative importance, what must have been enough to establish the aggregate pattern at all scales. However, the influence of environmental factors (soil and topography) on the species distribution pattern depends not only on the environmental optimum of the species, but a result of species-environment interaction. We concluded that: (I) soil and topographical characteristics explain a small portion of the floristic composition in each phytogeographic unit, although the occurrence of some species have been associated to the soil and topographic gradient; (II) from soil characteristics and topography it was possible to predict the presence of tree species, which showed particular in relation to its association with the soil of each vegetation type; (III) from descriptive associations soil and topography influence the spatial distribution pattern of the species, to the extent that contribute to the presence of the same.
|
29 |
Dinâmica temporal e influência de variáveis ambientais no recrutamento de peixes recifais do Banco dos Abrolho, BA, Brasil. / Temporal dynamics and influence of environmental variables in the recruitment of reef fish of the Abrolhos Bank, BrazilSartor, Daniel 25 June 2015 (has links)
O recrutamento é extremamente importante no ambiente recifal, sendo o principal responsável pelo reabastecimento de populações adultas de peixes. Esse fenômeno é altamente complexo, não sendo claro se é influenciado apenas por processos estocásticos ou também por processos determinísticos. No presente estudo avaliamos a dinâmica temporal do recrutamento de diversas espécies de peixes recifais, identificando sítios de berçário (i.e. recrutamento estável e alto) e a influência de variáveis ambientais. Para tal, utilizamos dados de um monitoramento de médio prazo (i.e. 2001 a 2014) realizado no Banco dos Abrolhos (BA-Brasil). Foram amostrados mais de 45 sítios, sendo levantados dados sobre a comunidade de peixes, comunidade bentônica e outras variáveis ambientais. A partir desses dados, avaliamos a variação do recrutamento por sítio em dois períodos distintos (i.e. 2001-2008/2006-2014) e a influência de variáveis ambientais no recrutamento, através da técnica Boosted Regression Trees. Constatamos que diversas espécies de peixe apresentam-se com recrutamento estável em distintos sítios de amostragem. Também observamos um efeito positivo da densidade de peixes recifais coespecíficos adultos e da cobertura relativa de algas frondosas no recrutamento de diversas espécies analisadas. No geral, observamos que há certa espécie especificidade no processo de recrutamento, porém, em escalas espaciais maiores, os padrões podem estar ligados a características mais gerais, relacionadas a um grupo taxonômico mais elevado. Em relação aos sítios de berçário, um se destacou, sendo berçário de 5 diferentes espécies, incluindo Scarus trispinosus, uma das espécies prioritárias para conservação na região de Abrolhos. Assim, recomendamos a criação de uma área marinha de proteção integral que englobe o sítio em questão. Além disso, as descobertas deste trabalho nos permitem reforçar a teoria de que o recrutamento de peixes recifais pode ser influenciado por fenômenos determinísticos e não varia simplesmente de maneira estocástica. / Recruitment is extremely important in the reef environment, because it is the main source of population replenishment. Reef fish recruitment is a highly complex process and it is not clear whether it is influenced only by stochastic processes or also by deterministic processes. Herein, we aimed to investigate temporal dynamics of reef fish recruitment, identify nursery sites (i.e. predictably high recruitment sites) and evaluate the influence of environmental variables on recruitment. We used data from a medium-term time series (i.e. 2001-2014) of scientific surveys in Abrolhos Bank (BA-Brazil). We sampled more than 45 sites, for several consecutive years and recorded data about fish community, benthic community and other environmental variables. We assessed the variation of recruitment on each site, during two distinct periods (i.e. 2001-2008 / 2006-2014), and used the Boosted Regression Trees technique to evaluate the influence of environmental variables in recruitment. We found that several reef fish species present a low variable recruitment at different sampling sites. BRT showed a positive effect of the coverage of flesh algae and abundance of conspecific in the abundance of recruits (i.e. young-of-year) of many species. Overall, we notice that the recruitment traits seems to be species specific, but we also found indications that in larger spatial scales, recruitment spatial and temporal patterns may be related to general characteristics among species of the higher taxa. Nursery sites varied among species and one site was a nursery to 5 different reef fish species, including Scarus trispinosus, a species that require priority conservation in the Abrolhos Bank. Therefore, we recommend the creation of a new no-take marine protected area that encompasses this site. Our results also indicated that reef fish recruitment may be influenced by deterministic processes and do not vary only stochastically.
|
30 |
Application des arbres décisionnels en grappes pour prédire la performance des institutions microfinancières / Application of decision-trees for predicting the performance of microfinance institutionsBou Kheir, Roy 28 June 2013 (has links)
Les performances financières et sociales sont des caractéristiques institutionnelles importantes qui permettent aux pauvres et aux ‘quasi-pauvres' d'avoir accès aux crédits dans des conditions favorables, et aboutissent en même temps à un fonctionnement durable et aux mécanismes efficaces de gouvernances dans les institutions micro financières (IMFs). Dans ce contexte, cette étude a été menée afin de déterminer les variables financières/sociales/gouvernables qui peuvent influer les indicateurs de performance financière et sociale des IMFs à l'échelle mondiale; et de développer pour la première fois des arbres logiques décisionnels (en grappes) simples et pratiques qui peuvent être considérés comme des outils précieux aidant la mise en œuvre de stratégies efficaces pour les différents types des IMFs (à but lucratif et non lucratif) à l'échelle nationale.La première partie de cette thèse expose les données financières et sociales globales qui ont été extraites au cours des cinq dernières années (2007-2011) à partir de plusieurs bases de données bien connues (ex. Microfinance Information Exchange, Mix Market, Rating fund, etc…) pour les IMFs choisies classées comme ayant 4 ou 5 diamants (soit, 263 IMFs à but non lucratif et 135 IMFs à but lucratif) distribuées à travers les continents. Parmi les 263 IMFs à but non lucratif, l'échantillon de données a été composé de 192 organisations non-gouvernementales (ONGs), 42 institutions non bancaires et 29 coopératives. Un grand nombre de variables prédictives (54) ont été recueillies reflétant les aspects de l'environnement financier de ces IMFs (par exemple l'index des dépenses administratives, l'index de solvabilité, le coût par prêt, le nombre des déposants, etc…), les caractéristiques sociales (ex. profondeur, pourcentage des emprunteurs actifs ‘femmes', marché rural/urbain, niveau de pauvreté, etc…) et les mécanismes de gouvernance (ex. la taille de l'entreprise, la taille du conseil, la régulation, l'audit, l'affiliation à un réseau, l'assurance, etc…). Cette 1ère partie compare également l'efficacité de la plupart des méthodes/modèles statistiques les plus utilisés (incluant la régression linéaire, la régression logistique, les méthodes bayésiennes, les réseaux artificiels des neurones, l'analyse en composantes principales, etc….) pour estimer les indicateurs de performance financière et sociale au sein des IMFs. Elle inclue aussi une description détaillée du processus de construction des arbres décisionnels en grappes qui peut être utilisé pour cette estimation ainsi que toutes les étapes reliées (comprenant l'évaluation des divisions, l'assignement des catégories aux nœuds, les valeurs manquantes avec des répartiteurs de substitution, les critères d'arrêt, etc….).La deuxième partie explore les relations quantitatives entre les quatre indicateurs de performance financière les plus couramment utilisés [autosuffisance opérationnelle (operational self-sufficiency OSS), marge bénéficiaire (profit margin PM), rendement des actifs (return on assets ROA), et rendement des capitaux propres (return on equity ROE)] et les principales variables prédictives pour les IMFs choisies à but non lucratif (incluses à partir de 53 pays) à travers l'application de la modélisation par arbre de régression. Pour chaque indicateur de performance financière, plusieurs arbres de régression non élagués (684) ont été développés : (i) en utilisant toutes les variables prédictives, (ii) en utilisant toutes les variables prédictives financières seulement, (iii) en utilisant toutes les variables prédictives sociales seulement, (iv) en utilisant toutes les variables prédictives de gouvernance seulement, (v) en appliquant une seule variable prédictive à la fois, (vi) en excluant chaque variable à la fois du groupe potentiel des variables prédictives, et (vi) en forçant la séparation initiale de l'arbre à travers l'utilisation de la variable prédictive préférée afin d'explorer le pouvoir prédictif ... / Financial and social performances are important institutional characteristics that allow ‘the poor and the near-poor' to have access to credit in favorable conditions, and drives sustainable efficiency and effective governance mechanisms in MFIs (microfinance institutions). In this context, this study was conducted to determine the most influencing financial/social/governance variables (with their relative importance in %) that may affect the financial and social MFI performance indicators on worldwide basis; and to develop simple and practical microfinance tree-models (for the first time) that can be considered valuable tools helping with the implementation of efficient strategies among nonprofit and profit MFIs at a national scale.The first part of this thesis exposes the global financial and social data that has been extracted over the five recent years (2007-2011) from several well-known databases (e.g., Microfinance Information Exchange, Mix Market, Rating fund, etc.) for the chosen MFIs ranked four or five diamonds (i.e., 263 nonprofit MFIs and 135 profit ones) distributed widely over the continents. Among the 263 nonprofit MFIs, the data sample was composed of 192 Non-Governmental Organizations (NGOs), 42 non-bank institutions and 29 cooperatives. A large number of predictor variables (54) have been collected capturing aspects of the financial environment of these MFIs (e.g., administrative expense ratio, ratio of solvency, cost per loan, number of depositors, write-off-ratio, etc.), the social characteristics (e.g., depth, percent of women active borrowers, rural/urban market, poverty level, etc.) and the governance mechanisms (e.g., firm size, board size, regulation, audit, network affiliation, insurance, etc.). This first part compares also the efficiencies of the most used statistical methods/models (including linear regression, logistic regression, Bayesian methods, artificial neural networks, cluster analysis, principal component analysis, decision-trees, etc.) for estimating diverse financial and social performance MFIs' indicators. It includes also a detailed description of the tree building process that has been used for such estimation and all related steps (involving evaluating splits, assigning categories to nodes, missing values with surrogate splitters, stopping criteria, etc.).The second part explores quantitative relationships between the four commonly worldwide used financial performance indicators (operational self-sufficiency OSS, profit margin PM, return on assets ROA, and return on equity ROE) and key financial/social/governance predictor variables for the chosen non-profit MFIs (included from 53 countries) through the application of regression-tree modeling. For each financial performance indicator, several un-pruned regression trees (684) were developed: (i) using all predictor variables, (ii) all financial predictor variables only, (iii) all social predictor variables only, (iv) all governance predictor variables only, (v) applying only a single variable at a time, (vi) excluding each variable one at a time from the potential pool of predictor variables, and (vii) forcing the initial split of the tree using the preferred predictor variable for exploring the predictive power of independent predictors. The obtained results demonstrate that the strongest relationships were associated with ROE and ROA, the proportion of variance explained being equal to 99.8% and 99.5% respectively, followed by PM (97%) and OSS (95%). The second part also showed that the financial predictor variables did interfere differently in building the financial performance regression trees and associated relationships where ; administrative expense ratio influenced ROE (100%) ; average loan balance per borrower affected OSS (100%); cost per borrower, number of depositors, operating expense:loan portfolio, and risk coverage had significant impacts on ROA/ROE (98.5-100%).
|
Page generated in 0.0372 seconds