Global ETD Search

161	Estrutura da comunidade de mamíferos de médio e grande porte em uma paisagem fragmentada com matriz de eucalipto, Capão Bonito e Buri, SP / Medium to large-sized mammal community structure in a fragmented landscape with eucalyptus matrix, Capão Bonito and Buri, SP Elson Fernandes de Lima 22 February 2013 (has links) A conversão do uso do solo é uma das principais ameaçadas à fauna, pois a paisagem torna-se fragmentada e as áreas ocupadas por vegetação nativa são reduzidas, podendo alterar a estrutura das comunidades animais. Neste estudo, a comunidade de mamíferos de médio e grande porte (> 1 kg) foi avaliada em uma paisagem fragmentada com matriz de eucalipto, no sul do Estado de São Paulo, municípios de Buri e Capão Bonito (23°52\'47\" S; 48°23\'24\" O), através de três métodos distintos instalados conjuntamente (parcelas de areia, camera-trap e scent stations - nesta última foram utilizadas iscas atrativas específicas para carnívoros e onívoros, Canine Call® e Pro\'s Choice®). Os objetivos desta dissertação foram: i) avaliar a estrutura dessa comunidade em função da estrutura da paisagem; ii) comparar métodos de amostragem utilizados, discutindo sua aplicação. O levantamento das espécies foi realizado em cinco campanhas de cinco dias cada, entre 2010-2012, com unidades amostrais instaladas na matriz, nos corredores e nos fragmentos florestais, sendo que os elementos da paisagem foram avaliados em buffers de 250, 500, 1000 e 2000 m. Foram registradas 20 espécies de mamíferos de médio e grande porte, sendo a maioria (n=18) encontrada nos ambientes ocupados por vegetação nativa. Embora em menor número, várias espécies foram registradas na matriz. Outras medidas de biodiversidade como riqueza de grupos e diversidade funcional foram avaliados, no entanto, foram redundantes ao número de espécies. As proporções de vegetação nativa, a 250 e 2000 m, foram as mais importantes para explicar a frequência de registros de diversas espécies. Em termos de composição, os conjuntos de espécies obtidos nos remanescentes e corredores florestais foram semelhantes. Dentre os métodos de amostragem, as parcelas de areia demonstraram ser mais eficientes em estudos curtos, porém, seu custo financeiro é significativamente superior para estudos de longo prazo. A utilização de iscas odoríferas é uma inovação na região neotropical, apesar de suas incertezas na atração da fauna. Uma desvantagem desses métodos é que várias espécies não podem ser precisamente identificadas, dada a semelhança dos rastros, o que raramente ocorre com a utilização de cameras-trap, onde os espécimes são visualizados em fotografias, permitindo uma identificação segura. As diferentes metodologias utilizadas apresentaram taxas de registros distintas, ou seja, não foram capazes de identificar a mesma comunidade, entretanto, as estimativas de riqueza, quando avaliadas separadamente, indicam que os três atingiriam o mesmo resultado final. Este estudo sugere que paisagens silviculturais fragmentadas podem ser importantes para a conservação de mamíferos se bem planejadas, como a manutenção de manchas de habitat em bons estados de conservação e corredores que conectem os elementos florestais. Além disso, a utilização dos métodos de amostragem deve ser planejados de acordo com a finalidade do estudo. / The land use conversion is a major threat to wildlife because the landscape becomes fragmented and the areas occupied by native vegetation are reduced, altering the structure of animal communities. In this study, the medium to large-sized mammal community (> 1 kg) was evaluated in a fragmented landscape in eucalyptus matrix, in the southern São Paulo State, Buri and Capão Bonito municipalities (23°52\'47\" S, 48°23\'24\" W), using three different methods installed together (sand plots, camera-trap and scent stations, where were used specific baits to carnivores and omnivores, Canine Call® and Pro\'s Choice®). The objectives of this work were: i) to evaluate the structure of the community as a function of landscape structure, ii) compare sampling methods used, discussing their application. The species sampling was conducted in five campaigns of five days each, between 2010-2012, with sampling units installed in the matrix, corridors and forest fragments. The landscape elements were evaluated in buffers with 250, 500, 1000 and 2000 m around. We recorded 20 medium and large mammals species, the majority (n=18) found in y native vegetation (corridors and habitat patches). Although only several species were recorded in the matrix. Other measures of biodiversity as a functional group richness and functional diversity were evaluated, however, the results were the same obtained to species richness. The proportions of native vegetation, 250 and 2000 m, were most important in explaining the records frequency for many species. In terms of composition, the assemblage obtained in the habitat patches and forest corridors were similar. Among the methods of sampling, the sand plots was more effective in short-term assessment, however, its financial cost is significantly higher for long-term studies. The use of lure in scent stations is an innovation in the Neotropical, despite their uncertainties in attracting the animals. A disadvantage of these methods is that several species cannot be accurately identified because of the similarity of the tracks, which rarely occurs with the use of cameras-trap, where the specimens are shown in photographs, allowing a reliable identification. The different methodologies used showed different rates of records, or were not able to identify the same community, however, the richness estimates, when evaluated separately, indicate that the three would achieve the same result. This study suggests that forestry fragmented landscapes may be important for mammals conservation if well planned, such as the maintenance of habitat patches and corridors that connect the remaining habitat fragments. Furthermore, the use of sampling method must be planned according to the purpose of study. Amostragem Ecologia da paisagem Eucalipto Fauna Mamíferos Seleção de modelos Silvicultura Forestry Landscape Ecology Mammalia Methods Comparison Model Selection
162	Projeto de classificadores de padrÃes baseados em protÃtipos usando evoluÃÃo diferencial / On the efficient design of a prototype-based classifier using differential evolution Luiz Soares de Andrade Filho 28 November 2014 (has links) Nesta dissertaÃÃo Ã apresentada uma abordagem evolucionÃria para o projeto eciente de classificadores baseados em protÃtipos utilizando EvoluÃÃo Diferencial. Para esta finalidade foram reunidos conceitos presentes na famÃlia de redes neurais LVQ (Learning Vector Quantization, introduzida por Kohonen para classificaÃÃo supervisionada, juntamente com conceitos extraÃdos da tÃcnica de clusterizaÃÃo automÃtica proposta por Das et al. baseada na metaheurÃstica EvoluÃÃo Diferencial. A abordagem proposta visa determinar tanto o nÃmero Ãtimo de protÃtipos por classe, quanto as posiÃÃes correspondentes de cada protÃtipo no espaÃo de cobertura do problema. AtravÃs de simulaÃÃes computacionais abrangentes realizadas sobre vÃrios conjuntos de dados comumente utilizados em estudos de comparaÃÃo de desempenho, foi demonstrado que o classificador resultante, denominado LVQ-DE, alcanÃa resultados equivalentes (ou muitas vezes atÃ melhores) que o estado da arte em classificadores baseados em protÃtipos, com um nÃmero muito menor de protÃtipos. / In this Master's dissertation we introduce an evolutionary approach for the eficient design of prototyp e-based classiers using dierential evolution (DE). For this purp ose we amalgamate ideas from the Learning Vector Quantization (LVQ) framework for sup ervised classication by Kohonen (KOHONEN, 2001), with the DE-based automatic clustering approach by Das et al. (DAS; ABRAHAM; KONAR, 2008) in order to evolve sup ervised classiers. The prop osed approach is able to determine b oth the optimal numb er of prototyp es p er class and the corresp onding p ositions of these prototyp es in the data space. By means of comprehensive computer simulations on b enchmarking datasets, we show that the resulting classier, named LVQ-DE, consistently outp erforms state-of-the-art prototyp e-based classiers. QuantizaÃÃo vetorial SeleÃÃo de modelos EvoluÃÃo diferencial Prototype-based Classication Competitive Neural Networks Learning Vector Quantization Model Selection Dierential Evolution ENGENHARIAS
163	Seleção de modelos multiníveis para dados de avaliação educacional / Selection of multilevel models for educational evaluation data Fabiano Rodrigues Coelho 11 August 2017 (has links) Quando um conjunto de dados possui uma estrutura hierárquica, uma possível abordagem são os modelos de regressão multiníveis, que se justifica pelo fato de haver uma porção significativa da variabilidade dos dados que pode ser explicada por níveis macro. Neste trabalho, desenvolvemos a seleção de modelos de regressão multinível aplicados a dados educacionais. Esta análise divide-se em duas partes: seleção de variáveis e seleção de modelos. Esta última subdivide-se em dois casos: modelagem clássica e modelagem bayesiana. Buscamos através de critérios como o Lasso, AIC, BIC, WAIC entre outros, encontrar quais são os fatores que influenciam no desempenho em matemática dos alunos do nono ano do ensino fundamental do estado de São Paulo. Também investigamos o funcionamento de cada um dos critérios de seleção de variáveis e de modelos. Foi possível concluir que, sob a abordagem frequentista, o critério de seleção de modelos BIC é o mais eficiente, já na abordagem bayesiana, o critérioWAIC apresentou melhores resultados. Utilizando o critério de seleção de variáveis Lasso para abordagem clássica, houve uma diminuição de 34% dos preditores do modelo. Por fim, identificamos que o desempenho em matemática dos estudantes do nono ano do ensino fundamental do estado de São Paulo é influenciado pelas seguintes covariáveis: grau de instrução da mãe, frequência de leitura de livros, tempo gasto com recreação em dia de aula, o fato de gostar de matemática, o desempenho em matemática global da escola, desempenho em língua portuguesa do aluno, dependência administrativa da escola, sexo, grau de instrução do pai, reprovações e distorção idade-série. / When a dataset contains a hierarchical data structure, a possible approach is the multilevel regression modelling, which is justified by the significative amout of the data variability that can be explained by macro level processes. In this work, a selection of multilevel regression models for educational data is developed. This analysis is divided into two parts: variable selection and model selection. The latter is subdivided into two categories: classical and Bayesian modeling. Traditional criteria for model selection such as Lasso, AIC, BIC, and WAIC, among others are used in this study as an attempt to identify the factors influencing ninth grade students performance in Mathematics of elementary education in the State of São Paulo. Likewise, an investigation was conducted to evaluate the performance of each variable selection criteria and model selection methods applied to fitted models that will be mentioned throughout this work. It was possible to conclude that, under the frequentist approach, BIC is the most efficient, whereas under the bayesian approach, WAIC presented better results. Using Lasso under the frequentist approach, a decrease of 34% on the number of predictors was observed. Finally, we identified that the performance in Mathematics of students in the ninth year of elementary school in the state of São Paulo is most influenced by the following covariates: mothers educational level, frequency of book reading, time spent with recreation in classroom, the fact of liking Math, school global performance in Mathematics, performance in Portuguese, school administrative dependence, gender, fathers educational degree, failures and age-grade distortion. Critério de informação e Prova Brasil Modelos Multiníveis Seleção de modelos Model selection Multilevel models
164	Sophisticated and small versus simple and sizeable: When does it pay off to introduce drifting coefficients in Bayesian VARs? Feldkircher, Martin, Huber, Florian, Kastner, Gregor 01 1900 (has links) (PDF) We assess the relationship between model size and complexity in the time-varying parameter VAR framework via thorough predictive exercises for the Euro Area, the United Kingdom and the United States. It turns out that sophisticated dynamics through drifting coefficients are important in small data sets while simpler models tend to perform better in sizeable data sets. To combine best of both worlds, novel shrinkage priors help to mitigate the curse of dimensionality, resulting in competitive forecasts for all scenarios considered. Furthermore, we discuss dynamic model selection to improve upon the best performing individual model for each point in time. / Series: Department of Economics Working Paper Series JEL C11, C30, C53, E52
165	Approches statistiques en segmentation : application à la ré-annotation de génome / Statistical Approaches for Segmentation : Application to Genome Annotation Cleynen, Alice 15 November 2013 (has links) Nous proposons de modéliser les données issues des technologies de séquençage du transcriptome (RNA-Seq) à l'aide de la loi binomiale négative, et nous construisons des modèles de segmentation adaptés à leur étude à différentes échelles biologiques, dans le contexte où ces technologies sont devenues un outil précieux pour l'annotation de génome, l'analyse de l'expression des gènes, et la détection de nouveaux transcrits. Nous développons un algorithme de segmentation rapide pour analyser des séries à l'échelle du chromosome, et nous proposons deux méthodes pour l'estimation du nombre de segments, directement lié au nombre de gènes exprimés dans la cellule, qu'ils soient précédemment annotés ou détectés à cette même occasion. L'objectif d'annotation précise des gènes, et plus particulièrement de comparaison des sites de début et fin de transcription entre individus, nous amène naturellement à nous intéresser à la comparaison des localisations de ruptures dans des séries indépendantes. Nous construisons ainsi dans un cadre de segmentation bayésienne des outils de réponse à nos questions pour lesquels nous sommes capable de fournir des mesures d'incertitude. Nous illustrons nos modèles, tous implémentés dans des packages R, sur des données RNA-Seq provenant d'expériences sur la levure, et montrons par exemple que les frontières des introns sont conservées entre conditions tandis que les débuts et fin de transcriptions sont soumis à l'épissage différentiel. / We propose to model the output of transcriptome sequencing technologies (RNA-Seq) using the negative binomial distribution, as well as build segmentation models suited to their study at different biological scales, in the context of these technologies becoming a valuable tool for genome annotation, gene expression analysis, and new-transcript discovery. We develop a fast segmentation algorithm to analyze whole chromosomes series, and we propose two methods for estimating the number of segments, a key feature related to the number of genes expressed in the cell, should they be identified from previous experiments or discovered at this occasion. Research on precise gene annotation, and in particular comparison of transcription boundaries for individuals, naturally leads us to the statistical comparison of change-points in independent series. To address our questions, we build tools, in a Bayesian segmentation framework, for which we are able to provide uncertainty measures. We illustrate our models, all implemented in R packages, on an RNA-Seq dataset from a study on yeast, and show for instance that the intron boundaries are conserved across conditions while the beginning and end of transcripts are subject to differential splicing. Segmentation Binomiale négative Algorithmes Intervalles de crédibilité Sélection de modèle RNA-Seq Segmentation Negative binomial Algorithm Credibility intervals Model selection RNA-Seq
166	Finite Alphabet Blind Separation Behr, Merle 06 December 2017 (has links) No description available. 510 Mathematics (PPN61756535X)
167	Bayesian exploratory factor analysis Conti, Gabriella, Frühwirth-Schnatter, Sylvia, Heckman, James J., Piatek, Rémi 27 June 2014 (has links) (PDF) This paper develops and applies a Bayesian approach to Exploratory Factor Analysis that improves on ad hoc classical approaches. Our framework relies on dedicated factor models and simultaneously determines the number of factors, the allocation of each measurement to a unique factor, and the corresponding factor loadings. Classical identification criteria are applied and integrated into our Bayesian procedure to generate models that are stable and clearly interpretable. A Monte Carlo study confirms the validity of the approach. The method is used to produce interpretable low dimensional aggregates from a high dimensional set of psychological measurements. (authors' abstract) JEL C11, C38, C63
168	Projeção de preços de alumínio: modelo ótimo por meio de combinação de previsões / Aluminum price forecasting: optimal forecast combination João Bosco Barroso de Castro 15 June 2015 (has links) Commodities primárias, tais como metais, petróleo e agricultura, constituem matérias-primas fundamentais para a economia mundial. Dentre os metais, destaca-se o alumínio, usado em uma ampla gama de indústrias, e que detém o maior volume de contratos na London Metal Exchange (LME). Como o preço não está diretamente relacionado aos custos de produção, em momentos de volatilidade ou choques econômicos, o impacto financeiro na indústria global de alumínio é significativo. Previsão de preços do alumínio é fundamental, portanto, para definição de política industrial, bem como para produtores e consumidores. Este trabalho propõe um modelo ótimo de previsões para preços de alumínio, por meio de combinações de previsões e de seleção de modelos através do Model Confidence Set (MCS), capaz de aumentar o poder preditivo em relação a métodos tradicionais. A abordagem adotada preenche uma lacuna na literatura para previsão de preços de alumínio. Foram ajustados 5 modelos individuais: AR(1), como benchmarking, ARIMA, dois modelos ARIMAX e um modelo estrutural, utilizando a base de dados mensais de janeiro de 1999 a setembro de 2014. Para cada modelo individual, foram geradas 142 previsões fora da amostra, 12 meses à frente, por meio de uma janela móvel de 36 meses. Nove combinações de modelos foram desenvolvidas para cada ajuste dos modelos individuais, resultando em 60 previsões fora da amostra, 12 meses à frente. A avaliação de desempenho preditivo dos modelos foi realizada por meio do MCS para os últimos 60, 48 e 36 meses. Um total de 1.250 estimações foram realizadas e 1.140 variáveis independentes e suas transformadas foram avaliadas. A combinação de previsões usando ARIMA e um ARMAX foi o único modelo que permaneceu no conjunto de modelos com melhor acuracidade de previsão para 36, 48 e 60 meses a um nível descritivo do MCS de 0,10. Para os últimos 36 meses, o modelo combinado proposto apresentou resultados superiores em relação a todos os demais modelos. Duas co-variáveis identificadas no modelo ARMAX, preço futuro de três meses e estoques mundiais, aumentaram a acuracidade de previsão. A combinação ótima apresentou um intervalo de confiança pequeno, equivalente a 5% da média global da amostra completa analisada, fornecendo subsídio importante para tomada de decisão na indústria global de alumínio. iii / Primary commodities, including metals, oil and agricultural products are key raw materials for the global economy. Among metals, aluminum stands out for its large use in several industrial applications and for holding the largest contract volume on the London Metal Exchange (LME). As the price is not directly related to production costs, during volatility periods or economic shocks, the financial impact on the global aluminum industry is significant. Aluminum price forecasting, therefore, is critical for industrial policy as well as for producers and consumers. This work has proposed an optimal forecast model for aluminum prices by using forecast combination and the Model Confidence Set for model selection, resulting in superior performance compared to tradicional methods. The proposed approach was not found in the literature for aluminum price forecasting. Five individual models were developed: AR(1) for benchmarking, ARIMA, two ARIMAX models and a structural model, using monthly data from January 1999 to September 2014. For each individual model, 142 out-of-sample, 12 month ahead, forecasts were generated through a 36 month rolling window. Nine foreast combinations were deveoped for each individual model estimation, resulting in 60 out-of-sample, 12 month ahead forecasts. Model predictive performace was assessed through the Model Confidence Set for the latest 36, 48, and 60 months, through 12-month ahead out-of-sample forecasts. A total of 1,250 estimations were performed and 1,140 independent variables and their transformations were assessed. The forecast combination using ARMA and ARIMAX was the only model among the best set of models presenting equivalent performance at 0.10 MCS p-value in all three periods. For the latest 36 months, the proposed combination was the best model at 0.1 MCS p-value. Two co-variantes, identified for the ARMAX model, namely, 3-month forward price and global inventories increased forecast accuracy. The optimal forecast combination has generated a small confidence interval, equivalent to 5% of average aluminum price for the entire sample, proving relevant support for global industry decision makers. Análise de série temporais Combinação de previsões Preços de alumínio Previsão econômica Seleção de modelos Aluminun price forecasting Forecast combination Model selection
169	Discrepancy-based algorithms for best-subset model selection Zhang, Tao 01 May 2013 (has links) The selection of a best-subset regression model from a candidate family is a common problem that arises in many analyses. In best-subset model selection, we consider all possible subsets of regressor variables; thus, numerous candidate models may need to be fit and compared. One of the main challenges of best-subset selection arises from the size of the candidate model family: specifically, the probability of selecting an inappropriate model generally increases as the size of the family increases. For this reason, it is usually difficult to select an optimal model when best-subset selection is attempted based on a moderate to large number of regressor variables. Model selection criteria are often constructed to estimate discrepancy measures used to assess the disparity between each fitted candidate model and the generating model. The Akaike information criterion (AIC) and the corrected AIC (AICc) are designed to estimate the expected Kullback-Leibler (K-L) discrepancy. For best-subset selection, both AIC and AICc are negatively biased, and the use of either criterion will lead to overfitted models. To correct for this bias, we introduce a criterion AICi, which has a penalty term evaluated from Monte Carlo simulation. A multistage model selection procedure AICaps, which utilizes AICi, is proposed for best-subset selection. In the framework of linear regression models, the Gauss discrepancy is another frequently applied measure of proximity between a fitted candidate model and the generating model. Mallows' conceptual predictive statistic (Cp) and the modified Cp (MCp) are designed to estimate the expected Gauss discrepancy. For best-subset selection, Cp and MCp exhibit negative estimation bias. To correct for this bias, we propose a criterion CPSi that again employs a penalty term evaluated from Monte Carlo simulation. We further devise a multistage procedure, CPSaps, which selectively utilizes CPSi. In this thesis, we consider best-subset selection in two different modeling frameworks: linear models and generalized linear models. Extensive simulation studies are compiled to compare the selection behavior of our methods and other traditional model selection criteria. We also apply our methods to a model selection problem in a study of bipolar disorder. Best-subset model selection Gauss discrepancy Generalized linear models Kullback-Leibler discrepancy Linear models Multistage procedure Biostatistics
170	Estimation adaptative pour les modèles de Markov cachés non paramétriques / Adaptative estimation for nonparametric hidden Markov models Lehéricy, Luc 14 December 2018 (has links) Dans cette thèse, j'étudie les propriétés théoriques des modèles de Markov cachés non paramétriques. Le choix de modèles non paramétriques permet d'éviter les pertes de performance liées à un mauvais choix de paramétrisation, d'où un récent intérêt dans les applications. Dans une première partie, je m'intéresse à l'estimation du nombre d'états cachés. J'y introduis deux estimateurs consistants : le premier fondé sur un critère des moindres carrés pénalisés, le second sur une méthode spectrale. Une fois l'ordre connu, il est possible d'estimer les autres paramètres. Dans une deuxième partie, je considère deux estimateurs adaptatifs des lois d'émission, c'est-à-dire capables de s'adapter à leur régularité. Contrairement aux méthodes existantes, ces estimateurs s'adaptent à la régularité de chaque loi au lieu de s'adapter seulement à la pire régularité. Dans une troisième partie, je me place dans le cadre mal spécifié, c'est-à-dire lorsque les observations sont générées par une loi qui peut ne pas être un modèle de Markov caché. J'établis un contrôle de l'erreur de prédiction de l'estimateur du maximum de vraisemblance sous des conditions générales d'oubli et de mélange de la vraie loi. Enfin, j'introduis une variante non homogène des modèles de Markov cachés : les modèles de Markov cachés avec tendances, et montre la consistance de l'estimateur du maximum de vraisemblance. / During my PhD, I have been interested in theoretical properties of nonparametric hidden Markov models. Nonparametric models avoid the loss of performance coming from an inappropriate choice of parametrization, hence a recent interest in applications. In a first part, I have been interested in estimating the number of hidden states. I introduce two consistent estimators: the first one is based on a penalized least squares criterion, and the second one on a spectral method. Once the order is known, it is possible to estimate the other parameters. In a second part, I consider two adaptive estimators of the emission distributions. Adaptivity means that their rate of convergence adapts to the regularity of the target distribution. Contrary to existing methods, these estimators adapt to the regularity of each distribution instead of only the worst regularity. The third part is focussed on the misspecified setting, that is when the observations may not come from a hidden Markov model. I control of the prediction error of the maximum likelihood estimator when the true distribution satisfies general forgetting and mixing assumptions. Finally, I introduce a nonhomogeneous variant of hidden Markov models : hidden Markov models with trends, and show that the maximum likelihood estimators of such models is consistent. Adaptativité minimax Statistiques non paramétriques Modèles de Markov cachés Sélection de modèle Minimax adaptative estimation Nonparametric statistics Hidden Markov models Model selection

Search results