• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 3
  • 1
  • Tagged with
  • 4
  • 4
  • 4
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Análise da qualidade do ar : um estudo de séries temporais para dados de contagem

Silva, Kelly Cristina Ramos da 30 April 2013 (has links)
Made available in DSpace on 2016-06-02T20:06:08Z (GMT). No. of bitstreams: 1 5213.pdf: 2943691 bytes, checksum: 6d301fea12ee3950f36c4359dd4a627e (MD5) Previous issue date: 2013-04-30 / Financiadora de Estudos e Projetos / The aim of this study was to investigate the monthly amount of unfavourable days to pollutant dispersion in the atmosphere on the metropolitan region of S ão Paulo (RMSP). It was considered two data sets derived from the air quality monitoring on the RMSP: (1) monthly observations of the times series of annual period and (2) monthly observations of the times series of period form May to September. It was used two classes of models: the Vector Autoregressive models (VAR) and Generalized Additive Models for Location, Scale and Shape (GAMLSS). The techniques presented in this dissertation was focus in: VAR class had emphasis on modelling stationary time series; and GAMLSS class had emphasis on models for count data, like Delaporte (DEL), Negative Binomial type I (NBI), Negative Binomial type II (NBII), Poisson (PO), inflated Poisson Zeros (ZIP), Inverse Poisson Gaussian (PIG) and Sichel (SI). The VAR was used only for the data set (1) obtaining a good prediction of the monthly amount of unfavourable days, although the adjustment had presented relatively large residues. The GAMLSS were used in both data sets, and the NBII model had good performance to data set (1), and ZIP model for data set (2). Also, it was made a simulation study to better understanding of the GAMLSS class for count data. The data were generated from three different Negative Binomial distributions. The results shows that the models NBI, NBII, and PIG adjusted well the data generated. The statistic techniques used in this dissertation was important to describe and understand the air quality problem. / O objetivo deste trabalho foi investigar a quantidade mensal de dias desfavoráveis à dispersão de poluentes na atmosfera da região metropolitana de São Paulo (RMSP). Foram considerados dois conjuntos de dados provenientes do monitoramento da qualidade do ar da RMSP: (1) um contendo observações mensais das séries temporais do período anual e (2) outro contendo observações mensais das séries temporais do período de maio a setembro. Foram utilizadas duas classes de modelos: os Modelos Vetoriais Autorregressivos (VAR) e os Modelos Aditivos Generalizados para Locação, Escala e Forma (GAMLSS), ressaltando que as técnicas apresentadas nessa dissertação da classe VAR têm ênfase na modelagem de séries temporais estacionárias e as da classe GAMLSS têm ênfase nos modelos para dados de contagem, sendo eles: Delaporte (DEL), Binomial Negativa tipo I (NBI), Binomial Negativa tipo II (NBII), Poisson (PO), Poisson Inflacionada de Zeros (ZIP), Poisson Inversa Gaussiana (PIG) e Sichel (SI). O modelo VAR foi utilizado apenas para o conjunto de dados (1), obtendo uma boa previsão da quantidade mensal de dias desfavoráveis, apesar do ajuste ter apresentado resíduos relativamente grandes. Os GAMLSS foram utilizados em ambos conjuntos de dados, sendo que os modelos NBII e ZIP melhor se ajustaram aos conjuntos de dados (1) e (2) respectivamente. Além disso, realizou-se um estudo de simulação para compreender melhor os GAMLSS investigados. Os dados foram gerados de três diferentes distribuições Binomiais Negativas. Os resultados obtidos mostraram que, tanto os modelos NBI e NBII como o modelo PIG, ajustaram bem os dados gerados. As técnicas estatísticas utilizadas nessa dissertação foram importantes para descrever e compreender o problema da qualidade do ar.
2

Assessment of foliar nitrogen as an indicator of vegetation stress using remote sensing : the case study of Waterberg region, Limpopo Province

Manyashi, Enoch Khomotso 06 1900 (has links)
Vegetation status is a key indicator of the ecosystem condition in a particular area. The study objective was about the estimation of leaf nitrogen (N) as an indicator of vegetation water stress using vegetation indices especially the red edge based ones, and how leaf N concentration is influenced by various environmental factors. Leaf nitrogen was estimated using univariate and multivariate regression techniques of stepwise multiple linear regression (SMLR) and random forest. The effects of environmental parameters on leaf nitrogen distribution were tested through univariate regression and analysis of variance (ANOVA). Vegetation indices were evaluated derived from the analytical spectral device (ASD) data, resampled to RapidEye. The multivariate models were also developed to predict leaf N. The best model was chosen based on the lowest root mean square error (RMSE) and higher coefficient of determination (R2) values. Univariate results showed that red edge based vegetation index called MERRIS Terrestrial Chlorophyll Index (MTCI) yielded higher leaf N estimation accuracy as compared to other vegetation indices. Simple ratio (SR) based on the bands red and near-infrared was found to be the best vegetation index for leaf N estimation with exclusion of red edge band for stepwise multiple linear regression (SMLR) method. Simple ratio (SR3) was the best vegetation index when red edge was included for stepwise linear regression (SMLR) method. Random forest prediction model achieved the highest leaf N estimation accuracy, the best vegetation index was Red Green Index (RGI1) based on all bands with red green index when including the red edge band. When red edge band was excluded the best vegetation index for random forest was Difference Vegetation Index (DVI1). The results for univariate and multivariate results indicated that the inclusion of the red edge band provides opportunity to accurately estimate leaf N. Analysis of variance results showed that vegetation and soil types have a significant effect on leaf N distribution with p-values<0.05. Red edge based indices provides opportunity to assess vegetation health using remote sensing techniques. / Environmental Sciences / M. Sc. (Environmental Management)
3

Assessment of foliar nitrogen as an indicator of vegetation stress using remote sensing : the case study of Waterberg region, Limpopo Province

Manyashi, Enoch Khomotšo 06 1900 (has links)
Vegetation status is a key indicator of the ecosystem condition in a particular area. The study objective was about the estimation of leaf nitrogen (N) as an indicator of vegetation water stress using vegetation indices especially the red edge based ones, and how leaf N concentration is influenced by various environmental factors. Leaf nitrogen was estimated using univariate and multivariate regression techniques of stepwise multiple linear regression (SMLR) and random forest. The effects of environmental parameters on leaf nitrogen distribution were tested through univariate regression and analysis of variance (ANOVA). Vegetation indices were evaluated derived from the analytical spectral device (ASD) data, resampled to RapidEye. The multivariate models were also developed to predict leaf N. The best model was chosen based on the lowest root mean square error (RMSE) and higher coefficient of determination (R2) values. Univariate results showed that red edge based vegetation index called MERRIS Terrestrial Chlorophyll Index (MTCI) yielded higher leaf N estimation accuracy as compared to other vegetation indices. Simple ratio (SR) based on the bands red and near-infrared was found to be the best vegetation index for leaf N estimation with exclusion of red edge band for stepwise multiple linear regression (SMLR) method. Simple ratio (SR3) was the best vegetation index when red edge was included for stepwise linear regression (SMLR) method. Random forest prediction model achieved the highest leaf N estimation accuracy, the best vegetation index was Red Green Index (RGI1) based on all bands with red green index when including the red edge band. When red edge band was excluded the best vegetation index for random forest was Difference Vegetation Index (DVI1). The results for univariate and multivariate results indicated that the inclusion of the red edge band provides opportunity to accurately estimate leaf N. Analysis of variance results showed that vegetation and soil types have a significant effect on leaf N distribution with p-values<0.05. Red edge based indices provides opportunity to assess vegetation health using remote sensing techniques. / Environmental Sciences / M. Sc. (Environmental Management)
4

Neural networks regularization through representation learning / Régularisation des réseaux de neurones via l'apprentissage des représentations

Belharbi, Soufiane 06 July 2018 (has links)
Les modèles de réseaux de neurones et en particulier les modèles profonds sont aujourd'hui l'un des modèles à l'état de l'art en apprentissage automatique et ses applications. Les réseaux de neurones profonds récents possèdent de nombreuses couches cachées ce qui augmente significativement le nombre total de paramètres. L'apprentissage de ce genre de modèles nécessite donc un grand nombre d'exemples étiquetés, qui ne sont pas toujours disponibles en pratique. Le sur-apprentissage est un des problèmes fondamentaux des réseaux de neurones, qui se produit lorsque le modèle apprend par coeur les données d'apprentissage, menant à des difficultés à généraliser sur de nouvelles données. Le problème du sur-apprentissage des réseaux de neurones est le thème principal abordé dans cette thèse. Dans la littérature, plusieurs solutions ont été proposées pour remédier à ce problème, tels que l'augmentation de données, l'arrêt prématuré de l'apprentissage ("early stopping"), ou encore des techniques plus spécifiques aux réseaux de neurones comme le "dropout" ou la "batch normalization". Dans cette thèse, nous abordons le sur-apprentissage des réseaux de neurones profonds sous l'angle de l'apprentissage de représentations, en considérant l'apprentissage avec peu de données. Pour aboutir à cet objectif, nous avons proposé trois différentes contributions. La première contribution, présentée dans le chapitre 2, concerne les problèmes à sorties structurées dans lesquels les variables de sortie sont à grande dimension et sont généralement liées par des relations structurelles. Notre proposition vise à exploiter ces relations structurelles en les apprenant de manière non-supervisée avec des autoencodeurs. Nous avons validé notre approche sur un problème de régression multiple appliquée à la détection de points d'intérêt dans des images de visages. Notre approche a montré une accélération de l'apprentissage des réseaux et une amélioration de leur généralisation. La deuxième contribution, présentée dans le chapitre 3, exploite la connaissance a priori sur les représentations à l'intérieur des couches cachées dans le cadre d'une tâche de classification. Cet à priori est basé sur la simple idée que les exemples d'une même classe doivent avoir la même représentation interne. Nous avons formalisé cet à priori sous la forme d'une pénalité que nous avons rajoutée à la fonction de perte. Des expérimentations empiriques sur la base MNIST et ses variantes ont montré des améliorations dans la généralisation des réseaux de neurones, particulièrement dans le cas où peu de données d'apprentissage sont utilisées. Notre troisième et dernière contribution, présentée dans le chapitre 4, montre l'intérêt du transfert d'apprentissage ("transfer learning") dans des applications dans lesquelles peu de données d'apprentissage sont disponibles. L'idée principale consiste à pré-apprendre les filtres d'un réseau à convolution sur une tâche source avec une grande base de données (ImageNet par exemple), pour les insérer par la suite dans un nouveau réseau sur la tâche cible. Dans le cadre d'une collaboration avec le centre de lutte contre le cancer "Henri Becquerel de Rouen", nous avons construit un système automatique basé sur ce type de transfert d'apprentissage pour une application médicale où l'on dispose d’un faible jeu de données étiquetées. Dans cette application, la tâche consiste à localiser la troisième vertèbre lombaire dans un examen de type scanner. L’utilisation du transfert d’apprentissage ainsi que de prétraitements et de post traitements adaptés a permis d’obtenir des bons résultats, autorisant la mise en oeuvre du modèle en routine clinique. / Neural network models and deep models are one of the leading and state of the art models in machine learning. They have been applied in many different domains. Most successful deep neural models are the ones with many layers which highly increases their number of parameters. Training such models requires a large number of training samples which is not always available. One of the fundamental issues in neural networks is overfitting which is the issue tackled in this thesis. Such problem often occurs when the training of large models is performed using few training samples. Many approaches have been proposed to prevent the network from overfitting and improve its generalization performance such as data augmentation, early stopping, parameters sharing, unsupervised learning, dropout, batch normalization, etc. In this thesis, we tackle the neural network overfitting issue from a representation learning perspective by considering the situation where few training samples are available which is the case of many real world applications. We propose three contributions. The first one presented in chapter 2 is dedicated to dealing with structured output problems to perform multivariate regression when the output variable y contains structural dependencies between its components. Our proposal aims mainly at exploiting these dependencies by learning them in an unsupervised way. Validated on a facial landmark detection problem, learning the structure of the output data has shown to improve the network generalization and speedup its training. The second contribution described in chapter 3 deals with the classification task where we propose to exploit prior knowledge about the internal representation of the hidden layers in neural networks. This prior is based on the idea that samples within the same class should have the same internal representation. We formulate this prior as a penalty that we add to the training cost to be minimized. Empirical experiments over MNIST and its variants showed an improvement of the network generalization when using only few training samples. Our last contribution presented in chapter 4 showed the interest of transfer learning in applications where only few samples are available. The idea consists in re-using the filters of pre-trained convolutional networks that have been trained on large datasets such as ImageNet. Such pre-trained filters are plugged into a new convolutional network with new dense layers. Then, the whole network is trained over a new task. In this contribution, we provide an automatic system based on such learning scheme with an application to medical domain. In this application, the task consists in localizing the third lumbar vertebra in a 3D CT scan. A pre-processing of the 3D CT scan to obtain a 2D representation and a post-processing to refine the decision are included in the proposed system. This work has been done in collaboration with the clinic "Rouen Henri Becquerel Center" who provided us with data

Page generated in 0.1086 seconds