• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 22
  • 9
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 51
  • 51
  • 16
  • 9
  • 9
  • 8
  • 6
  • 6
  • 6
  • 5
  • 5
  • 5
  • 4
  • 4
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
41

L’arbre de régression multivariable et les modèles linéaires généralisés revisités : applications à l’étude de la diversité bêta et à l’estimation de la biomasse d’arbres tropicaux

Ouellette, Marie-Hélène 04 1900 (has links)
En écologie, dans le cadre par exemple d’études des services fournis par les écosystèmes, les modélisations descriptive, explicative et prédictive ont toutes trois leur place distincte. Certaines situations bien précises requièrent soit l’un soit l’autre de ces types de modélisation ; le bon choix s’impose afin de pouvoir faire du modèle un usage conforme aux objectifs de l’étude. Dans le cadre de ce travail, nous explorons dans un premier temps le pouvoir explicatif de l’arbre de régression multivariable (ARM). Cette méthode de modélisation est basée sur un algorithme récursif de bipartition et une méthode de rééchantillonage permettant l’élagage du modèle final, qui est un arbre, afin d’obtenir le modèle produisant les meilleures prédictions. Cette analyse asymétrique à deux tableaux permet l’obtention de groupes homogènes d’objets du tableau réponse, les divisions entre les groupes correspondant à des points de coupure des variables du tableau explicatif marquant les changements les plus abrupts de la réponse. Nous démontrons qu’afin de calculer le pouvoir explicatif de l’ARM, on doit définir un coefficient de détermination ajusté dans lequel les degrés de liberté du modèle sont estimés à l’aide d’un algorithme. Cette estimation du coefficient de détermination de la population est pratiquement non biaisée. Puisque l’ARM sous-tend des prémisses de discontinuité alors que l’analyse canonique de redondance (ACR) modélise des gradients linéaires continus, la comparaison de leur pouvoir explicatif respectif permet entre autres de distinguer quel type de patron la réponse suit en fonction des variables explicatives. La comparaison du pouvoir explicatif entre l’ACR et l’ARM a été motivée par l’utilisation extensive de l’ACR afin d’étudier la diversité bêta. Toujours dans une optique explicative, nous définissons une nouvelle procédure appelée l’arbre de régression multivariable en cascade (ARMC) qui permet de construire un modèle tout en imposant un ordre hiérarchique aux hypothèses à l’étude. Cette nouvelle procédure permet d’entreprendre l’étude de l’effet hiérarchisé de deux jeux de variables explicatives, principal et subordonné, puis de calculer leur pouvoir explicatif. L’interprétation du modèle final se fait comme dans une MANOVA hiérarchique. On peut trouver dans les résultats de cette analyse des informations supplémentaires quant aux liens qui existent entre la réponse et les variables explicatives, par exemple des interactions entres les deux jeux explicatifs qui n’étaient pas mises en évidence par l’analyse ARM usuelle. D’autre part, on étudie le pouvoir prédictif des modèles linéaires généralisés en modélisant la biomasse de différentes espèces d’arbre tropicaux en fonction de certaines de leurs mesures allométriques. Plus particulièrement, nous examinons la capacité des structures d’erreur gaussienne et gamma à fournir les prédictions les plus précises. Nous montrons que pour une espèce en particulier, le pouvoir prédictif d’un modèle faisant usage de la structure d’erreur gamma est supérieur. Cette étude s’insère dans un cadre pratique et se veut un exemple pour les gestionnaires voulant estimer précisément la capture du carbone par des plantations d’arbres tropicaux. Nos conclusions pourraient faire partie intégrante d’un programme de réduction des émissions de carbone par les changements d’utilisation des terres. / In ecology, in ecosystem services studies for example, descriptive, explanatory and predictive modelling all have relevance in different situations. Precise circumstances may require one or the other type of modelling; it is important to choose the method properly to insure that the final model fits the study’s goal. In this thesis, we first explore the explanatory power of the multivariate regression tree (MRT). This modelling technique is based on a recursive bipartitionning algorithm. The tree is fully grown by successive bipartitions and then it is pruned by resampling in order to reveal the tree providing the best predictions. This asymmetric analysis of two tables produces homogeneous groups in terms of the response that are constrained by splitting levels in the values of some of the most important explanatory variables. We show that to calculate the explanatory power of an MRT, an appropriate adjusted coefficient of determination must include an estimation of the degrees of freedom of the MRT model through an algorithm. This estimation of the population coefficient of determination is practically unbiased. Since MRT is based upon discontinuity premises whereas canonical redundancy analysis (RDA) models continuous linear gradients, the comparison of their explanatory powers enables one to distinguish between those two patterns of species distributions along the explanatory variables. The extensive use of RDA for the study of beta diversity motivated the comparison between its explanatory power and that of MRT. In an explanatory perspective again, we define a new procedure called a cascade of multivariate regression trees (CMRT). This procedure provides the possibility of computing an MRT model where an order is imposed to nested explanatory hypotheses. CMRT provides a framework to study the exclusive effect of a main and a subordinate set of explanatory variables by calculating their explanatory powers. The interpretation of the final model is done as in nested MANOVA. New information may arise from this analysis about the relationship between the response and the explanatory variables, for example interaction effects between the two explanatory data sets that were not evidenced by the usual MRT model. On the other hand, we study the predictive power of generalized linear models (GLM) to predict individual tropical tree biomass as a function of allometric shape variables. Particularly, we examine the capacity of gaussian and gamma error structures to provide the most precise predictions. We show that for a particular species, gamma error structure is superior in terms of predictive power. This study is part of a practical framework; it is meant to be used as a tool for managers who need to precisely estimate the amount of carbon recaptured by tropical tree plantations. Our conclusions could be integrated within a program of carbon emission reduction by land use changes.
42

[en] SUSTAINABILITY IMPACT ON MANUFACTURING OPERATIONAL PERFORMANCE: AN EMPIRICAL INVESTIGATION / [pt] IMPACTO DA SUSTENTABILIDADE NO DESEMPENHO OPERACIONAL DA MANUFATURA: UMA INVESTIGAÇÃO EMPÍRICA

RENATA BIANCHINI MAGON 09 October 2017 (has links)
[pt] Dado o surgimento de uma nova ordem econômica, as empresas em todo o mundo perceberam que precisam estar comprometidas com a sustentabilidade. As pressões externas vão desde o governo, com a criação de regulamentações socioambientais, até os empregados e a sociedade - mídia, ONGs e clientes - que estão cada vez mais conectados, atentos e exigentes a essas questões. Empresas sustentáveis devem satisfazer as necessidades do presente (gerar lucro) sem comprometer o futuro (respeitando o meio ambiente e os preceitos de responsabilidade social). A indústria de manufatura, foco dessa dissertação, tem muito a contribuir para a sustentabilidade, pois impacta socio-economico-ambientalmente os locais onde opera, de forma significativa. Geração de gases de efeito estufa e de resíduos tóxicos estão entre os grandes vilões, mas não se limitando a eles. No âmbito interno, as empresas necessitam absorver o conceito de sustentabilidade no seu processo de produção, a partir de práticas de gestão ambiental relacionadas, por exemplo, à otimização do uso dos recursos ambientais (ex. reuso de água e utilização de energias alternativas), à redução de gases poluentes e às alternativas para descarte de resíduos; assim como às práticas de gestão social tais como medidas para aumentar saúde e segurança no ambiente de trabalho e criação de programas ligados ao bem estar dos funcionários. As ações, porém, devem ser ampliadas para toda a cadeia do processo e devem ser adotadas medidas colaborativas com os fornecedores para que sejam comprometidos e também responsáveis. No entanto, para a empresa se tornar sustentável, investimentos adicionais e aumento de custos são necessários para incluir em sua estrutura pessoal e processos responsáveis pelo incremento da sustentabilidade, seja ela econômica, social ou relacionada ao meio ambiente, o chamado triple bottom line, em inglês. / [en] Companies worldwide realized that being committed to sustainability is becoming a source to competitive advantage. Empirical evidence exists in the literature validating a positive link of sustainable manufacturing practices with organizational performance. However, there is a lack of rigorous empirical studies directly examining the impact of both environmental and social practices on operational manufacturing performance, especially in four main competitive operational capabilities: cost delivery, quality, and flexibility. This study analyses these relationships with literature review and the backdrop of the resource-based view and of the natural resource-based view of the firm. For this purpose, structural equation modeling (SEM) is used to build the measurement model and hierarchical stepwise multiple regression is used to test the research hypotheses. The data used were obtained from the sixth round of the International Manufacturing Strategy Survey (IMSS-VI) which includes responses from 931 manufacturing plants within the assembly industry in 22 countries. Our findings suggest that internal and external sustainability management practices are complementary. Manufacturing plants can increase their quality and flexibility performance, by implementing internal sustainable practices, such as water and energy consumption reduction, environmental and social certifications, work/life balance policies and sustainability communication, and can increase their cost efficiency and delivery performance by promoting supplier s sustainability management. Overall, this study contributes to the investigation of strategies for sustainable management, highlighting important implications for both practice and future research.
43

Measuring long-term effects of a school improvement initiative

Svärdh, Joakim January 2013 (has links)
There is a growing demand for studies applying quantitative methods to large-scale data sets for the purpose of evaluating the effects of educational reforms (UVK, 2010). In this thesis the statistical method, Propensity Score Analysis (PSA), is presented and explored in the evaluating context of an extensive educational initiative within science and technology education; the Science and Technology for All-program (NTA). The research question put forward reads; under what conditions are PSA-analyses a useful method when measuring the effects from a school improvement initiative in S &amp; T? The study considers the use of PSA when looking for long-term effects that could be measured, what to take into consideration to be able to measure this, and how this could be done. The baseline references (outcome variables) used in order to measure/evaluate the long-term effects from the studied program is students’ achievements in the national test (score and grades) and their grades in year 9. Some findings revealed regarding the object of study (long-term effects from using NTA) are also presented. The PSA method is found to be a useful tool that makes it possible to create artificial control groups when experimental studies are impossible or inappropriate; which is often the case in school education research. The method opens up for making use of the rich source of registry data gathered by authorities. PSA proves reliable and relatively insensitive to the effects of covariates and heterogeneous effecter if the number of samples is large enough. The use of PSA (or other statistical methods) also makes it possible to measure outcomes several years after treatment. There are issues of concern when using PSA. One is the obvious demand for organized collection of measurement data. Another issue of concern is the choice of outcome variables. In this study the chosen outcome variables (pupils’ score and grading in national tests and grades in year 9) open up for discussions regarding aspects that might not be reflected/measured in national tests and/or teachers’ grading. Findings regarding the long-term effects from using NTA) show significantly positive effects in physics on test scores (average increase 16.5%) and test grades, but not in biology and chemistry. In this study no significant effects are found for course grades. PSA approach has proved to be a reliable method. There is however a limitation in terms of the method's ability to capture more subtle aspects of learning. A combination of quantitative and qualitative approach when studying long-term effects from educational intervention is therefore suggested. / <p>QC 20131120</p>
44

Assessment of foliar nitrogen as an indicator of vegetation stress using remote sensing : the case study of Waterberg region, Limpopo Province

Manyashi, Enoch Khomotso 06 1900 (has links)
Vegetation status is a key indicator of the ecosystem condition in a particular area. The study objective was about the estimation of leaf nitrogen (N) as an indicator of vegetation water stress using vegetation indices especially the red edge based ones, and how leaf N concentration is influenced by various environmental factors. Leaf nitrogen was estimated using univariate and multivariate regression techniques of stepwise multiple linear regression (SMLR) and random forest. The effects of environmental parameters on leaf nitrogen distribution were tested through univariate regression and analysis of variance (ANOVA). Vegetation indices were evaluated derived from the analytical spectral device (ASD) data, resampled to RapidEye. The multivariate models were also developed to predict leaf N. The best model was chosen based on the lowest root mean square error (RMSE) and higher coefficient of determination (R2) values. Univariate results showed that red edge based vegetation index called MERRIS Terrestrial Chlorophyll Index (MTCI) yielded higher leaf N estimation accuracy as compared to other vegetation indices. Simple ratio (SR) based on the bands red and near-infrared was found to be the best vegetation index for leaf N estimation with exclusion of red edge band for stepwise multiple linear regression (SMLR) method. Simple ratio (SR3) was the best vegetation index when red edge was included for stepwise linear regression (SMLR) method. Random forest prediction model achieved the highest leaf N estimation accuracy, the best vegetation index was Red Green Index (RGI1) based on all bands with red green index when including the red edge band. When red edge band was excluded the best vegetation index for random forest was Difference Vegetation Index (DVI1). The results for univariate and multivariate results indicated that the inclusion of the red edge band provides opportunity to accurately estimate leaf N. Analysis of variance results showed that vegetation and soil types have a significant effect on leaf N distribution with p-values<0.05. Red edge based indices provides opportunity to assess vegetation health using remote sensing techniques. / Environmental Sciences / M. Sc. (Environmental Management)
45

Análise comparativa do impacto das variáveis atitudinais e do comportamento do consumidor nas vendas físicas de uma loja Pet Shop / Comparative analysis of the impact of attitudinal variables and consumer behavior on physical sales of a Pet Shop

Sakai, Maryanne Akemi 19 March 2018 (has links)
Submitted by Adriana Alves Rodrigues (aalves@espm.br) on 2018-10-09T12:07:43Z No. of bitstreams: 1 MPCC - MARYANNE AKEMI SAKAI.pdf: 1800094 bytes, checksum: 4cb5c8618e0dbdbec0c30d59dc1f6b16 (MD5) / Approved for entry into archive by Adriana Alves Rodrigues (aalves@espm.br) on 2018-10-09T12:08:17Z (GMT) No. of bitstreams: 1 MPCC - MARYANNE AKEMI SAKAI.pdf: 1800094 bytes, checksum: 4cb5c8618e0dbdbec0c30d59dc1f6b16 (MD5) / Approved for entry into archive by Debora Cristina Bonfim Aquarone (deborabonfim@espm.br) on 2018-10-09T12:11:44Z (GMT) No. of bitstreams: 1 MPCC - MARYANNE AKEMI SAKAI.pdf: 1800094 bytes, checksum: 4cb5c8618e0dbdbec0c30d59dc1f6b16 (MD5) / Made available in DSpace on 2018-10-09T12:11:54Z (GMT). No. of bitstreams: 1 MPCC - MARYANNE AKEMI SAKAI.pdf: 1800094 bytes, checksum: 4cb5c8618e0dbdbec0c30d59dc1f6b16 (MD5) Previous issue date: 2018-03-19 / Companies have a wide range of information from their customers, such as their registration and last purchase data. For a service provider in a market like Pet Shops, know your customer and know what the variables are. which most impact on your purchases is of paramount importance. The study gain additional gains when improving the purchasing model through variables consumer attitudes. The RFM analysis (frequency, frequency and monetary value) contributed clients to be grouped according to frequency, frequency and value transaction. The methodology was complemented with semi-structured interviews, 140 respondents, and multivariate linear regression models. The object of the study was the Pet Shop located in São Paulo. Five regression models were used to verify the incremental gains with the incorporation of attitudinal variables. The model that obtained R2 (68.13%) was the one that considered as a response variable the mean value of transaction of the year 2017 and as explanatory variables the quintiles of recency, frequency, of value, in addition to incomplete higher education, pet be considered as a child or as a member family, amount of other pets and cluster of clients that need to be remembered. As As a result of the research, it is noted that the perception of the family vis-a-vis the pet prevailing in the decision to purchase services and products, making variables such as distance from the residence to the Pet Shop or household income become secondary when determining the attitudinal variables that most influence the purchase decision. Understand each client and how it relates to the pet allows you to increase your transaction value because the client seeks above all the welfare of your pet. / As empresas possuem uma grande variedade de informação de seus clientes, como seu cadastro e dados de últimas compras. Para uma empresa prestadora de serviços presente em um mercado pulverizado como o de Pet Shops, conhecer o seu cliente e saber quais são as variáveis que mais impactam em suas compras é de suma importância. O estudo verifica se é possível auferir ganhos adicionais quando se aprimora o modelo de compras por meio de variáveis atitudinais do consumidor. A análise RFM (recência, frequência e valor monetário) contribuiu para que os clientes fossem agrupados conforme os padrões de recência, frequência e valor de transação. A metodologia foi complementada com entrevistas semiestruturadas, survey com 140 respondentes, e modelos de regressão linear multivariada. O objeto do estudo foi a loja de Pet Shop localizada em São Paulo. Foram conduzidos 5 modelos de regressão para se verificar os ganhos incrementais com a incorporação de variáveis atitudinais. O modelo que obteve o R2 mais alto (68,13%) foi aquele que contemplava como variável resposta o valor médio de transação do ano de 2017 e como variáveis explicativas os quintis da recência, da frequência, do valor, além de ensino superior incompleto, pet ser considerado como filho ou como membro da família, quantidade de outros pets e cluster de clientes que precisam ser lembrados. Como resultado da pesquisa, nota-se que a percepção da família frente ao pet exerce papel preponderante na decisão de compra de serviços e produtos, fazendo com que variáveis como distância da residência ao Pet Shop ou renda domiciliar tornem-se secundárias ao se determinar as variáveis atitudinais que mais influenciam na decisão de compra. Entender cada cliente e como ele se relaciona com o pet permite aumentar o seu valor de transação, pois o cliente busca acima de tudo o bem-estar de seu pet.
46

An?lise dos indicadores de qualidade versus taxa de abandono utilizando m?todo de regress?o m?ltipla para servi?o de banda larga

Fernandes Neto, Andr? Pedro 20 June 2008 (has links)
Made available in DSpace on 2014-12-17T14:52:36Z (GMT). No. of bitstreams: 1 AndrePFN.pdf: 1525936 bytes, checksum: edb576494fd35f42e78d512df4fc02df (MD5) Previous issue date: 2008-06-20 / &#65279;Telecommunication is one of the most dynamic and strategic areas in the world. Many technological innovations has modified the way information is exchanged. Information and knowledge are now shared in networks. Broadband Internet is the new way of sharing contents and information. This dissertation deals with performance indicators related to maintenance services of telecommunications networks and uses models of multivariate regression to estimate churn, which is the loss of customers to other companies. In a competitive environment, telecommunications companies have devised strategies to minimize the loss of customers. Loosing customers presents a higher cost than obtaining new ones. Corporations have plenty of data stored in a diversity of databases. Usually the data are not explored properly. This work uses the Knowledge Discovery in Databases (KDD) to establish rules and new models to explain how churn, as a dependent variable, are related to a diversity of service indicators, such as time to deploy the service (in hours), time to repair (in hours), and so on. Extraction of meaningful knowledge is, in many cases, a challenge. Models were tested and statistically analyzed. The work also shows results that allows the analysis and identification of which quality services indicators influence the churn. Actions are also proposed to solve, at least in part, this problem / &#65279;A ?rea de telecomunica??es ? uma das mais estrat?gicas e din?micas do mundo atual. Esse fato se deve a in?meras inova??es tecnol?gicas que afetaram a forma como as informa??es trafegam. O conhecimento deixou de ser percebido como um ac?mulo linear, l?gico e cronol?gico de informa??es e passou a ser visto como uma constru??o em rede, consequentemente a massifica??o da Internet banda larga em alta velocidade teve grande influ?ncia sobre esse fen?meno. Essa disserta??o aborda um estudo sobre medi??o de desempenho e servi?os de manuten??o em telecomunica??es, com o uso de ferramentas de descoberta de conhecimento em base de dados (KDD). Objetiva-se transformar informa??es, armazenadas nas bases de dados de uma grande empresa de telecomunica??es do pa?s, em conhecimento ?til. A metodologia de pesquisa utilizada focou no uso de an?lise de regress?o m?ltipla como ferramenta para estimar a taxa de abandono de clientes em servi?os de Internet de banda larga, como vari?vel dependente, e indicadores de qualidade de servi?o como vari?veis independentes. Modelos foram testados e analisados estatisticamente. O trabalho apresenta resultados que permitem analisar e identificar quais os indicadores de qualidade que exercem maior influ?ncia na taxa de abandono dos clientes. S?o propostas sugest?es que possam ser aplicadas para melhoria de qualidade do servi?o percebido e consequentemente diminui??es das perdas com a taxa de abandono
47

Assessment of foliar nitrogen as an indicator of vegetation stress using remote sensing : the case study of Waterberg region, Limpopo Province

Manyashi, Enoch Khomotšo 06 1900 (has links)
Vegetation status is a key indicator of the ecosystem condition in a particular area. The study objective was about the estimation of leaf nitrogen (N) as an indicator of vegetation water stress using vegetation indices especially the red edge based ones, and how leaf N concentration is influenced by various environmental factors. Leaf nitrogen was estimated using univariate and multivariate regression techniques of stepwise multiple linear regression (SMLR) and random forest. The effects of environmental parameters on leaf nitrogen distribution were tested through univariate regression and analysis of variance (ANOVA). Vegetation indices were evaluated derived from the analytical spectral device (ASD) data, resampled to RapidEye. The multivariate models were also developed to predict leaf N. The best model was chosen based on the lowest root mean square error (RMSE) and higher coefficient of determination (R2) values. Univariate results showed that red edge based vegetation index called MERRIS Terrestrial Chlorophyll Index (MTCI) yielded higher leaf N estimation accuracy as compared to other vegetation indices. Simple ratio (SR) based on the bands red and near-infrared was found to be the best vegetation index for leaf N estimation with exclusion of red edge band for stepwise multiple linear regression (SMLR) method. Simple ratio (SR3) was the best vegetation index when red edge was included for stepwise linear regression (SMLR) method. Random forest prediction model achieved the highest leaf N estimation accuracy, the best vegetation index was Red Green Index (RGI1) based on all bands with red green index when including the red edge band. When red edge band was excluded the best vegetation index for random forest was Difference Vegetation Index (DVI1). The results for univariate and multivariate results indicated that the inclusion of the red edge band provides opportunity to accurately estimate leaf N. Analysis of variance results showed that vegetation and soil types have a significant effect on leaf N distribution with p-values<0.05. Red edge based indices provides opportunity to assess vegetation health using remote sensing techniques. / Environmental Sciences / M. Sc. (Environmental Management)
48

Essays on multivariate generalized Birnbaum-Saunders methods

MARCHANT FUENTES, Carolina Ivonne 31 October 2016 (has links)
Submitted by Rafael Santana (rafael.silvasantana@ufpe.br) on 2017-04-26T17:07:37Z No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5) Carolina Marchant.pdf: 5792192 bytes, checksum: adbd82c79b286d2fe2470b7955e6a9ed (MD5) / Made available in DSpace on 2017-04-26T17:07:38Z (GMT). No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5) Carolina Marchant.pdf: 5792192 bytes, checksum: adbd82c79b286d2fe2470b7955e6a9ed (MD5) Previous issue date: 2016-10-31 / CAPES; BOLSA DO CHILE. / In the last decades, univariate Birnbaum-Saunders models have received considerable attention in the literature. These models have been widely studied and applied to fatigue, but they have also been applied to other areas of the knowledge. In such areas, it is often necessary to model several variables simultaneously. If these variables are correlated, individual analyses for each variable can lead to erroneous results. Multivariate regression models are a useful tool of the multivariate analysis, which takes into account the correlation between variables. In addition, diagnostic analysis is an important aspect to be considered in the statistical modeling. Furthermore, multivariate quality control charts are powerful and simple visual tools to determine whether a multivariate process is in control or out of control. A multivariate control chart shows how several variables jointly affect a process. First, we propose, derive and characterize multivariate generalized logarithmic Birnbaum-Saunders distributions. Also, we propose new multivariate generalized Birnbaum-Saunders regression models. We use the method of maximum likelihood estimation to estimate their parameters through the expectation-maximization algorithm. We carry out a simulation study to evaluate the performance of the corresponding estimators based on the Monte Carlo method. We validate the proposed models with a regression analysis of real-world multivariate fatigue data. Second, we conduct a diagnostic analysis for multivariate generalized Birnbaum-Saunders regression models. We consider the Mahalanobis distance as a global influence measure to detect multivariate outliers and use it for evaluating the adequacy of the distributional assumption. Moreover, we consider the local influence method and study how a perturbation may impact on the estimation of model parameters. We implement the obtained results in the R software, which are illustrated with real-world multivariate biomaterials data. Third and finally, we develop a robust methodology based on multivariate quality control charts for generalized Birnbaum-Saunders distributions with the Hotelling statistic. We use the parametric bootstrap method to obtain the distribution of this statistic. A Monte Carlo simulation study is conducted to evaluate the proposed methodology, which reports its performance to provide earlier alerts of out-of-control conditions. An illustration with air quality real-world data of Santiago-Chile is provided. This illustration shows that the proposed methodology can be useful for alerting episodes of extreme air pollution. / Nas últimas décadas, o modelo Birnbaum-Saunders univariado recebeu considerável atenção na literatura. Esse modelo tem sido amplamente estudado e aplicado inicialmente à modelagem de fadiga de materiais. Com o passar dos anos surgiram trabalhos com aplicações em outras áreas do conhecimento. Em muitas das aplicações é necessário modelar diversas variáveis simultaneamente incorporando a correlação entre elas. Os modelos de regressão multivariados são uma ferramenta útil de análise multivariada, que leva em conta a correlação entre as variáveis de resposta. A análise de diagnóstico é um aspecto importante a ser considerado no modelo estatístico e verifica as suposições adotadas como também sua sensibilidade. Além disso, os gráficos de controle de qualidade multivariados são ferramentas visuais eficientes e simples para determinar se um processo multivariado está ou não fora de controle. Este gráfico mostra como diversas variáveis afetam conjuntamente um processo. Primeiro, propomos, derivamos e caracterizamos as distribuições Birnbaum-Saunders generalizadas logarítmicas multivariadas. Em seguida, propomos um modelo de regressão Birnbaum-Saunders generalizado multivariado. Métodos para estimação dos parâmetros do modelo, tal como o método de máxima verossimilhança baseado no algoritmo EM, foram desenvolvidos. Estudos de simulação de Monte Carlo foram realizados para avaliar o desempenho dos estimadores propostos. Segundo, realizamos uma análise de diagnóstico para modelos de regressão Birnbaum-Saunders generalizados multivariados. Consideramos a distância de Mahalanobis como medida de influência global de detecção de outliers multivariados utilizando-a para avaliar a adequacidade do modelo. Além disso, desenvolvemos medidas de diagnósticos baseadas em influência local sob alguns esquemas de perturbações. Implementamos a metodologia apresentada no software R, e ilustramos com dados reais multivariados de biomateriais. Terceiro, e finalmente, desenvolvemos uma metodologia robusta baseada em gráficos de controle de qualidade multivariados para a distribuição Birnbaum-Saunders generalizada usando a estatística de Hotelling. Baseado no método bootstrap paramétrico encontramos aproximações da distribuição desta estatística e obtivemos limites de controle para o gráfico proposto. Realizamos um estudo de simulação de Monte Carlo para avaliar a metodologia proposta indicando seu bom desempenho para fornecer alertas precoces de processos fora de controle. Uma ilustração com dados reais de qualidade do ar de Santiago-Chile é fornecida. Essa ilustração mostra que a metodologia proposta pode ser útil para alertar sobre episódios de poluição extrema do ar, evitando efeitos adversos na saúde humana.
49

GIS-integrated mathematical modeling of social phenomena at macro- and micro- levels—a multivariate geographically-weighted regression model for identifying locations vulnerable to hosting terrorist safe-houses: France as case study

Eisman, Elyktra 13 November 2015 (has links)
Adaptability and invisibility are hallmarks of modern terrorism, and keeping pace with its dynamic nature presents a serious challenge for societies throughout the world. Innovations in computer science have incorporated applied mathematics to develop a wide array of predictive models to support the variety of approaches to counterterrorism. Predictive models are usually designed to forecast the location of attacks. Although this may protect individual structures or locations, it does not reduce the threat—it merely changes the target. While predictive models dedicated to events or social relationships receive much attention where the mathematical and social science communities intersect, models dedicated to terrorist locations such as safe-houses (rather than their targets or training sites) are rare and possibly nonexistent. At the time of this research, there were no publically available models designed to predict locations where violent extremists are likely to reside. This research uses France as a case study to present a complex systems model that incorporates multiple quantitative, qualitative and geospatial variables that differ in terms of scale, weight, and type. Though many of these variables are recognized by specialists in security studies, there remains controversy with respect to their relative importance, degree of interaction, and interdependence. Additionally, some of the variables proposed in this research are not generally recognized as drivers, yet they warrant examination based on their potential role within a complex system. This research tested multiple regression models and determined that geographically-weighted regression analysis produced the most accurate result to accommodate non-stationary coefficient behavior, demonstrating that geographic variables are critical to understanding and predicting the phenomenon of terrorism. This dissertation presents a flexible prototypical model that can be refined and applied to other regions to inform stakeholders such as policy-makers and law enforcement in their efforts to improve national security and enhance quality-of-life.
50

Neural networks regularization through representation learning / Régularisation des réseaux de neurones via l'apprentissage des représentations

Belharbi, Soufiane 06 July 2018 (has links)
Les modèles de réseaux de neurones et en particulier les modèles profonds sont aujourd'hui l'un des modèles à l'état de l'art en apprentissage automatique et ses applications. Les réseaux de neurones profonds récents possèdent de nombreuses couches cachées ce qui augmente significativement le nombre total de paramètres. L'apprentissage de ce genre de modèles nécessite donc un grand nombre d'exemples étiquetés, qui ne sont pas toujours disponibles en pratique. Le sur-apprentissage est un des problèmes fondamentaux des réseaux de neurones, qui se produit lorsque le modèle apprend par coeur les données d'apprentissage, menant à des difficultés à généraliser sur de nouvelles données. Le problème du sur-apprentissage des réseaux de neurones est le thème principal abordé dans cette thèse. Dans la littérature, plusieurs solutions ont été proposées pour remédier à ce problème, tels que l'augmentation de données, l'arrêt prématuré de l'apprentissage ("early stopping"), ou encore des techniques plus spécifiques aux réseaux de neurones comme le "dropout" ou la "batch normalization". Dans cette thèse, nous abordons le sur-apprentissage des réseaux de neurones profonds sous l'angle de l'apprentissage de représentations, en considérant l'apprentissage avec peu de données. Pour aboutir à cet objectif, nous avons proposé trois différentes contributions. La première contribution, présentée dans le chapitre 2, concerne les problèmes à sorties structurées dans lesquels les variables de sortie sont à grande dimension et sont généralement liées par des relations structurelles. Notre proposition vise à exploiter ces relations structurelles en les apprenant de manière non-supervisée avec des autoencodeurs. Nous avons validé notre approche sur un problème de régression multiple appliquée à la détection de points d'intérêt dans des images de visages. Notre approche a montré une accélération de l'apprentissage des réseaux et une amélioration de leur généralisation. La deuxième contribution, présentée dans le chapitre 3, exploite la connaissance a priori sur les représentations à l'intérieur des couches cachées dans le cadre d'une tâche de classification. Cet à priori est basé sur la simple idée que les exemples d'une même classe doivent avoir la même représentation interne. Nous avons formalisé cet à priori sous la forme d'une pénalité que nous avons rajoutée à la fonction de perte. Des expérimentations empiriques sur la base MNIST et ses variantes ont montré des améliorations dans la généralisation des réseaux de neurones, particulièrement dans le cas où peu de données d'apprentissage sont utilisées. Notre troisième et dernière contribution, présentée dans le chapitre 4, montre l'intérêt du transfert d'apprentissage ("transfer learning") dans des applications dans lesquelles peu de données d'apprentissage sont disponibles. L'idée principale consiste à pré-apprendre les filtres d'un réseau à convolution sur une tâche source avec une grande base de données (ImageNet par exemple), pour les insérer par la suite dans un nouveau réseau sur la tâche cible. Dans le cadre d'une collaboration avec le centre de lutte contre le cancer "Henri Becquerel de Rouen", nous avons construit un système automatique basé sur ce type de transfert d'apprentissage pour une application médicale où l'on dispose d’un faible jeu de données étiquetées. Dans cette application, la tâche consiste à localiser la troisième vertèbre lombaire dans un examen de type scanner. L’utilisation du transfert d’apprentissage ainsi que de prétraitements et de post traitements adaptés a permis d’obtenir des bons résultats, autorisant la mise en oeuvre du modèle en routine clinique. / Neural network models and deep models are one of the leading and state of the art models in machine learning. They have been applied in many different domains. Most successful deep neural models are the ones with many layers which highly increases their number of parameters. Training such models requires a large number of training samples which is not always available. One of the fundamental issues in neural networks is overfitting which is the issue tackled in this thesis. Such problem often occurs when the training of large models is performed using few training samples. Many approaches have been proposed to prevent the network from overfitting and improve its generalization performance such as data augmentation, early stopping, parameters sharing, unsupervised learning, dropout, batch normalization, etc. In this thesis, we tackle the neural network overfitting issue from a representation learning perspective by considering the situation where few training samples are available which is the case of many real world applications. We propose three contributions. The first one presented in chapter 2 is dedicated to dealing with structured output problems to perform multivariate regression when the output variable y contains structural dependencies between its components. Our proposal aims mainly at exploiting these dependencies by learning them in an unsupervised way. Validated on a facial landmark detection problem, learning the structure of the output data has shown to improve the network generalization and speedup its training. The second contribution described in chapter 3 deals with the classification task where we propose to exploit prior knowledge about the internal representation of the hidden layers in neural networks. This prior is based on the idea that samples within the same class should have the same internal representation. We formulate this prior as a penalty that we add to the training cost to be minimized. Empirical experiments over MNIST and its variants showed an improvement of the network generalization when using only few training samples. Our last contribution presented in chapter 4 showed the interest of transfer learning in applications where only few samples are available. The idea consists in re-using the filters of pre-trained convolutional networks that have been trained on large datasets such as ImageNet. Such pre-trained filters are plugged into a new convolutional network with new dense layers. Then, the whole network is trained over a new task. In this contribution, we provide an automatic system based on such learning scheme with an application to medical domain. In this application, the task consists in localizing the third lumbar vertebra in a 3D CT scan. A pre-processing of the 3D CT scan to obtain a 2D representation and a post-processing to refine the decision are included in the proposed system. This work has been done in collaboration with the clinic "Rouen Henri Becquerel Center" who provided us with data

Page generated in 0.1026 seconds