• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 186
  • 65
  • 16
  • 13
  • 10
  • 9
  • 7
  • 7
  • 5
  • 5
  • 4
  • 4
  • 3
  • 2
  • 2
  • Tagged with
  • 386
  • 386
  • 79
  • 66
  • 55
  • 50
  • 50
  • 44
  • 41
  • 40
  • 37
  • 34
  • 34
  • 33
  • 31
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
331

Estudos hidroqu?micos da ?gua produzida de um determinado campo de petr?leo da bacia potiguar

Sena, Shirley Feitosa Machado 28 February 2011 (has links)
Made available in DSpace on 2014-12-17T14:08:46Z (GMT). No. of bitstreams: 1 ShirleyFMS_DISSERT_parcial.pdf: 156702 bytes, checksum: 946fe2ea3a673657c92a1d9cf7dbf34a (MD5) Previous issue date: 2011-02-28 / Petr?leo Brasileiro SA - PETROBRAS / Waste generated during the exploration and production of oil, water stands out due to various factors including the volume generated, the salt content, the presence of oil and chemicals and the water associated with oil is called produced water. The chemical composition of water is complex and depends strongly on the field generator, because it was in contact with the geological formation for thousands of years. This work aims to characterize the hydrochemical water produced in different areas of a field located in the Potiguar Basin. We collected 27 samples from 06 zones (400, 600, 400/600, 400/450/500, 350/400, A) the producing field called S and measured 50 required parameter divided between physical and chemical parameters, cations and anions. In hydrochemical characterization was used as tools of reasons ionic calculations, diagrams and they hydrochemical classification diagram Piper and Stiff diagram and also the statistic that helped in the identification of signature patterns for each production area including the area that supplies water injected this field for secondary oil recovery. The ionic balance error was calculated to assess the quality of the results of the analysis that was considered good, because 89% of the samples were below 5% error. Hydrochemical diagrams classified the waters as sodium chloride, with the exception of samples from Area A, from the injection well, which were classified as sodium bicarbonate. Through descriptive analysis and discriminant analysis was possible to obtain a function that differs chemically production areas, this function had a good hit rate of classification was 85% / Dos res?duos gerados durante a explora??o e produ??o de petr?leo, a ?gua se destaca devido a v?rios fatores, entre eles o volume gerado, o conte?do salino, a presen?a de ?leo e de produtos qu?micos e essa ?gua associada ao petr?leo ? denominada ?gua produzida. A composi??o qu?mica dessa ?gua ? complexa e depende fortemente do campo gerador, pois a mesma esteve em contato com a forma??o geol?gica por milhares de anos. Esse trabalho tem como objetivo caracterizar hidroqu?micamente a ?gua produzida em diferentes zonas de um campo localizado na Bacia Potiguar. Foram coletadas 27 amostras provenientes de 06 zonas (400, 600, 400/600, 400/450/500, 350/400, A) produtoras do campo denominado S e medidos 50 par?mentos divididos entre par?metros f?sico-qu?micos, c?tions e ?nions. Na caracteriza??o hidroqu?mica foi utilizado como ferramentas c?lculos de raz?es i?nicas, diagramas de classifica??o hidroqu?mica sendo eles o diagrama de Piper e o diagrama de Stiff e tamb?m a estat?stica que auxiliou nas identifica??es de assinaturas padr?es de cada zona de produ??o inclusive a zona que fornece a ?gua injetada nesse campo para a recupera??o secund?ria de petr?leo. O erro do balan?o i?nico foi calculado para avaliar a qualidade dos resultados das an?lises que foi considerada boa, pois 89% das amostras apresentaram erro abaixo de 5%. Os diagramas hidroqu?micos classificaram as ?guas como Cloretadas S?dicas, com exce??o das amostras da zona A, proveniente do po?o injetor, que foram classificadas como Bicarbonatadas S?dicas. Atrav?s da an?lise descritiva e da an?lise de discriminante foi poss?vel a obten??o de uma fun??o que distingue quimicamente as zonas de produ??o, essa fun??o apresentou uma boa taxa de acerto de classifica??o que foi de 85%
332

Metodologias estat?sticas aplicadas a dados de an?lises qu?micas da ?gua produzida em um campo maduro de petr?leo

Mendes, D?bora Cosme Pereira 30 March 2012 (has links)
Made available in DSpace on 2014-12-17T14:08:51Z (GMT). No. of bitstreams: 1 DeboraCMP_DISSERT_Pre-textuais.pdf: 440461 bytes, checksum: de18053fd5097e8172e698a19f80a5e7 (MD5) Previous issue date: 2012-03-30 / Produced water is characterized as one of the most common wastes generated during exploration and production of oil. This work aims to develop methodologies based on comparative statistical processes of hydrogeochemical analysis of production zones in order to minimize types of high-cost interventions to perform identification test fluids - TIF. For the study, 27 samples were collected from five different production zones were measured a total of 50 chemical species. After the chemical analysis was applied the statistical data, using the R Statistical Software, version 2.11.1. Statistical analysis was performed in three steps. In the first stage, the objective was to investigate the behavior of chemical species under study in each area of production through the descriptive graphical analysis. The second step was to identify a function that classify production zones from each sample, using discriminant analysis. In the training stage, the rate of correct classification function of discriminant analysis was 85.19%. The next stage of processing of the data used for Principal Component Analysis, by reducing the number of variables obtained from the linear combination of chemical species, try to improve the discriminant function obtained in the second stage and increase the discrimination power of the data, but the result was not satisfactory. In Profile Analysis curves were obtained for each production area, based on the characteristics of the chemical species present in each zone. With this study it was possible to develop a method using hydrochemistry and statistical analysis that can be used to distinguish the water produced in mature fields of oil, so that it is possible to identify the zone of production that is contributing to the excessive elevation of the water volume. / A ?gua produzida caracteriza-se como um dos res?duos mais comuns gerados durante as fases de explora??o e produ??o de petr?leo. Este trabalho tem como objetivo desenvolver metodologias baseadas em processos estat?stico-comparativos de an?lises hidrogeoqu?micas de zonas de produ??o com o objetivo de minimizar tipos de interven??es de altos custos para realiza??o de teste de identifica??o de fluidos TIF. Para o estudo, foram coletadas 27 amostras de cinco zonas de produ??o diferentes e foram medidos um total de 50 esp?cies qu?micas. Ap?s as an?lises qu?micas foi aplicado o tratamento estat?stico aos dados, utilizando o Software Estat?stico R, vers?o 2.11.1. O tratamento estat?stico foi realizado em tr?s etapas. Na primeira etapa, o objetivo foi verificar o comportamento das esp?cies qu?micas em estudo em cada zona de produ??o atrav?s da an?lise gr?fica descritiva. A segunda etapa consistiu em identificar uma fun??o que classificasse as zonas de produ??o de cada amostra, utilizando An?lise Discriminante. Na etapa de treinamento, a taxa de acerto de classifica??o da fun??o da an?lise discriminante foi de 85,19%. A etapa seguinte do tratamento estat?stico dos dados utilizou An?lise de Componentes Principais para, atrav?s da redu??o do n?mero de vari?veis obtidas a partir da combina??o linear das esp?cies qu?micas, tentar melhorar a fun??o discriminante obtida na segunda etapa e aumentar o poder de discrimina??o dos dados, por?m o resultado n?o foi satisfat?rio. Na An?lise de Perfis foram obtidas curvas para cada zona de produ??o, baseadas nas caracter?sticas das esp?cies qu?micas presentes em cada zona. Com este trabalho foi poss?vel desenvolver uma metodologia utilizando a hidroqu?mica e a an?lise estat?stica que pode ser utilizada para distinguir a ?gua produzida em campos maduros de petr?leo, de forma que seja poss?vel identificar a zona de produ??o que est? contribuindo para a eleva??o excessiva do volume de ?gua.
333

Estudo de expansões assintóticas, avaliação numérica de momentos das distribuições beta generalizadas, aplicações em modelos de regressão e análise discriminante

BRITO, Rejane dos Santos 20 March 2009 (has links)
Submitted by (ana.araujo@ufrpe.br) on 2016-08-10T13:00:13Z No. of bitstreams: 1 Rejane dos Santos Brito.pdf: 1642561 bytes, checksum: 084711a62c79f703133a032643c8d19f (MD5) / Made available in DSpace on 2016-08-10T13:00:13Z (GMT). No. of bitstreams: 1 Rejane dos Santos Brito.pdf: 1642561 bytes, checksum: 084711a62c79f703133a032643c8d19f (MD5) Previous issue date: 2009-03-20 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES / We make a review about Edgeworth, Lugannani-Rice, Daniels and Cordeiro-Ferrari asymptotic approximations. We use the Cordeiro-Ferrari asymptotic approximation to approximate the gamma distribution G(m;f ) by the exponential distribution with mean a. In a further application, based on the statistical proposed by them, we approximate the t-Student distribution with n degrees of freedom using the normal standard distribution. Moreover, we realize a study about the functionalities of the beta generalized distributions. We obtain moments of the generalized beta distributions using the Lauricella and Kampé de Fériet generalized functions. Beyond this, we propose a new generalized beta distribution called beta power. Finally, we realize some applications in regression models by logistic regression and further more using discriminant analysis. / Inicialmente, realiza-se uma revisão literária sobre as expansões assintóticas de Daniels, Edgeworth, Lugannani-Rice e Cordeiro-Ferrari. Mediante uso da expansão de Cordeiro- Ferrari, torna-se possível realizar um estudo correspondente a aproximação da distribuição gama G(m;f ) em função da distribuição exponencial com média a. E, ainda, numa outra aplicação, faz-se a aproximação da distribuição t-Student com n graus de liberdade em função da distribuição normal padrão. Além disso, apresenta-se um estudo correspondente às funcionalidades das distribuições beta generalizadas e, ainda, a obtenção dos momentos das distribuições beta generalizadas mediante as funções de Lauricella e generalizada de Kampé de Fériet. Propõe-se, ainda, a generalização da distribuição power como sendo uma nova distribuição beta generalizada. Por fim, realizam-se algumas aplicações em modelos de regressão, mediante regressão logística, bem como em modelos de análise discriminante.
334

Análise de insolvência empresarial : uma abordagem financeira fundamentalista com aplicação do método estatístico multivariado e da técnica discriminante / ANÁLISE DE INSOLVÊNCIA EMPRESARIAL: UMA ABORDAGEM FINANCEIRA COM APLICAÇÃO DO MÉTODO FUNDAMENTALISTA MULTIVARIADO ESTATÍSTICO DA TÉCNICA E DISCRIMINANTE

Mateus, Regis Santos 24 May 2010 (has links)
The insolvency business represents an excellent subject for a wide and diverse range of economic agents and may be the result of complex internal and external factors to the company. Considering these factors, it is assumed the assumption that fundamental analysis fulfills an important role in addressing these issues, whether in character microeconomic or macroeconomic context. In order to investigate the influence and behavior of these factors, identified from the macroeconomic, sectoral, and the fundamentals of companies, we use the statistical method and technique of multivariate discriminant analysis. The main restrictive assumption concerns the relevance of including variables other than those normally used in forecasting models of corporate insolvency. The investigation process is delimited as follows: in spatial terms of specificity and covers large companies with a designation of SA (corporation) a publicly traded operating in Brazil. The time frame considered the year 2008 and covers the macroeconomic and microeconomic variables. And as research design considers the observational study in conjunction with the application of multivariate statistical method and by the statistical technique of discriminant analysis. Given the various studies related to prediction of bankruptcy that are very similar to this research, probably the significance of financial ratios representing the predictor variables normally used in the discriminant model variables and distinctly included in this analysis is relatively similar, where the statistical significance of each of these variables is coherent and consistent in the analysis of insolvency of Brazilian companies. / A insolvência empresarial representa um tema relevante para um conjunto amplo e diversificado de agentes econômicos e pode ser resultado de um complexo de fatores internos e externos à empresa. Tendo em vista estes fatores, assume-se o pressuposto de que a análise fundamentalista cumpre papel relevante ao tratar destes aspectos, sejam eles em caráter microeconômico ou num contexto macroeconômico. No intuito de investigar a influência e o comportamento destes fatores, identificados a partir das variáveis macroeconômicas, setoriais e dos fundamentos das empresas, utiliza-se o método estatístico multivariado e a técnica de análise discriminante. A principal hipótese restritiva se refere à relevância da inclusão de variáveis distintas das normalmente utilizadas em modelos de previsão de insolvência empresarial. O processo de investigação delimita-se da seguinte forma: em termos espaciais e de especificidade, abrange empresas de grande porte com denominação de S.A. (Sociedade Anônima) de capital aberto atuantes no Brasil. A delimitação temporal considera o ano de 2008 e engloba as variáveis macroeconômicas e microeconômicas. E como delineamento de pesquisa considera-se o estudo observacional em conjunto com a aplicação do método estatístico multivariado e mediante a técnica estatística de análise discriminante. Diante dos vários estudos ligados à previsão de insolvência que em muito se assemelham a esta pesquisa, provavelmente a relevância dos índices financeiros que representam as variáveis preditoras normalmente utilizadas no modelo discriminante e as variáveis distintamente incluídas nesta análise seja relativamente semelhante, onde a significância estatística de cada uma destas variáveis seja coerente e consistente no processo de análise de insolvência de empresas brasileiras.
335

Expressão de grupos de genes como marcadores moleculares preditivos de resposta à quimioterapia neoadjuvante com doxorrubicina e ciclofosfamida em pacientes com câncer de mama / Expression of gene groups as predictive molecular markers response to neoadjuvant chemotherapy with doxorubicin and cyclophosphamide in breast cancer patients

Mateus de Camargo Barros Filho 16 June 2009 (has links)
Pacientes com câncer de mama localmente avançado são submetidas à quimioterapia neoadjuvante na tentativa de reduzir a dimensão do tumor e aumentar a possibilidade da realização de uma cirurgia conservadora. Nosso grupo identificou previamente através da tecnologia de cDNA microarray, trios de genes, incluindo BZRP, CLPTM1, MTSS1, NOTCH1, NUP210, PRSS11, RPL37A, SMYD2 e XLHSRF-1, cuja expressão era capaz de predizer a resposta à quimioterapia neoadjuvante com doxorrubicina e ciclofosfamida em pacientes com câncer de mama. No presente estudo, avaliamos se a expressão destes genes é reprodutível na identificação de pacientes responsivas e não-responsivas através de RT-PCR em tempo real, que representa uma técnica mais acessível. Avaliamos inicialmente amostras de 28 pacientes anteriormente estudadas (grupo de validação técnica = 23 responsivas e cinco não-responsivas) e a seguir um grupo de 14 novas pacientes (grupo de validação biológica = 11 responsivas e três não-responsivas). Dentre os trios de genes inicialmente identificados, a expressão de RPL37A + XLHSRF-1 + NOTCH1 e RPL37A + XLHSRF-1 + NUP210 classificou corretamente 86% (24/28) das amostras do grupo de validação técnica e 71% (10/14) das amostras do grupo de validação biológica, através de análise de classificação discriminante. Desse modo, esses trios não demonstraram a mesma precisão em comparação com resultados de cDNA microarray. Uma nova análise combinatória foi realizada na procura do melhor modelo preditivo utilizando valores de expressão obtidos por RT-PCR em tempo real. Identificamos então um novo trio, composto pelos genes RPL37A, SMYD2 e MTSS1, cuja expressão classificou corretamente 93% das amostras do grupo de validação técnica (22/23 responsivas e 4/5 não-responsivas) e 79% do grupo de validação biológica (8/11 responsivas e 3/3 não-responsivas). Portanto, o teste apresentou 88% de sensibilidade e especificidade em detectar pacientes responsivas para o total de amostras analisadas. Ao verificarmos o poder de classificação do mesmo grupo de genes, utilizando os valores de expressão pela análise de cDNA microarray, observamos um resultado semelhante (91% de sensibilidade e especificidade em reconhecer as amostras responsivas). Dessa forma, demonstramos que o perfil de expressão gênica obtido com cDNA microarray é reprodutível através do uso de RT-PCR em tempo real. Um estudo integrando um maior número de pacientes e uma plataforma de cDNA microarray mais abrangente pode auxiliar na identificação de um modelo preditivo baseado em grupos de genes mais acurado para antever a resposta ao tratamento com quimioterapia baseada em doxorrubicina. / Patients with locally advanced breast cancer are submitted to primary chemotherapy as an attempt to reduce tumor dimension and increase breast conserving surgery rates. Our group has previously identified through cDNA microarray technology gene trios, including BZRP, CLPTM1, MTSS1, NOTCH1, NUP210, PRSS11, RPL37A, SMYD2 and XLHSRF-1, whose expression was capable of predicting response to neoadjuvant chemotherapy with doxorubicin and cyclophosphamide in breast cancer patients. In the current study, it was evaluated whether expression of these genes is reproducible in the identification of responsive and non-responsive patients by real time RT-PCR, which represents a more accessible technique. We initially evaluated samples from 28 patients earlier studied (technical validation group = 23 responsive and 5 non-responsive) and subsequent to a new 14 patients set (biological validation group = 11 responsive and three non-responsive). Among the initially identified gene trios, RPL37A + XLHSRF-1 + NOTCH1 and RPL37A + XLHSRF-1 + NUP210 expression correctly classify 86% (24/28) samples from the technical validation group and 71% (10/14) samples from the biological validation group, through discriminant classification analysis. Therefore, these trios didnt demonstrate the same precision as compared with cDNA microarray results. A new combinatorial analysis was also performed in search of the best predictive model using real time RT-PCR expression values. A new trio was identified, represented by RPL37A, SMYD2 and MTSS1 genes, whose expression correctly classified 93% samples from technical validation group (22/23 responsive and 4/5 non-responsive) and 79% samples from biological validation group (8/11 responsive samples and 3/3 non-responsive samples). Therefore, the test presented 88% sensibility and specificity in identifying responsive patients for all samples analyzed. By means of verifying the classification strength of the same gene group, using cDNA microarray expression values, we observed a similar result (91% sensibility and specificity in recognizing responsive samples). Thus, we demonstrated that gene expression profile obtained by cDNA microarray is reproducible through real time RT-PCR. A study integrating a larger number of patients and a more comprehensive cDNA microarray platform may help the identification of a more accurate predictive model based on gene groups to foresee response to doxorubicin-based chemotherapy treatment.
336

Application of Artificial Intelligence (Artificial Neural Network) to Assess Credit Risk : A Predictive Model For Credit Card Scoring

Islam, Md. Samsul, Zhou, Lin, Li, Fei January 2009 (has links)
Credit Decisions are extremely vital for any type of financial institution because it can stimulate huge financial losses generated from defaulters. A number of banks use judgmental decisions, means credit analysts go through every application separately and other banks use credit scoring system or combination of both. Credit scoring system uses many types of statistical models. But recently, professionals started looking for alternative algorithms that can provide better accuracy regarding classification. Neural network can be a suitable alternative. It is apparent from the classification outcomes of this study that neural network gives slightly better results than discriminant analysis and logistic regression. It should be noted that it is not possible to draw a general conclusion that neural network holds better predictive ability than logistic regression and discriminant analysis, because this study covers only one dataset. Moreover, it is comprehensible that a “Bad Accepted” generates much higher costs than a “Good Rejected” and neural network acquires less amount of “Bad Accepted” than discriminant analysis and logistic regression. So, neural network achieves less cost of misclassification for the dataset used in this study. Furthermore, in the final section of this study, an optimization algorithm (Genetic Algorithm) is proposed in order to obtain better classification accuracy through the configurations of the neural network architecture. On the contrary, it is vital to note that the success of any predictive model largely depends on the predictor variables that are selected to use as the model inputs. But it is important to consider some points regarding predictor variables selection, for example, some specific variables are prohibited in some countries, variables all together should provide the highest predictive strength and variables may be judged through statistical analysis etc. This study also covers those concepts about input variables selection standards.
337

Cadres méthodologiques pour la conception innovante d'un Plan énergétique durable à l'échelle d'une organisation : application d'une planification énergétique pour une économie compétitive bas carbone au Sonnenhof / Methodological frameworks for the innovative design of a sustainable energy plan at a organization scale : energy planning for moving to a competitive low carbon economy in Sonnenhof

Bach, Sébastien 27 September 2017 (has links)
Les entreprises et plus généralement les organisations sont confrontées à des enjeux climatiques et économiques avec pour obligation de respecter un cadre légal et des orientations définis à des plus grandes échelles (régionale, nationale et internationale). Une organisation est souvent au fait du but ou de l’objectif à atteindre ; en revanche le moyen d’y parvenir peut nécessiter de l’apprentissage voire de la recherche. Le but de cette thèse est de fournir une méthodologie à l’usage des organisations pour réaliser le management stratégique des projets relatifs à leur transition énergétique. A partir de différents états de l’art sur la planification énergétique et la conception en particulier, nous avons pointé le déficit méthodologique auquel doit faire face une organisation : si les démarches et outils existent lorsqu’un problème est clairement identifié, comment justement identifier un ou des problèmes à partir uniquement d’une formulation de buts ou d’intentions ? La première partie propose une démarche de planification énergétique à l’échelle d’une organisation qui fait émerger, de manière structurée, les problèmes auxquels l’organisation sera potentiellement confrontée. Notre démarche repose sur l’utilisation des BEGES et des méthodes de management de l’énergie/GES d’une part, complétés par des démarches et outils de conception d’autre part. Ces derniers facilitent la consolidation des informations et des données nécessaires pour formuler et structurer les problèmes à résoudre. A l’issue de cette démarche certains problèmes sont formulés sous forme de contradictions et de conflits. La démarche développée est purement qualitative et adaptée au travail de groupe avec des experts. Cependant certaines données numériques traduisent des comportements de systèmes qui sont peu maitrisés par les parties prenantes du projet. La deuxième partie propose une méthode combinant la simulation et l’analyse de données pour identifier les contradictions d’objectifs et de cause qui peuvent ou semblent empêcher l’atteinte des objectifs. Ces contradictions sont formulées de sorte à pouvoir être traitées avec les méthodes de résolution de problèmes inventifs. Le principe d’identification des contradictions d’objectifs repose sur la transformation des réponses expérimentales ou de simulation des systèmes étudiés en données qualitatives binaires et sur l’identification des Paretos optimaux des données ainsi transformées. Les contradictions de causes concernent les facteurs ou paramètres de conception qui induisent les conflits d’objectifs. Nous proposons de les identifier à l’aide d’une méthode d’analyse discriminante binaire à base d’apprentissage supervisé associée à l’ANOVA. Nous montrons sur un cas d’étude, d’une part, comment intégrer cette approche dans la démarche présentée en partie 1 du mémoire, et d’autre part, comment l’utiliser pour obtenir des concepts de solutions dans un contexte multi-objectifs (diminution des consommations d’énergie, des émissions de GES, du coût etc.). / Companies and more generally organizations are confronted with climatic and economic issues, they have to respect a legal framework and orientations defined in larger scales (regional, national and international). An organization usually knows the goal or the objective to be achieved; however the way to do can require learning or even research. The goal of this thesis is to provide a methodology for the use of organizations to realize strategic management of their energy transition projects. From many different states of the art about energy planning and conception in particular, we show the methodological deficit which an organization has to face: if approaches and tools exist when a problem is clearly identified, how actually identify one or several problems from only a goal or intention formulation? The first part proposes an energy planning approach at an organizational scale to bring out in structured way problems which the organization may be confronted. Our approach is based on greenhouse gas emission assessments and energy/GHG management methods which are completed with conception approaches and some tools and methodologies. They facilitate the consolidation of required information and data to formulate and structure problems to solve. As a result of our approach some problems are formulated as contradictions and conflicts. The developed approach is purely qualitative and adapted to workgroup with experts. However some numerical data translate system behaviors which are sparsely mastered by project stakeholders. The second part proposes a combined method of simulation and data analysis to identify objective and cause contradictions which can or seem to prevent achieving the objectives. These contradictions are formulated in such a way to be handled with methods of resolution of inventive problems. The identification of objective contradictions is based on the transformation of experimental or simulation answers of the studied systems in binary qualitative data and on the identification of optimal Pareto of the transformed data. Cause contradictions concern conception factors or parameters which induce objective conflicts. We suggest identifying these contradictions with a binary discriminant analysis method based on supervised learning associated with ANOVA. On one hand, we show on a study case how integrate this initiative into the presented approaches in part 1 and on the other hand, how use it to obtain solution concepts in a multi-objective context (energy consumptions, GHG emissions or cost reduction etc.).
338

Diskriminační a shluková analýza jako nástroj klasifikace objektů / Discriminant and cluster analysis as a tool for classification of objects

Rynešová, Pavlína January 2015 (has links)
Cluster and discriminant analysis belong to basic classification methods. Using cluster analysis can be a disordered group of objects organized into several internally homogeneous classes or clusters. Discriminant analysis creates knowledge based on the jurisdiction of existing classes classification rule, which can be then used for classifying units with an unknown group membership. The aim of this thesis is a comparison of discriminant analysis and different methods of cluster analysis. To reflect the distances between objects within each cluster, squeared Euclidean and Mahalanobis distances are used. In total, there are 28 datasets analyzed in this thesis. In case of leaving correlated variables in the set and applying squared Euclidean distance, Ward´s method classified objects into clusters the most successfully (42,0 %). After changing metrics on the Mahalanobis distance, the most successful method has become the furthest neighbor method (37,5 %). After removing highly correlated variables and applying methods with Euclidean metric, Ward´s method was again the most successful in classification of objects (42,0%). From the result implies that cluster analysis is more precise when excluding correlated variables than when leaving them in a dataset. The average result of discriminant analysis for data with correlated variables and also without correlated variables is 88,7 %.
339

Multivariate non-invasive measurements of skin disorders

Nyström, Josefina January 2006 (has links)
The present thesis proposes new methods for obtaining objective and accurate diagnoses in modern healthcare. Non-invasive techniques have been used to examine or diagnose three different medical conditions, namely neuropathy among diabetics, radiotherapy induced erythema (skin redness) among breast cancer patients and diagnoses of cutaneous malignant melanoma. The techniques used were Near-InfraRed spectroscopy (NIR), Multi Frequency Bio Impedance Analysis of whole body (MFBIA-body), Laser Doppler Imaging (LDI) and Digital Colour Photography (DCP). The neuropathy for diabetics was studied in papers I and II. The first study was performed on diabetics and control subjects of both genders. A separation was seen between males and females and therefore the data had to be divided in order to obtain good models. NIR spectroscopy was shown to be a viable technique for measuring neuropathy once the division according to gender was made. The second study on diabetics, where MFBIA-body was added to the analysis, was performed on males exclusively. Principal component analysis showed that healthy reference subjects tend to separate from diabetics. Also, diabetics with severe neuropathy separate from persons less affected. The preliminary study presented in paper III was performed on breast cancer patients in order to investigate if NIR, LDI and DCP were able to detect radiotherapy induced erythema. The promising results in the preliminary study motivated a new and larger study. This study, presented in papers IV and V, intended to investigate the measurement techniques further but also to examine the effect that two different skin lotions, Essex and Aloe vera have on the development of erythema. The Wilcoxon signed rank sum test showed that DCP and NIR could detect erythema, which is developed during one week of radiation treatment. LDI was able to detect erythema developed during two weeks of treatment. None of the techniques could detect any differences between the two lotions regarding the development of erythema. The use of NIR to diagnose cutaneous malignant melanoma is presented as unpublished results in this thesis. This study gave promising but inconclusive results. NIR could be of interest for future development of instrumentation for diagnosis of skin cancer.
340

Novel chemometric proposals for advanced multivariate data analysis, processing and interpretation

Vitale, Raffaele 03 November 2017 (has links)
The present Ph.D. thesis, primarily conceived to support and reinforce the relation between academic and industrial worlds, was developed in collaboration with Shell Global Solutions (Amsterdam, The Netherlands) in the endeavour of applying and possibly extending well-established latent variable-based approaches (i.e. Principal Component Analysis - PCA - Partial Least Squares regression - PLS - or Partial Least Squares Discriminant Analysis - PLSDA) for complex problem solving not only in the fields of manufacturing troubleshooting and optimisation, but also in the wider environment of multivariate data analysis. To this end, novel efficient algorithmic solutions are proposed throughout all chapters to address very disparate tasks, from calibration transfer in spectroscopy to real-time modelling of streaming flows of data. The manuscript is divided into the following six parts, focused on various topics of interest: Part I - Preface, where an overview of this research work, its main aims and justification is given together with a brief introduction on PCA, PLS and PLSDA; Part II - On kernel-based extensions of PCA, PLS and PLSDA, where the potential of kernel techniques, possibly coupled to specific variants of the recently rediscovered pseudo-sample projection, formulated by the English statistician John C. Gower, is explored and their performance compared to that of more classical methodologies in four different applications scenarios: segmentation of Red-Green-Blue (RGB) images, discrimination of on-/off-specification batch runs, monitoring of batch processes and analysis of mixture designs of experiments; Part III - On the selection of the number of factors in PCA by permutation testing, where an extensive guideline on how to accomplish the selection of PCA components by permutation testing is provided through the comprehensive illustration of an original algorithmic procedure implemented for such a purpose; Part IV - On modelling common and distinctive sources of variability in multi-set data analysis, where several practical aspects of two-block common and distinctive component analysis (carried out by methods like Simultaneous Component Analysis - SCA - DIStinctive and COmmon Simultaneous Component Analysis - DISCO-SCA - Adapted Generalised Singular Value Decomposition - Adapted GSVD - ECO-POWER, Canonical Correlation Analysis - CCA - and 2-block Orthogonal Projections to Latent Structures - O2PLS) are discussed, a new computational strategy for determining the number of common factors underlying two data matrices sharing the same row- or column-dimension is described, and two innovative approaches for calibration transfer between near-infrared spectrometers are presented; Part V - On the on-the-fly processing and modelling of continuous high-dimensional data streams, where a novel software system for rational handling of multi-channel measurements recorded in real time, the On-The-Fly Processing (OTFP) tool, is designed; Part VI - Epilogue, where final conclusions are drawn, future perspectives are delineated, and annexes are included. / La presente tesis doctoral, concebida principalmente para apoyar y reforzar la relación entre la academia y la industria, se desarrolló en colaboración con Shell Global Solutions (Amsterdam, Países Bajos) en el esfuerzo de aplicar y posiblemente extender los enfoques ya consolidados basados en variables latentes (es decir, Análisis de Componentes Principales - PCA - Regresión en Mínimos Cuadrados Parciales - PLS - o PLS discriminante - PLSDA) para la resolución de problemas complejos no sólo en los campos de mejora y optimización de procesos, sino también en el entorno más amplio del análisis de datos multivariados. Con este fin, en todos los capítulos proponemos nuevas soluciones algorítmicas eficientes para abordar tareas dispares, desde la transferencia de calibración en espectroscopia hasta el modelado en tiempo real de flujos de datos. El manuscrito se divide en las seis partes siguientes, centradas en diversos temas de interés: Parte I - Prefacio, donde presentamos un resumen de este trabajo de investigación, damos sus principales objetivos y justificaciones junto con una breve introducción sobre PCA, PLS y PLSDA; Parte II - Sobre las extensiones basadas en kernels de PCA, PLS y PLSDA, donde presentamos el potencial de las técnicas de kernel, eventualmente acopladas a variantes específicas de la recién redescubierta proyección de pseudo-muestras, formulada por el estadista inglés John C. Gower, y comparamos su rendimiento respecto a metodologías más clásicas en cuatro aplicaciones a escenarios diferentes: segmentación de imágenes Rojo-Verde-Azul (RGB), discriminación y monitorización de procesos por lotes y análisis de diseños de experimentos de mezclas; Parte III - Sobre la selección del número de factores en el PCA por pruebas de permutación, donde aportamos una guía extensa sobre cómo conseguir la selección de componentes de PCA mediante pruebas de permutación y una ilustración completa de un procedimiento algorítmico original implementado para tal fin; Parte IV - Sobre la modelización de fuentes de variabilidad común y distintiva en el análisis de datos multi-conjunto, donde discutimos varios aspectos prácticos del análisis de componentes comunes y distintivos de dos bloques de datos (realizado por métodos como el Análisis Simultáneo de Componentes - SCA - Análisis Simultáneo de Componentes Distintivos y Comunes - DISCO-SCA - Descomposición Adaptada Generalizada de Valores Singulares - Adapted GSVD - ECO-POWER, Análisis de Correlaciones Canónicas - CCA - y Proyecciones Ortogonales de 2 conjuntos a Estructuras Latentes - O2PLS). Presentamos a su vez una nueva estrategia computacional para determinar el número de factores comunes subyacentes a dos matrices de datos que comparten la misma dimensión de fila o columna y dos planteamientos novedosos para la transferencia de calibración entre espectrómetros de infrarrojo cercano; Parte V - Sobre el procesamiento y la modelización en tiempo real de flujos de datos de alta dimensión, donde diseñamos la herramienta de Procesamiento en Tiempo Real (OTFP), un nuevo sistema de manejo racional de mediciones multi-canal registradas en tiempo real; Parte VI - Epílogo, donde presentamos las conclusiones finales, delimitamos las perspectivas futuras, e incluimos los anexos. / La present tesi doctoral, concebuda principalment per a recolzar i reforçar la relació entre l'acadèmia i la indústria, es va desenvolupar en col·laboració amb Shell Global Solutions (Amsterdam, Països Baixos) amb l'esforç d'aplicar i possiblement estendre els enfocaments ja consolidats basats en variables latents (és a dir, Anàlisi de Components Principals - PCA - Regressió en Mínims Quadrats Parcials - PLS - o PLS discriminant - PLSDA) per a la resolució de problemes complexos no solament en els camps de la millora i optimització de processos, sinó també en l'entorn més ampli de l'anàlisi de dades multivariades. A aquest efecte, en tots els capítols proposem noves solucions algorítmiques eficients per a abordar tasques dispars, des de la transferència de calibratge en espectroscopia fins al modelatge en temps real de fluxos de dades. El manuscrit es divideix en les sis parts següents, centrades en diversos temes d'interès: Part I - Prefaci, on presentem un resum d'aquest treball de recerca, es donen els seus principals objectius i justificacions juntament amb una breu introducció sobre PCA, PLS i PLSDA; Part II - Sobre les extensions basades en kernels de PCA, PLS i PLSDA, on presentem el potencial de les tècniques de kernel, eventualment acoblades a variants específiques de la recentment redescoberta projecció de pseudo-mostres, formulada per l'estadista anglés John C. Gower, i comparem el seu rendiment respecte a metodologies més clàssiques en quatre aplicacions a escenaris diferents: segmentació d'imatges Roig-Verd-Blau (RGB), discriminació i monitorització de processos per lots i anàlisi de dissenys d'experiments de mescles; Part III - Sobre la selecció del nombre de factors en el PCA per proves de permutació, on aportem una guia extensa sobre com aconseguir la selecció de components de PCA a través de proves de permutació i una il·lustració completa d'un procediment algorítmic original implementat per a la finalitat esmentada; Part IV - Sobre la modelització de fonts de variabilitat comuna i distintiva en l'anàlisi de dades multi-conjunt, on discutim diversos aspectes pràctics de l'anàlisis de components comuns i distintius de dos blocs de dades (realitzat per mètodes com l'Anàlisi Simultània de Components - SCA - Anàlisi Simultània de Components Distintius i Comuns - DISCO-SCA - Descomposició Adaptada Generalitzada en Valors Singulars - Adapted GSVD - ECO-POWER, Anàlisi de Correlacions Canòniques - CCA - i Projeccions Ortogonals de 2 blocs a Estructures Latents - O2PLS). Presentem al mateix temps una nova estratègia computacional per a determinar el nombre de factors comuns subjacents a dues matrius de dades que comparteixen la mateixa dimensió de fila o columna, i dos plantejaments nous per a la transferència de calibratge entre espectròmetres d'infraroig proper; Part V - Sobre el processament i la modelització en temps real de fluxos de dades d'alta dimensió, on dissenyem l'eina de Processament en Temps Real (OTFP), un nou sistema de tractament racional de mesures multi-canal registrades en temps real; Part VI - Epíleg, on presentem les conclusions finals, delimitem les perspectives futures, i incloem annexos. / Vitale, R. (2017). Novel chemometric proposals for advanced multivariate data analysis, processing and interpretation [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/90442 / TESIS

Page generated in 0.1089 seconds