41 |
Seleção de características: abordagem via redes neurais aplicada à segmentação de imagens / Feature selection: a neural approach applied to image segmentationDavi Pereira dos Santos 21 March 2007 (has links)
A segmentaçãoo de imagens é fundamental para a visão computacional. Com essa finalidade, a textura tem sido uma propriedade bastante explorada por pesquisadores. Porém, a existência de diversos métodos de extração de textura, muitas vezes específicos para determinadas aplicações, dificulta a implementação de sistemas de escopo mais geral. Tendo esse contexto como motivação e inspirado no sucesso dos sistemas de visão naturais e em sua generalidade, este trabalho propõe a combinação de métodos por meio da seleção de características baseada na saliência das sinapses de um perceptron multicamadas (MLP). É proposto, também, um método alternativo baseado na capacidade do MLP de apreender textura que dispensa o uso de técnicas de extração de textura. Como principal contribuição, além da comparação da heurística de seleção proposta frente à busca exaustiva segundo o critério da distância de Jeffrey-Matusita, foi introduzida a técnica de Equalização da Entrada, que melhorou consideravelmente a qualidade da medida de saliência. É também apresentada a segmentação de imagens de cenas naturais, como exemplo de aplicação / Segmentation is a crucial step in Computer Vision. Texture has been a property largely employed by many researchers to achieve segmentation. The existence of a large amount of texture extraction methods is, sometimes, a hurdle to overcome when it comes to modeling systems for more general problems. Inside this context and following the excellence of natural vision systems and their generality, this work has adopted a feature selection method based on synaptic conexions salience of a Multilayer Perceptron and a method based on its texture inference capability. As well as comparing the proposed method with exhaustive search according to the Jeffrey-Matusita distance criterion, this work also introduces, as a major contribution, the Input Equalization technique, which contributed to significantly improve the segmentation results. The segmentation of images of natural scenes has also been provided as a likely application of the method
|
42 |
Uma abordagem de predição estruturada baseada no modelo perceptronCoelho, Maurício Archanjo Nunes 25 June 2015 (has links)
Submitted by Renata Lopes (renatasil82@gmail.com) on 2017-03-06T17:58:43Z
No. of bitstreams: 1
mauricioarchanjonunescoelho.pdf: 10124655 bytes, checksum: 549fa53eba76e81b76ddcbce12c97e55 (MD5) / Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2017-03-06T20:26:43Z (GMT) No. of bitstreams: 1
mauricioarchanjonunescoelho.pdf: 10124655 bytes, checksum: 549fa53eba76e81b76ddcbce12c97e55 (MD5) / Made available in DSpace on 2017-03-06T20:26:44Z (GMT). No. of bitstreams: 1
mauricioarchanjonunescoelho.pdf: 10124655 bytes, checksum: 549fa53eba76e81b76ddcbce12c97e55 (MD5)
Previous issue date: 2015-06-25 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / A teoria sobre aprendizado supervisionado tem avançado significativamente nas últimas
décadas. Diversos métodos são largamente utilizados para resoluções dos mais variados
problemas, citando alguns: sistemas especialistas para obter respostas to tipo verdadeiro/
falso, o modelo Perceptron para separação de classes, Máquina de Vetores Suportes
(SVMs) e o Algoritmo de Margem Incremental (IMA) no intuito de aumentar a margem
de separação, suas versões multi-classe, bem como as redes neurais artificiais, que apresentam
possibilidades de entradas relativamente complexas. Porém, como resolver tarefas
que exigem respostas tão complexas quanto as perguntas?
Tais respostas podem consistir em várias decisões inter-relacionadas que devem ser ponderadas
uma a uma para se chegar a uma solução satisfatória e globalmente consistente.
Será visto no decorrer do trabalho que existem problemas de relevante interesse que apresentam
estes requisitos.
Uma questão que naturalmente surge é a necessidade de se lidar com a explosão combinatória
das possíveis soluções. Uma alternativa encontrada apresenta-se através da construção
de modelos que compactam e capturam determinadas propriedades estruturais
do problema: correlações sequenciais, restrições temporais, espaciais, etc. Tais modelos,
chamados de estruturados, incluem, entre outros, modelos gráficos, tais como redes de
Markov e problemas de otimização combinatória, como matchings ponderados, cortes de
grafos e agrupamentos de dados com padrões de similaridade e correlação.
Este trabalho formula, apresenta e discute estratégias on-line eficientes para predição
estruturada baseadas no princípio de separação de classes derivados do modelo Perceptron
e define um conjunto de algoritmos de aprendizado supervisionado eficientes quando
comparados com outras abordagens.
São também realizadas e descritas duas aplicações experimentais a saber: inferência dos
custos das diversas características relevantes para a realização de buscas em mapas variados
e a inferência dos parâmetros geradores dos grafos de Markov. Estas aplicações têm
caráter prático, enfatizando a importância da abordagem proposta. / The theory of supervised learning has significantly advanced in recent decades. Several
methods are widely used for solutions of many problems, such as expert systems for
answers to true/false, Support Vector Machine (SVM) and Incremental Margin Algorithm
(IMA). In order to increase the margin of separation, as well as its multi-class versions,
in addition to the artificial neural networks which allow complex input data. But how to
solve tasks that require answers as complex as the questions? Such responses may consist
of several interrelated decisions to be considered one by one to arrive at a satisfactory and
globally consistent solution. Will be seen throughout the thesis, that there are problems
of relevant interest represented by these requirements.
One question that naturally arises is the need to deal with the exponential explosion of
possible answers. As a alternative, we have found through the construction of models
that compress and capture certain structural properties of the problem: sequential correlations,
temporal constraints, space, etc. These structured models include, among others,
graphical models, such as Markov networks and combinatorial optimization problems,
such as weighted matchings, graph cuts and data clusters with similarity and correlation
patterns.
This thesis formulates, presents and discusses efficient online strategies for structured
prediction based on the principle of separation of classes, derived from the Perceptron and
defines a set of efficient supervised learning algorithms compared to other approaches.
Also are performed and described two experimental applications: the costs prediction
of relevant features on maps and the prediction of the probabilistic parameters for the
generating Markov graphs. These applications emphasize the importance of the proposed
approach.
|
43 |
Rozpoznání textu s využitím neuronových sítí / Text recognition with artificial neural networksPeřinová, Barbora January 2018 (has links)
This master’s thesis deals with optical character recognition. The first part describes the basic types of optical character recognition tasks and divides algorithm into individual phases. For each phase the most commonly used methods are described in the next part. Within the character recognition phase the problematics of artificial neural networks and their usage in given phase is explained, specifically multilayer perceptron and convolutional neural networks. The second part deals with requirements definition for specific application to be used as feedback for robotic system. Convolution neural networks and CNTK library for deep learning using algorithm implementation in .NET is introduced. Finally, the test results of the individual phases of the proposed solution and the comparison with the open source Tesseract engine are discussed.
|
44 |
Estimativa de nitrogênio em Annona emarginata (Schltdl.) H. Rainer utilizando espectroscopia no infravermelho próximo (NIRS) Uma abordagem estatística e computacional /Gomes, Rafaela Lanças January 2020 (has links)
Orientador: Gisela Ferreira / Resumo: O nitrogênio é um elemento mineral essencial para as plantas. Sua deficiência em fases iniciais do desenvolvimento pode gerar alterações fisiológicas e morfológicas que reduzem o crescimento e conflui na não expressão total do potencial genético vegetal. As técnicas mais difundidas para a quantificação do N nas plantas demandam tempo, são destrutivas e liberam compostos tóxicos para o ambiente. A NIRS (Near-Infrared Spectroscopy - Espectroscopia no Infravermelho Próximo), se apresenta como uma técnica alternativa, sendo indireta, mas instantânea, não destrutiva e que não utiliza reagentes químicos, mas necessita de calibração, que pode ser feita por métodos estatísticos e computacionais. Para mudas que são produzidas em viveiros, como as de Annona emarginata (Schltdl.) H. Rainer, é essencial manter o monitoramento de N, de forma rápida e não danosa, para garantir a qualidade e vigor das mudas. Desta forma, este trabalho visou detectar alterações na caracterização espectral foliar de A. emarginata em função do fornecimento de concentrações de nitrogênio e classificar as mudas em função dos níveis de nitrogênio, com base na caracterização espectral, utilizando algoritmos de aprendizado de máquinas e análise estatística multivariada. As mudas de A. emarginata (240) foram mantidas em sistema de hidroponia, com alterações na concentração de nitrogênio: 0 mg.L-1 de N (T1); 52,5 mg.L-1 de N (T2); 105 mg.L-1 de N (T3) e 210 mg.L-1 de N (T4), com 60 repetições (mudas) para cada tratam... (Resumo completo, clicar acesso eletrônico abaixo) / Abstract: Nitrogen is an essential mineral element for plants. Its deficiency in early stages of development can lead to physiological and morphological changes that reduce growth and result in the total non-expression of plant genetic potential. The most widespread techniques for quantifying N in plants are time consuming, destructive and release toxic compounds into the environment. Near-Infrared Spectroscopy (NIRS) is an alternative technique, being indirect, but instantaneous, non-destructive and does not use chemical reagents, but needs calibration, which can be done by statistical and computational methods. For seedlings that are produced in nurseries, such as those of Annona emarginata (Schltdl.) H. Rainer, it is essential to keep N monitoring fast and harmless to ensure seedling quality. Thus, this work aimed to detect changes in the leaf spectral characterization of A. emarginata as a function of nitrogen concentration supply and to classify seedlings as a function of nitrogen levels, based on spectral characterization, using machine learning algorithms and multivariate statistical analysis. 240 A. emarginata seedlings were maintained in a hydroponic system, with modifications in nitrogen concentration: 0 mg.L-1 of N (T1); 52.5 mg.L-1 of N (T2); 105 mg.L-1 N (T3) and 210 mg.L-1 N (T4), with 60 replications (seedlings) for each treatment. After 45 days in these solutions, three leaves of each plant were collected, photographed and their spectral characterizations was measured. ... (Complete abstract click electronic access below) / Mestre
|
45 |
Applicering av maskininlärning för att predicera utfall av Kickstarter-projekt / Application of machine learning to predict outcome of Kickstarter-projectsLidén, Rickard, In, Gabriel January 2021 (has links)
Crowdfunding är i den moderna digitala världen ett populärt sätt att samla in pengar till sitt projekt. Kickstarter är en av de ledande sidorna för crowdfunding. Predicering av ett Kickstarter-projekts framgång eller misslyckande kan därav vara av stort intresse för entreprenörer.Studiens syfte är att jämföra fyra olika algoritmers prediceringsförmåga på två olika Kickstarter-dataset. Det ena datasetet sträcker sig mellan åren 2020-2021, och det andra mellan åren 2016-2021. Algoritmerna som jämförs är KNN, Naive Bayes, MLP, och Random Forest.Av dessa fyra modeller så skapades i denna studie de bästa produktionsmodellerna av KNN och Random Forest. KNN var bäst för 2020-2021-datasetet, med 77,0% träffsäkerhet. Random Forest var bäst för 2016-2021-datasetet, med 76,8% träffsäkerhet. / Crowdfunding has in the modern, digitalized world become a popular method for gathering money for a project. Kickstarter is one of the most popular websites for crowdfunding. This means that predicting the success or failure of a Kickstarter-project by way of machine learning could be of great interest to entrepreneurs.The purpose of this study is to compare the predictive abilities of four different algorithms on two different Kickstarter-datasets. One dataset contains data in the span of the years 2020-2021, and the other contains data from 2016-2021. The algorithms used in this study are KNN, Naive Bayes, MLP and Random Forest.Out of these four algorithms, the top-performing prediction abilities for the two datasets were found in KNN and Random Forest. KNN was the best-performing algorithm for 2020-2021, with 77,0% accuracy. Random Forest had the top score for 2016-2021, with 76,8% accuracy. The language used in this study is Swedish.
|
46 |
A Deep Learning Approach to Predicting Diagnosis Code from Electronic Health Records / Djupinlärning för prediktion av diagnoskod utifrån elektroniska patientjournalerHåkansson, Ellinor January 2018 (has links)
Electronic Health Record (EHR) is an umbrella term encompassing demographics and health information of a patient from many different sources in a digital format. Deep learning has been used on EHRs in many successful studies and there is great potential in future implementations. In this study, diagnosis classification of EHRs with Multi-layer Perceptron models are studied. Two MLPs with different architectures are constructed and run on both a modified version of the EHR dataset and the raw data. A Random Forest is used as baseline for comparison. The MLPs are not successful in beating the baseline, with the best-performing MLP having a classification accuracy of 48.1%, which is 13.7 percentage points lower than that of the baseline. The results indicate that when the dataset is small, this approach should not be chosen. However, the dataset is growing over time and thus there is potential for continued research in the future. / Elektronisk patientjournal (EHR) är ett paraplybegrepp som används för att beskriva en digital samling av demografisk och medicinsk data från olika källor för en patient. Det finns stor potential i användandet av djupinlärning på dessa journaler och många framgångsrika studier har redan gjorts på området. I denna studie undersöks diagnosklassificering av elektroniska patientjournaler med Multi-layer perceptronmodeller. Två MLP-modeller av olika arkitekturer presenteras. Dessa körs på både en anpassad version av EHR-datamängden och på den råa EHR-datan. En Random Forest-modell används som baslinje för jämförelse. MLP-modellerna lyckas inte överträffa baslinjen, då den bästa MLP-modellen ger en klassifikationsnoggrannhet på 48,1%, vilket är 13,7 procentenheter mindre än baslinjens. Resultaten visar att en liten datamängd indikerar att djupinlärning bör väljas bort för denna typ av problem. Datamängden växer dock över tid, vilket gör områdetattraktivt för framtida studier.
|
47 |
[en] REDUCING TEACHER-STUDENT INTERACTIONS BETWEEN TWO NEURAL NETWORKS / [pt] REDUZINDO AS INTERAÇÕES PROFESSOR-ALUNO ENTRE DUAS REDES NEURAISGUSTAVO MADEIRA KRIEGER 11 October 2019 (has links)
[pt] Propagação de conhecimento é um dos pilares da evolução humana. Nossas descobertas são baseadas em conhecimentos já existentes, construídas em cima deles e então se tornam a fundação para a próxima geração de aprendizado. No ramo de Inteligência Artificial, existe o interesse em replicar esse aspecto da natureza humana em máquinas. Criando um primeiro modelo e treinando ele nos dados originais, outro modelo pode ser criado e aprender a partir dele ao invés de ter que começar todo o processo do zero. Se for comprovado que esse método é confiável, ele vai permitir várias mudanças na forma que nós abordamos machine learning, em que cada inteligência não será um microcosmo independente. Essa relação entre modelos é batizada de relação Professor-Aluno. Esse trabalho descreve o desenvolvimento de dois modelos distintos e suas capacidades de aprender usando a informação dada em um ao outro. Os experimentos apresentados aqui mostram os resultados desse treino e as diferentes metodologias usadas em busca do cenário ótimo em que esse processo de aprendizado é viável para replicação futura. / [en] Propagation of knowledge is one of the pillars of human evolution. Our discoveries are all based on preexisting knowledge, built upon them and then become the foundation for the next generation of learning. In the field of artificial intelligence, there s an interest in replicating this aspect of human nature on machines. By creating a first model and training it on the original data, another model can be created and learn from it instead of having to learn everything from scratch. If this method is proven to be reliable, it will allow many changes in the way that we approach machine learning, specially allowing different models to work together. This relation between models is nicknamed the Teacher-Student relation. This work describes the development of two separate models and their ability to learn using incomplete data and each other. The experiments presented here show the results of this training and the different methods used in the pursuit of an optimal scenario where such learning process is viable for future use.
|
48 |
[en] MULTILAYER PERCEPTRON FOR CLASSIFYING POLYMERS FROM TENSILE TEST DATA / [pt] PERCEPTRON DE MÚLTIPLAS CAMADAS PARA A CLASSIFICAÇÃO DE POLÍMEROS A PARTIR DE DADOS DE ENSAIOS DE TRAÇÃOHENRIQUE MONTEIRO DE ABREU 03 September 2024 (has links)
[pt] O ensaio de tração é o ensaio mecânico mais aplicado para a obtenção
das propriedades mecânicas de polímeros. Por meio de um ensaio de tração
é obtida a curva tensão-deformação, e é a partir desta curva que são obtidas propriedades mecânicas tais como o módulo de elasticidade, a tenacidade
e a resiliência do material, as quais podem ser utilizadas na identificação de
comportamentos mecânicos equivalentes em materiais poliméricos, seja para
a diferenciação de resíduos plásticos para a reciclagem ou para a classificação
de um material plástico reciclado quanto ao teor de um determinado polímero
em sua composição. Porém, a obtenção das propriedades mecânicas a partir da curva tensão-deformação envolve cálculos e ajustes nos intervalos da
curva em que essas propriedades são determinadas, tornando a obtenção das
propriedades mecânicas um processo complexo sem a utilização de programas
computacionais especializados. A partir da compreensão do padrão de comportamento da curva tensão-deformação de um material, algoritmos de aprendizagem de máquina (AM) podem ser ferramentas eficientes para automatizar
a classificação de diferentes tipos de materiais poliméricos. Com o objetivo
de verificar a acurácia de um algoritmo de AM na classificação de três tipos
de polímeros, foram realizados ensaios de tração em corpos de prova de polietileno de alta densidade (PEAD), polipropileno (PP) e policloreto de vinila
(PVC). O conjunto de dados obtido a partir das curvas tensão-deformação foi
utilizado no treinamento de uma rede neural artificial perceptron de múltiplas
camadas (PMC). Com uma acurácia de 0,9261 para o conjunto de teste, o
modelo obtido a partir da rede PMC foi capaz de classificar os polímeros com
base nos dados da curva tensão-deformação, indicando a possibilidade do uso
de modelos de AM para automatizar a classificação de materiais poliméricos a
partir de dados de ensaios de tração. / [en] The tensile test is the most applied mechanical test to obtain the mechanical properties of polymers, which can be used in polymeric materials classification. Through a tensile test is obtained the stress-strain curve, is from which
mechanical properties such as the modulus of elasticity, tenacity, and resilience
of the material are obtained, which can be used to identify equivalent mechanical behaviors in polymeric materials, whether for the distinguishing plastic
waste for recycling or for classifying recycled plastic material according to the
content of a polymer type in its composition. However, obtaining mechanical
properties from the stress-strain curve involves calculations and adjustments in
the intervals of the curve in which these properties are determined, turning it
into a complex process without the use of specialized software. By understanding the behavior pattern of a material’s stress-strain curve, machine learning
(ML) algorithms can be efficient tools to automate the classification of different types of polymeric materials. To verify the accuracy of an ML algorithm
in classifying three types of polymers, tensile tests were performed on specimens made of high-density polyethylene (HDPE), polypropylene (PP), and
polyvinyl chloride (PVC). The dataset obtained from the stress-strain curves
was used in the training of a multilayer perceptron (MLP) neural network.
With an accuracy of 0.9261 for the test set, the model obtained from the MLP
neural network was able to classify the polymers based on the stress-strain
curve data, thus indicating the possibility of using an ML algorithm to automate the classification of polymeric materials based on tensile test data.
|
49 |
Comparative study of neural networks and design of experiments to the classification of HIV status / Wilbert Sibanda.Sibanda, Wilbert January 2013 (has links)
This research addresses the novel application of design of experiment, artificial neural net-works and logistic regression to study the effect of demographic characteristics on the risk of acquiring HIV infection among the antenatal clinic attendees in South Africa. The annual antenatal HIV survey is the only major national indicator for HIV prevalence in South Africa. This is a vital technique to understand the changes in the HIV epidemic over time. The annual antenatal clinic data contains the following demographic characteristics for each pregnant woman; age (herein called mother's age), partner's age (herein father's age), population group (race), level of education, gravidity (number of pregnancies), parity (number of children born), HIV and syphilis status. This project applied a screening design of experiment technique to rank the effects of individual demographic characteristics on the risk of acquiring an HIV infection. There are a various screening design techniques such as fractional or full factorial and Plackett-Burman designs. In this work, a two-level fractional factorial design was selected for the purposes of screening. In addition to screening designs, this project employed response surface methodologies (RSM) to estimate interaction and quadratic effects of demographic characteristics using a central composite face-centered and a Box-Behnken design. Furthermore, this research presents the novel application of multi-layer perceptron’s (MLP) neural networks to model the demographic characteristics of antenatal clinic attendees. A review report was produced to study the application of neural networks to modelling HIV/AIDS around the world. The latter report is important to enhance our understanding of the extent to which neural networks have been applied to study the HIV/AIDS pandemic. Finally, a binary logistic regression technique was employed to benchmark the results obtained by the design of experiments and neural networks methodologies. The two-level fractional factorial design demonstrated that HIV prevalence was highly sensitive to changes in the mother's age (15-55 years) and level of her education (Grades 0-13). The central composite face centered and Box-Behnken designs employed to study the individual and interaction effects of demographic characteristics on the spread of HIV in South Africa, demonstrated that HIV status of an antenatal clinic attendee was highly sensitive to changes in pregnant mother's age and her educational level. In addition, the interaction of the mother's age with other demographic characteristics was also found to be an important determinant of the risk of acquiring an HIV infection. Furthermore, the central composite face centered and Box-Behnken designs illustrated that, individual-ally the pregnant mother's parity and her partner's age had no marked effect on her HIV status. However, the pregnant woman’s parity and her male partner’s age did show marked effects on her HIV status in “two way interactions with other demographic characteristics”. The multilayer perceptron (MLP) sensitivity test also showed that the age of the pregnant woman had the greatest effect on the risk of acquiring an HIV infection, while her gravidity and syphilis status had the lowest effects. The outcome of the MLP modelling produced the same results obtained by the screening and response surface methodologies. The binary logistic regression technique was compared with a Box-Behnken design to further elucidate the differential effects of demographic characteristics on the risk of acquiring HIV amongst pregnant women. The two methodologies indicated that the age of the pregnant woman and her level of education had the most profound effects on her risk of acquiring an HIV infection. To facilitate the comparison of the performance of the classifiers used in this study, a receiver operating characteristics (ROC) curve was applied. Theoretically, an ROC analysis provides tools to select optimal models and to discard suboptimal ones independent from the cost context or the classification distribution. SAS Enterprise MinerTM was employed to develop the required receiver-of-characteristics (ROC) curves. To validate the results obtained by the above classification methodologies, a credit scoring add-on in SAS Enterprise MinerTM was used to build binary target scorecards comprised of HIV positive and negative datasets for probability determination. The process involved grouping variables using weights-of-evidence (WOE), prior to performing a logistic regression to produce predicted probabilities. The process of creating bins for the scorecard enables the study of the inherent relationship between demographic characteristics and an in-dividual’s HIV status. This technique increases the understanding of the risk ranking ability of the scorecard method, while offering an added advantage of being predictive.
|
50 |
Comparative study of neural networks and design of experiments to the classification of HIV status / Wilbert Sibanda.Sibanda, Wilbert January 2013 (has links)
This research addresses the novel application of design of experiment, artificial neural net-works and logistic regression to study the effect of demographic characteristics on the risk of acquiring HIV infection among the antenatal clinic attendees in South Africa. The annual antenatal HIV survey is the only major national indicator for HIV prevalence in South Africa. This is a vital technique to understand the changes in the HIV epidemic over time. The annual antenatal clinic data contains the following demographic characteristics for each pregnant woman; age (herein called mother's age), partner's age (herein father's age), population group (race), level of education, gravidity (number of pregnancies), parity (number of children born), HIV and syphilis status. This project applied a screening design of experiment technique to rank the effects of individual demographic characteristics on the risk of acquiring an HIV infection. There are a various screening design techniques such as fractional or full factorial and Plackett-Burman designs. In this work, a two-level fractional factorial design was selected for the purposes of screening. In addition to screening designs, this project employed response surface methodologies (RSM) to estimate interaction and quadratic effects of demographic characteristics using a central composite face-centered and a Box-Behnken design. Furthermore, this research presents the novel application of multi-layer perceptron’s (MLP) neural networks to model the demographic characteristics of antenatal clinic attendees. A review report was produced to study the application of neural networks to modelling HIV/AIDS around the world. The latter report is important to enhance our understanding of the extent to which neural networks have been applied to study the HIV/AIDS pandemic. Finally, a binary logistic regression technique was employed to benchmark the results obtained by the design of experiments and neural networks methodologies. The two-level fractional factorial design demonstrated that HIV prevalence was highly sensitive to changes in the mother's age (15-55 years) and level of her education (Grades 0-13). The central composite face centered and Box-Behnken designs employed to study the individual and interaction effects of demographic characteristics on the spread of HIV in South Africa, demonstrated that HIV status of an antenatal clinic attendee was highly sensitive to changes in pregnant mother's age and her educational level. In addition, the interaction of the mother's age with other demographic characteristics was also found to be an important determinant of the risk of acquiring an HIV infection. Furthermore, the central composite face centered and Box-Behnken designs illustrated that, individual-ally the pregnant mother's parity and her partner's age had no marked effect on her HIV status. However, the pregnant woman’s parity and her male partner’s age did show marked effects on her HIV status in “two way interactions with other demographic characteristics”. The multilayer perceptron (MLP) sensitivity test also showed that the age of the pregnant woman had the greatest effect on the risk of acquiring an HIV infection, while her gravidity and syphilis status had the lowest effects. The outcome of the MLP modelling produced the same results obtained by the screening and response surface methodologies. The binary logistic regression technique was compared with a Box-Behnken design to further elucidate the differential effects of demographic characteristics on the risk of acquiring HIV amongst pregnant women. The two methodologies indicated that the age of the pregnant woman and her level of education had the most profound effects on her risk of acquiring an HIV infection. To facilitate the comparison of the performance of the classifiers used in this study, a receiver operating characteristics (ROC) curve was applied. Theoretically, an ROC analysis provides tools to select optimal models and to discard suboptimal ones independent from the cost context or the classification distribution. SAS Enterprise MinerTM was employed to develop the required receiver-of-characteristics (ROC) curves. To validate the results obtained by the above classification methodologies, a credit scoring add-on in SAS Enterprise MinerTM was used to build binary target scorecards comprised of HIV positive and negative datasets for probability determination. The process involved grouping variables using weights-of-evidence (WOE), prior to performing a logistic regression to produce predicted probabilities. The process of creating bins for the scorecard enables the study of the inherent relationship between demographic characteristics and an in-dividual’s HIV status. This technique increases the understanding of the risk ranking ability of the scorecard method, while offering an added advantage of being predictive.
|
Page generated in 0.0647 seconds