Global ETD Search

141	Técnicas de análise de dados distribuídos em áreas / Analysis techniques of distributed data over areas Bertolla, Jane Maiara [UNESP] 20 February 2015 (has links) (PDF) Made available in DSpace on 2015-08-20T17:09:41Z (GMT). No. of bitstreams: 0 Previous issue date: 2015-02-20. Added 1 bitstream(s) on 2015-08-20T17:26:37Z : No. of bitstreams: 1 000839869.pdf: 1792340 bytes, checksum: 4490a9a8babe32e8c2e64a0d2e2667c4 (MD5) / O objetivo deste trabalho é estudar técnicas de análise espacial para compreender os padrões associados a dados por área, testar se o padrão observado é aleatório ou se o evento se distribui por aglomerado, obter mapas mais suaves que o mapa observado e procurar estimativas melhores de estruturas adjacentes. Utilizou-se um conjunto de dados referente a 1656 casos positivos de dengue na cidade de Rio Claro - SP registrados no primeiro semestre de 2011. Neste conjunto de dados, e com o auxílio do software Terra View 4.2.2, foram construídas as estimativas de Kernel para dois tipos de função de estimação: Kernel Normal e Kernel Quártico. Para a Função de Estimação Kernel Normal foram usados os seguintes raios de influência: 100m, 150m, 200m e 500m. Já para a Função de Estimação Kernel Quártico foram usados os seguintes raios de influência: 250m, 375m, 500m e 625m. Ainda no mesmo software foram construídos mapas para uma análise exploratória com três critérios diferentes: Intervalos iguais, Intervalos por quantil e Intervalos por desvios padrão. Em seguida, foram construídos os respectivos mapas de suavização através da Média Móvel Espacial. Dependendo do critério utilizado (quantil, intervalos iguais ou desvio padrão), notou-se que há diferenças nos resultados de ocorrências de dengue, tanto para os dados originais, quanto para os transformados pela média móvel. Observou-se que o comportamento do kernel quártico é similar ao do kernel normal, porém com diferentes raios de influência. Este resultado corrobora as observações de Bailey e Gatrell(1995), de que a função de ajuste não é de grande importância, já que o controle pode ser feito através do raio de influência para a estimativa em cada ponto. Através do teste de permutação aleatório verificou-se que há dependência espacial entre os valores observados, onde a estatística I é igual a 0,389828 e o p-valor igual a... / The goal of this piece is to study spacial analysis techniques to understand the patterns associated to data over areas, test if the observed pattern is random or if the event distributes itself by agglomeration, get smoother maps than the observed maps and look for better estimates of adjacent structures. A database concerning of 1656 positive dengue occurrences in the city of Rio Claro-SP during the first semester of 2011 was used. With this database, using the software Terra View 4.2.2, it were constructed the kernel estimates for two kind of estimate functions: Normal Kernel and Quartic Kernel. For the Normal Kernel Function estimate it were used the following influence radius: 100m, 150m, 200m and 500m. On the other hand, for the Quartic Kernel Function estimate it were used the following influence radius: 250m, 375m, 500m and 625m. Yet using the same software, maps were constructed for a visual exploratory analysis with three different criterions: equal intervals, quintiles intervals and standard deviation intervals. In the sequence, the respective smoothing maps were constructed using the Spacial Moving Averages. Depending on the used criterion (quintiles, equal intervals or standard deviation) it was observed differences between the dengue occurrences, considering the original data or the ones transformed by the moving average. It was observed that the quartic kernel behavior is similar to the normal's kernel, but with different influence radius. This result corroborates Bailey and Gatrell's observations that the adjustment function is not of considerable importance, considering that the control can be made through the influence radius for the estimate in each point. Through the random permutation test it was verified that there is special dependence between the observed values, where the stats equals 0,389828 and the p-value equals 0,01. Kawamoto (2012) applied the kernel estimate for the same database, but considering the ... Biometria Bioestatistica Análise espacial (Estatística) Biometry
142	Reconhecimento de fragmentos de impressões digitais baseado em cristas e poros / Angeloni, Marcus de Assis. January 2013 (has links) Orientador: Aparecido Nilceu Marana / Banca: Aura Conci / Banca: José Remo Ferreira Brega / Resumo: Dentre as diversas características biométricas possíveis de serem utilizadas para identificação de pessoas, a impressão digital é a mais utilizada. Os sistemas atuais de identificação automática de impressões digitais são baseados nos padrões das cristas e nas minúcias, classificadas como características de primeiro e segundo níveis, respectivamente. No entanto, com a evolução dos sensores de captura das impressões digitais e a crescente demanda por sistemas mais seguros, torna-se possível e ne-cessário o uso de um conjunto adicional de características discriminativas presentes no interior das cristas, conhecidas como características de terceiro nível, onde se enquadram os poros. Pesquisas recentes têm focado em aplicações de reconhecimento de impressões digitais nas quais as técnicas baseadas em características de primeiro e segundo níveis geralmente apresentam baixas taxas de reconhecimento correto, tal como no reconhecimento de fragmentos de impressões digitais. Esta dissertação de mestrado teve como objetivo propor, implementar e avaliar o uso de poros no método baseado em cristas utilizando a Transformada de Hough, a fim de mitigar os casos de falsos positivos, comuns neste tipo de problema. Foram avaliados os métodos de extração automática de poros basedo em filtros isotrópicos e adaptativos, e o uso dos poros auxiliando na etapa de registro e comparação das imagens. Resultados experimentais realizados sobre a base pública de fragmentos de impressões digitais PolyU HRF mostraram uma redução de aproximadamente 5% no EER e 15% no FAR100 e FAR1000 em relação ao método baseado em cristas original / Abstract: Among the several biometric traits possible to be used for identifying people, fin-gerprint is the most used. Current automated fingerprint identification systems are based on the ridge pattern and minutiae, classified as first and second level features, respectively. However, with the improving of fingerprint sensors and the growing demand for more secure systems, it is possible and necessary to use an additio-nal discriminative features set present in the ridges, known as third level features, where the sweat pores are classified. Recent researches have focused on fingerprint recognition applications in which fingerprint techniques based on first and second levels features usually have low rates of correct recognition, such as the fragments of fingerprints recognition. This Master's dissertation aimed to propose, implement and evaluate the use of pores in the ridge-based fingerprint matching method using Hough Transform, in order to mitigate the false positives cases, that commonly occur in this type of problem. We evaluate the isotropic-based and adaptive-based automatic pore extraction methods, and the use of pores assisting in the images registration and comparison steps. Experimental results on the public database PolyU HRF, composed by partial fingerprint images, showed a reduction of about 5% in EER, and 15% in FAR100 and FAR1000, when compared to the original ridge-based approach / Mestre Ciência da computação. Biometria. Reconhecimento de padrões. Biometry
143	Testes multinomiais otimizados: uma aplicação no equilíbrio genético de Hardy-Weinberg. / Optimized multinomial tests: an aplication to hardy-weinberg genetic equilibrium. André Jalles Monteiro 15 May 2002 (has links) A estatística X2, quando na aplicação do teste de equilibrio genético de Hardy-weinberg, apresenta baixa eficiência, principalmente quando a amostra é de pequeno porte. Alguns procedimentos alternativos foram apresentados, com excelentes propriedades estatístcas: nivel de significância homogênio e não-viés. Esses procedimentos apresentam uma grande desvantagem prática: muitos pontos na região de rejeição são aleatorizados. No presente trabalho é apresentada uma nova propriedade, o máximo volume da função poder. Na busca do teste com essa propriedade, é sugerida uma forma de construção da região de rejeição, que apresenta o maior número de pontos, sem aleatorizações. Esse procedimento surge como uma adaptação da construção da região de rejeição com a propriedade de nível de significância homogêneo, sem a desvantagem de muitos pontos aleatorizados, apresentando a maior quantidade de combinações genotípicas, associadas ao não-equilibrio genético, qualquer que seja o nível de significância preestabelecido do teste. Tem-se, assim, uma alternativa prática que viabiliza propriedades teóricas desejáveis, a um teste de hipóteses. / The chi-square statistic, on the application of Hardy-Weinberg genetic equilibrium test, has low efficiency, mostly if the sample is scarce. Some alternative procedures have been presented, have valuable statistic proprieties: homogeneous significance level and unbiasedness. Those procedures have a pratical disadvantage: various points are parcially in the critical region. At the present work, it is shown a new approach, the maximum volume of the power function, as a that to construct a critical region, with the maximum number of points that not randomization. This approach is an adaptation of the critical region construction with homogeneous significance level propriety, but it does not have the disadvantage of many points parcially in the critical associated with Hardy-Weinberg genetic disequilibrium, whichever is the segnificance level in the test. Therefore, it is a pratical alternative which makes possible the propriety theory are desidered in a hypothesis test. biometria genética estatística biometry genetic statistics
144	Factors impacting the adoption of biometric authentication in the local banking sector Pooe, Antonio 08 November 2011 (has links) M.Tech. / This research is concerned with establishing the causes for the slow adoption of biometric authentication in the South African banking sector and constitutes exploratory research. It looks at the widely accepted means of authentication and delves deeper into why these modes may not be sufficient to protect sensitive data. The scope of the research is limited to the banking sector only. The first sections of the study establish what the biometric authentication norms are amongst international banking institutions. This is then followed by an environmental study of the South African approach to biometric authentication. Owing to the limited number of banks in South Africa compared to developed countries, the study is limited to the four major banking institutions in South Africa, namely ABSA, Standard Bank, Nedbank and First National Bank. An online survey was used to gather the required data for analysis. The general approach adopted to investigate the extent to which biometric authentication is used by the said four banks was to first measure the respondents’ knowledge of biometrics and to establish the level of exposure the respondents had to the said technology. The next step was then to establish the extent to which the participating banks had investigated the use of biometric authentication. This was followed by consideration of the current use of biometric authentication and lastly, the future use and user perceptions regarding various aspects of biometric authentication in the financial services sector. A matrix that identifies the factors perceived to be impacting the adoption of biometric authentication concludes the last chapter on user perception. Biometry Authentication Banks and banking - Security measures
145	Biometriese enkelaantekening tot IT stelsels Tait, Bobby Laubscher 21 April 2009 (has links) M.Comm. Computer security Access control Security measures Biometry
146	Randomization in a two armed clinical trial: an overview of different randomization techniques Batidzirai, Jesca Mercy January 2011 (has links) Randomization is the key element of any sensible clinical trial. It is the only way we can be sure that the patients have been allocated into the treatment groups without bias and that the treatment groups are almost similar before the start of the trial. The randomization schemes used to allocate patients into the treatment groups play a role in achieving this goal. This study uses SAS simulations to do categorical data analysis and comparison of differences between two main randomization schemes namely unrestricted and restricted randomization in dental studies where there are small samples, i.e. simple randomization and the minimization method respectively. Results show that minimization produces almost equally sized treatment groups, but simple randomization is weak in balancing prognostic factors. Nevertheless, simple randomization can also produce balanced groups even in small samples, by chance. Statistical power is also improved when minimization is used than in simple randomization, but bigger samples might be needed to boost the power. Clinical trials -- Statistical methods Biometry Sampling (Statistics)
147	Statistical Methods for Constructing Heterogeneous Biomarker Networks Xie, Shanghong January 2019 (has links) The theme of this dissertation is to construct heterogeneous biomarker networks using graphical models for understanding disease progression and prognosis. Biomarkers may organize into networks of connected regions. Substantial heterogeneity in networks between individuals and subgroups of individuals is observed. The strengths of network connections may vary across subjects depending on subject-specific covariates (e.g., genetic variants, age). In addition, the connectivities between biomarkers, as subject-specific network features, have been found to predict disease clinical outcomes. Thus, it is important to accurately identify biomarker network structure and estimate the strength of connections. Graphical models have been extensively used to construct complex networks. However, the estimated networks are at the population level, not accounting for subjects’ covariates. More flexible covariate-dependent graphical models are needed to capture the heterogeneity in subjects and further create new network features to improve prediction of disease clinical outcomes and stratify subjects into clinically meaningful groups. A large number of parameters are required in covariate-dependent graphical models. Regularization needs to be imposed to handle the high-dimensional parameter space. Furthermore, personalized clinical symptom networks can be constructed to investigate co-occurrence of clinical symptoms. When there are multiple biomarker modalities, the estimation of a target biomarker network can be improved by incorporating prior network information from the external modality. This dissertation contains four parts to achieve these goals: (1) An efficient l0-norm feature selection method based on augmented and penalized minimization to tackle the high-dimensional parameter space involved in covariate-dependent graphical models; (2) A two-stage approach to identify disease-associated biomarker network features; (3) An application to construct personalized symptom networks; (4) A node-wise biomarker graphical model to leverage the shared mechanism between multi-modality data when external modality data is available. In the first part of the dissertation, we propose a two-stage procedure to regularize l0-norm as close as possible and solve it by a highly efficient and simple computational algorithm. Advances in high-throughput technologies in genomics and imaging yield unprecedentedly large numbers of prognostic biomarkers. To accommodate the scale of biomarkers and study their association with disease outcomes, penalized regression is often used to identify important biomarkers. The ideal variable selection procedure would search for the best subset of predictors, which is equivalent to imposing an l0-penalty on the regression coefficients. Since this optimization is a non-deterministic polynomial-time hard (NP-hard) problem that does not scale with number of biomarkers, alternative methods mostly place smooth penalties on the regression parameters, which lead to computationally feasible optimization problems. However, empirical studies and theoretical analyses show that convex approximation of l0-norm (e.g., l1) does not outperform their l0 counterpart. The progress for l0-norm feature selection is relatively slower, where the main methods are greedy algorithms such as stepwise regression or orthogonal matching pursuit. Penalized regression based on regularizing l0-norm remains much less explored in the literature. In this work, inspired by the recently popular augmenting and data splitting algorithms including alternating direction method of multipliers, we propose a two-stage procedure for l0-penalty variable selection, referred to as augmented penalized minimization-L0 (APM-L0). APM-L0 targets l0-norm as closely as possible while keeping computation tractable, efficient, and simple, which is achieved by iterating between a convex regularized regression and a simple hard-thresholding estimation. The procedure can be viewed as arising from regularized optimization with truncated l1 norm. Thus, we propose to treat regularization parameter and thresholding parameter as tuning parameters and select based on cross-validation. A one-step coordinate descent algorithm is used in the first stage to significantly improve computational efficiency. Through extensive simulation studies and real data application, we demonstrate superior performance of the proposed method in terms of selection accuracy and computational speed as compared to existing methods. The proposed APM-L0 procedure is implemented in the R-package APML0. In the second part of the dissertation, we develop a two-stage method to estimate biomarker networks that account for heterogeneity among subjects and evaluate the network’s association with disease clinical outcome. In the first stage, we propose a conditional Gaussian graphical model with mean and precision matrix depending on covariates to obtain subject- or subgroup-specific networks. In the second stage, we evaluate the clinical utility of network measures (connection strengths) estimated from the first stage. The second stage analysis provides the relative predictive power of between-region network measures on clinical impairment in the context of regional biomarkers and existing disease risk factors. We assess the performance of the proposed method by extensive simulation studies and application to a Huntington’s disease (HD) study to investigate the effect of HD causal gene on the rate of change in motor symptom through affecting brain subcortical and cortical grey matter atrophy connections. We show that cortical network connections and subcortical volumes, but not subcortical connections are identified to be predictive of clinical motor function deterioration. We validate these findings in an independent HD study. Lastly, highly similar patterns seen in the grey matter connections and a previous white matter connectivity study suggest a shared biological mechanism for HD and support the hypothesis that white matter loss is a direct result of neuronal loss as opposed to the loss of myelin or dysmyelination. In the third part of the dissertation, we apply the methodology to construct heterogeneous cross-sectional symptom networks. The co-occurrence of symptoms may result from the direct interactions between these symptoms and the symptoms can be treated as a system. In addition, subject-specific risk factors (e.g., genetic variants, age) can also exert external influence on the system. In this work, we develop a covariate-dependent conditional Gaussian graphical model to obtain personalized symptom networks. The strengths of network connections are modeled as a function of covariates to capture the heterogeneity among individuals and subgroups of individuals. We assess the performance of the proposed method by simulation studies and an application to a Huntington’s disease study to investigate the networks of symptoms in different domains (motor, cognitive, psychiatric) and identify the important brain imaging biomarkers associated with the connections. We show that the symptoms in the same domain interact more often with each other than across domains. We validate the findings using subjects’ measurements from follow-up visits. In the fourth part of the dissertation, we propose an integrative learning approach to improve the estimation of subject-specific networks of target modality when external modality data is available. The biomarker networks measured by different modalities of data (e.g., structural magnetic resonance imaging (sMRI), diffusion tensor imaging (DTI)) may share the same true underlying biological mechanism. In this work, we propose a node-wise biomarker graphical model to leverage the shared mechanism between multi-modality data to provide a more reliable estimation of the target modality network and account for the heterogeneity in networks due to differences between subjects and networks of external modality. Latent variables are introduced to represent the shared unobserved biological network and the information from the external modality is incorporated to model the distribution of the underlying biological network. An approximation approach is used to calculate the posterior expectations of latent variables to reduce time. The performance of the proposed method is demonstrated by extensive simulation studies and an application to construct gray matter brain atrophy network of Huntington’s disease by using sMRI data and DTI data. The estimated network measures are shown to be meaningful for predicting follow-up clinical outcomes in terms of patient stratification and prediction. Lastly, we conclude the dissertation with comments on limitations and extensions. Biometry Biochemical markers Prognosis Graphical modeling (Statistics)
148	Causal Mediation Analysis for Effect Heterogeneity Zhang, Jiaqing January 2021 (has links) It is possible to quantify and understand how an exposure affects an outcome through an intermediate variable via causal mediation analysis. In many cases in practice, however, the effect of the exposure may vary for different subgroups of the population. Combining these two ideas results in the related concepts of moderated mediation and mediated moderation. Addressing questions of why and how an exposure gives rise to an outcome differently for different subsets of the population provides deeper understandings of the effect heterogeneity phenomenon and permits insights that may be both clinically and practically meaningful about what works for whom and through which intermediate(s).This dissertation explores how to understand and explain these causal mechanisms by focusing on explaining effect heterogeneity via causal mediation analysis. Formal definitions and analytical formulas for direct and indirect effect heterogeneity measures are described from a counterfactual perspective. Various types of direct and indirect effect heterogeneity from two-way and three-way decompositions, such as natural direct and indirect effect heterogeneity and pure direct and indirect effect heterogeneity, are introduced and defined. However, just simply decomposing the total effect heterogeneity into direct and indirect effect heterogeneity does not fully account for the complex mechanism of the two-way and three-way interactions happening in the effect heterogeneity phenomenon. Arising from this, in the context of a regression-based approach, this dissertation shows how direct and indirect effect heterogeneity can be further decomposed to account for possible multi-way interactions between exposure, mediator, and modifier. This is an essential way to account for different portions of interactions along causal pathways of effect heterogeneity. It provides more causal implications about the question for whom and in what context that the effect happens. Identification assumptions that are sufficient for the estimations of effect heterogeneity decompositions are also considered. Analytical expressions for effect heterogeneity decompositions on additive and ratio scales are provided. National Longitudinal Study of Adolescent to Adult Health (Add Health) data is used to illustrate the proposed methodologies in application. Statistics Biometry Mediation (Statistics) Public health
149	Clustering Algorithm for Zero-Inflated Data January 2020 (has links) Zero-inflated data are common in biomedical research. In cluster analysis, the heuristic approach fails to provide inferential properties to the outcome while the existing model-based approach only works in the case of a mixture of multivariate normal. In this dissertation, I developed two new model-based clustering algorithms- the multivariate zero-inflated log-normal and the multivariate zero-inflated Poisson clustering algorithms. I then applied these methods to the questionnaire data and compare the resulting clusters to the ones derived from assuming multivariate normal distribution. Associations between clustering results and clinical outcomes were also investigated. Biometry--Statistical methods Medicine--Research Biology--Research
150	Fitting of survival functions for grouped data on insurance policies Louw, Elizabeth Magrietha 28 November 2005 (has links) The aim of the research is the statistical modelling of parametric survival distributions of grouped survival data of long- and shortterm policies in the insurance industry, by means of a method of maximum likelihood estimation subject to constraints. This methodology leads to explicit expressions for the estimates of the parameters, as well as for approximated variances and covariances of the estimates, which gives exact maximum likelihood estimates of the parameters. This makes direct extension to more complex designs feasible. The statistical modelling offers parametric models for survival distributions, in contrast with non-parametric models that are used commonly in the actuarial profession. When the parametric models provide a good fit to data, they tend to give more precise estimates of the quantities of interest such as odds ratios, hazard ratios or median lifetimes. These estimates form the statistical foundation for scientific decisionmaking with respect to actuarial design, maintenance and marketing of insurance policies. Although the methodology in this thesis is developed specifically for the insurance industry, it may be applied in the normal context of research and scientific decision making, that includes for example survival distributions for the medical, biological, engineering, econometric and sociological sciences. / Dissertation (PhD (Mathematical Statistics))--University of Pretoria, 2002. / Mathematics and Applied Mathematics / unrestricted Insurance policies Survival analysis biometry UCTD

Search results