• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 262
  • 111
  • 16
  • 15
  • 13
  • 4
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 1
  • Tagged with
  • 451
  • 129
  • 110
  • 88
  • 75
  • 58
  • 52
  • 47
  • 40
  • 32
  • 32
  • 31
  • 29
  • 26
  • 24
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
351

"Identificação de espécies vegetais através da análise da forma interna de órgãos foliares" / Plant species identification based on venation system shape analysis

Rodrigo de Oliveira Plotze 29 October 2004 (has links)
A diversidade de espécies presentes no riquíssimo reino vegetal torna o processo de identificação de órgãos foliares uma tarefa muito complexa. A biodiversidade das espécies, associada aos modelos tradicionais de taxonomia, transforma essa tarefa em um verdadeiro desafio para os pesquisadores. Neste trabalho é apresentada uma nova abordagem para identificação de espécies vegetais baseada em características internas dos órgãos foliares. A coleta de informações é realizada através de técnicas de visão computacional e análise de imagens, através das quais são extraídas características relativas à complexidade (dimensão fractal) e biometria dos órgãos foliares. A eficiência da metodologia desenvolvida foi avaliada em casos reais de identificação de espécies, em que foram utilizados dois conjuntos de imagens: espécies da Mata Atlântica e do Cerrado brasileiro, e espécies de maracujás silvestres do gênero Passiflora. Para classificação das espécies foram utilizadas as técnicas de reconhecimento padrões de análise de agrupamentos e redes neurais artificiais. / The plant species diversity makes their correct identification a very complex task. The traditional taxonomy models, associated with species biodiversity, has been transformed this task in a challenger for the researches. This work presents a new approach to plant species identification, based on internal characteristics of leaf form. The data are collected by computer vision and shape analysis techniques, which extracts features from complexity (fractal dimension) and biometry of plant species. The methodology efficiency was evaluated with real cases of species identification: digital images of Mata Atlântica and brazilian Cerrado species; and passion fruit species of genus Passiflora. The species classifications are performed using pattern recognition techniques as clustering and artificial neural networks.
352

Identificação de espécies vegetais por meio da análise do contorno foliar - uma abordagem bio-inspirada / Not available

Maurício Falvo 12 August 2005 (has links)
A identificação de unia planta exige, pelos padrões de taxionomia vegetal, a análise de folhas, flores e frutos. O projeto TreeVis surge com uma proposta de auxiliar na identificação de espécies vegetais, por meio do uso de métodos biométricos, a partir da análise de alguns atributos de uma folha. A contribuição inicial deste trabalho de mestrado, para o projeto TreeVis, está obtenção de classificadores por meio do uso de assinaturas de contorno, sob o domínio da frequência, possibilitando a composição de diversos tipos de assinaturas e classificadores para uma mesma espécie. Devido à baixa eficiência obtida por métodos de classificação como distância mínima, optou-se pelo uso de redes neurais. Essa abordagem evidenciou a necessidade de solução de dois problemas: o grande número de possibilidades de composição de sinais o que ocasionaria um grande esforço computacional para a obtenção de todas respectivas redes neurais; e o reduzido número das amostras utilizadas no trabalho - o qual comprometeria as etapas de treinamento e teste de uma rede neural. Para a solução desses problemas, foram desenvolvidos dois métodos: o primeiro método identifica e seleciona as assinaturas que apresentam um maior potencial de sucesso em obter um classificador por meio de redes neurais, solucionando o problema e desperdício de esforço computacional; o segundo método possibilita a geração de amostras artificiais de folhas através da combinação dos espectros de frequência do contorno das amostras reais por meio operadores genéticos de cross-over e mutação. Solucionadas as duas questões, foram obtidas diversas redes neurais, através da indicação das assinaturas de melhor potencial e treinadas com amostras artificiais. Do total de 31 classes, 7 foram descartadas da tentativa de obtenção de classificadores por não apresentarem nenhuma assinatura com potencial de classificação - conforme indicação do método desenvolvido. Das 24 espécies restantes, foram obtidos classificadores para 18 espécies (75%) com taxas médias de 85% de acerto. A execução deste trabalho necessitou do desenvolvimento de um arcabouço para a automatização da geração, treinamento e teste das redes neurais. / The vegetable identifieation is done, in vegetal taxonomy standards, by fiower, fruits and leaves analyses. The TreeVis project proposes to identifv vegetal speeiniens by biometric methods using only same leaf features. The contribution of this work for to TreeVis project is the generation of classifiers by the contour signatures, under frequency domain, niaking be able the coniposition of several types of signatures and classifiers for the same speeimen. Because of poor efficiency results from methods like minimal distance, was chosen to use neural networks. This approach showed the need to solve two probleins: the numerous composition possibilities of signatures - that would be need a big computational effort to obtained ali possible neural networks; and the small number of speeimen samples - that would compromise the training and test. of neural networks. To solve these two probleins was developed two methods: The first identify and select the signatures that have a good pattern recognition potential, before of the network will be done, solving the waste unneeded effort problem. The second method proposed produces artificial leaf sliapes by combination of contour spectrum frequency speeiniens of real leaves, using genetic operators like cross-over and mutation. Solved these probleins several networks was obtained by appointed potential signature methods and trained and tested with artificial leaves. From 31 speeiniens class, 07 were discarded because tliey had not signatures with classification potential - indicated by developed method. From 24 classes remaining were obtained classifiers for 18 classes (75%) with médium rates 85% of set riglit. The execution of this work demanded the construction of a framework to automatize the generation, training and test of the neural networks.
353

Effect modification by socioeconomic conditions on the effects of prescription opioid supply on drug poisoning deaths in the United States

Fink, David S. January 2020 (has links)
The rise in America’s drug poisoning rates has been described as a public health crisis and has long been attributed to the rapid rise in opioid supply due to increased volumes of medical prescribing in the United States that began in the mid-1990s and peaked in 2012. In 2016, the introduction of the “deaths of despair” hypothesis provided a more nuanced explanation for the rising rates of drug poisoning deaths: increasing income inequality and stagnation of middle-class worker wages, driven by long-term shifts in the labor market, reduced employment opportunities and overall life prospects for persons with a high school degree or less, driving increases in “deaths of despair” (i.e., deaths from suicide, cirrhosis of the liver, and drug poisonings). This focus on economic and social conditions as capable of shaping geospatial differences in drug demand and attendant drug-related harms (e.g., drug poisonings) provides a larger context to factors potentially underlying the heterogeneous distribution of prescription opioid supply across the United States. However, despite the likelihood that economic and social conditions may be important demand-side factors that also interact with supply-side factors to produce the rates of fatal drug poisonings, little information exists about the effect of area-level socioeconomic conditions on fatal drug poisoning rates, and no study has investigated whether socioeconomic conditions interact with prescription opioid supply to affect area-level rates of fatal drug poisonings. The overarching goal of this dissertation was to test the independent and joint effects of supply- and demand-side factors, operationalized as prescription opioid supply and socioeconomic conditions, on fatal drug poisoning in the U.S. First, a systematic review of the literature was conducted to critically evaluate the evidence on the ecological relationship of prescription opioid supply and socioeconomic conditions on rates of drug poisoning deaths. The systematic review provides robust evidence of the independent effect of each prescription opioid supply and socioeconomic conditions on rates of drug poisoning deaths. The gap in the literature on the joint effects of prescription opioid supply and socioeconomic conditions was clear, with no study examining the interaction between supply- and demand-side factors on rates of fatal drug poisonings. Moreover, although greater prescription opioid supply was associated with higher rates of fatal drug poisonings in most of the studies, two studies presented contradictory findings, with one study showing no effect of supply on drug poisoning deaths and the other showing locations with higher levels of prescription opioid supply were associated with fewer drug-related deaths. Three limitations were also identified in the reviewed studies that could partially explain the observed associations. First, although studies aggregated data on drug poisoning deaths to a range of administrative spatial levels, including census tract, 5-digit ZIP code, county, 3-digit ZIP code, and state, no study investigated the sensitivity of findings to the level of geographic aggregation. Second, spatial modeling requires the assessment of spatial autocorrelation in both the unadjusted and adjusted data, but few studies even assessed spatial autocorrelation in the data, and fewer still incorporated spatial dependencies in the model. This is important because when spatial autocorrelation is present, the independence assumption in standard statistical regression models is violated, potentially causing bias and loss of efficiency. Third, studies operationalized prescription opioid supply and socioeconomic conditions using a variety of different measures, and no study assessed the sensitivity of findings to the different measures of supply and socioeconomic conditions. Second, the ecological relationship between prescription opioid supply and fatal drug poisonings was examined. For this, pooled cross-sectional time series data from 3,109 U.S. counties in 49 states (2006-2016) were used in Bayesian Poisson conditional autoregressive models to estimate the effect of county prescription opioid supply on four types of drug poisoning deaths: any drug (drug-related death), any opioid (opioid-related death), any prescription opioid but not heroin (prescription opioid-related death), and heroin (heroin-related death), adjusting for compositional and contextual differences across counties. Comparisons were made by type of drug poisoning (any drug, any opioid, prescription opioids only, heroin), level of geographic aggregation (county versus state), and measure of prescription opioid supply (rate of opioid-prescribing per 100 persons and morphine milligram equivalents per-capita). Results indicated a positive association between prescription opioid supply and rates of fatal drug poisonings consistent across changes in type of drug poisoning, level of aggregation, and measure of prescription opioid supply. However, removing confounders from the model caused the direction of the effect estimate to reverse for drug poisoning deaths from any drug, any opioid, and heroin. These results suggested that differences in adjustment for confounding could explain most of the inconsistent findings in the literature. Finally, a rigorous test of the hypothesis that worse socioeconomic conditions increase risk of fatal drug poisonings at the county level, and interact with prescription opioid supply was conducted. This analysis used the same pooled cross-sectional time series data from 3,109 U.S. counties in 49 states (2006-2016). The analysis modeled the effect of five key socioeconomic variables, including three single socioeconomic variables (unemployment, poverty rate, income inequality) and two index variables (Rey index, American Human Development Index [HDI]) on four types of drug poisoning deaths: any drug (drug-related death), any opioid (opioid-related death), any prescription opioid but not heroin (prescription opioid-related death), and heroin (heroin-related death). Using a hierarchical Bayesian modeling approach to account for spatial dependence and the variability of fatal drug poisoning rates due to the small number of events, the independent effect of socioeconomic conditions on rates of drug poisoning deaths and their joint multiplicative and additive effect with prescription opioid supply were estimated. Results showed that rates of fatal drug poisonings were higher in more economically and socially disadvantaged counties; the five key indicator variables were differentially associated with drug poisoning rates; and the American Human Development Index (HDI) and income inequality were most strongly associated with fatal drug poisoning rates. Finally, the results indicate that both HDI and income inequality interact with county-level prescription opioid supply to affect drug poisoning rates. Specifically, the effect of higher prescription opioid supply on rates of fatal drug poisonings was greater in counties with higher HDI and more equal income distributions than counties with lower HDI and less equal income distributions. Overall, this dissertation increased knowledge about the separate and conjoint roles of supply- and demand-side factors in the geospatial distribution of fatal drug poisonings in the U.S. The idea that area-level prescription opioid supply are key drivers of prescription drug use, misuse, and addiction and the attendant consequences, including nonfatal and fatal drug poisonings, has been in the literature for well over a decade. However, no study to date has shown that area-level socioeconomic conditions modify the effect of prescription opioid supply on fatal drug poisonings. By identifying important contextual factors capable of modifying the effect of prescription opioid supply reductions on mortality, high-risk geographic areas can be prioritized for interventions to counter any unintended effects of reducing the prescription opioid supply in an area. As federal and state policies continue to target the rising rates of fatal drug poisonings, these findings show that area-level socioeconomic conditions may represent an important target for policy intervention during the current drug poisoning crisis and a critical piece of information necessary for predicting any future drug-related crises.
354

Identifikace a verifikace osob pomocí záznamu EKG / ECG based human authentication and identification

Waloszek, Vojtěch January 2021 (has links)
In the past years, utilization of ECG for verification and identification in biometry is investigated. The topic is investigated in this thesis. Recordings from ECG ID database from PhysioNet and our own ECG recordings recorded using Apple Watch 4 are used for training and testing this method. Many of the existing methods have proven the possibility of using ECG for biometry, however they were using clinical ECG devices. This thesis investigates using recordings from wearable devices, specifically smart watch. 16 features are extracted from ECG recordings and a random forest classifier is used for verification and identification. The features include time intervals between fiducial points, voltage difference between fiducial points and PR intervals variability in a recording. The average performance of verification model of 14 people is TRR 96,19 %, TAR 84,25 %.
355

Rozpoznávání mluvčího na mobilním telefonu / Speaker Recognition on Mobile Phone

Pešán, Jan January 2011 (has links)
Tato práce se zaměřuje na implementaci počítačového systému rozpoznávání řečníka do prostředí mobilního telefonu. Je zde popsán princip, funkce, a implementace rozpoznávače na mobilním telefonu Nokia N900.
356

Statistical and Machine Learning Methods for Pattern Identification in Environmental Mixtures

Gibson, Elizabeth Atkeson January 2021 (has links)
Background: Statistical and machine learning techniques are now being incorporated into high-dimensional mixture research to overcome issues with traditional methods. Though some methods perform well on specific tasks, no method consistently outperforms all others in complex mixture analyses, largely because different methods were developed to answer different research questions. The research presented here concentrates on answering a single mixtures question: Are there exposure patterns within a mixture corresponding with sources or behaviors that give rise to exposure? Objective: This dissertation details work to design, adapt, and apply pattern recognition methods to environmental mixtures and introduces two methods adapted to specific challenges of environmental health data, (1) Principal Component Pursuit (PCP) and (2) Bayesian non-parametric non-negative matrix factorization (BN²MF). We build on this work to characterize the relationship between identified patterns of in utero endocrine disrupting chemical (EDC) exposure and child neurodevelopment. Methods: PCP---a dimensionality reduction technique in computer vision---decomposes the exposure mixture into a low-rank matrix of consistent patterns and a sparse matrix of unique or extreme exposure events. We incorporated two existing PCP extensions that suit environmental data, (1) a non-convex rank penalty, and (2) a formulation that removes the need for parameter tuning. We further adapted PCP to accommodate environmental mixtures by including (1) a non-negativity constraint, (2) a modified algorithm to allow for missing values, and (3) a separate penalty for measurements below the limit of detection (PCP-LOD). BN²MF decomposes the exposure mixture into three parts, (1) a matrix of chemical loadings on identified patterns, (2) a matrix of individual scores on identified patterns, and (3) and diagonal matrix of pattern weights. It places non-negative continuous priors on pattern loadings, weights, and individual scores and uses a non-parametric sparse prior on the pattern weights to estimate the optimal number. We extended BN²MF to explicitly account for uncertainty in identified patterns by estimating the full distribution of scores and loadings. To test both methods, we simulated data to represent environmental mixtures with various structures, altering the level of complexity in the patterns, the noise level, the number of patterns, the size of the mixture, and the sample size. We evaluated PCP-LOD's performance against principal component analysis (PCA), and we evaluated BN²MF's performance against PCA, factor analysis, and frequentist nonnegative matrix factorization (NMF). For all methods, we compared their solutions with true simulated values to measure performance. We further assessed BN²MF's coverage of true simulated scores. We applied PCP-LOD to an exposure mixture of 21 persistent organic pollutants (POPs) measured in 1,000 U.S. adults from the 2001--2002 National Health and Nutrition Examination Survey (NHANES). We applied BN²MF to an exposure mixture of 17 EDCs measured in 343 pregnant women in the Columbia Center for Children’s Environmental Health's Mothers and Newborns Cohort. Finally, we designed a two-stage Bayesian hierarchical model to estimate health effects of environmental exposure patterns while incorporating the uncertainty of pattern identification. In the first stage, we identified EDC exposure patterns using BN²MF. In the second stage, we included individual pattern scores and their distributions as exposures of interest in a hierarchical regression model, with child IQ as the outcome, adjusting for potential confounders. We present sex-specific results. Results: PCP-LOD recovered the true number of patterns through cross-validation for all simulations; based on an a priori specified criterion, PCA recovered the true number of patterns in 32% of simulations. PCP-LOD achieved lower relative predictive error than PCA for all simulated datasets with up to 50% of the data < LOD. When 75% of values were < LOD, PCP-LOD outperformed PCA only when noise was low. In the POP mixture, PCP-LOD identified a rank three underlying structure. One pattern represented comprehensive exposure to all POPs. The other two patterns grouped chemicals based on known properties such as structure and toxicity. PCP-LOD also separated 6% of values as extreme events. Most participants had no extreme exposures (44%) or only extremely low exposures (18%). BN²MF estimated the true number of patterns for 99% of simulated datasets. BN²MF's variational confidence intervals achieved 95% coverage across all levels of structural complexity with up to 40% added noise. BN²MF performed comparably with frequentist methods in terms of overall prediction and estimation of underlying loadings and scores. We identified two patterns of EDC exposure in pregnant women, corresponding with diet and personal care product use as potentially separate sources or behaviors leading to exposure. The diet pattern expressed exposure to phthalates and BPA. One standard deviation increase in this pattern was associated with a decrease of 3.5 IQ points (95% credible interval: -6.7, -0.3), on average, in female children but not in males. The personal care product pattern represented exposure to phenols, including parabens, and diethyl phthalate. We found no associations between this pattern and child cognition. Conclusion: PCP-LOD and BN^2MF address limitations of existing pattern recognition methods employed in this field such as user-specified pattern number, lack of interpretability of patterns in terms of human understanding, influence of outlying values, and lack of uncertainty quantification. Both methods identified patterns that grouped chemicals based on known sources (e.g., diet), behaviors (e.g., personal care product use), or properties (e.g., structure and toxicity). Phthalates and BPA found in food packaging and can linings formed a BN²MF-identified pattern of EDC exposure negatively associated with female child intelligence in the Mothers and Newborns cohort. Results may be used to inform interventions designed to target modifiable behavior or regulations to act on dietary exposure sources.
357

Modelling children under five mortality in South Africa using copula and frailty survival models

Mulaudzi, Tshilidzi Benedicta January 2022 (has links)
Thesis (Ph.D. (Statistics)) -- University of Limpopo, 2022 / This thesis is based on application of frailty and copula models to under five child mortality data set in South Africa. The main purpose of the study was to apply sample splitting techniques in a survival analysis setting and compare clustered survival models considering left truncation to the under five child mortality data set in South Africa. The major contributions of this thesis is in the application of the shared frailty model and a class of Archimedean copulas in particular, Clayton-Oakes copula with completely monotone generator, and introduction of sample splitting techniques in a survival analysis setting. The findings based on shared frailty model show that clustering effect was sig nificant for modelling the determinants of time to death of under five children, and revealed the importance of accounting for clustering effect. The conclusion based on Clayton-Oakes model showed association between survival times of children from the same mother. It was found that the parameter estimates for the shared frailty and the Clayton-Oakes models were quite different and that the two models cannot be comparable. Gender, province, year, birth order and whether a child is part of twin or not were found to be significant factors affect ing under five child mortality in South Africa. / NRF-TDG Flemish Interuniversity Council Institutional corporation (VLIR-IUC) VLIR-IUC Programme of the University of Limpopo
358

Survival analysis of time-to-first peritonitis among kidney patients who are on peritoneal analysis at Pietersburg Provincial Hospital, Limpopo Province, South Africa

Maja, Tshepo Frans January 2020 (has links)
Thesis (M.Sc. (Statistics)) -- University of Limpopo, 2020 / Peritoneal Dialysis (PD) is a process of replacing kidney function which cleans waste from the blood and remove extra fluid from the body. In most cases, the process of PD is slowed down by a peritoneal membrane infection called peritonitis. Despite recent advancements in treatments and prevention, peritonitis still remains the leading complication which results in high morbidity and technique failure among PD patients. Using a prospective peritonitis dataset of 159 kidney patients who were on PD from 2008 to 2015 in Pietersburg Provincial Hospital, the aim of this study was to identify potential social, demographic and biological risk factors that contribute to the first episode of peritonitis. Both semi-parametric (Cox PH) and parametric (Accelerated Failure Time: Weibull, exponential, loglogistic, and gamma) survival models were fitted to the peritonitis dataset. Akaike Information Criterion (AIC) was applied to select models which best fit to the peritonitis data. Accordingly, log-logistic Accelerated Failure Time (AFT) model was found to be a working model that best fit to the data. A total of 96 (60.38%) peritonitis cases were recorded over the follow-up period with majority of peritonitis infection coming from females (65.4%) and rural dwellers (65.7%) with (62.6%) of black Africans showing higher risk of developing peritonitis. The multivariate log-logistic AFT model revealed that availability of water (p-value=0.018), electricity (p-value=0.018), dwelling (p-value=0.008), haemoglobin status (p-value=0.002) and duration on PD (p-value=0.001) are significant risk factors for the development of peritonitis. Therefore, patients with no water and electricity, coming from rural background with low level of haemoglobin and shorter duration on PD are associii ated with high risk or hazard of developing peritonitis for the first time.
359

Causal Inference for Health Effects of Time-varying Correlated Environmental Mixtures

Chai, Zilan January 2023 (has links)
Exposure to environmental chemicals has been shown to affect health status throughout the life course. Quantifying the joint effect of environmental mixtures over time is crucial to determine optimal intervention timing. Establishing causal relationships from environmental mixture data can be challenging due to various factors, including multicollinearity, complex functional form of exposure-response relationships, and residual unmeasured confounding. These issues can lead to biased estimates of treatment effects and pose significant obstacles in accurately identifying the true relationship between the pollutants and outcome variables. Causal interpretation of longitudinal environmental mixture studies encounters challenges. This dissertation explores the use of causal inference in environmental mixture studies, with a particular emphasis on addressing three key challenges. First, there is currently no statistical approach that allows simultaneous consideration of time-varying confounding, flexible modeling, and variable selection when examining the effect of multiple, correlated, and time-varying exposures. Second, the violation of a critical assumption that underpins all causal inference methods - namely, the absence of unmeasured confounding - poses a significant problem, as models that incorporate multiple environmental exposures may exacerbate the degree of bias depending on the nature of unmeasured confounding. Finally, there is a lack of computational resources that facilitate the application of newly developed causal inference methods for analyzing environmental mixtures. In Chapter 2, we introduce a causal inference method, g-BKMR, which enables to estimate nonlinear, non-additive effects of time-varying exposures and time-varying confounders, while also allowing for variable selection. An extensive simulation study shows that g-BKMR outperforms approaches that rely on correct model specification or do not account for time-dependent confounding, especially when correlation across time-varying exposures is high or the exposure-outcome relationship is nonlinear. We apply g-BKMR to quantify the contribution of metal mixtures to blood glucose in the Strong Heart Study, a prospective cohort study of American Indians. Chapter 3, we address the issue of time-varying unmeasured confounding when estimating time-varying effects of exposure to environmental chemicals. We review the Bayesian g-formula under the assumption of no unmeasured confounding, and then introduce a Bayesian probabilistic sensitivity analysis approach that can account for multiple, potentially time-varying, unmeasured confounders and continuous exposures. Through a simulation study, we demonstrate that the proposed algorithm outperforms the naive method, which fails to consider the influence of confounding. Chapter 4, introduces causalbkmr, a novel R package and can be currently be accessed on Github. causalbkmr is designed to support the implementation of g-BKMR, BKMR Causal Mediation Analysis, and Multiple Imputation BKMR, thereby offering a user-friendly and effective platform for executing these state-of-the-art methods in practice in the context of complex mixtures analysis. While the package bkmr is available, the novel package causalbkmr expands upon bkmr by enabling its application specifically to environmental mixture data within a causal inference framework. The implementation of these novel methodologies within causalbkmr allows for the extraction of causal interpretations, thus enhancing the analytical capabilities provided by the package. Chapter 5 concludes with a discussion and outlines potential future directions for investigation.
360

Statistical Methods for High-dimensional Neuroimaging Data Analysis

Li, Ruiyang January 2025 (has links)
Neuroimaging data, often high-dimensional and collected across multiple imaging modalities, is a valuable tool for studying the underlying mechanisms of how the human brain structures, functions, and thus impacts cognition. This dissertation aims to address the challenges of analyzing high-dimensional neuroimaging data, such as the missing data issue in multimodal fusion, the preservation of underlying hierarchical structure between mediators and exposure-by-mediator interactions in model selection with high-dimensional potential mediators, and the false discovery rate control for mediator selection from a high-dimensional candidate set. The first part of this dissertation aims to address the commonly occurring missing data issue during multimodal fusion. Recent advances in multimodal imaging acquisition techniques have allowed us to measure different aspects of brain structure and function. Multimodal fusion, such as linked independent component analysis (LICA), is a popular approach to integrate complementary information. However, these methods are severely limited by the common occurrence of missing data in brain imaging. In the first chapter, we propose a Full Information LICA algorithm (FI-LICA) to handle the missing data problem during multimodal fusion under the LICA framework. Built upon the principle of full information from complete cases, our method utilizes all available information to recover the missing latent information. Our simulation experiments show the ideal performance of FI-LICA compared to current practices. Further, applying to multimodal data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) study, FI-LICA demonstrates better performance in classifying current diagnosis and in predicting the transition of participants with mild cognitive impairment (MCI) to AD, thereby highlighting the practical utility of our proposed method. The second part of this dissertation aims to preserve the underlying hierarchical structure between mediators and exposure-by-mediator interactions during model selection in the high-dimensional mediator settings. In mediation analysis, the exposure often influences the mediating effect, i.e., there is an interaction between exposure and mediator on the dependent variable. When the mediator is high-dimensional, it is necessary to identify non-zero mediators (M) and exposure-by-mediator (X-by-M) interactions. Although several high-dimensional mediation methods can naturally handle X-by-M interactions, research is scarce in preserving the underlying hierarchical structure between the main effects and the interactions. To fill the knowledge gap, in the second chapter, we develop the XMInt procedure to select M and X-by-M interactions in the high-dimensional mediators setting while preserving the hierarchical structure. Our proposed method employs a sequential regularization-based forward-selection approach to identify the mediators and their hierarchically preserved interaction with exposure. Our numerical experiments show promising selection results. Furthermore, we apply our method to ADNI morphological data and examine the role of cortical thickness and subcortical volumes on the effect of amyloid-beta accumulation on cognitive performance, which could be helpful in understanding the brain compensation mechanism. The third part of this dissertation aims to control the false discovery rate (FDR) when selecting mediators from a high-dimensional candidate set. Specifically, we formulate a multiple-hypothesis testing framework for mediator selection from a high-dimensional candidate set and propose a method, which extends the recent development in FDR-controlled variable selection with knockoff, to select mediators with FDR control. We show that the proposed method and algorithm achieve finite sample FDR control. We present extensive simulation results to demonstrate the power and finite sample performance compared with the existing method. Lastly, we demonstrate the method by analyzing data from the Adolescent Brain Cognitive Development (ABCD) study, in which the proposed method selects several resting-state functional magnetic resonance imaging connectivity markers as mediators for the relationship between adverse childhood events and the crystallized composite score in the NIH toolbox.

Page generated in 0.3496 seconds