Return to search

Towards label-efficient deep learning for medical image analysis

Deep learning methods have achieved state-of-the-art performance in various tasks of medical image analysis. However, the success relies heavily on the expensive and time-consuming collection of large quantities of labeled data, which is not always available. This dissertation investigates the use of self-supervised and generative methods to enhance the label efficiency of deep learning models for 3D medical image analysis. Unlike natural images, medical images contain consistent anatomical contexts specific to the domain, which can be exploited as self-supervision signals to pre-train the model. Furthermore, generative methods can be utilized to synthesize additional samples, thereby increasing sample diversity.

In the first part of the dissertation, we introduce self-supervised learning frameworks that learn anatomy-aware and disease-related representation. In order to learn disease-related representation, we propose two domain-specific contrasting strategies that leverage anatomical similarity across patients to create hard negative samples that incentivize learning fine-grained pathological features. In order to learn anatomy-sensitive representation, we develop a novel 3D convolutional layer with kernels that are conditionally parameterized based on the anatomical locations. We perform extensive experiments on large-scale datasets of CT scans, which show that our method improves the performance of many downstream tasks.

In the second part of the dissertation, we introduce generative models capable of synthesizing high-resolution, anatomy-guided 3D medical images. Current generative models are typically limited to low-resolution outputs due to memory constraints, despite clinicians' need for high-resolution details in diagnoses. To overcome this, we present a hierarchical architecture that efficiently manages memory demands, enabling the generation of high-resolution images. In addition, diffusion-based generative models are becoming more prevalent in medical imaging. However, existing state-of-the-art methods often under-utilize the extensive information found in radiology reports and anatomical structures. To address these limitations, we propose a text-guided 3D image diffusion model that preserves anatomical details. We conduct experiments on downstream tasks and blind evaluation by radiologists, which demonstrate the clinical value of our proposed methodologies.

Identiferoai:union.ndltd.org:bu.edu/oai:open.bu.edu:2144/49251
Date11 September 2024
CreatorsSun, Li
ContributorsBatmanghelich, Kayhan
Source SetsBoston University
Languageen_US
Detected LanguageEnglish
TypeThesis/Dissertation
RightsAttribution-NonCommercial 4.0 International, http://creativecommons.org/licenses/by-nc/4.0/

Page generated in 0.0015 seconds