51 |
Clinical Assessment of Deep Learning-Based Uncertainty Maps in Lung Cancer Segmentation / Klinisk Bedömning av Djupinlärningsbaserade Osäkerhetskartor vid Segmentering av LungcancerMaruccio, Federica Carmen January 2023 (has links)
Prior to radiation therapy planning, tumours and organs at risk need to be delineated. In recent years, deep learning models have opened the possibility of automating the contouring process, speeding up the procedures and helping clinicians. However, deep learning models, trained using ground truth labels from different clinicians, inevitably incorporate the human-based inter-observer variability as well as other machine-based uncertainties and biases. Consequently, this affects the accuracy of segmentation, representing the primary source of error in contouring tasks. Therefore, clinicians still need to check and manually correct the segmentation and still do not have a measure of reliability. To tackle these issues, researchers have shifted their focus to the topic of probabilistic neural networks and uncertainties in deep learning models. Hence, the main research question of the project is whether a 3D U-Net neural network trained on CT lung cancer images can enhance clinical contouring practice by implementing a probabilistic auto-contouring system. The Monte Carlo dropout technique was employed to generate probabilistic and uncertainty maps. The model calibration was assessed using reliability diagrams, and subsequently, a clinical experiment with a radiation oncologist was conducted. To assess the clinical validity of the uncertainty maps two novel metrics were identified, namely mean uncertainty (MU) and relative uncertainty volume (RUV). The results of this study demonstrated that probability and uncertainty mapping effectively identify cases of under or over-contouring. Although the reliability analysis indicated that the model tends to be overconfident, the outcomes from the clinical experiment showed a strong correlation between the model results and the clinician’s opinion. The two metrics exhibited promising potential as indicators for clinicians to determine whether correction of the predictions is necessary. Hence, probabilistic models revealed to be valuable in clinical practice, supporting clinicians in their contouring and potentially reducing clinical errors.
|
52 |
Uncertainty Estimation in Volumetric Image SegmentationPark, Donggyun January 2023 (has links)
The performance of deep neural networks and estimations of their robustness has been rapidly developed. In contrast, despite the broad usage of deep convolutional neural networks (CNNs)[1] for medical image segmentation, research on their uncertainty estimations is being far less conducted. Deep learning tools in their nature do not capture the model uncertainty and in this sense, the output of deep neural networks needs to be critically analysed with quantitative measurements, especially for applications in the medical domain. In this work, epistemic uncertainty, which is one of the main types of uncertainties (epistemic and aleatoric) is analyzed and measured for volumetric medical image segmentation tasks (and possibly more diverse methods for 2D images) at pixel level and structure level. The deep neural network employed as a baseline is 3D U-Net architecture[2], which shares the essential structural concept with U-Net architecture[3], and various techniques are applied to quantify the uncertainty and obtain statistically meaningful results, including test-time data augmentation and deep ensembles. The distribution of the pixel-wise predictions is estimated by Monte Carlo simulations and the entropy is computed to quantify and visualize how uncertain (or certain) the predictions of each pixel are. During the estimation, given the increased network training time in volumetric image segmentation, training an ensemble of networks is extremely time-consuming and thus the focus is on data augmentation and test-time dropouts. The desired outcome is to reduce the computational costs of measuring the uncertainty of the model predictions while maintaining the same level of estimation performance and to increase the reliability of the uncertainty estimation map compared to the conventional methods. The proposed techniques are evaluated on publicly available volumetric image datasets, Combined Healthy Abdominal Organ Segmentation (CHAOS, a set of 3D in-vivo images) from Grand Challenge (https://chaos.grand-challenge.org/). Experiments with the liver segmentation task in 3D Computed Tomography (CT) show the relationship between the prediction accuracy and the uncertainty map obtained by the proposed techniques. / Prestandan hos djupa neurala nätverk och estimeringar av deras robusthet har utvecklats snabbt. Däremot, trots den breda användningen av djupa konvolutionella neurala nätverk (CNN) för medicinsk bildsegmentering, utförs mindre forskning om deras osäkerhetsuppskattningar. Verktyg för djupinlärning fångar inte modellosäkerheten och därför måste utdata från djupa neurala nätverk analyseras kritiskt med kvantitativa mätningar, särskilt för tillämpningar inom den medicinska domänen. I detta arbete analyseras och mäts epistemisk osäkerhet, som är en av huvudtyperna av osäkerheter (epistemisk och aleatorisk) för volymetriska medicinska bildsegmenteringsuppgifter (och möjligen fler olika metoder för 2D-bilder) på pixelnivå och strukturnivå. Det djupa neurala nätverket som används som referens är en 3D U-Net-arkitektur [2] och olika tekniker används för att kvantifiera osäkerheten och erhålla statistiskt meningsfulla resultat, inklusive testtidsdata-augmentering och djupa ensembler. Fördelningen av de pixelvisa förutsägelserna uppskattas av Monte Carlo-simuleringar och entropin beräknas för att kvantifiera och visualisera hur osäkra (eller säkra) förutsägelserna för varje pixel är. Under uppskattningen, med tanke på den ökade nätverksträningstiden i volymetrisk bildsegmentering, är träning av en ensemble av nätverk extremt tidskrävande och därför ligger fokus på dataaugmentering och test-time dropouts. Det önskade resultatet är att minska beräkningskostnaderna för att mäta osäkerheten i modellförutsägelserna samtidigt som man bibehåller samma nivå av estimeringsprestanda och ökar tillförlitligheten för kartan för osäkerhetsuppskattning jämfört med de konventionella metoderna. De föreslagna teknikerna kommer att utvärderas på allmänt tillgängliga volymetriska bilduppsättningar, Combined Healthy Abdominal Organ Segmentation (CHAOS, en uppsättning 3D in-vivo-bilder) från Grand Challenge (https://chaos.grand-challenge.org/). Experiment med segmenteringsuppgiften för lever i 3D Computed Tomography (CT) vissambandet mellan prediktionsnoggrannheten och osäkerhetskartan som erhålls med de föreslagna teknikerna.
|
53 |
Comparative Analysis of Transformer and CNN Based Models for 2D Brain Tumor SegmentationTräff, Henrik January 2023 (has links)
A brain tumor is an abnormal growth of cells within the brain, which can be categorized into primary and secondary tumor types. The most common type of primary tumors in adults are gliomas, which can be further classified into high-grade gliomas (HGGs) and low-grade gliomas (LGGs). Approximately 50% of patients diagnosed with HGG pass away within 1-2 years. Therefore, the early detection and prompt treatment of brain tumors are essential for effective management and improved patient outcomes. Brain tumor segmentation is a task in medical image analysis that entails distinguishing brain tumors from normal brain tissue in magnetic resonance imaging (MRI) scans. Computer vision algorithms and deep learning models capable of analyzing medical images can be leveraged for brain tumor segmentation. These algorithms and models have the potential to provide automated, reliable, and non-invasive screening for brain tumors, thereby enabling earlier and more effective treatment. For a considerable time, Convolutional Neural Networks (CNNs), including the U-Net, have served as the standard backbone architectures employed to address challenges in computer vision. In recent years, the Transformer architecture, which already has firmly established itself as the new state-of-the-art in the field of natural language processing (NLP), has been adapted to computer vision tasks. The Vision Transformer (ViT) and the Swin Transformer are two architectures derived from the original Transformer architecture that have been successfully employed for image analysis. The emergence of Transformer based architectures in the field of computer vision calls for an investigation whether CNNs can be rivaled as the de facto architecture in this field. This thesis compares the performance of four model architectures, namely the Swin Transformer, the Vision Transformer, the 2D U-Net, and the 2D U-Net which is implemented with the nnU-Net framework. These model architectures are trained using increasing amounts of brain tumor images from the BraTS 2020 dataset and subsequently evaluated on the task of brain tumor segmentation for both HGG and LGG together, as well as HGG and LGG individually. The model architectures are compared on total training time, segmentation time, GPU memory usage, and on the evaluation metrics Dice Coefficient, Jaccard Index, precision, and recall. The 2D U-Net implemented using the nnU-Net framework performs the best in correctly segmenting HGG and LGG, followed by the Swin Transformer, 2D U-Net, and Vision Transformer. The Transformer based architectures improve the least when going from 50% to 100% of training data. Furthermore, when data augmentation is applied during training, the nnU-Net outperforms the other model architectures, followed by the Swin Transformer, 2D U-Net, and Vision Transformer. The nnU-net benefited the least from employing data augmentation during training, while the Transformer based architectures benefited the most. In this thesis we were able to perform a successful comparative analysis effectively showcasing the distinct advantages of the four model architectures under discussion. Future comparisons could incorporate training the model architectures on a larger set of brain tumor images, such as the BraTS 2021 dataset. Additionally, it would be interesting to explore how Vision Transformers and Swin Transformers, pre-trained on either ImageNet- 21K or RadImageNet, compare to the model architectures of this thesis on brain tumor segmentation.
|
54 |
Multi-site Organ Detection in CT Images using Deep Learning / Regionsoberoende organdetektion i CT-bilder meddjupinlärningJacobzon, Gustaf January 2020 (has links)
When optimizing a controlled dose in radiotherapy, high resolution spatial information about healthy organs in close proximity to the malignant cells are necessary in order to mitigate dispersion into these organs-at-risk. This information can be provided by deep volumetric segmentation networks, such as 3D U-Net. However, due to limitations of memory in modern graphical processing units, it is not feasible to train a volumetric segmentation network on full image volumes and subsampling the volume gives a too coarse segmentation. An alternative is to sample a region of interest from the image volume and train an organ-specific network. This approach requires knowledge of which region in the image volume that should be sampled and can be provided by a 3D object detection network. Typically the detection network will also be region specific, although a larger region such as the thorax region, and requires human assistance in choosing the appropriate network for a certain region in the body. Instead, we propose a multi-site object detection network based onYOLOv3 trained on 43 different organs, which may operate on arbitrary chosen axial patches in the body. Our model identifies the organs present (whole or truncated) in the image volume and may automatically sample a region from the input and feed to the appropriate volumetric segmentation network. We train our model on four small (as low as 20 images) site-specific datasets in a weakly-supervised manner in order to handle the partially unlabeled nature of site-specific datasets. Our model is able to generate organ-specific regions of interests that enclose 92% of the organs present in the test set. / Vid optimering av en kontrollerad dos inom strålbehandling krävs det information om friska organ, så kallade riskorgan, i närheten av de maligna cellerna för att minimera strålningen i dessa organ. Denna information kan tillhandahållas av djupa volymetriskta segmenteringsnätverk, till exempel 3D U-Net. Begränsningar i minnesstorleken hos moderna grafikkort gör att det inte är möjligt att träna ett volymetriskt segmenteringsnätverk på hela bildvolymen utan att först nedsampla volymen. Detta leder dock till en lågupplöst segmentering av organen som inte är tillräckligt precis för att kunna användas vid optimeringen. Ett alternativ är att endast behandla en intresseregion som innesluter ett eller ett fåtal organ från bildvolymen och träna ett regionspecifikt nätverk på denna mindre volym. Detta tillvägagångssätt kräver dock information om vilket område i bildvolymen som ska skickas till det regionspecifika segmenteringsnätverket. Denna information kan tillhandahållas av ett 3Dobjektdetekteringsnätverk. I regel är även detta nätverk regionsspecifikt, till exempel thorax-regionen, och kräver mänsklig assistans för att välja rätt nätverk för en viss region i kroppen. Vi föreslår istället ett multiregions-detekteringsnätverk baserat påYOLOv3 som kan detektera 43 olika organ och fungerar på godtyckligt valda axiella fönster i kroppen. Vår modell identifierar närvarande organ (hela eller trunkerade) i bilden och kan automatiskt ge information om vilken region som ska behandlas av varje regionsspecifikt segmenteringsnätverk. Vi tränar vår modell på fyra små (så lågt som 20 bilder) platsspecifika datamängder med svag övervakning för att hantera den delvis icke-annoterade egenskapen hos datamängderna. Vår modell genererar en organ-specifik intresseregion för 92 % av organen som finns i testmängden.
|
55 |
Learning to Measure Invisible FishGustafsson, Stina January 2022 (has links)
In recent years, the EU has observed a decrease in the stocks of certain fish species due to unrestricted fishing. To combat the problem, many fisheries are investigating how to automatically estimate the catch size and composition using sensors onboard the vessels. Yet, measuring the size of fish in marine imagery is a difficult task. The images generally suffer from complex conditions caused by cluttered fish, motion blur and dirty sensors. In this thesis, we propose a novel method for automatic measurement of fish size that can enable measuring both visible and occluded fish. We use a Mask R-CNN to segment the visible regions of the fish, and then fill in the shape of the occluded fish using a U-Net. We train the U-Net to perform shape completion in a semi-supervised manner, by simulating occlusions on an open-source fish dataset. Different to previous shape completion work, we teach the U-Net when to fill in the shape and not by including a small portion of fully visible fish in the input training data. Our results show that our proposed method succeeds to fill in the shape of the synthetically occluded fish as well as of some of the cluttered fish in real marine imagery. We achieve an mIoU score of 93.9 % on 1 000 synthetic test images and present qualitative results on real images captured onboard a fishing vessel. The qualitative results show that the U-Net can fill in the shapes of lightly occluded fish, but struggles when the tail fin is hidden and only parts of the fish body is visible. This task is difficult even for a human, and the performance could perhaps be increased by including the fish appearance in the shape completion task. The simulation-to-reality gap could perhaps also be reduced by finetuning the U-Net on some real occlusions, which could increase the performance on the heavy occlusions in the real marine imagery.
|
56 |
Automatic Detection of Common Signal Quality Issues in MRI Data using Deep Neural NetworksAx, Erika, Djerf, Elin January 2023 (has links)
Magnetic resonance imaging (MRI) is a commonly used non-invasive imaging technique that provides high resolution images of soft tissue. One problem with MRI is that it is sensitive to signal quality issues. The issues can arise for various reasons, for example by metal located either inside or outside of the body. Another common signal quality issue is caused by the patient being partly placed outside field of view of the MRI scanner. This thesis aims to investigate the possibility to automatically detect these signal quality issues using deep neural networks. More specifically, two different 3D CNN network types were studied, a classification-based approach and a reconstruction-based approach. The datasets used consist of MRI volumes from UK Biobank which have been processed and manually annotated by operators at AMRA Medical. For the classification method four different network architectures were explored utilising supervised learning with multi-label classification. The classification method was evaluated using accuracy and label-based evaluation metrics, such as macro-precision, macro-recall and macro-F1. The reconstruction method was based on anomaly detection using an autoencoder which was trained to reconstruct volumes without any artefacts. A mean squared prediction error was calculated for the reconstructed volume and compared against a threshold in order to classify a volume with or without artefacts. The idea was that volumes containing artefacts should be more difficult to reconstruct and thus, result in a higher prediction error. The reconstruction method was evaluated using accuracy, precision, recall and F1-score. The results show that the classification method has overall higher performance than the reconstruction method. The achieved accuracy for the classification method was 98.0% for metal artefacts and 97.5% for outside field of view artefacts. The best architecture for the classification method proved to be DenseNet201. The reconstruction method worked for metal artefacts with an achieved accuracy of 75.7%. Furthermore, it was concluded that reconstruction method did not work for detection of outside field of view artefacts. The results from the classification method indicate that there is a possibility to automatically detect artefacts with deep neural networks. However, it is needed to further improve the method in order to completely replace a manual quality control step before using the volumes for calculation of biomarkers.
|
57 |
Segmentation of People and Vehicles in Dense Voxel Grids from Photon Counting LiDAR using 3D-UnetDanielsson, Fredrik January 2021 (has links)
In recent years, the usage of 3D deep learning techniques has seen a surge,mainly driven by advancements in autonomous driving and medical applications.This thesis investigates the applicability of existing state-of-the-art 3Ddeep learning network architectures to dense voxel grids from single photoncounting 3D LiDAR. This work also examine the choice of loss function asa means of dealing with extreme data imbalance, in order to segment peopleand vehicles in outdoor forest scenes. Due to data similarities with volumetricmedical data, such as computer tomography scans, this thesis investigates ifa model for 3D deep learning used for medical applications, the commonlyused 3D U-Net, can be used for photon counting data. The results showthat segmentation of people and vehicles is possible in this type of data butthat performance depends on the segmentation task, light conditions, and theloss function. For people segmentation the final models are able to predictall targets, but with a significant amount of false positives, something that islikely caused by similar LiDAR responses between people and tree trunks.For vehicle detection, the results are more inconsistent and varies greatlybetween different loss functions as well as the position and orientation of thevehicles. Overall, we consider the 3D U-Net model a successful proof-ofconceptregarding the applicability of 3D deep learning techniques to this kindof data. / Under de senaste åren har användningen för djupinlärningstekniker för 3Dsett en kraftig ökning, främst driven av framsteg inom autonoma fordon ochmedicinska tillämpningar. Denna avhandling undersöker befintliga modernadjupinlärningsnätverk för 3D i täta voxelgriddar från fotonräknande 3D LiDARför att segmentera människor och fordon i skogsscener. Vidare undersöksvalet av målfunktion som ett sätt att hantera extrem dataobalans. På grundav datalikheter med volymetriska medicinska data, såsom datortomografi,kommer denna avhandling att undersöka om en modell för 3D-djupinlärningsom används för medicinska applikationer, nämligen 3D U-Net, kan användasför fotonräknande data. Resultaten visar att segmentering av människor ochfordon är möjligt men att prestanda varier avsevärt med segmenteringsuppgiften,ljusförhållanden, och målfunktioner. För segmentering av människorkan de slutgiltiga modellerna segmentera alla mål men med en betydandemängd falska utslag, något som sannolikt orsakas av liknande LiDAR-svarmellan människor och trädstammar. För segmentering av fordon är resultatenmer oberäkneliga och varierar kraftigt mellan olika målfunktioner såväl somfordonens position och orientering. Sammantaget anser vi att 3D U-Netmodellenvisar på en framgångsrik konceptvalidering när det gäller tillämpningav djupinlärningstekniker för 3D på denna typ av data.
|
58 |
Land Use/Land Cover Classification From Satellite Remote Sensing Images Over Urban Areas in Sweden : An Investigative Multiclass, Multimodal and Spectral Transformation, Deep Learning Semantic Image Segmentation Study / Klassificering av markanvändning/marktäckning från satellit-fjärranalysbilder över urbana områden i Sverige : En undersökande multiklass, multimodal och spektral transformation, djupinlärningsstudie inom semantisk bildsegmenteringAidantausta, Oskar, Asman, Patrick January 2023 (has links)
Remote Sensing (RS) technology provides valuable information about Earth by enabling an overview of the planet from above, making it a much-needed resource for many applications. Given the abundance of RS data and continued urbanisation, there is a need for efficient approaches to leverage RS data and its unique characteristics for the assessment and management of urban areas. Consequently, employing Deep Learning (DL) for RS applications has attracted much attention over the past few years. In this thesis, novel datasets consisting of satellite RS images over urban areas in Sweden were compiled from Sentinel-2 multispectral, Sentinel-1 Synthetic Aperture Radar (SAR) and Urban Atlas 2018 Land Use/Land Cover (LULC) data. Then, DL was applied for multiband and multiclass semantic image segmentation of LULC. The contributions of complementary spectral, temporal and SAR data and spectral indices to LULC classification performance compared to using only Sentinel-2 data with red, green and blue spectral bands were investigated by implementing DL models based on the fully convolutional network-based architecture, U-Net, and performing data fusion. Promising results were achieved with 25 possible LULC classes. Furthermore, almost all DL models at an overall model level and all DL models at an individual class level for most LULC classes benefited from complementary satellite RS data with varying degrees of classification improvement. Additionally, practical knowledge and insights were gained from evaluating the results and are presented regarding satellite RS data characteristics and semantic segmentation of LULC in urban areas. The obtained results are helpful for practitioners and researchers applying or intending to apply DL for semantic segmentation of LULC in general and specifically in Swedish urban environments.
|
59 |
GAN-based Automatic Segmentation of Thoracic Aorta from Non-contrast-Enhanced CT Images / GAN-baserad automatisk segmentering avthoraxorta från icke-kontrastförstärkta CT-bilderXu, Libo January 2021 (has links)
The deep learning-based automatic segmentation methods have developed rapidly in recent years to give a promising performance in the medical image segmentation tasks, which provide clinical medicine with an accurate and fast computer-aided diagnosis method. Generative adversarial networks and their extended frameworks have achieved encouraging results on image-to-image translation problems. In this report, the proposed hybrid network combined cycle-consistent adversarial networks, which transformed contrast-enhanced images from computed tomography angiography to the conventional low-contrast CT scans, with the segmentation network and trained them simultaneously in an end-to-end manner. The trained segmentation network was tested on the non-contrast-enhanced CT images. The synthetic process and the segmentation process were also implemented in a two-stage manner. The two-stage process achieved a higher Dice similarity coefficient than the baseline U-Net did on test data, but the proposed hybrid network did not outperform the baseline due to the field of view difference between the two training data sets.
|
60 |
U - Net Based Crack Detection in Road and Railroad Tunnels Using Data Acquired by Mobile Device / U - Net - baserad sprickdetektering i väg - och järnvägstunnlar med hjälp av data som förvärvats av mobil enhetGao, Kepan January 2022 (has links)
Infrastructures like bridges and tunnels are significant for the economy and growth of countries, however, the risk of failure increases as they getting aged. Therefore, a systematic monitoring scheme is necessary to check the integrity regularly. Among all the defects, cracks are the most common ones that can be observed directly by camera or mapping system. Meanwhile, cracks are capable and reliable indicators. As a result, crack detection is one of the most broadly researched topic. As the limitation of computing resource vanishing, deep learning methods are developing rapidly and used widely. U-net is one of the latest deep learning methods for image classification and has shown overwhelming adaptability and performance in medical images. It is promising to be capable for crack detection. In this thesis project, a U-net approach is used to automatically detect road and tunnel cracks. An open-source crack detection dataset is used for training. The model is improved by new parameter settings and fine-tuning and transformed onto the data acquired by the mobile mapping system of TACK team. Image processing techniques such as class imbalance handling and center line are also used for improvement. At last, qualitative and quantitative statistics are used to illustrate superiority of the methods. This thesis project is a sub-project of project TACK, which is an ongoing research project carried out by KTH - Royal Institute of Technology, Sapienza University of Rome and WSP Sweden company under the InfraSweden2030 program funded by Vinnova. The main objective of TACK is developing a methodology for automatic detection and measurement of cracks on tunnel linings or other infrastructures.
|
Page generated in 0.0213 seconds