Global ETD Search

51	Comparative Analysis of Transformer and CNN Based Models for 2D Brain Tumor Segmentation Träff, Henrik January 2023 (has links) A brain tumor is an abnormal growth of cells within the brain, which can be categorized into primary and secondary tumor types. The most common type of primary tumors in adults are gliomas, which can be further classified into high-grade gliomas (HGGs) and low-grade gliomas (LGGs). Approximately 50% of patients diagnosed with HGG pass away within 1-2 years. Therefore, the early detection and prompt treatment of brain tumors are essential for effective management and improved patient outcomes. Brain tumor segmentation is a task in medical image analysis that entails distinguishing brain tumors from normal brain tissue in magnetic resonance imaging (MRI) scans. Computer vision algorithms and deep learning models capable of analyzing medical images can be leveraged for brain tumor segmentation. These algorithms and models have the potential to provide automated, reliable, and non-invasive screening for brain tumors, thereby enabling earlier and more effective treatment. For a considerable time, Convolutional Neural Networks (CNNs), including the U-Net, have served as the standard backbone architectures employed to address challenges in computer vision. In recent years, the Transformer architecture, which already has firmly established itself as the new state-of-the-art in the field of natural language processing (NLP), has been adapted to computer vision tasks. The Vision Transformer (ViT) and the Swin Transformer are two architectures derived from the original Transformer architecture that have been successfully employed for image analysis. The emergence of Transformer based architectures in the field of computer vision calls for an investigation whether CNNs can be rivaled as the de facto architecture in this field. This thesis compares the performance of four model architectures, namely the Swin Transformer, the Vision Transformer, the 2D U-Net, and the 2D U-Net which is implemented with the nnU-Net framework. These model architectures are trained using increasing amounts of brain tumor images from the BraTS 2020 dataset and subsequently evaluated on the task of brain tumor segmentation for both HGG and LGG together, as well as HGG and LGG individually. The model architectures are compared on total training time, segmentation time, GPU memory usage, and on the evaluation metrics Dice Coefficient, Jaccard Index, precision, and recall. The 2D U-Net implemented using the nnU-Net framework performs the best in correctly segmenting HGG and LGG, followed by the Swin Transformer, 2D U-Net, and Vision Transformer. The Transformer based architectures improve the least when going from 50% to 100% of training data. Furthermore, when data augmentation is applied during training, the nnU-Net outperforms the other model architectures, followed by the Swin Transformer, 2D U-Net, and Vision Transformer. The nnU-net benefited the least from employing data augmentation during training, while the Transformer based architectures benefited the most. In this thesis we were able to perform a successful comparative analysis effectively showcasing the distinct advantages of the four model architectures under discussion. Future comparisons could incorporate training the model architectures on a larger set of brain tumor images, such as the BraTS 2021 dataset. Additionally, it would be interesting to explore how Vision Transformers and Swin Transformers, pre-trained on either ImageNet- 21K or RadImageNet, compare to the model architectures of this thesis on brain tumor segmentation. Machine Learning ML AI Computer vision Vision Transformer Swin Transformer U-Net nnU-Net Brain Tumor Segmentation Deep Learning Computer and Information Sciences Data- och informationsvetenskap
52	Multi-site Organ Detection in CT Images using Deep Learning / Regionsoberoende organdetektion i CT-bilder meddjupinlärning Jacobzon, Gustaf January 2020 (has links) When optimizing a controlled dose in radiotherapy, high resolution spatial information about healthy organs in close proximity to the malignant cells are necessary in order to mitigate dispersion into these organs-at-risk. This information can be provided by deep volumetric segmentation networks, such as 3D U-Net. However, due to limitations of memory in modern graphical processing units, it is not feasible to train a volumetric segmentation network on full image volumes and subsampling the volume gives a too coarse segmentation. An alternative is to sample a region of interest from the image volume and train an organ-specific network. This approach requires knowledge of which region in the image volume that should be sampled and can be provided by a 3D object detection network. Typically the detection network will also be region specific, although a larger region such as the thorax region, and requires human assistance in choosing the appropriate network for a certain region in the body. Instead, we propose a multi-site object detection network based onYOLOv3 trained on 43 different organs, which may operate on arbitrary chosen axial patches in the body. Our model identifies the organs present (whole or truncated) in the image volume and may automatically sample a region from the input and feed to the appropriate volumetric segmentation network. We train our model on four small (as low as 20 images) site-specific datasets in a weakly-supervised manner in order to handle the partially unlabeled nature of site-specific datasets. Our model is able to generate organ-specific regions of interests that enclose 92% of the organs present in the test set. / Vid optimering av en kontrollerad dos inom strålbehandling krävs det information om friska organ, så kallade riskorgan, i närheten av de maligna cellerna för att minimera strålningen i dessa organ. Denna information kan tillhandahållas av djupa volymetriskta segmenteringsnätverk, till exempel 3D U-Net. Begränsningar i minnesstorleken hos moderna grafikkort gör att det inte är möjligt att träna ett volymetriskt segmenteringsnätverk på hela bildvolymen utan att först nedsampla volymen. Detta leder dock till en lågupplöst segmentering av organen som inte är tillräckligt precis för att kunna användas vid optimeringen. Ett alternativ är att endast behandla en intresseregion som innesluter ett eller ett fåtal organ från bildvolymen och träna ett regionspecifikt nätverk på denna mindre volym. Detta tillvägagångssätt kräver dock information om vilket område i bildvolymen som ska skickas till det regionspecifika segmenteringsnätverket. Denna information kan tillhandahållas av ett 3Dobjektdetekteringsnätverk. I regel är även detta nätverk regionsspecifikt, till exempel thorax-regionen, och kräver mänsklig assistans för att välja rätt nätverk för en viss region i kroppen. Vi föreslår istället ett multiregions-detekteringsnätverk baserat påYOLOv3 som kan detektera 43 olika organ och fungerar på godtyckligt valda axiella fönster i kroppen. Vår modell identifierar närvarande organ (hela eller trunkerade) i bilden och kan automatiskt ge information om vilken region som ska behandlas av varje regionsspecifikt segmenteringsnätverk. Vi tränar vår modell på fyra små (så lågt som 20 bilder) platsspecifika datamängder med svag övervakning för att hantera den delvis icke-annoterade egenskapen hos datamängderna. Vår modell genererar en organ-specifik intresseregion för 92 % av organen som finns i testmängden. Organ Detection Organs-at-risk 3D Object Detection Segmentation Deep Learning Machine Learning Weakly-supervised Learning YOLOv3 3D U-Net Elektroteknik och elektronik
53	Learning to Measure Invisible Fish Gustafsson, Stina January 2022 (has links) In recent years, the EU has observed a decrease in the stocks of certain fish species due to unrestricted fishing. To combat the problem, many fisheries are investigating how to automatically estimate the catch size and composition using sensors onboard the vessels. Yet, measuring the size of fish in marine imagery is a difficult task. The images generally suffer from complex conditions caused by cluttered fish, motion blur and dirty sensors. In this thesis, we propose a novel method for automatic measurement of fish size that can enable measuring both visible and occluded fish. We use a Mask R-CNN to segment the visible regions of the fish, and then fill in the shape of the occluded fish using a U-Net. We train the U-Net to perform shape completion in a semi-supervised manner, by simulating occlusions on an open-source fish dataset. Different to previous shape completion work, we teach the U-Net when to fill in the shape and not by including a small portion of fully visible fish in the input training data. Our results show that our proposed method succeeds to fill in the shape of the synthetically occluded fish as well as of some of the cluttered fish in real marine imagery. We achieve an mIoU score of 93.9 % on 1 000 synthetic test images and present qualitative results on real images captured onboard a fishing vessel. The qualitative results show that the U-Net can fill in the shapes of lightly occluded fish, but struggles when the tail fin is hidden and only parts of the fish body is visible. This task is difficult even for a human, and the performance could perhaps be increased by including the fish appearance in the shape completion task. The simulation-to-reality gap could perhaps also be reduced by finetuning the U-Net on some real occlusions, which could increase the performance on the heavy occlusions in the real marine imagery. Instance Segmentation Shape Completion Automatic Size Measurement Fisheries Electronic Monitoring Mask R-CNN U-Net
54	Automatic Detection of Common Signal Quality Issues in MRI Data using Deep Neural Networks Ax, Erika, Djerf, Elin January 2023 (has links) Magnetic resonance imaging (MRI) is a commonly used non-invasive imaging technique that provides high resolution images of soft tissue. One problem with MRI is that it is sensitive to signal quality issues. The issues can arise for various reasons, for example by metal located either inside or outside of the body. Another common signal quality issue is caused by the patient being partly placed outside field of view of the MRI scanner. This thesis aims to investigate the possibility to automatically detect these signal quality issues using deep neural networks. More specifically, two different 3D CNN network types were studied, a classification-based approach and a reconstruction-based approach. The datasets used consist of MRI volumes from UK Biobank which have been processed and manually annotated by operators at AMRA Medical. For the classification method four different network architectures were explored utilising supervised learning with multi-label classification. The classification method was evaluated using accuracy and label-based evaluation metrics, such as macro-precision, macro-recall and macro-F1. The reconstruction method was based on anomaly detection using an autoencoder which was trained to reconstruct volumes without any artefacts. A mean squared prediction error was calculated for the reconstructed volume and compared against a threshold in order to classify a volume with or without artefacts. The idea was that volumes containing artefacts should be more difficult to reconstruct and thus, result in a higher prediction error. The reconstruction method was evaluated using accuracy, precision, recall and F1-score. The results show that the classification method has overall higher performance than the reconstruction method. The achieved accuracy for the classification method was 98.0% for metal artefacts and 97.5% for outside field of view artefacts. The best architecture for the classification method proved to be DenseNet201. The reconstruction method worked for metal artefacts with an achieved accuracy of 75.7%. Furthermore, it was concluded that reconstruction method did not work for detection of outside field of view artefacts. The results from the classification method indicate that there is a possibility to automatically detect artefacts with deep neural networks. However, it is needed to further improve the method in order to completely replace a manual quality control step before using the volumes for calculation of biomarkers. mr magnetic resonance machine learning deep learning anomaly detection U-Net autoencoder 3D classification reconstruction artefacts Medical Engineering Medicinteknik Medical Image Processing Medicinsk bildbehandling
55	Segmentation of People and Vehicles in Dense Voxel Grids from Photon Counting LiDAR using 3D-Unet Danielsson, Fredrik January 2021 (has links) In recent years, the usage of 3D deep learning techniques has seen a surge,mainly driven by advancements in autonomous driving and medical applications.This thesis investigates the applicability of existing state-of-the-art 3Ddeep learning network architectures to dense voxel grids from single photoncounting 3D LiDAR. This work also examine the choice of loss function asa means of dealing with extreme data imbalance, in order to segment peopleand vehicles in outdoor forest scenes. Due to data similarities with volumetricmedical data, such as computer tomography scans, this thesis investigates ifa model for 3D deep learning used for medical applications, the commonlyused 3D U-Net, can be used for photon counting data. The results showthat segmentation of people and vehicles is possible in this type of data butthat performance depends on the segmentation task, light conditions, and theloss function. For people segmentation the final models are able to predictall targets, but with a significant amount of false positives, something that islikely caused by similar LiDAR responses between people and tree trunks.For vehicle detection, the results are more inconsistent and varies greatlybetween different loss functions as well as the position and orientation of thevehicles. Overall, we consider the 3D U-Net model a successful proof-ofconceptregarding the applicability of 3D deep learning techniques to this kindof data. / Under de senaste åren har användningen för djupinlärningstekniker för 3Dsett en kraftig ökning, främst driven av framsteg inom autonoma fordon ochmedicinska tillämpningar. Denna avhandling undersöker befintliga modernadjupinlärningsnätverk för 3D i täta voxelgriddar från fotonräknande 3D LiDARför att segmentera människor och fordon i skogsscener. Vidare undersöksvalet av målfunktion som ett sätt att hantera extrem dataobalans. På grundav datalikheter med volymetriska medicinska data, såsom datortomografi,kommer denna avhandling att undersöka om en modell för 3D-djupinlärningsom används för medicinska applikationer, nämligen 3D U-Net, kan användasför fotonräknande data. Resultaten visar att segmentering av människor ochfordon är möjligt men att prestanda varier avsevärt med segmenteringsuppgiften,ljusförhållanden, och målfunktioner. För segmentering av människorkan de slutgiltiga modellerna segmentera alla mål men med en betydandemängd falska utslag, något som sannolikt orsakas av liknande LiDAR-svarmellan människor och trädstammar. För segmentering av fordon är resultatenmer oberäkneliga och varierar kraftigt mellan olika målfunktioner såväl somfordonens position och orientering. Sammantaget anser vi att 3D U-Netmodellenvisar på en framgångsrik konceptvalidering när det gäller tillämpningav djupinlärningstekniker för 3D på denna typ av data. Deep Learning LiDAR Segmentation Dense Voxel Grids Single Photon Counting Machine Learning 3D U-Net Convolutional Neural Network Elektroteknik och elektronik
56	Land Use/Land Cover Classification From Satellite Remote Sensing Images Over Urban Areas in Sweden : An Investigative Multiclass, Multimodal and Spectral Transformation, Deep Learning Semantic Image Segmentation Study / Klassificering av markanvändning/marktäckning från satellit-fjärranalysbilder över urbana områden i Sverige : En undersökande multiklass, multimodal och spektral transformation, djupinlärningsstudie inom semantisk bildsegmentering Aidantausta, Oskar, Asman, Patrick January 2023 (has links) Remote Sensing (RS) technology provides valuable information about Earth by enabling an overview of the planet from above, making it a much-needed resource for many applications. Given the abundance of RS data and continued urbanisation, there is a need for efficient approaches to leverage RS data and its unique characteristics for the assessment and management of urban areas. Consequently, employing Deep Learning (DL) for RS applications has attracted much attention over the past few years. In this thesis, novel datasets consisting of satellite RS images over urban areas in Sweden were compiled from Sentinel-2 multispectral, Sentinel-1 Synthetic Aperture Radar (SAR) and Urban Atlas 2018 Land Use/Land Cover (LULC) data. Then, DL was applied for multiband and multiclass semantic image segmentation of LULC. The contributions of complementary spectral, temporal and SAR data and spectral indices to LULC classification performance compared to using only Sentinel-2 data with red, green and blue spectral bands were investigated by implementing DL models based on the fully convolutional network-based architecture, U-Net, and performing data fusion. Promising results were achieved with 25 possible LULC classes. Furthermore, almost all DL models at an overall model level and all DL models at an individual class level for most LULC classes benefited from complementary satellite RS data with varying degrees of classification improvement. Additionally, practical knowledge and insights were gained from evaluating the results and are presented regarding satellite RS data characteristics and semantic segmentation of LULC in urban areas. The obtained results are helpful for practitioners and researchers applying or intending to apply DL for semantic segmentation of LULC in general and specifically in Swedish urban environments. data fusion deep learning land use/land cover classification multiclass multimodal remote sensing semantic segmentation Sentinel satellite spectral index U-Net Urban Atlas Remote Sensing Fjärranalysteknik
57	GAN-based Automatic Segmentation of Thoracic Aorta from Non-contrast-Enhanced CT Images / GAN-baserad automatisk segmentering avthoraxorta från icke-kontrastförstärkta CT-bilder Xu, Libo January 2021 (has links) The deep learning-based automatic segmentation methods have developed rapidly in recent years to give a promising performance in the medical image segmentation tasks, which provide clinical medicine with an accurate and fast computer-aided diagnosis method. Generative adversarial networks and their extended frameworks have achieved encouraging results on image-to-image translation problems. In this report, the proposed hybrid network combined cycle-consistent adversarial networks, which transformed contrast-enhanced images from computed tomography angiography to the conventional low-contrast CT scans, with the segmentation network and trained them simultaneously in an end-to-end manner. The trained segmentation network was tested on the non-contrast-enhanced CT images. The synthetic process and the segmentation process were also implemented in a two-stage manner. The two-stage process achieved a higher Dice similarity coefficient than the baseline U-Net did on test data, but the proposed hybrid network did not outperform the baseline due to the field of view difference between the two training data sets. Non-contrast-enhanced Medical Image Segmentation Image-to-image Translation U-Net Generative Adversarial Network End-to-end Two-stage Medical Engineering Medicinteknik
58	U - Net Based Crack Detection in Road and Railroad Tunnels Using Data Acquired by Mobile Device / U - Net - baserad sprickdetektering i väg - och järnvägstunnlar med hjälp av data som förvärvats av mobil enhet Gao, Kepan January 2022 (has links) Infrastructures like bridges and tunnels are significant for the economy and growth of countries, however, the risk of failure increases as they getting aged. Therefore, a systematic monitoring scheme is necessary to check the integrity regularly. Among all the defects, cracks are the most common ones that can be observed directly by camera or mapping system. Meanwhile, cracks are capable and reliable indicators. As a result, crack detection is one of the most broadly researched topic. As the limitation of computing resource vanishing, deep learning methods are developing rapidly and used widely. U-net is one of the latest deep learning methods for image classification and has shown overwhelming adaptability and performance in medical images. It is promising to be capable for crack detection. In this thesis project, a U-net approach is used to automatically detect road and tunnel cracks. An open-source crack detection dataset is used for training. The model is improved by new parameter settings and fine-tuning and transformed onto the data acquired by the mobile mapping system of TACK team. Image processing techniques such as class imbalance handling and center line are also used for improvement. At last, qualitative and quantitative statistics are used to illustrate superiority of the methods. This thesis project is a sub-project of project TACK, which is an ongoing research project carried out by KTH - Royal Institute of Technology, Sapienza University of Rome and WSP Sweden company under the InfraSweden2030 program funded by Vinnova. The main objective of TACK is developing a methodology for automatic detection and measurement of cracks on tunnel linings or other infrastructures. U-Net deep learning tunnel monitoring structural health monitoring crack detection mobile mapping system Signal Processing Signalbehandling Infrastructure Engineering Infrastrukturteknik Communication Systems Kommunikationssystem
59	The Effect of Beautification Filters on Image Recognition : "Are filtered social media images viable Open Source Intelligence?" / Effekten av försköningsfilter vid bildigenkänning : "Är filtrerade bilder från sociala media lämpliga som fritt tillgänglig underrättelseinformation?" Skepetzis, Vasilios, Hedman, Pontus January 2021 (has links) In light of the emergence of social media, and its abundance of facial imagery, facial recognition finds itself useful from an Open Source Intelligence standpoint. Images uploaded on social media are likely to be filtered, which can destroy or modify biometric features. This study looks at the recognition effort of identifying individuals based on their facial image after filters have been applied to the image. The social media image filters studied occlude parts of the nose and eyes, with a particular interest in filters occluding the eye region. Our proposed method uses a Residual Neural Network Model to extract features from images, with recognition of individuals based on distance measures, based on the extracted features. Classification of individuals is also further done by the use of a Linear Support Vector Machine and XGBoost classifier. In attempts to increase the recognition performance for images completely occluded in the eye region, we present a method to reconstruct this information by using a variation of a U-Net, and from the classification perspective, we also train the classifier on filtered images to increase the performance of recognition. Our experimental results showed good recognition of individuals when filters were not occluding important landmarks, especially around the eye region. Our proposed solution shows an ability to mitigate the occlusion done by filters through either reconstruction or training on manipulated images, in some cases, with an increase in the classifier’s accuracy of approximately 17% points with only reconstruction, 16% points when the classifier trained on filtered data, and 24% points when both were used at the same time. When training on filtered images, we observe an average increase in performance, across all datasets, of 9.7% points. face recognition OSINT machine learning deep learning convolutional neural networks social media filters u-net residual neural network Ansiktsigenkänning OSINT maskininlärning djupinlärning faltningsnätverk sociala media filter u-net residual neuronnät Computer and Information Sciences Data- och informationsvetenskap Signal Processing Signalbehandling Computer Systems Datorsystem
60	TransRUnet: 2D Detection and Segmentation of Lymphoma Lesions in Full-Body PET-CT Images / TransRUnet: 2D-detektion och segmentering av lymfomlesioner i helkroppsundersökning med PET-CT Stahnke, Lasse January 2023 (has links) Identification and localization of FDG-avid lymphoma lesions in PET-CT image volumes is of high importance for the diagnosis and monitoring of treatment progress in lymphoma patients. This process is tedious, time-consuming, and error-prone, due to large image volumes and the heterogeneity of lesions. Thus, a fully automatic method for lymphoma detection is desirable. The AutoPET challenge dataset contains 145 full-body FDG-PET-CT images of lymphoma patients with pixel-level segmentation of lesions. The Retina U-Net utilizes semantic segmentation maps for object detection through simultaneous segmentation and detection. More recently, transformer-based methods became increasingly popular due to their good performance. Here, TransRUnet is proposed, a 2D deep neural network capable of segmentation and object detection, combining the Retina U-Net with a Feature Pyramid Transformer. Firstly, a Retina U-Net was trained as a Baseline on 2D axial slices of 116 patient volumes from the AutoPET dataset, achieving an mAP of 0.377 and a DSC of 0.737 on the 29 test patients. Secondly, the TransRUnet was trained on the same patients, achieving an mAP and DSC of 0.285 and 0.732, respectively. Performance comparison based on mAP and DSC did not show significant differences (p = 0.596 and p = 0.940, for mAP and DSC, respectively) between the Retina U-Net and the TransRUnet. Furthermore, a substantial difference in FROC between the two models could not be observed. The ground truth data should be preprocessed to reduce noise in the training data or a 3D generalization of the TransRUnet should be used to improve the detection performance. / Att i PET-CT-bildvolymer identifiera och lokalisera lymfomlesioner med hög FDG-aviditet är av stor betydelse för diagnos och övervakning av behandlingseffekt hos lymfompatienter. Denna process är omständlig, tidskrävande och felbenägen på grund av stora bildvolymer och heterogeniteten hos lesionerna. Därför är det önskvärt med en helautomatisk metod för lymfomdetektion. AutoPET Challenge-datasetet innehåller 145 FDG-PET-CT-bilder av lymfom-patienter med segmentering av lesioner på pixelnivå. Retina U-Net använder semantiska segmenteringskartor för objektsdetektering genom samtidig segmentering och detektering. På senare tid har transformatorbaserade metoder blivit alltmer populära på grund av sina goda prestanda. Här föreslås TransRUnet, ett djupgående neuralt 2D-nätverk som kan segmentera och upptäcka objekt och som kombinerar Retina U-Net med en Feature Pyramid Transformer. I första steget tränades ett Retina U-Net som baslinje på 2D axialskivor av 116 patientvolymer från AutoPET-dataset, och uppnådde en mAP på 0,377 och en DSC på 0,737 på de 29 testpatienterna. I nästa steg tränades TransRUnet på samma patienter och uppnådde en mAP och DSC på 0,285 respektive 0,732. Jämförelse av prestanda baserat på mAP och DSC visade inga signifikanta skillnader (p = 0,596 och p = 0,940 för mAP respektive DSC) mellan Retina U-Net och TransRUnet. Dessutom kunde ingen väsentlig skillnad i FROC mellan de två modellerna observeras. Ground truth-data bör förbehandlas för att minska bruset i träningsdata eller också bör en 3D-generalisering av TransRUnet användas för att förbättra detektionsprestanda. Lymphoma PET-CT Deep Learning CNN Retina U-Net Feature Pyramid Transformer Detection Segmentation Lymfom PET-CT djupinlärning CNN Retina U-Net Feature Pyramid Transformer detektion segmentering Medical Engineering Medicinteknik Medical Image Processing Medicinsk bildbehandling

Search results