Spelling suggestions: "subject:"- isegmentering"" "subject:"- ensegmentering""
151 |
Translating Expressive Prose using CAT Tools : An investigation into discerning the effects of segmentation in student translationsvon Rettig, Anna January 2014 (has links)
Computer Assisted Translation tools continue to become more ubiquitous, but translation students do not necessarily receive much training in using them, and may therefore find translating when using them very different to translating freehand. An experiment was conducted where a three Master’s students were each asked to translate two texts; one in a CAT tool and the other freehand, and the resulting target texts were inspected to determine whether they may have been affected by the segmentation performed by the CAT tool compared to freehand translations of the same text, and if so, how. There were indications that in certain cases, such as very long sentences, the CAT tool may act as a visual aid, and also indications that certain students may be more prone to follow the segmentation provided by the CAT tool than others. However, the influence of personal translator style and translator’s habitus cannot be disregarded and as such the differences that are apparent cannot be entirely attributed to the CAT tool.
|
152 |
Teaching an AI to recycle by looking at scrap metal : Semantic segmentation through self-supervised learning with transformers / Lär en AI att källsortera genom att kolla på metallskrotForsberg, Edwin, Harris, Carl January 2022 (has links)
Stena Recycling is one of the leading recycling companies in Sweden and at their facility in Halmstad, 300 tonnes of refuse are handled every day where aluminium is one of the most valuable materials they sort. Today, most of the sorting process is done automatically, but there are still parts of the refuse that are not correctly sorted. Approximately 4\% of the aluminium is currently not properly sorted and goes to waste. Earlier works have investigated using machine vision to help in the sorting process at Stena Recycling. However, consistently through all these previous works, there is a problem in gathering enough annotated data to train the machine learning models. This thesis aims to investigate how machine vision could be used in the recycling process and if pre-training models using self-supervised learning can alleviate the problem of gathering annotated data and yield an improvement. The results show that machine vision models could viably be used in an information system to assist operators. This thesis also shows that pre-training models with self-supervised learning may yield a small increase in performance. Furthermore, we show that models pre-trained using self-supervised learning also appear to transfer the knowledge learned from images created in a lab environment to images taken at the recycling plant.
|
153 |
Deep Learning Semantic Segmentation of 3D Point Cloud Data from a Photon Counting LiDAR / Djupinlärning för semantisk segmentering av 3D punktmoln från en fotonräknande LiDARSüsskind, Caspian January 2022 (has links)
Deep learning has shown to be successful on the task of semantic segmentation of three-dimensional (3D) point clouds, which has many interesting use cases in areas such as autonomous driving and defense applications. A common type of sensor used for collecting 3D point cloud data is Light Detection and Ranging (LiDAR) sensors. In this thesis, a time-correlated single-photon counting (TCSPC) LiDAR is used, which produces very accurate measurements over long distances up to several kilometers. The dataset collected by the TCSPC LiDAR used in the thesis contains two classes, person and other, and it comes with several challenges due to it being limited in terms of size and variation, as well as being extremely class imbalanced. The thesis aims to identify, analyze, and evaluate state-of-the-art deep learning models for semantic segmentation of point clouds produced by the TCSPC sensor. This is achieved by investigating different loss functions, data variations, and data augmentation techniques for a selected state-of-the-art deep learning architecture. The results showed that loss functions tailored for extremely imbalanced datasets performed the best with regard to the metric mean intersection over union (mIoU). Furthermore, an improvement in mIoU could be observed when some combinations of data augmentation techniques were employed. In general, the performance of the models varied heavily, with some achieving promising results and others achieving much worse results.
|
154 |
Deep Learning for Semantic Segmentation of 3D Point Clouds from an Airborne LiDAR / Semantisk segmentering av 3D punktmoln från en luftburen LiDAR med djupinlärningSerra, Sabina January 2020 (has links)
Light Detection and Ranging (LiDAR) sensors have many different application areas, from revealing archaeological structures to aiding navigation of vehicles. However, it is challenging to interpret and fully use the vast amount of unstructured data that LiDARs collect. Automatic classification of LiDAR data would ease the utilization, whether it is for examining structures or aiding vehicles. In recent years, there have been many advances in deep learning for semantic segmentation of automotive LiDAR data, but there is less research on aerial LiDAR data. This thesis investigates the current state-of-the-art deep learning architectures, and how well they perform on LiDAR data acquired by an Unmanned Aerial Vehicle (UAV). It also investigates different training techniques for class imbalanced and limited datasets, which are common challenges for semantic segmentation networks. Lastly, this thesis investigates if pre-training can improve the performance of the models. The LiDAR scans were first projected to range images and then a fully convolutional semantic segmentation network was used. Three different training techniques were evaluated: weighted sampling, data augmentation, and grouping of classes. No improvement was observed by the weighted sampling, neither did grouping of classes have a substantial effect on the performance. Pre-training on the large public dataset SemanticKITTI resulted in a small performance improvement, but the data augmentation seemed to have the largest positive impact. The mIoU of the best model, which was trained with data augmentation, was 63.7% and it performed very well on the classes Ground, Vegetation, and Vehicle. The other classes in the UAV dataset, Person and Structure, had very little data and were challenging for most models to classify correctly. In general, the models trained on UAV data performed similarly as the state-of-the-art models trained on automotive data.
|
155 |
TransRUnet: 2D Detection and Segmentation of Lymphoma Lesions in Full-Body PET-CT Images / TransRUnet: 2D-detektion och segmentering av lymfomlesioner i helkroppsundersökning med PET-CTStahnke, Lasse January 2023 (has links)
Identification and localization of FDG-avid lymphoma lesions in PET-CT image volumes is of high importance for the diagnosis and monitoring of treatment progress in lymphoma patients. This process is tedious, time-consuming, and error-prone, due to large image volumes and the heterogeneity of lesions. Thus, a fully automatic method for lymphoma detection is desirable. The AutoPET challenge dataset contains 145 full-body FDG-PET-CT images of lymphoma patients with pixel-level segmentation of lesions. The Retina U-Net utilizes semantic segmentation maps for object detection through simultaneous segmentation and detection. More recently, transformer-based methods became increasingly popular due to their good performance. Here, TransRUnet is proposed, a 2D deep neural network capable of segmentation and object detection, combining the Retina U-Net with a Feature Pyramid Transformer. Firstly, a Retina U-Net was trained as a Baseline on 2D axial slices of 116 patient volumes from the AutoPET dataset, achieving an mAP of 0.377 and a DSC of 0.737 on the 29 test patients. Secondly, the TransRUnet was trained on the same patients, achieving an mAP and DSC of 0.285 and 0.732, respectively. Performance comparison based on mAP and DSC did not show significant differences (p = 0.596 and p = 0.940, for mAP and DSC, respectively) between the Retina U-Net and the TransRUnet. Furthermore, a substantial difference in FROC between the two models could not be observed. The ground truth data should be preprocessed to reduce noise in the training data or a 3D generalization of the TransRUnet should be used to improve the detection performance. / Att i PET-CT-bildvolymer identifiera och lokalisera lymfomlesioner med hög FDG-aviditet är av stor betydelse för diagnos och övervakning av behandlingseffekt hos lymfompatienter. Denna process är omständlig, tidskrävande och felbenägen på grund av stora bildvolymer och heterogeniteten hos lesionerna. Därför är det önskvärt med en helautomatisk metod för lymfomdetektion. AutoPET Challenge-datasetet innehåller 145 FDG-PET-CT-bilder av lymfom-patienter med segmentering av lesioner på pixelnivå. Retina U-Net använder semantiska segmenteringskartor för objektsdetektering genom samtidig segmentering och detektering. På senare tid har transformatorbaserade metoder blivit alltmer populära på grund av sina goda prestanda. Här föreslås TransRUnet, ett djupgående neuralt 2D-nätverk som kan segmentera och upptäcka objekt och som kombinerar Retina U-Net med en Feature Pyramid Transformer. I första steget tränades ett Retina U-Net som baslinje på 2D axialskivor av 116 patientvolymer från AutoPET-dataset, och uppnådde en mAP på 0,377 och en DSC på 0,737 på de 29 testpatienterna. I nästa steg tränades TransRUnet på samma patienter och uppnådde en mAP och DSC på 0,285 respektive 0,732. Jämförelse av prestanda baserat på mAP och DSC visade inga signifikanta skillnader (p = 0,596 och p = 0,940 för mAP respektive DSC) mellan Retina U-Net och TransRUnet. Dessutom kunde ingen väsentlig skillnad i FROC mellan de två modellerna observeras. Ground truth-data bör förbehandlas för att minska bruset i träningsdata eller också bör en 3D-generalisering av TransRUnet användas för att förbättra detektionsprestanda.
|
156 |
Screw Hole Detection in Industrial Products using Neural Network based Object Detection and Image Segmentation : A Study Providing Ideas for Future Industrial Applications / Skruvhålsdetektering på Industriella Produkter med hjälp av Neurala Nätverksbaserade Objektdetektering och Bildsegmentering : En Studie som Erbjuder Ideér för Framtida Industriella ApplikationerMelki, Jakob January 2022 (has links)
This project is about screw hole detection using neural networks for automated assembly and disassembly. In a lot of industrial companies, such as Ericsson AB, there are products such as radio units or filters that have a lot of screw holes. Thus, the assembly and disassemble process is very time consuming and demanding for a human to assemble and disassemble the products. The problem statement in this project is to investigate the performance of neural networks within object detection and semantic segmentation to detect screw holes in industrial products. Different industrial models were created and synthetic data was generated in Blender. Two types of experiments were done, the first one compared an object detection algorithm (Faster R-CNN) with a semantic segmentation algorithm (SegNet) to see which area is most suitable for hole detection. The results showed that semantic segmentation outperforms object detection when it comes to detect multiple small holes. The second experiment was to further investigate about semantic segmentation algorithms by adding U-Net, PSPNet and LinkNet into the comparison. The networks U-Net and LinkNet were the most successful ones and achieved a Mean Intersection over Union (MIoU) of around 0.9, which shows that they have potential for further development. Thus, conclusions draw in this project are that segmentation algorithms are more suitable for hole detection than object detection algorithms. Furthermore, it shows that there is potential in neural networks within semantic segmentation to detect screw holes because of the results of U-Net and LinkNet. Future work that one can do is to create more advanced product models, investigate other segmentation networks and hyperparameter tuning. / Det här projektet handlar om skruvhålsdetektering genom att använda neurala nätverk för automatiserad montering och demontering. I många industriföretag, såsom Ericsson AB, finns det många produkter som radioenheter eller filter som har många skruvhål. Därmed, är monterings - och demonteringsprocessen väldigt tidsfördröjande och krävande för en människa att montera och demontera produkterna. Problemformuleringen i detta projekt är att undersöka prestationen av olika neurala nätverk inom objekt detektering och semantisk segmentering för skurvhålsdetektering på indutriella produkter. Olika indutriella modeller var skapade och syntetisk data var genererat i Blender. Två typer av experiment gjordes, den första jämförde en objekt detekterings algoritm (Faster R-CNN) med en semantisk segmenterigs algoritm för att vilket område som är mest lämplig för hål detektering. Resultaten visade att semantisk segmentering utpresterar objekt detektering när det kommer till att detektera flera små hål. Det andra experimentet handlade om att vidare undersöka semantiska segmenterings algoritmer genom att addera U-Net, PSPNet och LinkNet till jämförelsen. Nätverken U-Net och PSPNet var de mest framgångsrika och uppnåde en Mean Intersection over Union (MIoU) på cirka 0.9, vilket visar på att de har potential för vidare utveckling. Slutsatserna inom detta projekt är att semantisk segmentering är mer lämplig för hål detektering än objekt detektering. Dessutom, visade sig att det finns potential i neurala nätverk inom semantisk segmentering för att detejtera skruvhål på grund av resultaten av U-Net och LinkNet. Framtida arbete som man kan göra är att skapa flera avancerade produkt modeller, undersöka andra segmenterisk nätverk och hyperparameter tuning.
|
157 |
Deep Convolutional Neural Network for Effective Image Analysis : DESIGN AND IMPLEMENTATION OF A DEEP PIXEL-WISE SEGMENTATION ARCHITECTUREMarti, Marco Ros January 2017 (has links)
This master thesis presents the process of designing and implementing a CNN-based architecture for image recognition included in a larger project in the field of fashion recommendation with deep learning. Concretely, the presented network aims to perform localization and segmentation tasks. Therefore, an accurate analysis of the most well-known localization and segmentation networks in the state of the art has been performed. Afterwards, a multi-task network performing RoI pixel-wise segmentation has been created. This proposal solves the detected weaknesses of the pre-existing networks in the field of application, i.e. fashion recommendation. These weaknesses are basically related with the lack of a fine-grained quality of the segmentation and problems with computational efficiency. When it comes to improve the details of the segmentation, this network proposes to work pixel- wise, i.e. performing a classification task for each of the pixels of the image. Thus, the network is more suitable to detect all the details presented in the analysed images. However, a pixel-wise task requires working in pixel resolution, which implies that the number of operations to perform is usually large. To reduce the total number of operations to perform in the network and increase the computational efficiency, this pixel-wise segmentation is only done in the meaningful regions of the image (Regions of Interest), which are also computed in the network (RoI masks). Then, after a study of the more recent deep learning libraries, the network has been successfully implemented. Finally, to prove the correct operation of the design, a set of experiments have been satisfactorily conducted. In this sense, it must be noted that the evaluation of the results obtained during testing phase with respect to the most well-known architectures is out of the scope of this thesis as the experimental conditions, especially in terms of dataset, have not been suitable for doing so. Nevertheless, the proposed network is totally prepared to perform this evaluation in the future, when the required experimental conditions are available. / Denna examensarbete presenterar processen för att designa och implementera en CNN-baserad arkitektur för bildigenkänning som ingår i ett större projekt inom moderekommendation med djup inlärning. Konkret, det presenterade nätverket syftar till att utföra lokaliseringsoch segmenteringsuppgifter. Därför har en noggrann analys av de mest kända lokaliseringsoch segmenteringsnätena utförts inom den senaste tekniken. Därefter har ett multi-task-nätverk som utför RoI pixel-wise segmentering skapats. Detta förslag löser de upptäckta svagheterna hos de befintliga näten inom tillämpningsområdet, dvs modeanbefaling. Dessa svagheter är i grund och botten relaterade till bristen på en finkornad kvalitet på segmenteringen och problem med beräkningseffektivitet. När det gäller att förbättra detaljerna i segmenteringen, föreslår detta nätverk att arbeta pixelvis, dvs att utföra en klassificeringsuppgift för var och en av bildpunkterna i bilden. Nätverket är sålunda lämpligare att detektera alla detaljer som presenteras i de analyserade bilderna. En pixelvis uppgift kräver dock att man arbetar med pixelupplösning, vilket innebär att antalet operationer som ska utföras är vanligtvis stor. För att minska det totala antalet operationer som ska utföras i nätverket och öka beräkningseffektiviteten görs denna pixelvisa segmentering endast i de meningsfulla regionerna i bilden (intressanta regioner), som också beräknas i nätverket (RoI-masker) . Sedan, efter en studie av de senaste djuplärningsbiblioteken, har nätverket framgångsrikt implementerats. Slutligen, för att bevisa korrekt funktion av konstruktionen, har en uppsättning experiment genomförts på ett tillfredsställande sätt. I detta avseende måste det noteras att utvärderingen av de resultat som uppnåtts under testfasen i förhållande till de mest kända arkitekturerna ligger utanför denna avhandling, eftersom de experimentella förhållandena, särskilt vad gäller dataset, inte har varit lämpliga För att göra det. Ändå är det föreslagna nätverket helt beredd att utföra denna utvärdering i framtiden när de nödvändiga försöksvillkoren är tillgängliga. / En aquest treball de fi de màster es presenta el disseny i la implementació d’una arquitectura pel reconeixement d’imatges fent ús de CNN. Aquesta xarxa es troba inclosa en un projecte de major envergadura en el camp de la recomanació de moda. En concret, la xarxa presentada en aquest document s’encarrega de realitzar les tasques de localització i segmentació. Després d’un estudi a consciència de les xarxes més conegudes de l’estat de l’art, s’ha dissenyat una xarxa multi-tasca encarregada de realitzar una segmentació a resolució de píxel de les regions d’interès de la imatge, les quals han sigut prèviament calculades i emmascarades. Aquesta proposta soluciona les mancances detectades en les xarxes ja existents pel que fa a la tasca de recomanació de moda. Aquestes mancances es basen en la obtenció d’una segmentació sense prou nivell de detalls i en una rellevant complexitat computacional. Pel que fa a la qualitat de la segmentació, aquesta tesi proposa treballar en resolució de píxel, classificant tots els píxels de la imatge de forma individual, per tal de poder adaptar-se a tots els detalls que puguin aparèixer a la imatge analitzada. No obstant, treballar píxel a píxel implica la realització d’una gran quantitat d’operacions. Per reduir-les, proposem fer la segmentació píxel a píxel només a les regions d’interès de la imatge. A continuació, després d’un estudi detallat de les llibreries de deep learnign més destacades, el disseny ha sigut implementat. Finalment s’han dut a terme una sèrie d’experiments per provar el correcte funcionament del disseny. En aquest sentit és important destacar que aquesta tesi no té com a objectiu avaluar el disseny respecte d’altres xarxes ja existents. La raó és que les condicions d’experimentació, sobretot pel que fa a la base de dades, no són adequades per aquesta tasca. No obstant, la xarxa està perfectament preparada per fer aquesta avaluació un cop les condicions d’experimentació així ho permetin.
|
158 |
Instance Segmentation of Multiclass Litter and Imbalanced Dataset Handling : A Deep Learning Model Comparison / Instanssegmentering av kategoriserat skräp samt hantering av obalanserat datasetSievert, Rolf January 2021 (has links)
Instance segmentation has a great potential for improving the current state of littering by autonomously detecting and segmenting different categories of litter. With this information, litter could, for example, be geotagged to aid litter pickers or to give precise locational information to unmanned vehicles for autonomous litter collection. Land-based litter instance segmentation is a relatively unexplored field, and this study aims to give a comparison of the instance segmentation models Mask R-CNN and DetectoRS using the multiclass litter dataset called Trash Annotations in Context (TACO) in conjunction with the Common Objects in Context precision and recall scores. TACO is an imbalanced dataset, and therefore imbalanced data-handling is addressed, exercising a second-order relation iterative stratified split, and additionally oversampling when training Mask R-CNN. Mask R-CNN without oversampling resulted in a segmentation of 0.127 mAP, and with oversampling 0.163 mAP. DetectoRS achieved 0.167 segmentation mAP, and improves the segmentation mAP of small objects most noticeably, with a factor of at least 2, which is important within the litter domain since small objects such as cigarettes are overrepresented. In contrast, oversampling with Mask R-CNN does not seem to improve the general precision of small and medium objects, but only improves the detection of large objects. It is concluded that DetectoRS improves results compared to Mask R-CNN, as well does oversampling. However, using a dataset that cannot have an all-class representation for train, validation, and test splits, together with an iterative stratification that does not guarantee all-class representations, makes it hard for future works to do exact comparisons to this study. Results are therefore approximate considering using all categories since 12 categories are missing from the test set, where 4 of those were impossible to split into train, validation, and test set. Further image collection and annotation to mitigate the imbalance would most noticeably improve results since results depend on class-averaged values. Doing oversampling with DetectoRS would also help improve results. There is also the option to combine the two datasets TACO and MJU-Waste to enforce training of more categories.
|
Page generated in 0.0877 seconds