Global ETD Search

371	Depth-Aware Deep Learning Networks for Object Detection and Image Segmentation Dickens, James 01 September 2021 (has links) The rise of convolutional neural networks (CNNs) in the context of computer vision has occurred in tandem with the advancement of depth sensing technology. Depth cameras are capable of yielding two-dimensional arrays storing at each pixel the distance from objects and surfaces in a scene from a given sensor, aligned with a regular color image, obtaining so-called RGBD images. Inspired by prior models in the literature, this work develops a suite of RGBD CNN models to tackle the challenging tasks of object detection, instance segmentation, and semantic segmentation. Prominent architectures for object detection and image segmentation are modified to incorporate dual backbone approaches inputting RGB and depth images, combining features from both modalities through the use of novel fusion modules. For each task, the models developed are competitive with state-of-the-art RGBD architectures. In particular, the proposed RGBD object detection approach achieves 53.5% mAP on the SUN RGBD 19-class object detection benchmark, while the proposed RGBD semantic segmentation architecture yields 69.4% accuracy with respect to the SUN RGBD 37-class semantic segmentation benchmark. An original 13-class RGBD instance segmentation benchmark is introduced for the SUN RGBD dataset, for which the proposed model achieves 38.4% mAP. Additionally, an original depth-aware panoptic segmentation model is developed, trained, and tested for new benchmarks conceived for the NYUDv2 and SUN RGBD datasets. These benchmarks offer researchers a baseline for the task of RGBD panoptic segmentation on these datasets, where the novel depth-aware model outperforms a comparable RGB counterpart. Deep learning Computer vision CNN Object detection Semantic segmentation Instance segmentation Multi-modal deep learning Panoptic segmentation Artificial intelligence Convolutional neural networks Neural networks RGBD Depth images
372	Automated Gravel Road Condition Assessment : A Case Study of Assessing Loose Gravel using Audio Data Saeed, Nausheen January 2021 (has links) Gravel roads connect sparse populations and provide highways for agriculture and the transport of forest goods. Gravel roads are an economical choice where traffic volume is low. In Sweden, 21% of all public roads are state-owned gravel roads, covering over 20,200 km. In addition, there are some 74,000 km of gravel roads and 210,000 km of forest roads that are owned by the private sector. The Swedish Transport Administration (Trafikverket) rates the condition of gravel roads according to the severity of irregularities (e.g. corrugations and potholes), dust, loose gravel, and gravel cross-sections. This assessment is carried out during the summertime when roads are free of snow. One of the essential parameters for gravel road assessment is loose gravel. Loose gravel can cause a tire to slip, leading to a loss of driver control. Assessment of gravel roads is carried out subjectively by taking images of road sections and adding some textual notes. A cost-effective, intelligent, and objective method for road assessment is lacking. Expensive methods, such as laser profiler trucks, are available and can offer road profiling with high accuracy. These methods are not applied to gravel roads, however, because of the need to maintain cost-efficiency. In this thesis, we explored the idea that, in addition to machine vision, we could also use machine hearing to classify the condition of gravel roads in relation to loose gravel. Several suitable classical supervised learning and convolutional neural networks (CNN) were tested. When people drive on gravel roads, they can make sense of the road condition by listening to the gravel hitting the bottom of the car. The more we hear gravel hitting the bottom of the car, the more we can sense that there is a lot of loose gravel and, therefore, the road might be in a bad condition. Based on this idea, we hypothesized that machines could also undertake such a classification when trained with labeled sound data. Machines can identify gravel and non-gravel sounds. In this thesis, we used traditional machine learning algorithms, such as support vector machines (SVM), decision trees, and ensemble classification methods. We also explored CNN for classifying spectrograms of audio sounds and images in gravel roads. Both supervised learning and CNN were used, and results were compared for this study. In classical algorithms, when compared with other classifiers, ensemble bagged tree (EBT)-based classifiers performed best for classifying gravel and non-gravel sounds. EBT performance is also useful in reducing the misclassification of non-gravel sounds. The use of CNN also showed a 97.91% accuracy rate. Using CNN makes the classification process more intuitive because the network architecture takes responsibility for selecting the relevant training features. Furthermore, the classification results can be visualized on road maps, which can help road monitoring agencies assess road conditions and schedule maintenance activities for a particular road. / <p>Due to unforeseen circumstances the seminar was postponed from May 7 to 28, as duly stated in the new posting page.</p> Gravel roads road maintenance convolutional neural network (CNN) SVM decision trees ensemble bagged trees GoogLeNet ResNet50 ResNet18 sound analysis Infrastructure Engineering Infrastrukturteknik Computer and Information Sciences Data- och informationsvetenskap
373	Thermal Imaging-Based Instance Segmentation for Automated Health Monitoring of Steel Ladle Refractory Lining / Infraröd-baserad Instanssegmentering för Automatiserad Övervakning av Eldfast Murbruk i Stålskänk Bråkenhielm, Emil, Drinas, Kastrati January 2022 (has links) Equipment and machines can be exposed to very high temperatures in the steel mill industry. One particularly critical part is the ladles used to hold and pour molten iron into mouldings. A refractory lining is used as an insulation layer between the outer steel shell and the molten iron to protect the ladle from the hot iron. Over time, or if the lining is not completely cured, the lining wears out or can potentially fail. Such a scenario can lead to a breakout of molten iron, which can cause damage to equipment and, in the worst case, workers. Previous work analyses how critical areas can be identified in a proactive matter. Using thermal imaging, the failing spots on the lining could show as high-temperature areas on the outside steel shell. The idea is that the outside temperature corresponds to the thickness of the insulating lining. The detection of these spots is identified when temperatures over a given threshold are registered within the thermal camera's field of view. The images must then be manually analyzed over time, to follow the progression of a detected spot. The existing solution is also prone to the background noise of other hot objects. This thesis proposes an initial step to automate monitoring the health of refractory lining in steel ladles. The report will investigate the usage of Instance Segmentation to isolate the ladle from its background. Thus, reducing false alarms and background noise in an autonomous monitoring setup. The model training is based on Mask R-CNN on our own thermal images, with pre-trained weights from visual images. Detection is done on two classes: open or closed ladle. The model proved reasonably successful on a small dataset of 1000 thermal images. Different models were trained with and without augmentation, pre-trained weights as well multi-phase fine-tuning. The highest mAP of 87.5\% was achieved on a pre-trained model with image augmentation without fine-tuning. Though it was not tested in production, temperature readings could lastly be extracted on the segmented ladle, decreasing the risk of false alarms from background noise. Computer Vision Deep Learning Thermal Imaging Instance Segmentation Mask R-CNN Steel Ladle Breakout Prevention
374	Transformer Based Object Detection and Semantic Segmentation for Autonomous Driving Hardebro, Mikaela, Jirskog, Elin January 2022 (has links) The development of autonomous driving systems has been one of the most popular research areas in the 21st century. One key component of these kinds of systems is the ability to perceive and comprehend the physical world. Two techniques that address this are object detection and semantic segmentation. During the last decade, CNN based models have dominated these types of tasks. However, in 2021, transformer based networks were able to outperform the existing CNN approach, therefore, indicating a paradigm shift in the domain. This thesis aims to explore the use of a vision transformer, particularly a Swin Transformer, in an object detection and semantic segmentation framework, and compare it to a classical CNN on road scenes. In addition, since real-time execution is crucial for autonomous driving systems, the possibility of a parameter reduction of the transformer based network is investigated. The results appear to be advantageous for the Swin Transformer compared to the convolutional based network, considering both object detection and semantic segmentation. Furthermore, the analysis indicates that it is possible to reduce the computational complexity while retaining the performance. Computer Vision Autonomous Driving Machine Learning Transformers Swin CNN Object Detection Semantic Segmentation Grad-CAM PCA Mean Attention Distance
375	Embedded Implementation of Lane Keeping Functionality Using CNN Bark, Filip January 2018 (has links) The interest in autonomous vehicles has recently increased and as a consequence many companies and researchers have begun working on their own solutions to many of the issues that ensue when a car has to handle complicated decisions on its own. This project looks into the possibility of relegating as many decisions as possible to only one sensor and engine control unit (ECU) — in this work, by letting a Raspberry Pi with a camera attached control a vehicle following a road. To solve this problem, image processing, or more specifically, machine learning’s convolutional neural networks (CNN) are utilized to steer a car by monitoring the path with a single camera. The proposed CNN is designed and implemented using a machine learning library for Python known as Keras. The design of the network is based on the famous Lenet, but has been downscaled to increase computation speed and to reduce memory size while still maintaining a sufficient accuracy. The network was run on the ECU, which in turn was fastened to a RC car together with the camera. For control purposes wires were soldered to the remote controller and connected to the Raspberry Pi. As concerns steering, a simple bang-bang controller was implemented. Glass box testing was used to assess the effectiveness of the code, and to guarantee a continuous evaluation of the results. To satisfy the network’s requirements in terms of both accuracy and computation speed larger experiments were performed. The final experiments showed that the network achieved sufficient accuracy and performance to steer the prototype car in real time tasks, such as following model roads and stopping at the end of the path, as planned. This shows that despite being small with moderate accuracy, this CNN can handle the task of lane-keeping using only the data of one single camera. Since the CNN could do this while running on a small computer such as the Raspberry Pi, it has been observed that using a CNN for a lane-keeping algorithm in an embedded system looks promising. / På senare tid så har intresset angående självkörande bilar ökat. Detta har lett till att många företag och forskare har börjat jobbat på sina egna lösningar till den myriad av problem som upstår när en bil behöver ta komplicerade beslut på egen hand. Detta projekt undersöker möjligheten att lämna så många av dessa beslut som möjligt till en enda sensor och processor. I detta fall så blir det en Raspberry Pi (RPI) och en kamera som sätts på en radiostyrd bil och skall följa en väg. För att implementera detta så används bildbehandling, eller mer specifikt, convolutional neural networks (CNN) från maskininlärning för att styra bilen med en enda kamera. Det utvecklade nätverket är designat och implementerat med ett bibliotek för maskininlärning i Python som kallas för Keras. Nätverkets design är baserat på det berömda Lenet men den har skalats ner för att öka prestandan och minska storleken som nätverket tar men fortfarande uppnå en anständing träffsäkerhet. Nätverket körs på RPIn, vilken i sin tur är fastsatt på en radiostyrd bil tillsammans med kameran. Kablar har kopplats och blivit lödda mellan RPIn och handkontrollen till radiostyrda bilen så att RPIn kan styra bilen. Själva styrningen lämnats åt en simpel "Bang Bang controller". Utvärdering av nätvärket och prototypen utfördes löpande under projektets gång, enhetstester gjordes enligt glasboxmetoden för att testa och verifiera olika delar av koden. Större experiment gjordes för att säkerställa att nätverket presterar som förväntat i olika situationer. Det slutgiltiga experimentet fastställde att nätverket uppfyller en acceptabel träffsäkerhet och kan styra prototypen utan problem när denne följer olika vägar samt att den kan stanna i de fall den behöver. Detta visar att trots den begränsade storleken på nätverket så kunde det styra en bil baserat på datan från endast en sensor. Detta var dessutom möjligt när man körde nätverket på en liten och svag dator som en RPI, detta visar att CNN var kraftfulla nog i det här fallet. Lane Keeping Convolutional Neural Network Machine Learning Computer Vision Körfälthållning CNN Maskininlärning datorsyn Computer and Information Sciences Data- och informationsvetenskap Elektroteknik och elektronik
376	An Embedded System for Classification and Dirt Detection on Surgical Instruments Hallgrímsson, Guðmundur January 2019 (has links) The need for automation in healthcare has been rising steadily in recent years, both to increase efficiency and for freeing educated workers from repetitive, menial, or even dangerous tasks. This thesis investigates the implementation of two pre-determined and pre-trained convolutional neural networks on an FPGA for the classification and dirt detection of surgical instruments in a robotics application. A good background on the inner workings and history of artificial neural networks is given and expanded on in the context of convolutional neural networks. The Winograd algorithm for computing convolutional operations is presented as a method for increasing the computational performance of convolutional neural networks. A selection of development platform and toolchains is then made. A high-level design of the overall system is explained, before details of the high-level synthesis implementation of the dirt detection convolutional neural network are shown. Measurements are then made on the performance of the high-level synthesis implementation of the various blocks needed for convolutional neural networks. The main convolutional kernel is implemented both by using the Winograd algorithm and the naive convolution algorithm and comparisons are made. Finally, measurements on the overall performance of the end-to-end system are made and conclusions are drawn. The final product of the project gives a good basis for further work in implementing a complete system to handle this functionality in a manner that is both efficient in power and low in latency. Such a system would utilize the different strengths of general-purpose sequential processing and the parallelism of an FPGA and tie those together in a single system. / Behovet av automatisering inom vård och omsorg har blivit allt större de senaste åren, både vad gäller effektivitet samt att befria utbildade arbetare från repetitiva, enkla eller till och med farliga arbetsmoment. Den här rapporten undersöker implementeringen av två tidigare för-definierade och för-tränade faltade neurala nätverk på en FPGA, för att klassificera och upptäcka föroreningar på kirurgiska verktyg. En bra bakgrund på hur neurala nätverk fungerar, och deras historia, presenteras i kontexten faltade neurala nätverk. Winograd algoritmen, som används för att beräkna faltningar, beskrivs som en metod med syfte att öka beräkningsmässig prestanda. Val av utvecklingsplattform och verktyg utförs. Systemet beskrivs på en hög nivå, innan detaljer om hög-nivå-syntesimplementeringen av förorenings-detekterings-nätverket visas. Mätningar görs sedan av de olika bygg-blockens prestanda. Kärnkoden med faltnings-algoritmen implementeras både med Winograd-algoritmen och med den traditionella, naiva, metoden, och utfallet för bägge metoderna jämförs. Slutligen utförs mätningar på hela systemets prestanda och slutsatser dras därav. Projektets slutprodukt kan användas som en bra bas för vidare utveckling av ett komplett system som både är effektivt angående effektförbrukning och har bra prestanda, genom att knyta ihop styrkan hos traditionella sekventiella processorer med parallelismen i en FPGA till ett enda system. Neural Network CNN FPGA PetaLinux Winograd High-level Synthesis Neuralt nätverk Faltade neurala nätverk FPGA PetaLinux Winograd Hög-nivå syntes Elektroteknik och elektronik
377	Spatio-temporal prediction of residential burglaries using convolutional LSTM neural networks Holm, Noah, Plynning, Emil January 2018 (has links) The low amount solved residential burglary crimes calls for new and innovative methods in the prevention and investigation of the cases. There were 22 600 reported residential burglaries in Sweden 2017 but only four to five percent of these will ever be solved. There are many initiatives in both Sweden and abroad for decreasing the amount of occurring residential burglaries and one of the areas that are being tested is the use of prediction methods for more efficient preventive actions. This thesis is an investigation of a potential method of prediction by using neural networks to identify areas that have a higher risk of burglaries on a daily basis. The model use reported burglaries to learn patterns in both space and time. The rationale for the existence of patterns is based on near repeat theories in criminology which states that after a burglary both the burgled victim and an area around that victim has an increased risk of additional burglaries. The work has been conducted in cooperation with the Swedish Police authority. The machine learning is implemented with convolutional long short-term memory (LSTM) neural networks with max pooling in three dimensions that learn from ten years of residential burglary data (2007-2016) in a study area in Stockholm, Sweden. The model's accuracy is measured by performing predictions of burglaries during 2017 on a daily basis. It classifies cells in a 36x36 grid with 600 meter square grid cells as areas with elevated risk or not. By classifying 4% of all grid cells during the year as risk areas, 43% of all burglaries are correctly predicted. The performance of the model could potentially be improved by further configuration of the parameters of the neural network, along with a use of more data with factors that are correlated to burglaries, for instance weather. Consequently, further work in these areas could increase the accuracy. The conclusion is that neural networks or machine learning in general could be a powerful and innovative tool for the Swedish Police authority to predict and moreover prevent certain crime. This thesis serves as a first prototype of how such a system could be implemented and used. crime prediction crime forecasting residential burglary deep convolutional neural network CNN long short-term memory LSTM recurrent neural network Other Civil Engineering Annan samhällsbyggnadsteknik
378	Accelerating CNN on FPGA : An Implementation of MobileNet on FPGA Shen, Yulan January 2019 (has links) Convolutional Neural Network is a deep learning algorithm that brings revolutionary impact on computer vision area. One of its applications is image classification. However, problem exists in this algorithm that it involves huge number of operations and parameters, which limits its possibility in time and resource restricted embedded applications. MobileNet, a neural network that uses separable convolutional layers instead of standard convolutional layers, largely reduces computational consumption compared to traditional CNN models. By implementing MobileNet on FPGA, image classification problems could be largely accelerated. In this thesis, we have designed an accelerator block for MobileNet. We have implemented a simplified MobileNet on Xilinx UltraScale+ Zu104 FPGA board with 64 accelerators. We use the implemented MobileNet to solve a gesture classification problem. The implemented design works under 100MHz frequency. It shows a 28.4x speed up than CPU (Intel(R) Pentium(R) CPU G4560 @ 3.50GHz), and a 6.5x speed up than GPU (NVIDIA GeForce 940MX 1.004GHz). Besides, it is a power efficient design. Its power consumption is 4.07w. The accuracy reaches 43% in gesture classification. / CNN-Nätverk är en djupinlärning algoritm som ger revolutionerande inverkan på datorvision, till exempel, bildklassificering. Det finns emellertid problem i denna algoritm att det innebär ett stort antal operationer och parametrar, vilket begränsar möjligheten i tidsbegränsade och resursbegränsade inbäddade applikationer. MobileNet, ett neuralt nätverk som använder separerbara convolution lager i stället för standard convolution lager, minskar i stor utsträckning beräkningsmängder än traditionella CNN-modeller. Genom att implementera MobileNet på FPGA kan problem med bildklassificering accelereras i stor utsträckning. Vi har utformat ett acceleratorblock för MobileNet. Vi har implementerat ett förenklat MobileNet på Xilinx UltraScale + Zu104 FPGA-kort med 64 acceleratorer. Vi använder det implementerade MobileNet för att lösa ett gestklassificeringsproblem. Implementerade designen fungerar under 100MHzfrekvens. Den visar en hastighet på 28,4x än CPU (Intel (R) Pentium (R) CPU G4560 @ 3,50 GHz) och en 6,5x snabbare hastighet än GPU (NVIDIA GeForce 940MX 1,004GHz). Det är en energieffektiv design. Strömförbrukningen är 4,07w. Noggrannheten når 43% i gestklassificering. CNN FPGA acceleration Deep Learning MobileNet Image classification Computer vision Elektroteknik och elektronik Computer and Information Sciences Data- och informationsvetenskap
379	Strategies for the Characterization and Virtual Testing of SLM 316L Stainless Steel Hendrickson, Michael Paul 02 August 2023 (has links) The selective laser melting (SLM) process allows for the control of unique part form and function characteristics not achievable with conventional manufacturing methods and has thus gained interest in several industries such as the aerospace and biomedical fields. The fabrication processing parameters selected to manufacture a given part influence the created material microstructure and the final mechanical performance of the part. Understanding the process-structure and structure-performance relationships is very important for the design and quality assurance of SLM parts. Image based analysis methods are commonly used to characterize material microstructures, but are very time consuming, traditionally requiring manual segmentation of imaged features. Two Python-based image analysis tools are developed here to automate the instance segmentation of manufacturing defects and subgranular cell features commonly found in SLM 316L stainless steel (SS) for quantitative analysis. A custom trained mask region-based convolution neural network (Mask R-CNN) model is used to segment cell features from scanning electron microscopy (SEM) images with an instance segmentation accuracy nearly identical to that of a human researcher, but about four orders of magnitude faster. The defect segmentation tool uses techniques from the OpenCV Python library to identify and segment defect instances from optical images. A melt pool structure generation tool is also developed to create custom melt-pool geometries based on a few user inputs with the ability to create functionally graded structures for use in a virtual testing framework. This tool allows for the study of complex melt-pool geometries and graded structures commonly seen in SLM parts and is applied to three finite element analyses to investigate the effects of different melt-pool geometries on part stress concentrations. / Master of Science / Recent advancements in additive manufacturing (AM) processes like the selective laser melting (SLM) process are revolutionizing the way many products are manufactured. The geometric form and material microstructure of SLM parts can be controlled by manufacturing settings, referred to as fabrication processing parameters, in ways not previously possible via conventional manufacturing techniques such as machining and casting. The improved geometric control of SLM parts has enabled more complex part geometries as well as significant manufacturing cost savings for some parts. With improved control over the material microstructure, the mechanical performance of SLM parts can be finely tailored and optimized for a particular application. Complex functionally graded materials (FGM) can also easily be created with the SLM process by varying the fabrication processing parameters spatially within the manufactured part to improve mechanical performance for a desired application. The added control offered by the SLM process has created a need for understanding how changes in the fabrication processing parameters affect the material structure, and in turn, how the produced structure affects the mechanical properties of the part. This study presents three different tools developed for the automated characterization of SLM 316L stainless steel (SS) material structures and the generation of realistic material structures for numerical simulation of mechanical performance. A defect content tool is presented to automatically identify and create binary segmentations of defects in SLM parts, consisting of small air pockets within the volume of the parts, from digital optical images. A machine learning based instance segmentation tool is also trained on a custom data set and used to measure the size of nanoscale cell features unique to 316L (SS) and some other metal alloys processed with SLM from scanning electron microscopy (SEM) images. Both these tools automate the laborious process of segmenting individual objects of interest from hundreds or thousands of images and are shown to have an accuracy very close to that of manually produced results from a human. The results are also used to analyze three different samples produced with different fabrication processing parameters which showed similar process-structure relationships with other studies. The SLM structure generation tool is developed to create melt pool structures similar to those seen in SLM parts from the successive melting and solidification of material from the laser scanning path. This structural feature is unique to AM processes such as SLM, and the example test cases investigated in this study shows that changes in the melt pool structure geometry have a measurable effect, slightly above 10% difference, on the stress and strain response of the material when a tensile load is applied. The melt pool structure generation tool can create complex geometries capable of varying spatially to create FGMs from a few user inputs, and when applied to existing simulation methods for SLM parts, offers improved estimates for the mechanical response of SLM parts. Selective Laser Melting Functionally Graded Materials 316L Stainless Steel Additive Manufacturing Material Characterization Mast R-CNN Instance Segmentation Finite Element Analysis
380	Image-based Machine Learning Applications in Nitrate Sensor Quality Assessment and Inkjet Print Quality Stability Qingyu Yang (6634961) 21 December 2022 (has links) <p>An on-line quality assessment system in the industry is essential to prevent artifacts and guide manufacturing processes. Some well-developed systems can diagnose problems and help control the output qualities. However, some of the conventional methods are limited in time consumption and cost of expensive human labor. So, more efficient solutions are needed to guide future decisions and improve productivity. This thesis focuses on developing two image-based machine learning systems to accelerate the manufacturing process: one is to benefit nitrate sensor fabrication, and the other is to help image quality control for inkjet printers.</p> <p><br></p> <p>In the first work, we propose a system for predicting the nitrate sensor's performance based on non-contact images. Nitrate sensors are commonly used to reflect the nitrate levels of soil conditions in agriculture. In a roll-to-roll system, for manufacturing thin-film nitrate sensors, varying characteristics of the ion-selective membrane on screen-printed electrodes are inevitable and affect sensor performance. It is essential to monitor the sensor performance in real-time to guarantee the quality of the sensor. We also develop a system for predicting the sensor performance in on-line scenarios and making the neural networks efficiently adapt to the new data.</p> <p><br></p> <p>Streaks are the number one image quality problem in inkjet printers. In the second work, we focus on developing an efficient method to model and predict missing jets, which is the main contributor to streaks. In inkjet printing, the missing jets typically increase over printing time, and the print head needs to be purged frequently to recover missing jets and maintain print quality. We leverage machine learning techniques for developing spatio-temporal models to predict when and where the missing jets are likely to occur. The prediction system helps the inkjet printers make more intelligent decisions during customer jobs. In addition, we propose another system that will automatically identify missing jet patterns from a large-scale database that can be used in a diagnostic system to identify potential failures.</p> Image Processing Machine Learning Computer Vision CNN Unsupervised Learning Image Retrieval Online Deep Learning Online Learning System Image Quality Assessment

Search results