Global ETD Search

321	Weed Detection in UAV Images of Cereal Crops with Instance Segmentation Gromova, Arina January 2021 (has links) Modern weeding is predominantly carried out by spraying whole fields with toxic pesticides, a process that accomplishes the main goal of eliminating weeds, but at a cost of the local environment. Weed management systems based on AI solutions enable more targeted actions, such as site-specific spraying, which is essential in reducing the need for chemicals. To introduce sustainable weeding in Swedish farmlands, we propose implementing a state-of-the-art Deep Learning (DL) algorithm capable of instance segmentation for remote sensing of weeds, before coupling an automated sprayer vehicle. Cereals have been chosen as the target crop in this study as they are among the most commonly cultivated plants in Northern Europe. We used Unmanned Aerial Vehicles (UAV) to capture images from several fields and trained a Mask R-CNN computer vision framework to accurately recognize and localize unique instances of weeds among plants. Moreover, we evaluated three different backbones (ResNet-50, ResNet101, ResNeXt-101) pre-trained on the MS COCO dataset and through transfer learning tuned the model towards our classification task. Some well-reported limitations in building an accurate model include occlusion among instances as well as the high similarity between weeds and crops. Our system handles these challenges fairly well. We achieved a precision of 0.82, recall of 0.61, and F1 score of 0.70. Still, improvements can be made in data preparation and pre-processing to further improve the recall rate. All and all, the main outcome of this study is the system pipeline which, together with post-processing using geographical field coordinates, could serve as a detector for half of the weeds in an end-to-end weed removal system. / Site-specific Weed Control in Swedish Agriculture computer vision deep learning CNN mask R-CNN weed detection Agricultural Science Jordbruksvetenskap
322	Single image scene-depth estimation based on self-supervised deep learning : For perception in autonomous heavy duty vehicles Piven, Yegor January 2021 (has links) Depth information is a vital component for perception of the 3D structure of vehicle's surroundings in the autonomous scenario. Ubiquity and relatively low cost of camera equipment make image-based depth estimation very attractive compared to employment of the specialised sensors. Classical image-based depth estimation approaches typically rely on multi-view geometry, requiring alignment and calibration between multiple image sources, which is both cumbersome and error-prone. In contrast, single images lack both temporal information and multi-view correspondences. Also, depth information is lost in projection from the 3D world to a 2D image during the image formation process, making single image depth estimation problem ill-posed. In recent years, Deep Learning-based approaches have been widely proposed for single image depth estimation. The problem is typically tackled in a supervised manner, requiring access to image data with pixel-wise depth information. Acquisition of large amounts of such data that is both varied and accurate, is a laborious and costly task. As an alternative, a number of self-supervised approaches exist showing that it is possible to train models performing single image depth estimation using synchronized stereo image-pairs or sequences of monocular images instead of depth labeled data. This thesis investigates the self-supervised approach utilizing sequences of monocular images, by training and evaluating one of the state-of-the-art methods on both the popular public KITTI dataset and the data of the host company, Scania. A number of extensions are implemented for the method of choice, namely addition of weak supervision with velocity data, employment of geometry consistency constraints and incorporation of a self-attention mechanism. Resulting models showed good depth estimation performance for major components of the scene, such as nearby roads and buildings, however struggled at further ranges, and with dynamic objects and thin structures. Minor qualitative and quantitative improvements in performance were observed with introduction of geometry consistency loss and mask, as well as the self-attention mechanism. Qualitative improvements included the models' enhanced ability to identify clearer object boundaries and better distinguish objects from their background. Geometry consistency loss also proved to be informative in low-texture regions of the image and resolved artifacting behaviour that was observed when training models on Scania's data. Incorporation of the supervision of predicted translations using velocity data has proved to be effective at enforcing the metric scale of the depth network's predictions. However, a risk of overfitting to such supervision was observed when training on Scania's data. In order to resolve this issue, velocity-supervised fine-tuning procedure is proposed as an alternative to velocity-supervised training from scratch, resolving the observed overfitting issue while still enabling the model to learn the metric scale. Proposed fine-tuning procedure was effective even when training models on the KITTI dataset, where no overfitting was observed, suggesting its general applicability. computer vision machine learning deep learning depth estimation single image depth estimation
323	Deep neural network for object classification and optimization algorithms for 3D positioning in Ultrasonic Sensor Array Zhang, Hui January 2021 (has links) Ultrasonic sensors are commonly used in automobiles to assist driving maneuvers, e.g., parking, because of their cost-effectiveness and robustness. This thesis investigated the feasibility of using an Ultrasonic Sensor Array to locate the 3D position of an object and also using the measurements from the sensor array to train a Convolutional Neural Network (CNN) to classify the objects. A simulated Ultrasonic Sensor array was built in COMSOL Multiphysics. The simulation of ultrasound used Ray Tracing technology to track the path of ultrasound rays. The readouts from the sensor array are used to formulate an optimization problem to address the 3D positioning of the object. We investigated the performance of two optimization methods in terms of the accuracy of the prediction and the efficiency of solving the problem. The average mean absolute error (MAE) and average mean squared error (MSE) of the Nelder-Mead method (without constraints) are 2.66 mm and 12.79 mm2 respectively, the average running time to predict one 3D position is 97.62 ms. The average MAE and average MSE of Powell’s method (with constraints) are 2.84 mm and 23.66 mm2 respectively, average running time to predict one 3D position is 84.68 ms. The result of Powell’s method (without constraints) is much worse than the above two, its average MAE and MSE are 24.93 mm and 7559.46 mm2, average running time is 238.30 ms. The readouts from the sensor array are also used to build eight different datasets of which the data structures are different combinations of the information from the readouts. Each of these eight data sets is used to train a CNN, and the classification accuracy of each CNN indicates that how well the data structure represents the objects. The results showed that the CNN trained by stacked time array 5×5×3 had the best classification accuracy among eight datasets, the classification accuracy on the test set is 85.05%. ultrasonic sensor simulation CNN optimization localization object classification Computer Sciences Datavetenskap (datalogi) Signal Processing Signalbehandling
324	Optimal Optimizer Hyper-Parameters for 2D to 3D Reconstruction Teki, Sai Ajith January 2021 (has links) 2D to 3D reconstruction is an ill-posed problem in the field of Autonomous Robot Navigation. Many practitioners are tend to utilize the enormous success of Deep Learning techniques like CNN, ANN etc to solve tasks related to this 2D to 3D reconstruction. Generally, every deep learning model involves implementation of different optimizers related to the tasks to lower the possible negativity in its results and selection of hyper parameter values for these optimizers during the process of training the model with required dataset.Selection of this optimizer hyper-parameters requires in-depth knowledge and trials and errors. So proposing optimal hyper parameters for optimizers results in no waste in computational resources and time.Hence solution for the selected task cab found easily. The main objective of this research is to propose optimal hyper parameter values of various deep learning optimizers related to 2D to 3D reconstruction and proposing best optimizer among them in terms of computational time and resources To achieve the goal of this study two research methods are used in our work. The first one is a Systematic Literature Review; whose main goal is to reveal the widely selected and used optimizers for 2D to 3D reconstruction model using 3D Deep Learning techniques.The second, an experimental methodology is deployed, whose main goal is to propose the optimal hyper parameter values for respective optimizers like Adam, SGD+Momentum, Adagrad, Adadelta and Adamax which are used in 3D reconstruction models. In case of the computational time, Adamax optimizer outperformed all other optimizers used with training time (1970min), testing time (3360 min), evaluation-1 (16 min) and evaluation-2 (14 min).In case of Average Point cloud points, Adamax outperformed all other optimizers used with Mean value of 28451.04.In case of pred->GT and GT->pred values , Adamax optimizer outperformed all other optimizers with mean values of 4.742 and 4.600 respectively. Point Cloud Images with respective dense cloud points are obtained as results of our experiment.From the above results,Adamax optimizer is proved to be best in terms of visualization of Point Cloud images with optimal hyper parameter values as below:Epochs : 1000 Learning Rate : 1e-2 Chunk size : 32 Batch size : 32. In this study,'Adamax' optimizer with optimal hyper para meter values and better Point Cloud Image is proven to be the best optimizer that can be used in a 2D to 3D reconstruction related task that deals with Point Cloud images optimizers hyperparameters 2D to 3D reconstruction point clouds robotic navigation deep learning
325	3D Instance Segmentation of Cluttered Scenes : A Comparative Study of 3D Data Representations Konradsson, Albin, Bohman, Gustav January 2021 (has links) This thesis provides a comparison between instance segmentation methods using point clouds and depth images. Specifically, their performance on cluttered scenes of irregular objects in an industrial environment is investigated. Recent work by Wang et al. [1] has suggested potential benefits of a point cloud representation when performing deep learning on data from 3D cameras. However, little work has been done to enable quantifiable comparisons between methods based on different representations, particularly on industrial data. Generating synthetic data provides accurate grayscale, depth map, and point cloud representations for a large number of scenes and can thus be used to compare methods regardless of datatype. The datasets in this work are created using a tool provided by SICK. They simulate postal packages on a conveyor belt scanned by a LiDAR, closely resembling a common industry application. Two datasets are generated. One dataset has low complexity, containing only boxes.The other has higher complexity, containing a combination of boxes and multiple types of irregularly shaped parcels. State-of-the-art instance segmentation methods are selected based on their performance on existing benchmarks. We chose PointGroup by Jiang et al. [2], which uses point clouds, and Mask R-CNN by He et al. [3], which uses images. The results support that there may be benefits of using a point cloud representation over depth images. PointGroup performs better in terms of the chosen metric on both datasets. On low complexity scenes, the inference times are similar between the two methods tested. However, on higher complexity scenes, MaskR-CNN is significantly faster. Deep Learning Computer Vision Point Cloud Depth Map 3D Instance Segmentation Cluttered Scenes
326	Improving Situational Awareness in Aviation: Robust Vision-Based Detection of Hazardous Objects Levin, Alexandra, Vidimlic, Najda January 2020 (has links) Enhanced vision and object detection could be useful in the aviation domain in situations of bad weather or cluttered environments. In particular, enhanced vision and object detection could improve situational awareness and aid the pilot in environment interpretation and detection of hazardous objects. The fundamental concept of object detection is to interpret what objects are present in an image with the aid of a prediction model or other feature extraction techniques. Constructing a comprehensive data set that can describe the operational environment and be robust for weather and lighting conditions is vital if the object detector is to be utilised in the avionics domain. Evaluating the accuracy and robustness of the constructed data set is crucial. Since erroneous detection, referring to the object detection algorithm failing to detect a potentially hazardous object or falsely detecting an object, is a major safety issue. Bayesian uncertainty estimations are evaluated to examine if they can be utilised to detect miss-classifications, enabling the use of a Bayesian Neural Network with the object detector to identify an erroneous detection. The object detector Faster RCNN with ResNet-50-FPN was utilised using the development framework Detectron2; the accuracy of the object detection algorithm was evaluated based on obtained MS-COCO metrics. The setup achieved a 50.327 % AP@[IoU=.5:.95] score. With an 18.1 % decrease when exposed to weather and lighting conditions. By inducing artificial artefacts and augmentations of luminance, motion, and weather to the images of the training set, the AP@[IoU=.5:.95] score increased by 15.6 %. The inducement improved the robustness necessary to maintain the accuracy when exposed to variations of environmental conditions, which resulted in just a 2.6 % decrease from the initial accuracy. To fully conclude that the augmentations provide the necessary robustness for variations in environmental conditions, the model needs to be subjected to actual image representations of the operational environment with different weather and lighting phenomena. Bayesian uncertainty estimations show great promise in providing additional information to interpret objects in the operational environment correctly. Further research is needed to conclude if uncertainty estimations can provide necessary information to detect erroneous predictions. Object Detection Custom Data Set Faster RCNN ResNet-50-FPN Aviation Situational Awareness
327	Image-to-Image Translation for Improvement of Synthetic Thermal Infrared Training Data Using Generative Adversarial Networks Hamrell, Hanna January 2021 (has links) Training data is an essential ingredient within supervised learning, yet time con-suming, expensive and for some applications impossible to retrieve. Thus it isof interest to use synthetic training data. However, the domain shift of syntheticdata makes it challenging to obtain good results when used as training data fordeep learning models. It is therefore of interest to refine synthetic data, e.g. using image-to-image translation, to improve results. The aim of this work is to compare different methods to do image-to-image translation of synthetic training data of thermal IR-images using GANs. Translation is done both using synthetic thermal IR-images alone, as well as including pixelwise depth and/or semantic information. To evaluate, a new measure based on the Frechét Inception Distance, adapted to work for thermal IR-images is proposed. The results show that the model trained using IR-images alone translates the generated images closest to the domain of authentic thermal IR-images. The training where IR-images are complemented by corresponding pixelwise depth data performs second best. However, given more training time, inclusion of depth data has the potential to outperform training withirdata alone. This gives a valuable insight on how to best translate images from the domain of synthetic IR-images to that of authentic IR-images, which is vital for quick and low cost generation of training data for deep learning models. machine learning deep learning image-to-image translation image processing generative adversarial networks infrared images computer vision
328	Development of machine learning models for object identification of parasite eggs using microscopy Larsson, Joel, Hedberg, Rasmus January 2020 (has links) Over one billion people in developing countries are afflicted by parasitic infections caused by soil-transmitted helminths. These infections are treatable with cheap and safe medicine that is widely available. However, diagnosis of these infections has proven to be a bottleneck by the fact that it is time-consuming, requires expensive equipment and trained personnel to be consistent and accurate. This study aimed to investigate the viability and performance of five machine learning models and a 'modular neural network' approach to localize and classify the following parasite eggs in microscopic images: Ascaris lumbricoides, Trichuris trichuria, Hookworm and Schistosoma mansoni. These models were implemented and evaluated on the Nvidia Jetson AGX Xavier to establish that they fulfilled the specifications of 95\% specificity and sensitivity, but also a speed requirement of 40000 images per 24 hours. The results show that R-FCN ResNet101 was the best model produced in this study, which performed the best on average. However, it did not fulfill the specifications entirely but is still considered a success due to being an improvement to the current implementation at Etteplan. Evaluation of the modular neural network approach would require further investigation to verify the performance of the system, but the results indicate it could be a possible improvement to the off-the-shelf machine learning models. To conclude, the study showed that the data and data infrastructure provided by Etteplan has proven to be a very powerful tool in training machine learning models to classify and localize parasite eggs in stool samples. However, expansion of the data to reduce the imbalance between the representations of the classes but also include more patient information could improve the training and evaluation process of the models. AI computer vision machine learning parasite eggs soil-transmitted helminths neglected tropical diseases
329	Comparing CNN methods for detection and tracking of ships in satellite images / Jämförelse av CNN-baserad machine learning för detektion och spårning av fartyg i satellitbilder Torén, Rickard January 2020 (has links) Knowing where ships are located is a key factor to support safe maritime transports, harbor management as well as preventing accidents and illegal activities at sea. Present international solutions for geopositioning in the maritime domain exist such as the Automatic Identification System (AIS). However, AIS requires the ships to constantly transmit their location. Real time imaginary based on geostationary satellites has recently been proposed to complement the existing AIS system making locating and tracking more robust. This thesis investigated and compared two machine learning image analysis approaches – Faster R-CNN and SSD with FPN – for detection and tracking of ships in satellite images. Faster R-CNN is a two stage model which first proposes regions of interest followed by detection based on the proposals. SSD is a one stage model which directly detects objects with the additional FPN for better detection of objects covering few pixels. The MAritime SATellite Imagery dataset (MASATI) was used for training and evaluation of the candidate models with 5600 images taken from a wide variety of locations. The TensorFlow Object Detection API was used for the implementation of the two models. The results for detection show that Faster R-CNN achieved a 30.3% mean Average Precision (mAP) while SSD with FPN achieved only 0.0005% mAP on the unseen test part of the dataset. This study concluded that Faster R-CNN is a candidate for identifying and tracking ships in satellite images. SSD with FPN seems less suitable for this task. It is also concluded that the amount of training and choice of hyper-parameters impacted the results. Faster R-CNN SSD FPN object detection tracking AIS ship satellite image
330	Path Planning and Path Following for an Autonomous GPR Survey Robot Meedendorp, Maurice January 2022 (has links) Ground Penetrating Radar (GPR) is a tool for mapping the subsurface in a non-invasive way. GPR surveys are currently carried out manually; a time-consuming, tedious and sometimes dangerous task. This report presents the high-level software components for an autonomous unmanned ground vehicle to conduct GPR surveys. The hardware system is a four-wheel drive, skid steering, battery operated vehicle with integrated GPR equipment. Autonomous surveys are conducted using lidar-inertial odometry with robust path planning, path following and obstacle avoidance capabilities. Evaluation shows that the vehicle is able to autonomously execute a planned survey with high accuracy and stops before collisions occur. This system enables high-frequency surveys to monitor the evolution of an area over time, allows one operator to monitor multiple surveys at once, and facilitates future research into novel survey patterns that are difficult to follow manually robotics gpr autonomous robot ros slam path follow path plan obstacle avoidance Robotics Robotteknik och automation

Search results