Global ETD Search

281	An Autonomous Intelligent Robotic Wheelchair to Assist People in Need: Standing-up, Turning-around and Sitting-down Papadakis Ktistakis, Iosif January 2018 (has links) No description available. Computer Engineering Computer Science Mechanical Engineering Robotics Assistive Robotics Wheelchairs Human Machine Interaction Speech Recognition Pose Estimation Active Participation System Integration Prototype Robotic Arm Decision Making
282	Robust Localization of Research Concept Vehicle (RCV) in Large Scale Environment / Robust lokalisering av Research Concept Vehicle (RCV) i storskalig miljö Raghuram, Anchit January 2018 (has links) Autonomous vehicles in the recent era are robust vehicles that have the capability to drive themselves without human involvement using sensors and Simultaneous Localization and Mapping algorithms, which helps the vehicle gain an understanding of its environment while driving with the help of laser scanners (Velodyne), IMU and GPS to collect data and solidify the foundation for locating itself in an unknown environment. Various methods were studied and have been tested for increasing the efficiency of registration and optimization over the years but the implementation of the NDT library for mapping and localization have been found to be fast and more accurate as compared to conventional methods. The objective of this thesis is to ascertain a robust method of pose estimation of the vehicle by combining data from the laser sensor, with the data from the IMU and GPS receiver on the vehicle. The initial estimate prediction of the position is achieved by generating a 3D map using the Normal Distribution Transform and estimating the position using the NDT localization algorithm and the GPS data collected by driving the vehicle in an external environment. The results presented explain and verify the hypothesis being stated and shows the comparison of the localization algorithm implemented with the GPS receiver data available on the vehicle while driving. / Autonoma fordon har på senare tid utvecklats till robusta fordon som kan köra sig själva utan hjälp av en människa, detta har möjliggjorts genom användandet av sensorer och algoritmer som utför lokalisering och kartläggning samtidigt (SLAM). Dessa sensorer och algoritmer hjälper fordonet att förstå dess omgivning medan det kör och tillsammans med laser skanners (Velodyne), IMU'er och GPS läggs grunden för att kunna utföra lokalisering i en okänd miljö. Ett flertal metoder har studerats och testats för att förbättra effektiviteten av registrering och optimering under åren men implementationen av NDT biblioteket för kartläggning och lokalisering har visat sig att vara snabbt och mer exakt jämfört med konventionella metoder. Målet med detta examensarbete är att hitta en robust metod för uppskatta pose genom att kombinera data från laser sensorn, en uppskattning av den ursprungliga positionen som fås genom att generera en 3D karta med hjälp av normalfördelningstransformen och GPS data insamlad från körningar i en extern miljö. Resultaten som presenteras beskriver och verifierar den hypotes som läggs fram och visar jämförelsen av den implementerade lokaliseringsalgoritmen med GPS data tillgänglig på fordonet under körning. Localization Normal Distribution Transform Pose Fusion MCL Monte-Carlo Localization GPS Robust Localization SLAM Elektroteknik och elektronik
283	Unsupervised 3D Human Pose Estimation / Oövervakad mänsklig poseuppskattning i 3D Budaraju, Sri Datta January 2021 (has links) The thesis proposes an unsupervised representation learning method to predict 3D human pose from a 2D skeleton via a VAEGAN (Variational Autoencoder Generative Adversarial Network) hybrid network. The method learns to lift poses from 2D to 3D using selfsupervision and adversarial learning techniques. The method does not use images, heatmaps, 3D pose annotations, paired/unpaired 2Dto3D skeletons, 3D priors, synthetic 2D skeletons, multiview or temporal information in any shape or form. The 2D skeleton input is taken by a VAE that encodes it in a latent space and then decodes that latent representation to a 3D pose. The 3D pose is then reprojected to 2D for a constrained, selfsupervised optimization using the input 2D pose. Parallelly, the 3D pose is also randomly rotated and reprojected to 2D to generate a ’novel’ 2D view for unconstrained adversarial optimization using a discriminator network. The combination of the optimizations of the original and the novel 2D views of the predicted 3D pose results in a ’realistic’ 3D pose generation. The thesis shows that the encoding and decoding process of the VAE addresses the major challenge of erroneous and incomplete skeletons from 2D detection networks as inputs and that the variance of the VAE can be altered to get various plausible 3D poses for a given 2D input. Additionally, the latent representation could be used for crossmodal training and many downstream applications. The results on Human3.6M datasets outperform previous unsupervised approaches with less model complexity while addressing more hurdles in scaling the task to the real world. / Uppsatsen föreslår en oövervakad metod för representationslärande för att förutsäga en 3Dpose från ett 2D skelett med hjälp av ett VAE GAN (Variationellt Autoenkodande Generativt Adversariellt Nätverk) hybrid neuralt nätverk. Metoden lär sig att utvidga poser från 2D till 3D genom att använda självövervakning och adversariella inlärningstekniker. Metoden använder sig vare sig av bilder, värmekartor, 3D poseannotationer, parade/oparade 2D till 3D skelett, a priori information i 3D, syntetiska 2Dskelett, flera vyer, eller tidsinformation. 2Dskelettindata tas från ett VAE som kodar det i en latent rymd och sedan avkodar den latenta representationen till en 3Dpose. 3D posen är sedan återprojicerad till 2D för att genomgå begränsad, självövervakad optimering med hjälp av den tvådimensionella posen. Parallellt roteras dessutom 3Dposen slumpmässigt och återprojiceras till 2D för att generera en ny 2D vy för obegränsad adversariell optimering med hjälp av ett diskriminatornätverk. Kombinationen av optimeringarna av den ursprungliga och den nya 2Dvyn av den förutsagda 3Dposen resulterar i en realistisk 3Dposegenerering. Resultaten i uppsatsen visar att kodningsoch avkodningsprocessen av VAE adresserar utmaningen med felaktiga och ofullständiga skelett från 2D detekteringsnätverk som indata och att variansen av VAE kan modifieras för att få flera troliga 3D poser för givna 2D indata. Dessutom kan den latenta representationen användas för crossmodal träning och flera nedströmsapplikationer. Resultaten på datamängder från Human3.6M är bättre än tidigare oövervakade metoder med mindre modellkomplexitet samtidigt som de adresserar flera hinder för att skala upp uppgiften till verkliga tillämpningar. Computer Vision Projective Geometry Deep Learning Unsupervised Learning 3D Human Pose Estimation GAN AutoEncoder Hybrid Generative Model Self Supervision Computer and Information Sciences Data- och informationsvetenskap
284	New object grasp synthesis with gripper selection: process development Legrand, Tanguy January 2022 (has links) A fundamental aspect to consider in factories is the transportation of the items at differentsteps in the production process. Conveyor belts do a great to bring items from point A topoint B but to load the item onto a working station it can demands a more precise and,in some cases, delicate approach. Nowadays this part is mostly handled by robotic arms.The issue encountered is that a robot arm extremity, its gripper, cannot directly instinctivelyknow how to grip an object. It is usually up to a technician to configure how andwhere the gripper goes to grip an item.The goal of this thesis is to analyse a problem given by a company which is to find a wayto automate the grasp pose synthesis of a new object with the adapted gripper.This automatized process can be separated into two sub-problems.First, how to choose the adapted gripper for a new object.Second, how to find a grasp pose on the object, with the previously chosen gripper.In the problem given by the company, the computer-aided design (CAD) 3D model of theconcerned object is given. Also, the grasp shall always be done vertically, i.e., the grippercomes vertically to the object and the gripper does not rotate on the x and y axis. Thegripper for a new object is selected between two kinds of grippers: two-finger paralleljawgripper and three-finger parallel-jaw gripper. No dataset of objects is provided.Object grasping is a well researched subject, especially for 2 finger grippers. However,few research is done for the 3 finger grippers grasp pose synthesis, or for gripper comparison,which are key part of the studied problem.To answer the sub-problems mentioned above, machine learning will be used for the gripperselection and a grasp synthesis method will be used for the grasp pose finding. However,due to the lack of gripper comparison in the related work, a new approach needsto be created, which will be inspired by the findings in the literature about grasp posesynthesis in general.This approach will consist of two parts.First, for each gripper and each object combination are generated some grasp poses, eachassociated with a corresponding score. The scores are used to have an idea of the bestgripper for an object, the best score for each gripper indicating how good a grasp couldbe on the object with said gripper.Secondly, the objects with their associated best score for each gripper will be used astraining data for a machine learning algorithm that will assist in the choice of the gripper.This approach leads to two research questions:“How to generate grasps of satisfying quality for an object with a certain gripper?”“Is it possible to determine the best gripper for a new object via machine learning ?”The first question is answered by using mathematical operations on the point cloud representationof the objects, and a cost function (that will be used to attribute a score), whileithe second question is answered using machine learning classification and regression togain insight on how machine learning can learn to associate object proprieties to gripperefficiency.The found results show that the grasp generation with the chosen cost function givesgrasp poses that are similar to the grasp poses a human operator would choose, but themachine learning models seem unable to assess grasp quality, either with regression orclassification. grasps pose synthesis point cloud classification point cloud regression machine learning 3D model CAD model robotic grasping pick and place gripper selection Computer Sciences Datavetenskap (datalogi)
285	Dense Foot Pose Estimation From Images Sharif, Sharif January 2023 (has links) There is ongoing research into building dense correspondence between digital images of objects in the world and estimating the 3D pose of these objects. This is a difficult area to conduct research due to the lack of availability of annotated data. Annotating each pixel is too time-consuming. At the time of this writing, current research has managed to use neural networks to establish a dense pose estimation of human body parts (feet, chest, legs etc.). The aim of this thesis is to investigate if a model can be developed using neural networks to perform dense pose estimation on human feet. The data used in evaluating the model is generated using proprietary tools. Since this thesis is using a custom model and custom dataset, one model will be developed and tested with various experiments to gain an understanding of the different parameters that influence the model’s performance. Experiments showed that a model based on DeepLabV3 is able to achieve a dense pose estimation of feet with a mean error of 1.0cm. The limiting factor for a model’s ability to estimate a dense pose is based on the model’s ability to classify the pixels in an image accurately. It was also shown that discontinuous UV unwrapping greatly reduced the model’s dense pose estimation ability. The results from this thesis should be considered preliminary and need to be repeated multiple times to account for the stochastic nature of training neural networks. / Pågående forskning undersöker hur man kan skapa tät korrespondens mellan digitala bilder av objekt i världen och uppskatta de objektens 3D-pose. Detta är ett svårt område att forska inom på grund av bristen på tillgänglig annoterad data. Att annotera varje pixel är tidskrävande. Vid tiden för detta skrivande har aktuell forskning lyckats använda neurala nätverk för att etablera en tät pose-estimering av mänskliga kroppsdelar (fötter, bröst, ben osv.). Syftet med denna arbete är att undersöka om en modell kan utvecklas med hjälp av neurala nätverk för att utföra dense pose-estimering av mänskliga fötter. Data som används för att utvärdera modellen genereras med hjälp av proprietära verktyg. Eftersom denna arbete använder en anpassad modell och anpassad dataset kommer en modell att utvecklas och testas med olika experiment för att förstå de olika parametrarna som påverkar modellens prestanda. Experiment visade att en modell baserad på DeepLabV3 kan uppnå en dense pose-estimering av fötter med en medelfel på 1,0 cm. Den begränsande faktorn för en modells förmåga att uppskatta en dense pose baseras på modellens förmåga att klassificera pixlarna i en bild korrekt. Det visades också att oregelbunden UV-uppackning avsevärt minskade modellens förmåga att estimera dense pose. Resultaten från denna avhandling bör betraktas som preliminära och behöver upprepas flera gånger för att ta hänsyn till den stokastiska naturen hos träning av neurala nätverk. Dense Foot Pose Estimation Computer vision Deep Learning Dense Correspondence Uppskattning Av Tät Fotställning Datorseende Djupinlärning Tät Korrespondens Computer and Information Sciences Data- och informationsvetenskap
286	Deep Visual Inertial-Aided Feature Extraction Network for Visual Odometry : Deep Neural Network training scheme to fuse visual and inertial information for feature extraction / Deep Visual Inertial-stöttat Funktionsextraktionsnätverk för Visuell Odometri : Träningsalgoritm för djupa Neurala Nätverk som sammanför visuell- och tröghetsinformation för särdragsextraktion Serra, Franco January 2022 (has links) Feature extraction is an essential part of the Visual Odometry problem. In recent years, with the rise of Neural Networks, the problem has shifted from a more classical to a deep learning approach. This thesis presents a fine-tuned feature extraction network trained on pose estimation as a proxy task. The architecture aims at integrating inertial information coming from IMU sensor data in the deep local feature extraction paradigm. Specifically, visual features and inertial features are extracted using Neural Networks. These features are then fused together and further processed to regress the pose of a moving agent. The visual feature extraction network is effectively fine-tuned and is used stand-alone for inference. The approach is validated via a qualitative analysis on the keypoints extracted and also in a more quantitative way. Quantitatively, the feature extraction network is used to perform Visual Odometry on the Kitti dataset where the ATE for various sequences is reported. As a comparison, the proposed method, the proposed without IMU and the original pre-trained feature extraction network are used to extract features for the Visual Odometry task. Their ATE results and relative trajectories show that in sequences with great change in orientation the proposed system outperforms the original one, while on mostly straight sequences the original system performs slightly better. / Feature extraktion är en viktig del av visuell odometri (VO). Under de senaste åren har framväxten av neurala nätverk gjort att tillvägagångsättet skiftat från klassiska metoder till Deep Learning metoder. Denna rapport presenterar ett kalibrerat feature extraheringsnätverk som är tränat med posesuppskattning som en proxyuppgift. Arkitekturen syftar till att integrera tröghetsinformation som kommer från sensordata i feature extraheringsnätverket. Specifikt extraheras visuella features och tröghetsfeatures med hjälp av neurala nätverk. Dessa features slås ihop och bearbetas ytterligare för att estimera position och riktning av en rörlig kamera. Metoden har undersökts genom en kvalitativ analys av featurepunkternas läge men även på ett mer kvantitativt sätt där VO-estimering på olika bildsekvenser från KITTI-datasetet har jämförts. Resultaten visar att i sekvenser med stora riktningsförändringar överträffar det föreslagna systemet det ursprungliga, medan originalsystemet presterar något bättre på sekvenser som är mestadels raka. Feature extraction network Visual Odometry IMU Neural Network Pose estimation Feature extraction Visuell Odometri IMU Neuralt nätverk Poseuppskattning Computer Sciences Datavetenskap (datalogi)
287	Real-Time Visual Multi-Target Tracking in Realistic Tracking Environments White, Jacob Harley 01 May 2019 (has links) This thesis focuses on visual multiple-target tracking (MTT) from a UAV. Typical state-of-the-art multiple-target trackers rely on an object detector as the primary detection source. However, object detectors usually require a GPU to process images in real-time, which may not be feasible to carry on-board a UAV. Additionally, they often do not produce consistent detections for small objects typical of UAV imagery.In our method, we instead detect motion to identify objects of interest in the scene. We detect motion at corners in the image using optical flow. We also track points long-term to continue tracking stopped objects. Since our motion detection algorithm generates multiple detections at each time-step, we use a hybrid probabilistic data association filter combined with a single iteration of expectation maximization to improve tracking accuracy.We also present a motion detection algorithm that accounts for parallax in non-planar UAV imagery. We use the essential matrix to distinguish between true object motion and apparent object motion due to parallax. Instead of calculating the essential matrix directly, which can be time-consuming, we design a new algorithm that optimizes the rotation and translation between frames. This new algorithm requires only 4 ms instead of 47 ms per frame of the video sequence.We demonstrate the performance of these algorithms on video data. These algorithms are shown to improve tracking accuracy, reliability, and speed. All these contributions are capable of running in real-time without a GPU. unmanned aerial vehicle multiple target tracking motion detection stationary object tracking homography probabistic data association relative pose estimation essential matrix parallax Electrical and Computer Engineering Engineering
288	Using pose estimation to support video annotation for linguistic use : Semi-automatic tooling to aid researchers / Användning av poseuppskattning för att stödja videoannoteringsprocessen inom lingvistik : Halvautomatiska verktyg för att underlätta för forskare Gerholm, Gustav January 2022 (has links) Video annotating is a lengthy manual process. A previous research project, MINT, produced a few thousand videos of child-parent interactions in a controlled environment in order to study children’s language development. These videos were filmed across multiple sessions, tracking the same children from the age of 3 months to 7 years. In order to study the gathered material, all these videos have to be annotated with multiple kinds of annotations including transcriptions, gaze of the children, physical distances between parent and child, etc. These annotations are currently far from complete, which is why this project aimed to be a stepping point for the development of semi-automatic tooling in order to aid the process. To do this, state-of-the-art pose estimators were used to process hundreds of videos, creating pseudo-anonymized pose estimations. The pose estimations were then used in order to gauge the distance between the child and parent, and annotate the corresponding frame of the videos. Everything was packaged as a CLI tool. The results of first applying the CLI and then correcting the automatic annotations manually (compared to manually annotating everything) showed a large decrease in overall time taken to complete the annotating of videos. The tool lends itself to further development for more advanced annotations since both the tool and its related libraries are open source. / Videoannotering är en lång manuell process. Ett tidigare forskningsprojekt, MINT, producerade några tusen videor av barn-förälder-interaktioner i en kontrollerad miljö för att studera barns språkutveckling. Dessa videor filmades under flera sessioner och spårade samma barn från 3 månaders ålder till 7 år. För att studera det insamlade materialet måste alla dessa videor annoteras med flera olika typer av taggar inklusive transkriptioner, barnens blick, fysiska avstånd mellan förälder och barn, m.m. Denna annoteringsprocess är för närvarande långt ifrån avslutad, vilket är anledningen till detta projekt syftade till att vara ett första steg för utvecklingen av halvautomatiska verktyg för att underlätta processen. Detta projekt syftade till att semi-automatiskt annotera om ett barn och en förälder, i varje videobild, var inom räckhåll eller utom räckhåll för varandra. För att göra detta användes toppmoderna pose-estimators för att bearbeta hundratals videor, vilket skapade pseudoanonymiserade poseuppskattningar. Poseuppskattningarna användes sedan för att gissa avståndet mellan barnet och föräldern och annotera resultat i motsvarande bildruta för videorna. Allt paketerades som ett CLI-verktyg. Resultaten av att först tillämpa CLI-verktyget och sedan korrigera de automatiska annoteringarna manuellt (jämfört med manuellt annotering av allt) visade en stor minskning av den totala tiden det tog att slutföra annoteringen av videor. Framför allt lämpar sig verktyget för vidareutveckling för mer avancerade taggar eftersom både verktyget och dess relaterade bibliotek är öppen källkod. MINT project pose estimation video annotating HyperPose 3D-distance approximation MINT-projektet poseuppskattning videoannotering HyperPose uppskattning av 3d-sträcka Computer Sciences Datavetenskap (datalogi)
289	Feasibility of Mobile Phone-Based 2D Human Pose Estimation for Golf : An analysis of the golf swing focusing on selected joint angles / Lämpligheten av mobiltelefonbaserad 2D mänskligposeuppskattning i golf : En analys av golfsvingar medfokus på utvalda ledvinklar Perini, Elisa January 2023 (has links) Golf is a sport where the correct technical execution is important for performance and injury prevention. The existing feedback systems are often cumbersome and not readily available to recreational players. To address this issue, this thesis explores the potential of using 2D Human Pose Estimation as a mobile phone-based swing analysis tool. The developed system allows to identify three events in the swing movement (toe-up, top and impact) and to measure specific angles during these events by using an algorithmic approach. The system focuses on quantifying the knee flexion and primary spine angle during the address, and lateral bending at the top of the swing. By using only the wrist coordinates in the vertical direction, the developed system identified 37% of investigated events, independently of whether the swing was filmed in the frontal of sagittal frame. Within five frames, 95% of the events were correctly identified. Using additional joint coordinates and the event data obtained by the above-mentioned event identification algorithm, the knee flexion at address was correctly assessed in 66% of the cases, with a mean absolute error of 3.7°. The mean absolute error of the primary spine angle measurement at address was of 10.5°. The lateral bending angle was correctly identified in 87% ofthe videos. This system highlights the potential of using 2D Human Pose Estimation for swing analysis. This thesis primarily focused on exploring the feasibility of the approach and further research is needed to expand the system and improve its accuracy. This work serves as a foundation, providing valuable insights for future advancements in the field of 2D Human Pose Estimation-based swing analysis. / Golf är en sport där korrekt tekniskt utförande är avgörande för prestation och skadeförebyggelse. Feedbacksystem som finns är ofta besvärliga och inte lättillgängliga för fritidsspelare. För att åtgärda detta problem undersöker detta examensarbete potentialen att använda 2D mänsklig poseuppskattning som mobiltelefonsbaserat svinganalysverktyg. Det utvecklade systemet gör det möjligt att identifiera tre händelser i svingen (toe-up, top och impact) och att mäta specifika vinklar under dessa händelser genom en algoritmisk metod. Systemet fokuserar på att kvantifiera knäböjningen och primära ryggradsvinkeln under uppställningen, och laterala böjningen vid svingtoppen. Genom att endast använda handledskoordinater i vertikalriktning identifierade det utvecklade systemet 37% av de undersökta händelserna oavsett om svingen filmades från frontal- eller medianplanet. Inom fem bildrutor identifierades 95% av händelserna korrekt. Genom att använda ytterligare ledkoordinater och händelsedata som erhållits genom den tidigare nämnda algoritmen för händelseidentifiering, bedömdes knäböjningen vid uppställningen vara korrekt i 66% av fallen med en medelabsolutfel på 3.7°. Medelabsolutfelet för mätningen av primär ryggradsvinkel vid uppställningen var 10.5°. Laterala böjningen identifierades korrekt i 87% av tillfällena. Detta system belyser potentialen i 2D mänsklig poseuppskattning för svinganalys. Detta examensarbete fokuserade främst på att utforska tillvägagångssättets genomförbarhet och ytterligare forskning behövs för att utveckla systemet och förbättra dess noggrannhet. Detta arbete är grundläggande och ger värdefulla insikter för framtida forskning inom området för svinganalys baserad på 2D mänsklig poseuppskattning. Golf Human Pose Estimation Sports Analytics Computer Vision Golf Mänsklig Poseuppskattning Sportanalys Datorseende Sport and Fitness Sciences Idrottsvetenskap Computer Systems Datorsystem Medical Image Processing Medicinsk bildbehandling
290	Deep Image Processing with Spatial Adaptation and Boosted Efficiency & Supervision for Accurate Human Keypoint Detection and Movement Dynamics Tracking Chao Yang Dai (14709547) 31 May 2023 (has links) <p>This thesis aims to design and develop the spatial adaptation approach through spatial transformers to improve the accuracy of human keypoint recognition models. We have studied different model types and design choices to gain an accuracy increase over models without spatial transformers and analyzed how spatial transformers increase the accuracy of predictions. A neural network called Widenet has been leveraged as a specialized network for providing the parameters for the spatial transformer. Further, we have evaluated methods to reduce the model parameters, as well as the strategy to enhance the learning supervision for further improving the performance of the model. Our experiments and results have shown that the proposed deep learning framework can effectively detect the human key points, compared with the baseline methods. Also, we have reduced the model size without significantly impacting the performance, and the enhanced supervision has improved the performance. This study is expected to greatly advance the deep learning of human key points and movement dynamics. </p> Computer vision Deep learning computer vision method Artifical intelligence HUMAN POSE ESTIMATION human keypoint estimation Deep Learning (DL) spatial transformers Machine Learning (ML)

Search results