• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 172
  • 59
  • 25
  • 14
  • 11
  • 6
  • 4
  • 4
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 364
  • 364
  • 108
  • 101
  • 64
  • 61
  • 46
  • 43
  • 38
  • 32
  • 30
  • 26
  • 26
  • 26
  • 26
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
321

A Higher-Fidelity Approach to Bridging the Simulation-Reality Gap for 3-D Object Classification

Feydt, Austin Pack 26 August 2019 (has links)
No description available.
322

SORTED : Serial manipulator with Object Recognition Trough Edge Detection

Bodén, Rikard, Pernow, Jonathan January 2019 (has links)
Today, there is an increasing demand for smart robots that can make decisions on their own and cooperate with humans in changing environments. The application areas for robotic arms with camera vision are likely to increase in the future of artificial intelligence as algorithms become more adaptable and intelligent than ever. The purpose of this bachelor’s thesis is to develop a robotic arm that recognises arbitrarily placed objects with camera vision and has the ability to pick and place the objects when they appear in unpredictable positions. The robotic arm has three degrees of freedom and the construction is modularised and 3D-printed with respect to maintenance, but also in order to be adaptive to new applications. The camera vision sensor is integrated in an external camera tripod with its field of view over the workspace. The camera vision sensor recognises objects through colour filtering and it uses an edge detection algorithm to return measurements of detected objects. The measurements are then used as input for the inverse kinematics, that calculates the rotation of each stepper motor. Moreover, there are three different angular potentiometers integrated in each axis to regulate the rotation by each stepper motor. The results in this thesis show that the robotic arm is able to pick up to 90% of the detected objects when using barrel distortion correction in the algorithm. The findings in this thesis is that barrel distortion, that comes with the camera lens, significantly impacts the precision of the robotic arm and thus the results. It can also be stated that the method for barrel distortion correction is affected by the geometry of detected objects and differences in illumination over the workspace. Another conclusion is that correct illumination is needed in order for the vision sensor to differentiate objects with different hue and saturation. / Idag ökar efterfrågan på smarta robotar som kan ta egna beslut och samarbeta med människor i föränderliga miljöer. Tillämpningsområdena för robotar med kamerasensorer kommer sannolikt att öka i en framtid av artificiell intelligens med algoritmer som blir mer intelligenta och anpassningsbara än tidigare. Syftet med detta kandidatexamensarbete är att utveckla en robotarm som, med hjälp av en kamerasensor, kan ta upp och sortera godtyckliga objekt när de uppträder på oförutsägbara positioner. Robotarmen har tre frihetsgrader och hela konstruktionen är 3D-printad och modulariserad för att vara underhållsvänlig, men också anpassningsbar för nya tillämpningsområden. Kamerasensorn ¨ar integrerad i ett externt kamerastativ med sitt synfält över robotarmens arbetsyta. Kamerasensorn detekterar objekt med hjälp av en färgfiltreringsalgoritm och returnerar sedan storlek, position och signatur för objekten med hjälp av en kantdetekteringsalgoritm. Objektens storlek används för att kalibrera kameran och kompensera för den radiella förvrängningen hos linsen. Objektens relativa position används sedan till invers kinematik för att räkna ut hur mycket varje stegmotor ska rotera för att erhålla den önskade vinkeln på varje axel som gör att gripdonet kan nå det detekterade objektet. Robotarmen har även tre olika potentiometrar integrerade i varje axel för att reglera rotationen av varje stegmotor. Resultaten i denna rapport visar att robotarmen kan detektera och plocka upp till 90% av objekten när kamerakalibrering används i algoritmen. Slutsatsen från rapporten är att förvrängningen från kameralinsen har störst påverkan på robotarmens precision och därmed resultatet. Det går även att konstatera att metoden som används för att korrigera kameraförvrängningen påverkas av geometrin samt orienteringen av objekten som ska detekteras, men framför allt variationer i belysning och skuggor över arbetsytan. En annan slutsats är att belysningen över arbetsytan är helt avgörande för om kamerasensorn ska kunna särskilja objekt med olika färgmättad och nyans.
323

A Deep Learning Based Approach to Object Recognition from LiDAR Data Along Swedish Railroads / En djupinlärningsbaserad metod för objektigenkänning längs svensk järnväg

Morast, Egil January 2022 (has links)
Malfunction in the overhead contact line system is a common cause of disturbances in the train traffic in Sweden. Due to the preventive methods being inefficient, the Swedish Transport Administration has stated the need to develop the railroad maintenance services and has identified Artificial Intelligence (AI) as an important tool for this undertaking.  Light Detection and Ranging (LiDAR) is a remote sensing technology that has been gaining popularity in recent years due to its high ranging accuracy and decreasing data acquisition cost. LiDAR is commonly used within the railroad industry and companies such as WSP collects large amount of data through LiDAR measurements every year. There is currently no reliable fully automatic method to process the point cloud data structure. Several studies propose innovative methods based on traditional machine learning to extract railroad system components from point clouds and have been able to do so with good results. However, these methods have limited applicability in real world problems, as they build upon hand-crafted features based on previous knowledge of the data on which they are applied. Deep learning technology may be a better alternative for the task as it does not require the same amount of human interaction for feature engineering and knowledge about the data in advance.  This thesis investigates if contact line poles can be recognized from LiDAR data with the use of the neural network architecture DGCNN. Data from two Swedish railroad lines, Saltsjöbanan and Roslagsbanan, provided by WSP was used. Point labels were predicted through semantic segmentation from which objects were distinguished using the clustering algorithm DBSCAN. The network was trained and validated on Saltsjöbanan using k-fold cross-validation and was later tested on Roslagsbanan to simulate the application of trained models on an unknown dataset. On point level the network achieved an estimated precision of 0.87 and a recall of 0.89 on the data from Saltsjöbanan and an estimated precision of 0.92 and recall of 0.83 on the data from Roslagsbanan. In the object recognition task, the approach achieved an average precision of 0.93 and recall of 0.998 on the data from Saltsjöbanan and on the data from Roslagsbanan, an average precision of 0.96 and a recall of 1 was achieved, indicating that it is possible to apply this method on railroad segments other than the one the network was trained on. Despite not being accurate or reliable enough on point level to be used for thorough inspection of the contact line system, this approach has various applications in terms of object recognition along Swedish railroads. Future research should investigate how adding additional classes beyond contact line poles would affect the results and what changes can be done to the parameters to optimize the performance. A side-by-side comparison with the current methods and traditional machine learning-based methods would be valuable as well. / Fel i kontaktledningssystemet är en vanlig orsak till störningar i tågtrafiken i Sverige. Då dagens metoder för att förebygga dessa fel är ineffektiva har Trafikverket uttryckt behovet av att utveckla underhållsarbetet av den svenska järnvägen och har identifierat artificiell intelligens (AI) som ett viktigt verktyg i det syftet. Light Detection and Ranging (LiDAR) är en fjärranalysteknologi som har blivit allt mer populär med åren tack vare sin höga mätnoggrannheten och allt billigare datainsamling. LiDAR används regelbundet inom järnvägsindustrin och företag som WSP samlar årligen in stora mängder data med denna teknologi. I dagsläget finns det däremot ingen tillräckligt pålitlig automatisk metod för att segmentera och klassificera punktmoln. Ett flertal studier föreslår lösningar baserade på traditionell maskininlärning för att ta ut järnvägskomponenter ur punktmolnsdata. Eftersom dessa metoder bygger på förkunskap och noga utvecklade funktioner för att hitta mönster i datan är de svåra att tillämpa i verkliga problem. Istället kan djupinlärning som inte kräver samma förkunskap eller noggranna matematiska modellering tillämpas. I det här arbetet identifierades kontaktledningsstolpar ur LiDAR data med hjälp av det neurala nätverket DGCNN. Datan som användes var punktmolnsdata från Saltsjöbanan och Roslagsbanan försedd av WSP. Först klassificerades punkter genom semantisk segmentering och från klassificeringen kunde objekt identifierades genom att tillämpa klusteringsalgoritmen DBSCAN. Nätverket tränades med hjälp av korsvalidering på data över Saltsjöbanan och testades därefter på data över Roslagsbanan för att undersöka om tränade modeller kan tillämpas på andra järnvägslinjer. På datan över Saltsjöbanan uppnådde nätverket en estimerad specificitet på 0.87 och sensitivitet på 0.89 på punktnivå. Motsvarande värden på datan över Roslagsbanan låg på 0.92 och 0.83. Metoden för objektigenkänning uppnådde en genomsnittlig specificitet på 0.93 och sensitivitet på 0.998 på datan över Saltsjöbanan och motsvarande värden på datan över Roslagsbanan låg på 0.96 och 1. Resultatet indikerar att metoden går att tillämpa på andra järnvägslinjer utan specifik träning för dessa.  Trots att metoden inte är träffsäker nog på punktnivå för att användas för grundlig besiktning av kontaktledningssystemet kan den användas för objektigenkänning längs svensk järnväg. Framtida forskning bör undersöka hur resultatet påverkas om ytterligare klasser utöver kontaktledningsstolpar används och vilka förändringar bör göras bland parametrarna för att optimera det undersökta tillvägagångssättet. En utförlig jämförelse mot nuvarande metoder och metoder baserade på traditionell maskininlärning skulle dessutom vara av värde.
324

Fashion Object Detection and Pixel-Wise Semantic Segmentation : Crowdsourcing framework for image bounding box detection & Pixel-Wise Segmentation

Mallu, Mallu January 2018 (has links)
Technology has revamped every aspect of our life, one of those various facets is fashion industry. Plenty of deep learning architectures are taking shape to augment fashion experiences for everyone. There are numerous possibilities of enhancing the fashion technology with deep learning. One of the key ideas is to generate fashion style and recommendation using artificial intelligence. Likewise, another significant feature is to gather reliable information of fashion trends, which includes analysis of existing fashion related images and data. When specifically dealing with images, localisation and segmentation are well known to address in-depth study relating to pixels, objects and labels present in the image. In this master thesis a complete framework is presented to perform localisation and segmentation on fashionista images. This work is a part of an interesting research work related to Fashion Style detection and Recommendation. Developed solution aims to leverage the possibility of localising fashion items in an image by drawing bounding boxes and labelling them. Along with that, it also provides pixel-wise semantic segmentation functionality which extracts fashion item label-pixel data. Collected data can serve as ground truth as well as training data for the aimed deep learning architecture. A study related to localisation and segmentation of videos has also been presented in this work. The developed system has been evaluated in terms of flexibility, output quality and reliability as compared to similar platforms. It has proven to be fully functional solution capable of providing essential localisation and segmentation services while keeping the core architecture simple and extensible. / Tekniken har förnyat alla aspekter av vårt liv, en av de olika fasetterna är modeindustrin. Massor av djupa inlärningsarkitekturer tar form för att öka modeupplevelser för alla. Det finns många möjligheter att förbättra modetekniken med djup inlärning. En av de viktigaste idéerna är att skapa modestil och rekommendation med hjälp av artificiell intelligens. På samma sätt är en annan viktig egenskap att samla pålitlig information om modetrender, vilket inkluderar analys av befintliga moderelaterade bilder och data. När det specifikt handlar om bilder är lokalisering och segmentering väl kända för att ta itu med en djupgående studie om pixlar, objekt och etiketter som finns i bilden. I denna masterprojekt presenteras en komplett ram för att utföra lokalisering och segmentering på fashionista bilder. Detta arbete är en del av ett intressant forskningsarbete relaterat till Fashion Style detektering och rekommendation. Utvecklad lösning syftar till att utnyttja möjligheten att lokalisera modeartiklar i en bild genom att rita avgränsande lådor och märka dem. Tillsammans med det tillhandahåller det även pixel-wise semantisk segmenteringsfunktionalitet som extraherar dataelementetikett-pixeldata. Samlad data kan fungera som grundsannelse samt träningsdata för den riktade djuplärarkitekturen. En studie relaterad till lokalisering och segmentering av videor har också presenterats i detta arbete. Det utvecklade systemet har utvärderats med avseende på flexibilitet, utskriftskvalitet och tillförlitlighet jämfört med liknande plattformar. Det har visat sig vara en fullt fungerande lösning som kan tillhandahålla viktiga lokaliseringsoch segmenteringstjänster samtidigt som kärnarkitekturen är enkel och utvidgbar.
325

Multiple Ingredient Dietary Supplement and Protective Effects in Gamma Irradiated Mice

Monster, Kathleen 11 1900 (has links)
Cognitive impairment, “Chemofog”, has been well established as a negative outcome of otherwise successful medical radiation treatments. Mitigation of this negative feature would dramatically increase quality of life for those recovering from cancer treatment. There is currently no known intervention to protect or restore cognitive function of patients undergoing radiation treatments. Development of a multiple ingredient dietary supplement (MDS) is meant to offer a non-invasive therapy to help mitigate risk and decrease damage to individuals. The MDS was originally designed to off-set 5 key mechanisms associated with aging including oxidative damage, inflammation, impaired glucose metabolism, mitochondrial dysfunction and membrane deterioration. Radiation damage shares many of the same deficiencies that develop with age and supplementation with MDS would impact many of the same pathways. Changes in cytokine profile (inflammation markers), and biomarkers of behavioural functions, sensory functions, and oxidative damage provide preliminary evidence of MDS impacts. / Thesis / Bachelor of Science (BSc) / Cognitive impairment, “Chemofog”, has been well established as a negative outcome of otherwise successful medical radiation treatments. Mitigation of this negative feature would dramatically increase quality of life for those recovering from cancer treatment. There is currently no known intervention to protect or restore cognitive function of patients undergoing radiation treatments. Development of a multiple ingredient dietary supplement (MDS) is meant to offer a non-invasive therapy to help mitigate risk and decrease damage to individuals. The MDS was originally designed to off-set 5 key mechanisms associated with aging including oxidative damage, inflammation, impaired glucose metabolism, mitochondrial dysfunction and membrane deterioration. Radiation damage shares many of the same deficiencies that develop with age and supplementation with MDS would impact many of the same pathways.
326

Visual Infrastructure based Accurate Object Recognition and Localization

Yang, Fan 25 August 2017 (has links)
No description available.
327

Learning Pose and State-Invariant Object Representations for Fine-Grained Recognition and Retrieval

Rohan Sarkar (19065215) 11 July 2024 (has links)
<p dir="ltr">Object Recognition and Retrieval is a fundamental problem in Computer Vision that involves recognizing objects and retrieving similar object images through visual queries. While deep metric learning is commonly employed to learn image embeddings for solving such problems, the representations learned using existing methods are not robust to changes in viewpoint, pose, and object state, especially for fine-grained recognition and retrieval tasks. To overcome these limitations, this dissertation aims to learn robust object representations that remain invariant to such transformations for fine-grained tasks. First, it focuses on learning dual pose-invariant embeddings to facilitate recognition and retrieval at both the category and finer object-identity levels by learning category and object-identity specific representations in separate embedding spaces simultaneously. For this, the PiRO framework is introduced that utilizes an attention-based dual encoder architecture and novel pose-invariant ranking losses for each embedding space to disentangle the category and object representations while learning pose-invariant features. Second, the dissertation introduces ranking losses that cluster multi-view images of an object together in both the embedding spaces while simultaneously pulling the embeddings of two objects from the same category closer in the category embedding space to learn fundamental category-specific attributes and pushing them apart in the object embedding space to learn discriminative features to distinguish between them. Third, the dissertation addresses state-invariance and introduces a novel ObjectsWithStateChange dataset to facilitate research in recognizing fine-grained objects with state changes involving structural transformations in addition to pose and viewpoint changes. Fourth, it proposes a curriculum learning strategy to progressively sample object images that are harder to distinguish for training the model, enhancing its ability to capture discriminative features for fine-grained tasks amidst state changes and other transformations. Experimental evaluations demonstrate significant improvements in object recognition and retrieval performance compared to previous methods, validating the effectiveness of the proposed approaches across several challenging datasets under various transformations.</p>
328

Slowness and sparseness for unsupervised learning of spatial and object codes from naturalistic data

Franzius, Mathias 27 June 2008 (has links)
Diese Doktorarbeit führt ein hierarchisches Modell für das unüberwachte Lernen aus quasi-natürlichen Videosequenzen ein. Das Modell basiert auf den Lernprinzipien der Langsamkeit und Spärlichkeit, für die verschiedene Ansätze und Implementierungen vorgestellt werden. Eine Vielzahl von Neuronentypen im Hippocampus von Nagern und Primaten kodiert verschiedene Aspekte der räumlichen Umgebung eines Tieres. Dazu gehören Ortszellen (place cells), Kopfrichtungszellen (head direction cells), Raumansichtszellen (spatial view cells) und Gitterzellen (grid cells). Die Hauptergebnisse dieser Arbeit basieren auf dem Training des hierarchischen Modells mit Videosequenzen aus einer Virtual-Reality-Umgebung. Das Modell reproduziert die wichtigsten räumlichen Codes aus dem Hippocampus. Die Art der erzeugten Repräsentationen hängt hauptsächlich von der Bewegungsstatistik des simulierten Tieres ab. Das vorgestellte Modell wird außerdem auf das Problem der invaranten Objekterkennung angewandt, indem Videosequenzen von simulierten Kugelhaufen oder Fischen als Stimuli genutzt wurden. Die resultierenden Modellrepräsentationen erlauben das unabhängige Auslesen von Objektidentität, Position und Rotationswinkel im Raum. / This thesis introduces a hierarchical model for unsupervised learning from naturalistic video sequences. The model is based on the principles of slowness and sparseness. Different approaches and implementations for these principles are discussed. A variety of neuron classes in the hippocampal formation of rodents and primates codes for different aspects of space surrounding the animal, including place cells, head direction cells, spatial view cells and grid cells. In the main part of this thesis, video sequences from a virtual reality environment are used for training the hierarchical model. The behavior of most known hippocampal neuron types coding for space are reproduced by this model. The type of representations generated by the model is mostly determined by the movement statistics of the simulated animal. The model approach is not limited to spatial coding. An application of the model to invariant object recognition is described, where artificial clusters of spheres or rendered fish are presented to the model. The resulting representations allow a simple extraction of the identity of the object presented as well as of its position and viewing angle.
329

Laser-based detection and tracking of dynamic objects

Wang, Zeng January 2014 (has links)
In this thesis, we present three main contributions to laser-based detection and tracking of dynamic objects, from both a model-based point of view and a model-free point of view, with an emphasis on applications to autonomous driving. A segmentation-based detector is first proposed to provide an end-to-end detection of the classes car, pedestrian and bicyclist in 3D laser data amongst significant background clutter. We postulate that, for the particular classes considered, solving a binary classification task outperforms approaches that tackle the multi-class problem directly. This is confirmed using custom and third-party datasets gathered of urban street scenes. The sliding window approach to object detection, while ubiquitous in the Computer Vision community, is largely neglected in laser-based object detectors, possibly due to its perceived computational inefficiency. We give a second thought to this opinion in this thesis, and demonstrate that, by fully exploiting the sparsity of the problem, exhaustive window searching in 3D can be made efficient. We prove the mathematical equivalence between sparse convolution and voting, and devise an efficient algorithm to compute exactly the detection scores at all window locations, processing a complete Velodyne scan containing 100K points in less than half a second. Its superior performance is demonstrated on the KITTI dataset, and compares commensurably with state of the art vision approaches. A new model-free approach to detection and tracking of moving objects with a 2D lidar is then proposed aiming at detecting dynamic objects of arbitrary shapes and classes. Objects are modelled by a set of rigidly attached sample points along their boundaries whose positions are initialised with and updated by raw laser measurements, allowing a flexible, nonparametric representation. Dealing with raw laser points poses a significant challenge to data association. We propose a hierarchical approach, and present a new variant of the well-known Joint Compatibility Branch and Bound algorithm to handle large numbers of measurements. The system is systematically calibrated on real world data containing 7.5K labelled object examples and validated on 6K test cases. Its performance is demonstrated over an existing industry standard targeted at the same problem domain as well as a classical approach to model-free tracking.
330

A three-dimensional representation method for noisy point clouds based on growing self-organizing maps accelerated on GPUs

Orts-Escolano, Sergio 21 January 2014 (has links)
The research described in this thesis was motivated by the need of a robust model capable of representing 3D data obtained with 3D sensors, which are inherently noisy. In addition, time constraints have to be considered as these sensors are capable of providing a 3D data stream in real time. This thesis proposed the use of Self-Organizing Maps (SOMs) as a 3D representation model. In particular, we proposed the use of the Growing Neural Gas (GNG) network, which has been successfully used for clustering, pattern recognition and topology representation of multi-dimensional data. Until now, Self-Organizing Maps have been primarily computed offline and their application in 3D data has mainly focused on free noise models, without considering time constraints. It is proposed a hardware implementation leveraging the computing power of modern GPUs, which takes advantage of a new paradigm coined as General-Purpose Computing on Graphics Processing Units (GPGPU). The proposed methods were applied to different problem and applications in the area of computer vision such as the recognition and localization of objects, visual surveillance or 3D reconstruction.

Page generated in 0.1101 seconds