• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 35
  • 5
  • 4
  • 4
  • 1
  • 1
  • Tagged with
  • 56
  • 56
  • 24
  • 22
  • 15
  • 15
  • 15
  • 11
  • 10
  • 9
  • 7
  • 7
  • 7
  • 7
  • 7
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Konstrukce 3D skeneru pro výukové účely / Design of a 3D scanner for educational purposes

Paprsek, Adam January 2018 (has links)
The aim of the submitted semestral project is research of 3D scanning object methods. The 3D scans utilization and types of these scans have been mentioned in the first chapter. The scanners are grouped into two basic methods – contact and contactless. The second chapter is dedicated to 3D model measurement principles, which are described in detail in four sub-chapters. In the following chapter is described design with complete realization 3D scanner with the implementation of the application for its controlling. Last part of the thesis concerns the 3D model editing and method evaluating.
22

Indoor 3D Scene Understanding Using Depth Sensors

Lahoud, Jean 09 1900 (has links)
One of the main goals in computer vision is to achieve a human-like understanding of images. Nevertheless, image understanding has been mainly studied in the 2D image frame, so more information is needed to relate them to the 3D world. With the emergence of 3D sensors (e.g. the Microsoft Kinect), which provide depth along with color information, the task of propagating 2D knowledge into 3D becomes more attainable and enables interaction between a machine (e.g. robot) and its environment. This dissertation focuses on three aspects of indoor 3D scene understanding: (1) 2D-driven 3D object detection for single frame scenes with inherent 2D information, (2) 3D object instance segmentation for 3D reconstructed scenes, and (3) using room and floor orientation for automatic labeling of indoor scenes that could be used for self-supervised object segmentation. These methods allow capturing of physical extents of 3D objects, such as their sizes and actual locations within a scene.
23

Hand gesture recognition using sEMG and deep learning

Nasri, Nadia 17 June 2021 (has links)
In this thesis, a study of two blooming fields in the artificial intelligence topic is carried out. The first part of the present document is about 3D object recognition methods. Object recognition in general is about providing the ability to understand what objects appears in the input data of an intelligent system. Any robot, from industrial robots to social robots, could benefit of such capability to improve its performance and carry out high level tasks. In fact, this topic has been largely studied and some object recognition methods present in the state of the art outperform humans in terms of accuracy. Nonetheless, these methods are image-based, namely, they focus in recognizing visual features. This could be a problem in some contexts as there exist objects that look alike some other, different objects. For instance, a social robot that recognizes a face in a picture, or an intelligent car that recognizes a pedestrian in a billboard. A potential solution for this issue would be involving tridimensional data so that the systems would not focus on visual features but topological features. Thus, in this thesis, a study of 3D object recognition methods is carried out. The approaches proposed in this document, which take advantage of deep learning methods, take as an input point clouds and are able to provide the correct category. We evaluated the proposals with a range of public challenges, datasets and real life data with high success. The second part of the thesis is about hand pose estimation. This is also an interesting topic that focuses in providing the hand's kinematics. A range of systems, from human computer interaction and virtual reality to social robots could benefit of such capability. For instance to interface a computer and control it with seamless hand gestures or to interact with a social robot that is able to understand human non-verbal communication methods. Thus, in the present document, hand pose estimation approaches are proposed. It is worth noting that the proposals take as an input color images and are able to provide 2D and 3D hand pose in the image plane and euclidean coordinate frames. Specifically, the hand poses are encoded in a collection of points that represents the joints in a hand, so that they can be easily reconstructed in the full hand pose. The methods are evaluated on custom and public datasets, and integrated with a robotic hand teleoperation application with great success.
24

Contributions to 3D object recognition and 3D hand pose estimation using deep learning techniques

Gomez-Donoso, Francisco 18 September 2020 (has links)
In this thesis, a study of two blooming fields in the artificial intelligence topic is carried out. The first part of the present document is about 3D object recognition methods. Object recognition in general is about providing the ability to understand what objects appears in the input data of an intelligent system. Any robot, from industrial robots to social robots, could benefit of such capability to improve its performance and carry out high level tasks. In fact, this topic has been largely studied and some object recognition methods present in the state of the art outperform humans in terms of accuracy. Nonetheless, these methods are image-based, namely, they focus in recognizing visual features. This could be a problem in some contexts as there exist objects that look alike some other, different objects. For instance, a social robot that recognizes a face in a picture, or an intelligent car that recognizes a pedestrian in a billboard. A potential solution for this issue would be involving tridimensional data so that the systems would not focus on visual features but topological features. Thus, in this thesis, a study of 3D object recognition methods is carried out. The approaches proposed in this document, which take advantage of deep learning methods, take as an input point clouds and are able to provide the correct category. We evaluated the proposals with a range of public challenges, datasets and real life data with high success. The second part of the thesis is about hand pose estimation. This is also an interesting topic that focuses in providing the hand's kinematics. A range of systems, from human computer interaction and virtual reality to social robots could benefit of such capability. For instance to interface a computer and control it with seamless hand gestures or to interact with a social robot that is able to understand human non-verbal communication methods. Thus, in the present document, hand pose estimation approaches are proposed. It is worth noting that the proposals take as an input color images and are able to provide 2D and 3D hand pose in the image plane and euclidean coordinate frames. Specifically, the hand poses are encoded in a collection of points that represents the joints in a hand, so that they can be easily reconstructed in the full hand pose. The methods are evaluated on custom and public datasets, and integrated with a robotic hand teleoperation application with great success.
25

Optimising 3D object destruction tools for improved performance and designer efficiency in video game development

Forslund, Elliot January 2023 (has links)
Background. In video game development, efficient destruction tools and workflows were crucial for creating engaging gaming environments. This study delved into the fundamental principles of 3D object properties and interactions, reviewed existing destruction techniques, and offered insights into their practical application, with a specific focus on Embark Studios’ destruction tool.  Objectives. This study focused on the optimisation of an existing destruction tool to enhance efficiency and integration within a gaming company’s pipeline. The key objectives included reducing execution time, and improving designer workflow. The study utilised performance counters and Unreal Insights profiling to identify and optimise hotspots in the destruction tool. Additionally, the performance of the op- timised tool was measured and compared to the existing one to quantify efficiency improvements. An expert evaluation with designers at Embark Studios was con- ducted to assess the impact of the optimised tool on their workflow.  Methods. The existing destruction tool was optimised primarily through parallelisation. The efficiency of the optimised tool was evaluated both empirically, by measuring the execution time, and subjectively, through an expert evaluation involv- ing three professional level designers.  Results. The optimisation significantly reduced the execution time of the destruc- tion tool. Feedback from the expert evaluation indicated that the optimised tool could enhance designer efficiency, particularly in rebuilding the destruction graphs. However, the performance of the optimised tool was found to be hardware-dependent, with varying execution times observed across different hardware configurations. Conclusions. This study presented an optimised destruction tool which demon- strated improved performance and efficiency, validating its suitability for integration into the pipeline of game development. It was proposed that future work could further optimise this tool and explore its performance across diverse hardware con- figurations.
26

Object registration in semi-cluttered and partial-occluded scenes for augmented reality

Gao, Q.H., Wan, Tao Ruan, Tang, W., Chen, L. 26 November 2018 (has links)
Yes / This paper proposes a stable and accurate object registration pipeline for markerless augmented reality applications. We present two novel algorithms for object recognition and matching to improve the registration accuracy from model to scene transformation via point cloud fusion. Whilst the first algorithm effectively deals with simple scenes with few object occlusions, the second algorithm handles cluttered scenes with partial occlusions for robust real-time object recognition and matching. The computational framework includes a locally supported Gaussian weight function to enable repeatable detection of 3D descriptors. We apply a bilateral filtering and outlier removal to preserve edges of point cloud and remove some interference points in order to increase matching accuracy. Extensive experiments have been carried to compare the proposed algorithms with four most used methods. Results show improved performance of the algorithms in terms of computational speed, camera tracking and object matching errors in semi-cluttered and partial-occluded scenes. / Shanxi Natural Science and Technology Foundation of China, grant number 2016JZ026 and grant number 2016KW-043).
27

3D Position Estimation using Deep Learning

Pedrazzini, Filippo January 2018 (has links)
The estimation of the 3D position of an object is one of the most important topics in the computer vision field. Where the final aim is to create automated solutions that can localize and detect objects from images, new high-performing models and algorithms are needed. Due to lack of relevant information in the single 2D images, approximating the 3D position can be considered a complex problem. This thesis describes a method based on two deep learning models: the image net and the temporal net that can tackle this task. The former is a deep convolutional neural network with the intention to extract meaningful features from the images, while the latter exploits the temporal information to reach a more robust prediction. This solution reaches a better Mean Absolute Error compared to already existing computer vision methods on different conditions and configurations. A new data-driven pipeline has been created to deal with 2D videos and extract the 3D information of an object. The same architecture can be generalized to different domains and applications. / Uppskattning av 3D-positionen för ett objekt är ett viktigt område inom datorseende. Då det slutliga målet är att skapa automatiserade lösningar som kan lokalisera och upptäcka objekt i bilder, behövs nya, högpresterande modeller och algoritmer. Bristen på relevant information i de enskilda 2D-bilderna gör att approximering av 3D-positionen blir ett komplext problem. Denna uppsats beskriver en metod baserad på två djupinlärningsmodeller: image net och temporal net. Den förra är ett djupt nätverk som kan extrahera meningsfulla egenskaper från bilderna, medan den senare utnyttjar den tidsmässiga informationen för att kunna göra mer robusta förutsägelser. Denna lösning erhåller ett lägre genomsnittligt absolut fel jämfört med existerande metoder, under olika villkor och konfigurationer. En ny datadriven arkitektur har skapats för att hantera 2D-videoklipp och extrahera 3D-informationen för ett objekt. Samma arkitektur kan generaliseras till olika domäner och applikationer.
28

Deep-learning Approaches to Object Recognition from 3D Data

Chen, Zhiang 30 August 2017 (has links)
No description available.
29

Wavelet-enhanced 2D and 3D Lightweight Perception Systems for autonomous driving

Alaba, Simegnew Yihunie 10 May 2024 (has links) (PDF)
Autonomous driving requires lightweight and robust perception systems that can rapidly and accurately interpret the complex driving environment. This dissertation investigates the transformative capacity of discrete wavelet transform (DWT), inverse DWT, CNNs, and transformers as foundational elements to develop lightweight perception architectures for autonomous vehicles. The inherent properties of DWT, including its invertibility, sparsity, time-frequency localization, and ability to capture multi-scale information, present an inductive bias. Similarly, transformers capture long-range dependency between features. By harnessing these attributes, novel wavelet-enhanced deep learning architectures are introduced. The first contribution is introducing a lightweight backbone network that can be employed for real-time processing. This network balances processing speed and accuracy, outperforming established models like ResNet-50 and VGG16 in terms of accuracy while remaining computationally efficient. Moreover, a multiresolution attention mechanism is introduced for CNNs to enhance feature extraction. This mechanism directs the network's focus toward crucial features while suppressing less significant ones. Likewise, a transformer model is proposed by leveraging the properties of DWT with vision transformers. The proposed wavelet-based transformer utilizes the convolution theorem in the frequency domain to mitigate the computational burden on vision transformers caused by multi-head self-attention. Furthermore, a proposed wavelet-multiresolution-analysis-based 3D object detection model exploits DWT's invertibility, ensuring comprehensive environmental information capture. Lastly, a multimodal fusion model is presented to use information from multiple sensors. Sensors have limitations, and there is no one-fits-all sensor for specific applications. Therefore, multimodal fusion is proposed to use the best out of different sensors. Using a transformer to capture long-range feature dependencies, this model effectively fuses the depth cues from LiDAR with the rich texture derived from cameras. The multimodal fusion model is a promising approach that integrates backbone networks and transformers to achieve lightweight and competitive results for 3D object detection. Moreover, the proposed model utilizes various network optimization methods, including pruning, quantization, and quantization-aware training, to minimize the computational load while maintaining optimal performance. The experimental results across various datasets for classification networks, attention mechanisms, 3D object detection, and multimodal fusion indicate a promising direction in developing a lightweight and robust perception system for robotics, particularly in autonomous driving.
30

Tecnologia para o reconhecimento do formato de objetos tri-dimensionais. / Three dimensional shape recognition technology.

Gonzaga, Adilson 05 July 1991 (has links)
Apresentamos neste trabalho o desenvolvimento de um método para o reconhecimento do Formato de Objetos Tri-dimensionais. Os sistemas tradicionais de Visão Computacional empregam imagens bi-dimensionais obtidos através de câmeras de TV, ricas em detalhes necessários a visão humana. Estes detalhes em grande parte das aplicações industriais de Robôs são supérfluos. Os algoritmos tradicionais de classificação consomem portanto muito tempo no processamento deste excesso de informação. Para este trabalho, desenvolvemos um sistema dedicado para reconhecimento que utiliza um feixe de Laser defletido sobre um objeto e a digitalização da Luminância em cada ponto de sua superfície. A intensidade luminosa refletida e proporcional a distância do ponto ao observador. É, portanto, possível determinar parâmetros que classifiquem cada objeto. A inclinação de cada face de um poliedro, o comportamento de suas fronteiras e também a existência de arestas internas, são as características adotadas. Estas características são então rotuladas, permitindo que o programa de classificação busque em um \"banco de conhecimento\" previamente estabelecido, a descrição dos objetos. Uma mesa giratória permite a rotação do modele fornecendo novas vistas ao observador, determinando sua classificação. Todo o sistema é controlado por um microcomputador cujo programa reconhece em tempo real o objeto em observação. Para o protótipo construído, utilizamos um Laser de HeNe sendo a recepção do raio refletido realizada por um fototransistor. Os objetos reconhecíveis pelo programa são poliedros regulares simples, compondo o seguinte conjunto: 1 prisma de base triangular, 1 cubo, 1 pirâmide de base triangular, 1 pirâmide de base retangular. O tratamento matemático empregado visa a comprovação da tecnologia proposta, podendo, na continuação de trabalhos futuros, ser efetivamente estendido a diversos outros objetos como, por exemplo, os de superfícies curvas. / We present in this work a new method for three dimensional Shape Recognition. Traditional Computer Vision systems use bi-dimensional TV camera images. In most of the industrial Robotic applications, the excess of detail obtained by the TV camera is needless. Traditional classification algorithms spend a lot of time to process the excess of information. For the present work we developed a dedicated recognition system, which deflects a Laser beam over an object and digitizes the Reflected beam point by point over the surface. The intensity of the reflected beam is proportional to the observer distance. Using this technique it was possible to establish features to classify various objects. These features are the slope of the polyhedral surfaces, the boundary type and the inner edges. For each object the features are labeled and the classification algorithm searches in a \"knowledge data base\" for the object description. The recognition system used a He-Ne Laser and the reflected signal was captured by a photo-transistor. The object to be recognized is placed over a rotating table which can be rotated, supplying a new view for the classification. A microcomputer controls the system operation and the object is recognized in real time. The recognized objects were simple regular polyhedral, just as: 1 triangular base prism, 1 cube, 1 triangular base pyramid, 1 rectangular base pyramid. To check that the proposed technology was correct, we used a dedicated mathematical approach, which can be extended to other surfaces, such as curves, in future works.

Page generated in 0.0404 seconds