Global ETD Search

1	Towards Efficient Convolutional Neural Architecture Design Richter, Mats L. 10 May 2022 (has links) The design and adjustment of convolutional neural network architectures is an opaque and mostly trial and error-driven process. The main reason for this is the lack of proper paradigms beyond general conventions for the development of neural networks architectures and lacking effective insights into the models that can be propagated back to design decision. In order for the task-specific design of deep learning solutions to become more efficient and goal-oriented, novel design strategies need to be developed that are founded on an understanding of convolutional neural network models. This work develops tools for the analysis of the inference process in trained neural network models. Based on these tools, characteristics of convolutional neural network models are identified that can be linked to inefficiencies in predictive and computational performance. Based on these insights, this work presents methods for effectively diagnosing these design faults before and during training with little computational overhead. These findings are empirically tested and demonstrated on architectures with sequential and multi-pathway structures, covering all the common types of convolutional neural network architectures used for classification. Furthermore, this work proposes simple optimization strategies that allow for goal-oriented and informed adjustment of the neural architecture, opening the potential for a less trial-and-error-driven design process. 54.72 - Künstliche Intelligenz 54.74 - Maschinelles Sehen I.2.10 - Vision and Scene Understanding I.5.2 - Design Methodology ddc:004
2	Context-aware anchoring, semantic mapping and active perception for mobile robots Günther, Martin 30 November 2021 (has links) An autonomous robot that acts in a goal-directed fashion requires a world model of the elements that are relevant to the robot's task. In real-world, dynamic environments, the world model has to be created and continually updated from uncertain sensor data. The symbols used in plan-based robot control have to be anchored to detected objects. Furthermore, robot perception is not only a bottom-up and passive process: Knowledge about the composition of compound objects can be used to recognize larger-scale structures from their parts. Knowledge about the spatial context of an object and about common relations to other objects can be exploited to improve the quality of the world model and can inform an active search for objects that are missing from the world model. This thesis makes several contributions to address these challenges: First, a model-based semantic mapping system is presented that recognizes larger-scale structures like furniture based on semantic descriptions in an ontology. Second, a context-aware anchoring process is presented that creates and maintains the links between object symbols and the sensor data corresponding to those objects while exploiting the geometric context of objects. Third, an active perception system is presented that actively searches for a required object while being guided by the robot's knowledge about the environment. Anchoring Semantic Mapping Active Perception Robotics Artificial Intelligence Context Object Search Robotik Künstliche Intelligenz 54.72 - Künstliche Intelligenz 54.74 - Maschinelles Sehen I.2.9 - Robotics I.2.10 - Vision and Scene Understanding ddc:004
3	Time-Dependent Data: Classification and Visualization Tanisaro, Pattreeya 14 November 2019 (has links) The analysis of the immensity of data in space and time is a challenging task. For this thesis, the time-dependent data has been explored in various directions. The studies focused on data visualization, feature extraction, and data classification. The data that has been used in the studies comes from various well-recognized archives and has been the basis of numerous researches. The data characteristics ranged from the univariate time series to multivariate time series, from hand gestures to unconstrained views of general human movements. The experiments covered more than one hundred datasets. In addition, we also discussed the applications of visual analytics to video data. Two approaches were proposed to create a feature vector for time-dependent data classification. One is designed especially for a bio-inspired model for human motion recognition and the other is a subspace-based approach for arbitrary data characteristics. The extracted feature vectors of the proposed approaches can be easily visualized in two-dimensional space. For the classification, we experimented with various known models and offered a simple model using data in subspaces for light-weight computation. Furthermore, this method allows a data analyst to inspect feature vectors and detect an anomaly from a large collection of data simultaneously. Various classification techniques were compared and the findings were summarized. Hence, the studies can assist a researcher in picking an appropriate technique when setting up a corresponding model for a given characteristic of temporal data, and offer a new perspective for analyzing the time series data. This thesis is comprised of two parts. The first part gives an overview of time-dependent data and of this thesis with its focus on classification; the second part covers the collection of seven publications. Time-Dependent Data Time Series Human Motion Recognitions Classification Visualization 54.72 - Künstliche Intelligenz 54.74 - Maschinelles Sehen I.5.0 - General I.2.10 - Vision and Scene Understanding 42.30.Sy - Pattern recognition ddc:004
4	Interactive 3D Reconstruction / Interaktive 3D-Rekonstruktion Schöning, Julius 23 May 2018 (has links) Applicable image-based reconstruction of three-dimensional (3D) objects offers many interesting industrial as well as private use cases, such as augmented reality, reverse engineering, 3D printing and simulation tasks. Unfortunately, image-based 3D reconstruction is not yet applicable to these quite complex tasks, since the resulting 3D models are single, monolithic objects without any division into logical or functional subparts. This thesis aims at making image-based 3D reconstruction feasible such that captures of standard cameras can be used for creating functional 3D models. The research presented in the following does not focus on the fine-tuning of algorithms to achieve minor improvements, but evaluates the entire processing pipeline of image-based 3D reconstruction and tries to contribute at four critical points, where significant improvement can be achieved by advanced human-computer interaction: (i) As the starting point of any 3D reconstruction process, the object of interest (OOI) that should be reconstructed needs to be annotated. For this task, novel pixel-accurate OOI annotation as an interactive process is presented, and an appropriate software solution is released. (ii) To improve the interactive annotation process, traditional interface devices, like mouse and keyboard, are supplemented with human sensory data to achieve closer user interaction. (iii) In practice, a major obstacle is the so far missing standard for file formats for annotation, which leads to numerous proprietary solutions. Therefore, a uniform standard file format is implemented and used for prototyping the first gaze-improved computer vision algorithms. As a sideline of this research, analogies between the close interaction of humans and computer vision systems and 3D perception are identified and evaluated. (iv) Finally, to reduce the processing time of the underlying algorithms used for 3D reconstruction, the ability of artificial neural networks to reconstruct 3D models of unknown OOIs is investigated. Summarizing, the gained improvements show that applicable image-based 3D reconstruction is within reach but nowadays only feasible by supporting human-computer interaction. Two software solutions, one for visual video analytics and one for spare part reconstruction are implemented. In the future, automated 3D reconstruction that produces functional 3D models can be reached only when algorithms become capable of acquiring semantic knowledge. Until then, the world knowledge provided to the 3D reconstruction pipeline by human computer interaction is indispensable. 3D reconstruction object annotation human-machine-interaction user in the loop computer vision CAD-ready 3D-Rekonstruktion Maschinelles Sehen 54.74 - Maschinelles Sehen 54.72 - Künstliche Intelligenz I.4.5 - Reconstruction I.4.6 - Segmentation I.2.10 - Vision and Scene Understanding 07.05.Wr - Computer interfaces ddc:004 ddc:620

1

Page generated in 0.1158 seconds