• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 80
  • 27
  • 10
  • 4
  • 4
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 149
  • 149
  • 48
  • 28
  • 24
  • 24
  • 22
  • 21
  • 18
  • 17
  • 17
  • 16
  • 15
  • 15
  • 15
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
141

[en] FAST MOTION ADAPTIVE ESTIMATION ALGORITHM APPLIED TO THE H.261/AVC STANDARD CODER / [pt] ALGORITMO RÁPIDO DE ESTIMAÇÃO ADAPTATIVO AO MOVIMENTO APLICADO AO CODIFICADOR PADRÃO H.264/AVC

GUILHERME MACHADO GOEHRINGER 31 March 2008 (has links)
[pt] As técnicas de estimação de movimento utilizadas nos padrões de compressão de vídeo proporcionam a utilização mais eficiente dos recursos de transmissão e armazenamento, através da redução do número de bits necessários para representar um sinal de vídeo e da conservação da qualidade do conteúdo que está sendo processado. O objetivo dessa dissertação de Mestrado é propor um novo algoritmo capaz de reduzir a grande complexidade computacional envolvida nestas técnicas, mantendo a qualidade do sinal reconstruído. Dessa maneira, apresenta-se um algoritmo AUMHS (Adaptive Unsymmetrical-cross Multi-Hexagon-grid Search) o qual traz como principais modificações ao algoritmo UMHS (Unsymmetrical-cross Multi-Hexagon-grid Search) a implementação de uma medida de movimento que classifica as cenas de uma seqüência de vídeo de acordo com o movimento detectado para posterior adequação dos parâmetros de estimação de movimento e de outros parâmetros do codificador. Como resultado apresenta-se um ganho expressivo na velocidade de processamento, e conseqüente redução do custo computacional, conservando-se a qualidade obtida pelos principais algoritmos da literatura. O algoritmo foi implementado no codificador do padrão H.264/AVC onde realizou-se análises comparativas de desempenho com os algoritmos UMHS e FSA através da medição de parâmetros como PSNR (Peak Signal to Noise Ratio), tempo de processamento do codificador, tempo de processamento do módulo de estimação de movimento, taxa de bits utilizada e avaliação subjetiva informal. / [en] The motion estimation techniques used by the video compression standards provide an efficient utilization of the transmission and storage resources, through the reduction of the number of bits required to represent a video signal and the conservation of the content quality that is being processed. The objective of this work is to propose a new algorithm capable of reducing the great computational complexity involved in the motion estimation techniques, keeping the quality of the reconstructed signal. In this way, we present an algorithm called AUMHS (Adaptive Unsymmetrical-cross Multi-Hexagon-grid Search) which brings as main modifications relative to the UMHS (Unsymmetrical-cross Multi-Hexagon-grid Search) the implementation of a movement measure that can classify the scenes of a video sequence according to the motion detected for posterior adequacy of the motion estimation and others coder parameters. As result we present an expressive gain in the processing speed, and consequent computational cost reduction, conserving the same quality of the main algorithms published in the literature. The algorithm was implemented in the H.264/AVC coder in order to proceed with comparative analysis of perfomance together with the UMHS and FSA algorithms, measuring parameters as PSNR (Peak Signal you the Noise Ratio), coding processing time, motion estimation time, bit rate, and informal subjective evaluation.
142

Sensordatenfusion zur robusten Bewegungsschätzung eines autonomen Flugroboters

Wunschel, Daniel 15 March 2012 (has links) (PDF)
Eine Voraussetzung um einen Flugregler für Flugroboter zu realisieren, ist die Wahrnehmung der Bewegungen dieses Roboters. Diese Arbeit beschreibt einen Ansatz zur Schätzung der Bewegung eines autonomen Flugroboters unter Verwendung relativ einfacher, leichter und kostengünstiger Sensoren. Mittels eines Erweiterten Kalman Filters werden Beschleunigungssensoren, Gyroskope, ein Ultraschallsensor, sowie ein Sensor zu Messung des optischen Flusses zu einer robusten Bewegungsschätzung kombiniert. Dabei wurden die einzelnen Sensoren hinsichtlich der Eigenschaften experimentell untersucht, welche für die anschließende Erstellung des Filters relevant sind. Am Ende werden die Resultate des Filters mit den Ergebnissen einer Simulation und eines externen Tracking-Systems verglichen.
143

Ανάπτυξη αποδοτικών παραμετρικών τεχνικών αντιστοίχισης εικόνων με εφαρμογή στην υπολογιστική όραση

Ευαγγελίδης, Γεώργιος 12 January 2009 (has links)
Μια από τις συνεχώς εξελισσόμενες περιοχές της επιστήμης των υπολογιστών είναι η Υπολογιστική Όραση, σκοπός της οποίας είναι η δημιουργία έξυπνων συστημάτων για την ανάκτηση πληροφοριών από πραγματικές εικόνες. Πολλές σύγχρονες εφαρμογές της υπολογιστικής όρασης βασίζονται στην αντιστοίχιση εικόνων. Την πλειοψηφία των αλγορίθμων αντιστοίχισης συνθέτουν παραμετρικές τεχνικές, σύμφωνα με τις οποίες υιοθετείται ένα παραμετρικό μοντέλο, το οποίο εφαρμοζόμενο στη μια εικόνα δύναται να παρέχει μια προσέγγιση της άλλης. Στο πλαίσιο της διατριβής μελετάται εκτενώς το πρόβλημα της Στερεοσκοπικής Αντιστοίχισης και το γενικό πρόβλημα της Ευθυγράμμισης Εικόνων. Για την αντιμετώπιση του πρώτου προβλήματος προτείνεται ένας τοπικός αλγόριθμος διαφορικής αντιστοίχισης που κάνει χρήση μιας νέας συνάρτησης κόστους, του Τροποποιημένου Συντελεστή Συσχέτισης (ECC), η οποία ενσωματώνει το παραμετρικό μοντέλο μετατόπισης στον κλασικό συντελεστή συσχέτισης. Η ενσωμάτωση αυτή καθιστά τη νέα συνάρτηση κατάλληλη για εκτιμήσεις ανομοιότητας με ακρίβεια μικρότερη από αυτήν του εικονοστοιχείου. Αν και η συνάρτηση αυτή είναι μη γραμμική ως προς την παράμετρο μετατόπισης, το πρόβλημα μεγιστοποίησης έχει κλειστού τύπου λύση με αποτέλεσμα τη μειωμένη πολυπλοκότητα της διαδικασίας της αντιστοίχισης με ακρίβεια υπο-εικονοστοιχείου. Ο προτεινόμενος αλγόριθμος παρέχει ακριβή αποτελέσματα ακόμα και κάτω από μη γραμμικές φωτομετρικές παραμορφώσεις, ενώ η απόδοσή του υπερτερεί έναντι γνωστών στη διεθνή βιβλιογραφία τεχνικών αντιστοίχισης ενώ φαίνεται να είναι απαλλαγμένος από το φαινόμενο pixel locking. Στην περίπτωση του προβλήματος της ευθυγράμμισης εικόνων, η προτεινόμενη συνάρτηση γενικεύεται με αποτέλεσμα τη δυνατότητα χρήσης οποιουδήποτε δισδιάστατου μετασχηματισμού. Η μεγιστοποίησή της, η οποία αποτελεί ένα μη γραμμικό πρόβλημα, επιτυγχάνεται μέσω της επίλυσης μιας ακολουθίας υπο-προβλημάτων βελτιστοποίησης. Σε κάθε επανάληψη επιβάλλεται η μεγιστοποίηση μιας μη γραμμικής συνάρτησης του διανύσματος διορθώσεων των παραμέτρων, η οποία αποδεικνύεται ότι καταλήγει στη λύση ενός γραμμικού συστήματος. Δύο εκδόσεις του σχήματος αυτού προτείνονται: ο αλγόριθμος Forwards Additive ECC (FA-ECC) και o αποδοτικός υπολογιστικά αλγόριθμος Inverse Compositional ECC (IC-ECC). Τα προτεινόμενα σχήματα συγκρίνονται με τα αντίστοιχα (FA-LK και SIC) του αλγόριθμου Lucas-Kanade, ο οποίος αποτελεί σημείο αναφοράς στη σχετική βιβλιογραφία, μέσα από μια σειρά πειραμάτων. Ο αλγόριθμος FA-ECC παρουσιάζει όμοια πολυπλοκότητα με τον ευρέως χρησιμοποιούμενο αλγόριθμο FA-LΚ και παρέχει πιο ακριβή αποτελέσματα ενώ συγκλίνει με αισθητά μεγαλύτερη πιθανότητα και ταχύτητα. Παράλληλα, παρουσιάζεται πιο εύρωστος σε περιπτώσεις παρουσίας προσθετικού θορύβου, φωτομετρικών παραμορφώσεων και υπερ-μοντελοποίησης της γεωμετρικής παραμόρφωσης των εικόνων. Ο αλγόριθμος IC-ECC κάνει χρήση της αντίστροφης λογικής, η οποία στηρίζεται στην αλλαγή των ρόλων των εικόνων αντιστοίχισης και συνδυάζει τον κανόνα ενημέρωσης των παραμέτρων μέσω της σύνθεσης των μετασχηματισμών. Τα δύο αυτά χαρακτηριστικά έχουν ως αποτέλεσμα τη δραστική μείωση του υπολογιστικού κόστους, ακόμα και σε σχέση με τον SIC αλγόριθμο, με τον οποίο βέβαια παρουσιάζει παρόμοια συμπεριφορά. Αν και ο αλγόριθμος FA-ECC γενικά υπερτερεί έναντι των τριών άλλων αλγορίθμων, η επιλογή μεταξύ των δύο προτεινόμενων σχημάτων εξαρτάται από το λόγο μεταξύ ακρίβειας αντιστοίχισης και υπολογιστικού κόστους. / Computer Vision has been recently one of the most active research areas in computer society. Many modern computer vision applications require the solution of the well known image registration problem which consist in finding correspondences between projections of the same scene. The majority of registration algorithms adopt a specific parametric transformation model, which is applied to one image, thus providing an approach of the other one. Towards the solution of the Stereo Correspondence problem, where the goal is the construction of the disparity map, a local differential algorithm is proposed which involves a new similarity criterion, the Enhanced Correlation Coefficient (ECC). This criterion is invariant to linear photometric distortions and results from the incorporation of a single parameter model into the classical correlation coefficient, defining thus a continuous objective function. Although the objective function is non-linear in translation parameter, its maximization results in a closed form solution, saving thus much computational burden. The proposed algorithm provides accurate results even under non-linear photometric distortions and its performance is superior to well known conventional stereo correspondence techniques. In addition, the proposed technique seems not to suffer from pixel locking effect and outperforms even stereo techniques, dedicated to the cancellation of this effect. For the image alignment problem, the maximization of a generalized version of ECC function that incorporates any 2D warp transformation is proposed. Although this function is a highly non-linear function of the warp parameters, an efficient iterative scheme for its maximization is developed. In each iteration of the new scheme, an efficient approximation of the nonlinear objective function is used leading to a closed form solution of low computational complexity. Two different iterative schemes are proposed; the Forwards Additive ECC (FA-ECC) and the Inverse Compositional ECC (IC-ECC) algorithm. Τhe proposed iterative schemes are compared with the corresponding schemes (FA-LK and SIC) of the leading Lucas-Kanade algorithm, through a series of experiments. FA-ECC algorithm makes use of the known additive parameter update rule and its computational cost is similar to the one required by the most widely used FA-LK algorithm. The proposed iterative scheme exhibits increased learning ability, since it converges faster with higher probability. This superiority is retained even in presence of additive noise and photometric distortion, as well as in cases of over-modelling the geometric distortion of the images. On the other hand, IC-ECC algorithm makes use of inverse logic by swapping the role of images and adopts the transformation composition update rule. As a consequence of these two options, the complexity per iteration is drastically reduced and the resulting algorithm constitutes the most computationally efficient scheme than three other above mentioned algorithms. However, empirical learning curves and probability of convergence scores indicate that the proposed algorithm has a similar performance to the one exhibited by SIC. Though FA-ECC seems to be clearly more robust in real situation conditions among all the above mentioned alignment algorithms, the choice between two proposed schemes necessitates a trade-off between accuracy and speed.
144

Estimation du mouvement de la paroi carotidienne en imagerie ultrasonore par une approche de marquage ultrasonore / Motion estimation of the carotid wall in ultrasound imaging using transverses oscillations

Salles, Sébastien 02 October 2015 (has links)
Ce travail de thèse est axé sur le domaine du traitement d’images biomédicales. L’objectif de notre étude est l’estimation des paramètres traduisant les propriétés mécaniques de l’artère carotide in vivo en imagerie échographique, dans une optique de détection précoce des pathologies cardiovasculaires. L’étude des comportements dynamiques de l’artère pour le dépistage précoce de l’athérosclérose constitue à ce jour une piste privilégiée. Cependant, malgré les avancées récentes, l’estimation du mouvement de la paroi carotidienne reste toujours difficile, notamment dans la direction longitudinale (direction parallèle au vaisseau). L’élaboration d’une méthode innovante permettant d’étudier le mouvement de la paroi carotidienne constitue la principale motivation de ce travail de thèse. Les trois contributions principales proposées dans ce travail sont i) le développement, la validation, et l’évaluation clinique d’une méthode originale d’estimation de mouvement 2D adaptée au mouvement de la paroi carotidienne, ii) la validation en simulation, et expérimentale de l’extension à la 3D de la méthode d’estimation proposée, et iii) l’évaluation expérimentale de la méthode proposée, en imagerie ultrasonore ultra-rapide, dans le cadre de l’estimation locale de la vitesse de l’onde de pouls. Nous proposons une méthode d’estimation de mouvement combinant un marquage ultrasonore dans la direction latérale, et un estimateur de mouvement basé sur la phase des images ultrasonores. Le marquage ultrasonore est réalisé par l’intermédiaire d’oscillations transverses. Nous proposons deux approches différentes pour introduire ces oscillations transverses, une approche classique utilisant une fonction de pondération spécifique, et une approche originale par filtrage permettant de contrôler de manière optimale leurs formations. L’estimateur de mouvement proposé utilise les phases analytiques des images radiofréquences, extraites par l’approche de Hahn. Ce travail de thèse montre que la méthode proposée permet une estimation de mouvement plus précise dans la direction longitudinale, et plus généralement dans les directions perpendiculaires au faisceau ultrasonore, que celle obtenue avec d’autres méthodes plus traditionnelles. De plus, l’évaluation expérimentale de la méthode sur des séquences d’images ultrasonores ultra-rapides issues de fantômes de carotide, a permis l’estimation locale de la vitesse de propagation de l’onde de pouls, la mise en évidence de la propagation d’un mouvement longitudinal et enfin l’estimation du module de Young des vaisseaux. / This work focuses on the processing of biomedical images. The aim of our study is to estimate the mechanical properties of the carotid artery in vivo using ultrasound imaging, in order to detect cardiovascular diseases at an early stage. Over the last decade, researchers have shown interest in studying artery wall motion, especially the motion of the carotid intima-media complex in order to demonstrate its significance as a marker of Atherosclerosis. However, despite recent progress, motion estimation of the carotid wall is still difficult, particularly in the longitudinal direction (direction parallel to the probe). The development of an innovative method for studying the movement of the carotid artery wall is the main motivation of this thesis. The three main contributions proposed in this work are i) the development, the validation, and the clinical evaluation of a novel method for 2D motion estimation of the carotid wall, ii) the development, the simulation and the experimental validation of the 3D extension of the estimation method proposed, and iii) the experimental evaluation of the 2D proposed method in ultra-fast imaging, for the estimation of the local pulse wave velocity. We propose a motion estimation method combining tagging of the ultrasound images, and a motion estimator based on the phase of the ultrasound images. The ultrasonic tagging is produced by means of transverse oscillations. We present two different approaches to introduce these transverses oscillations, a classic approach using a specific apodization function and a new approach based on filtering. The proposed motion estimator uses the 2D analytical phase of RF images using the Hahn approach. This thesis work shows that, compared with conventional methods, the proposed approach provides more accurate motion estimation in the longitudinal direction, and more generally in directions perpendicular to the beam axis. Also, the experimental evaluation of our method on ultra-fast images sequences from carotid phantom was used to validate our method regarding the estimation of the pulse wave velocity, the Young’s modulus of the vessels wall, and the propagation of a longitudinal movement.
145

Analýza vlastností stereokamery ZED ve venkovním prostředí / Analysis of ZED stereocamera in outdoor environment

Svoboda, Ondřej January 2019 (has links)
The Master thesis is focused on analyzing stereo camera ZED in the outdoor environment. There is compared ZEDfu visual odometry with commonly used methods like GPS or wheel odometry. Moreover, the thesis includes analyses of SLAM in the changeable outdoor environment, too. The simultaneous mapping and localization in RTAB-Map were processed separately with SIFT and BRISK descriptors. The aim of this master thesis is to analyze the behaviour ZED camera in the outdoor environment for future implementation in mobile robotics.
146

Sensordatenfusion zur robusten Bewegungsschätzung eines autonomen Flugroboters

Wunschel, Daniel 24 October 2011 (has links)
Eine Voraussetzung um einen Flugregler für Flugroboter zu realisieren, ist die Wahrnehmung der Bewegungen dieses Roboters. Diese Arbeit beschreibt einen Ansatz zur Schätzung der Bewegung eines autonomen Flugroboters unter Verwendung relativ einfacher, leichter und kostengünstiger Sensoren. Mittels eines Erweiterten Kalman Filters werden Beschleunigungssensoren, Gyroskope, ein Ultraschallsensor, sowie ein Sensor zu Messung des optischen Flusses zu einer robusten Bewegungsschätzung kombiniert. Dabei wurden die einzelnen Sensoren hinsichtlich der Eigenschaften experimentell untersucht, welche für die anschließende Erstellung des Filters relevant sind. Am Ende werden die Resultate des Filters mit den Ergebnissen einer Simulation und eines externen Tracking-Systems verglichen.
147

Entwicklung und Validierung methodischer Konzepte einer kamerabasierten Durchfahrtshöhenerkennung für Nutzfahrzeuge

Hänert, Stephan 03 July 2020 (has links)
Die vorliegende Arbeit beschäftigt sich mit der Konzeptionierung und Entwicklung eines neuartigen Fahrerassistenzsystems für Nutzfahrzeuge, welches die lichte Höhe von vor dem Fahrzeug befindlichen Hindernissen berechnet und über einen Abgleich mit der einstellbaren Fahrzeughöhe die Passierbarkeit bestimmt. Dabei werden die von einer Monokamera aufgenommenen Bildsequenzen genutzt, um durch indirekte und direkte Rekonstruktionsverfahren ein 3D-Abbild der Fahrumgebung zu erschaffen. Unter Hinzunahme einer Radodometrie-basierten Eigenbewegungsschätzung wird die erstellte 3D-Repräsentation skaliert und eine Prädiktion der longitudinalen und lateralen Fahrzeugbewegung ermittelt. Basierend auf dem vertikalen Höhenplan der Straßenoberfläche, welcher über die Aneinanderreihung mehrerer Ebenen modelliert wird, erfolgt die Klassifizierung des 3D-Raums in Fahruntergrund, Struktur und potentielle Hindernisse. Die innerhalb des Fahrschlauchs liegenden Hindernisse werden hinsichtlich ihrer Entfernung und Höhe bewertet. Ein daraus abgeleitetes Warnkonzept dient der optisch-akustischen Signalisierung des Hindernisses im Kombiinstrument des Fahrzeugs. Erfolgt keine entsprechende Reaktion durch den Fahrer, so wird bei kritischen Hindernishöhen eine Notbremsung durchgeführt. Die geschätzte Eigenbewegung und berechneten Hindernisparameter werden mithilfe von Referenzsensorik bewertet. Dabei kommt eine dGPS-gestützte Inertialplattform sowie ein terrestrischer und mobiler Laserscanner zum Einsatz. Im Rahmen der Arbeit werden verschiedene Umgebungssituationen und Hindernistypen im urbanen und ländlichen Raum untersucht und Aussagen zur Genauigkeit und Zuverlässigkeit des Verfahrens getroffen. Ein wesentlicher Einflussfaktor auf die Dichte und Genauigkeit der 3D-Rekonstruktion ist eine gleichmäßige Umgebungsbeleuchtung innerhalb der Bildsequenzaufnahme. Es wird in diesem Zusammenhang zwingend auf den Einsatz einer Automotive-tauglichen Kamera verwiesen. Die durch die Radodometrie bestimmte Eigenbewegung eignet sich im langsamen Geschwindigkeitsbereich zur Skalierung des 3D-Punktraums. Dieser wiederum sollte durch eine Kombination aus indirektem und direktem Punktrekonstruktionsverfahren erstellt werden. Der indirekte Anteil stützt dabei die Initialisierung des Verfahrens zum Start der Funktion und ermöglicht eine robuste Kameraschätzung. Das direkte Verfahren ermöglicht die Rekonstruktion einer hohen Anzahl an 3D-Punkten auf den Hindernisumrissen, welche zumeist die Unterkante beinhalten. Die Unterkante kann in einer Entfernung bis zu 20 m detektiert und verfolgt werden. Der größte Einflussfaktor auf die Genauigkeit der Berechnung der lichten Höhe von Hindernissen ist die Modellierung des Fahruntergrunds. Zur Reduktion von Ausreißern in der Höhenberechnung eignet sich die Stabilisierung des Verfahrens durch die Nutzung von zeitlich vorher zur Verfügung stehenden Berechnungen. Als weitere Maßnahme zur Stabilisierung wird zudem empfohlen die Hindernisausgabe an den Fahrer und den automatischen Notbremsassistenten mittels einer Hysterese zu stützen. Das hier vorgestellte System eignet sich für Park- und Rangiervorgänge und ist als kostengünstiges Fahrerassistenzsystem interessant für Pkw mit Aufbauten und leichte Nutzfahrzeuge. / The present work deals with the conception and development of a novel advanced driver assistance system for commercial vehicles, which estimates the clearance height of obstacles in front of the vehicle and determines the passability by comparison with the adjustable vehicle height. The image sequences captured by a mono camera are used to create a 3D representation of the driving environment using indirect and direct reconstruction methods. The 3D representation is scaled and a prediction of the longitudinal and lateral movement of the vehicle is determined with the aid of a wheel odometry-based estimation of the vehicle's own movement. Based on the vertical elevation plan of the road surface, which is modelled by attaching several surfaces together, the 3D space is classified into driving surface, structure and potential obstacles. The obstacles within the predicted driving tube are evaluated with regard to their distance and height. A warning concept derived from this serves to visually and acoustically signal the obstacle in the vehicle's instrument cluster. If the driver does not respond accordingly, emergency braking will be applied at critical obstacle heights. The estimated vehicle movement and calculated obstacle parameters are evaluated with the aid of reference sensors. A dGPS-supported inertial measurement unit and a terrestrial as well as a mobile laser scanner are used. Within the scope of the work, different environmental situations and obstacle types in urban and rural areas are investigated and statements on the accuracy and reliability of the implemented function are made. A major factor influencing the density and accuracy of 3D reconstruction is uniform ambient lighting within the image sequence. In this context, the use of an automotive camera is mandatory. The inherent motion determined by wheel odometry is suitable for scaling the 3D point space in the slow speed range. The 3D representation however, should be created by a combination of indirect and direct point reconstruction methods. The indirect part supports the initialization phase of the function and enables a robust camera estimation. The direct method enables the reconstruction of a large number of 3D points on the obstacle outlines, which usually contain the lower edge. The lower edge can be detected and tracked up to 20 m away. The biggest factor influencing the accuracy of the calculation of the clearance height of obstacles is the modelling of the driving surface. To reduce outliers in the height calculation, the method can be stabilized by using calculations from older time steps. As a further stabilization measure, it is also recommended to support the obstacle output to the driver and the automatic emergency brake assistant by means of hysteresis. The system presented here is suitable for parking and maneuvering operations and is interesting as a cost-effective driver assistance system for cars with superstructures and light commercial vehicles.
148

Robust Subspace Estimation Using Low-rank Optimization. Theory And Applications In Scene Reconstruction, Video Denoising, And Activity Recognition.

Oreifej, Omar 01 January 2013 (has links)
In this dissertation, we discuss the problem of robust linear subspace estimation using low-rank optimization and propose three formulations of it. We demonstrate how these formulations can be used to solve fundamental computer vision problems, and provide superior performance in terms of accuracy and running time. Consider a set of observations extracted from images (such as pixel gray values, local features, trajectories . . . etc). If the assumption that these observations are drawn from a liner subspace (or can be linearly approximated) is valid, then the goal is to represent each observation as a linear combination of a compact basis, while maintaining a minimal reconstruction error. One of the earliest, yet most popular, approaches to achieve that is Principal Component Analysis (PCA). However, PCA can only handle Gaussian noise, and thus suffers when the observations are contaminated with gross and sparse outliers. To this end, in this dissertation, we focus on estimating the subspace robustly using low-rank optimization, where the sparse outliers are detected and separated through the `1 norm. The robust estimation has a two-fold advantage: First, the obtained basis better represents the actual subspace because it does not include contributions from the outliers. Second, the detected outliers are often of a specific interest in many applications, as we will show throughout this thesis. We demonstrate four different formulations and applications for low-rank optimization. First, we consider the problem of reconstructing an underwater sequence by removing the iii turbulence caused by the water waves. The main drawback of most previous attempts to tackle this problem is that they heavily depend on modelling the waves, which in fact is ill-posed since the actual behavior of the waves along with the imaging process are complicated and include several noise components; therefore, their results are not satisfactory. In contrast, we propose a novel approach which outperforms the state-of-the-art. The intuition behind our method is that in a sequence where the water is static, the frames would be linearly correlated. Therefore, in the presence of water waves, we may consider the frames as noisy observations drawn from a the subspace of linearly correlated frames. However, the noise introduced by the water waves is not sparse, and thus cannot directly be detected using low-rank optimization. Therefore, we propose a data-driven two-stage approach, where the first stage “sparsifies” the noise, and the second stage detects it. The first stage leverages the temporal mean of the sequence to overcome the structured turbulence of the waves through an iterative registration algorithm. The result of the first stage is a high quality mean and a better structured sequence; however, the sequence still contains unstructured sparse noise. Thus, we employ a second stage at which we extract the sparse errors from the sequence through rank minimization. Our method converges faster, and drastically outperforms state of the art on all testing sequences. Secondly, we consider a closely related situation where an independently moving object is also present in the turbulent video. More precisely, we consider video sequences acquired in a desert battlefields, where atmospheric turbulence is typically present, in addition to independently moving targets. Typical approaches for turbulence mitigation follow averaging or de-warping techniques. Although these methods can reduce the turbulence, they distort the independently moving objects which can often be of great interest. Therefore, we address the iv problem of simultaneous turbulence mitigation and moving object detection. We propose a novel three-term low-rank matrix decomposition approach in which we decompose the turbulence sequence into three components: the background, the turbulence, and the object. We simplify this extremely difficult problem into a minimization of nuclear norm, Frobenius norm, and `1 norm. Our method is based on two observations: First, the turbulence causes dense and Gaussian noise, and therefore can be captured by Frobenius norm, while the moving objects are sparse and thus can be captured by `1 norm. Second, since the object’s motion is linear and intrinsically different than the Gaussian-like turbulence, a Gaussian-based turbulence model can be employed to enforce an additional constraint on the search space of the minimization. We demonstrate the robustness of our approach on challenging sequences which are significantly distorted with atmospheric turbulence and include extremely tiny moving objects. In addition to robustly detecting the subspace of the frames of a sequence, we consider using trajectories as observations in the low-rank optimization framework. In particular, in videos acquired by moving cameras, we track all the pixels in the video and use that to estimate the camera motion subspace. This is particularly useful in activity recognition, which typically requires standard preprocessing steps such as motion compensation, moving object detection, and object tracking. The errors from the motion compensation step propagate to the object detection stage, resulting in miss-detections, which further complicates the tracking stage, resulting in cluttered and incorrect tracks. In contrast, we propose a novel approach which does not follow the standard steps, and accordingly avoids the aforementioned diffi- culties. Our approach is based on Lagrangian particle trajectories which are a set of dense trajectories obtained by advecting optical flow over time, thus capturing the ensemble motions v of a scene. This is done in frames of unaligned video, and no object detection is required. In order to handle the moving camera, we decompose the trajectories into their camera-induced and object-induced components. Having obtained the relevant object motion trajectories, we compute a compact set of chaotic invariant features, which captures the characteristics of the trajectories. Consequently, a SVM is employed to learn and recognize the human actions using the computed motion features. We performed intensive experiments on multiple benchmark datasets, and obtained promising results. Finally, we consider a more challenging problem referred to as complex event recognition, where the activities of interest are complex and unconstrained. This problem typically pose significant challenges because it involves videos of highly variable content, noise, length, frame size . . . etc. In this extremely challenging task, high-level features have recently shown a promising direction as in [53, 129], where core low-level events referred to as concepts are annotated and modelled using a portion of the training data, then each event is described using its content of these concepts. However, because of the complex nature of the videos, both the concept models and the corresponding high-level features are significantly noisy. In order to address this problem, we propose a novel low-rank formulation, which combines the precisely annotated videos used to train the concepts, with the rich high-level features. Our approach finds a new representation for each event, which is not only low-rank, but also constrained to adhere to the concept annotation, thus suppressing the noise, and maintaining a consistent occurrence of the concepts in each event. Extensive experiments on large scale real world dataset TRECVID Multimedia Event Detection 2011 and 2012 demonstrate that our approach consistently improves the discriminativity of the high-level features by a significant margin.
149

The Stixel World

Pfeiffer, David 31 August 2012 (has links)
Die Stixel-Welt ist eine neuartige und vielseitig einsetzbare Zwischenrepräsentation zur effizienten Beschreibung dreidimensionaler Szenen. Heutige stereobasierte Sehsysteme ermöglichen die Bestimmung einer Tiefenmessung für nahezu jeden Bildpunkt in Echtzeit. Das erlaubt zum einen die Anwendung neuer leistungsfähiger Algorithmen, doch gleichzeitig steigt die zu verarbeitende Datenmenge und der dadurch notwendig werdende Aufwand massiv an. Gerade im Hinblick auf die limitierte Rechenleistung jener Systeme, wie sie in der videobasierten Fahrerassistenz zum Einsatz kommen, ist dies eine große Herausforderung. Um dieses Problem zu lösen, bietet die Stixel-Welt eine generische Abstraktion der Rohdaten des Sensors. Jeder Stixel repräsentiert individuell einen Teil eines Objektes im Raum und segmentiert so die Umgebung in Freiraum und Objekte. Die Arbeit stellt die notwendigen Verfahren vor, um die Stixel-Welt mittels dynamischer Programmierung in einem einzigen globalen Optimierungsschritt in Echtzeit zu extrahieren. Dieser Prozess wird durch eine Vielzahl unterschiedlicher Annahmen über unsere von Menschenhand geschaffene Umgebung gestützt. Darauf aufbauend wird ein Kalmanfilter-basiertes Verfahren zur präzisen Bewegungsschätzung anderer Objekte vorgestellt. Die Arbeit stellt umfangreiche Bewertungen der zu erwartenden Leistungsfähigkeit aller vorgestellten Verfahren an. Dafür kommen sowohl vergleichende Ansätze als auch diverse Referenzsensoren, wie beispielsweise LIDAR, RADAR oder hochpräzise Inertialmesssysteme, zur Anwendung. Die Stixel-Welt ist eine extrem kompakte Abstraktion der dreidimensionalen Umgebung und bietet gleichzeitig einfachsten Zugriff auf alle essentiellen Informationen der Szene. Infolge dieser Arbeit war es möglich, die Effizienz vieler auf der Stixel-Welt aufbauender Algorithmen deutlich zu verbessern. / The Stixel World is a novel and versatile medium-level representation to efficiently bridge the gap between pixel-based processing and high-level vision. Modern stereo matching schemes allow to obtain a depth measurement for almost every pixel of an image in real-time, thus allowing the application of new and powerful algorithms. However, it also results in a large amount of measurement data that has to be processed and evaluated. With respect to vision-based driver assistance, these algorithms are executed on highly integrated low-power processing units that leave no room for algorithms with an intense calculation effort. At the same time, the growing number of independently executed vision tasks asks for new concepts to manage the resulting system complexity. These challenges are tackled by introducing a pre-processing step to extract all required information in advance. Each Stixel approximates a part of an object along with its distance and height. The Stixel World is computed in a single unified optimization scheme. Strong use is made of physically motivated a priori knowledge about our man-made three-dimensional environment. Relying on dynamic programming guarantees to extract the globally optimal segmentation for the entire scenario. Kalman filtering techniques are used to precisely estimate the motion state of all tracked objects. Particular emphasis is put on a thorough performance evaluation. Different comparative strategies are followed which include LIDAR, RADAR, and IMU reference sensors, manually created ground truth data, and real-world tests. Altogether, the Stixel World is ideally suited to serve as the basic building block for today''s increasingly complex vision systems. It is an extremely compact abstraction of the actual world giving access to the most essential information about the current scenario. Thanks to this thesis, the efficiency of subsequently executed vision algorithms and applications has improved significantly.

Page generated in 0.1387 seconds