21 |
Reconnaissance des actions humaines à partir d'une séquence vidéoTouati, Redha 12 1900 (has links)
The work done in this master's thesis, presents a new system for the
recognition of human actions from a video sequence. The system uses,
as input, a video sequence taken by a static camera. A binary
segmentation method of the the video sequence is first achieved, by a
learning algorithm, in order to detect and extract the different people
from the background. To recognize an action, the system then exploits
a set of prototypes generated from an MDS-based dimensionality
reduction technique, from two different points of view in the video
sequence. This dimensionality reduction technique, according to two
different viewpoints, allows us to model each human action of the
training base with a set of prototypes (supposed to be similar for
each class) represented in a low dimensional non-linear space. The
prototypes, extracted according to the two viewpoints, are fed to a
$K$-NN classifier which allows us to identify the human action that
takes place in the video sequence. The experiments of our model
conducted on the Weizmann dataset of human actions provide interesting
results compared to the other state-of-the art (and often more
complicated) methods. These experiments show first the
sensitivity of our model for each viewpoint and its effectiveness to
recognize the different actions, with a variable but satisfactory
recognition rate and also the results obtained by the fusion of these
two points of view, which allows us to achieve a high performance
recognition rate. / Le travail mené dans le cadre de ce projet de maîtrise vise à
présenter un nouveau système de reconnaissance d’actions humaines à
partir d'une séquence d'images vidéo. Le système utilise en entrée une
séquence vidéo prise par une caméra statique. Une méthode de
segmentation binaire est d'abord effectuée, grâce à un algorithme
d’apprentissage, afin de détecter les différentes personnes de
l'arrière-plan. Afin de reconnaitre une action, le système exploite
ensuite un ensemble de prototypes générés, par une technique de
réduction de dimensionnalité MDS, à partir de deux points de vue
différents dans la séquence d'images. Cette étape de réduction de
dimensionnalité, selon deux points de vue différents, permet de
modéliser chaque action de la base d'apprentissage par un ensemble de
prototypes (censé être relativement similaire pour chaque classe)
représentés dans un espace de faible dimension non linéaire. Les
prototypes extraits selon les deux points de vue sont amenés à un
classifieur K-ppv qui permet de reconnaitre l'action qui se déroule
dans la séquence vidéo. Les expérimentations de ce système sur la
base d’actions humaines de Wiezmann procurent des résultats assez
intéressants comparés à d’autres méthodes plus complexes. Ces
expériences montrent d'une part, la sensibilité du système pour chaque
point de vue et son efficacité à reconnaitre les différentes actions,
avec un taux de reconnaissance variable mais satisfaisant, ainsi que
les résultats obtenus par la fusion de ces deux points de vue, qui
permet l'obtention de taux de reconnaissance très performant.
|
22 |
Reconnaissance des actions humaines à partir d'une séquence vidéoTouati, Redha 12 1900 (has links)
The work done in this master's thesis, presents a new system for the
recognition of human actions from a video sequence. The system uses,
as input, a video sequence taken by a static camera. A binary
segmentation method of the the video sequence is first achieved, by a
learning algorithm, in order to detect and extract the different people
from the background. To recognize an action, the system then exploits
a set of prototypes generated from an MDS-based dimensionality
reduction technique, from two different points of view in the video
sequence. This dimensionality reduction technique, according to two
different viewpoints, allows us to model each human action of the
training base with a set of prototypes (supposed to be similar for
each class) represented in a low dimensional non-linear space. The
prototypes, extracted according to the two viewpoints, are fed to a
$K$-NN classifier which allows us to identify the human action that
takes place in the video sequence. The experiments of our model
conducted on the Weizmann dataset of human actions provide interesting
results compared to the other state-of-the art (and often more
complicated) methods. These experiments show first the
sensitivity of our model for each viewpoint and its effectiveness to
recognize the different actions, with a variable but satisfactory
recognition rate and also the results obtained by the fusion of these
two points of view, which allows us to achieve a high performance
recognition rate. / Le travail mené dans le cadre de ce projet de maîtrise vise à
présenter un nouveau système de reconnaissance d’actions humaines à
partir d'une séquence d'images vidéo. Le système utilise en entrée une
séquence vidéo prise par une caméra statique. Une méthode de
segmentation binaire est d'abord effectuée, grâce à un algorithme
d’apprentissage, afin de détecter les différentes personnes de
l'arrière-plan. Afin de reconnaitre une action, le système exploite
ensuite un ensemble de prototypes générés, par une technique de
réduction de dimensionnalité MDS, à partir de deux points de vue
différents dans la séquence d'images. Cette étape de réduction de
dimensionnalité, selon deux points de vue différents, permet de
modéliser chaque action de la base d'apprentissage par un ensemble de
prototypes (censé être relativement similaire pour chaque classe)
représentés dans un espace de faible dimension non linéaire. Les
prototypes extraits selon les deux points de vue sont amenés à un
classifieur K-ppv qui permet de reconnaitre l'action qui se déroule
dans la séquence vidéo. Les expérimentations de ce système sur la
base d’actions humaines de Wiezmann procurent des résultats assez
intéressants comparés à d’autres méthodes plus complexes. Ces
expériences montrent d'une part, la sensibilité du système pour chaque
point de vue et son efficacité à reconnaitre les différentes actions,
avec un taux de reconnaissance variable mais satisfaisant, ainsi que
les résultats obtenus par la fusion de ces deux points de vue, qui
permet l'obtention de taux de reconnaissance très performant.
|
23 |
The application of advanced inventory techniques in urban inventory data development to earthquake risk modeling and mitigation in mid-AmericaMuthukumar, Subrahmanyam 27 October 2008 (has links)
The process of modeling earthquake hazard risk and vulnerability is a prime component of mitigation planning, but is rife with epistemic, aleatory and factual uncertainty. Reducing uncertainty in such models yields significant benefits, both in terms of extending knowledge and increasing the efficiency and effectiveness of mitigation planning. An accurate description of the built environment as an input into loss estimation would reduce factual uncertainty in the modeling process.
Building attributes for earthquake loss estimation and risk assessment modeling were identified. Three modules for developing the building attributes were proposed, including structure classification, building footprint recognition and building valuation. Data from primary sources and field surveys were collected from Shelby County, Tennessee, for calibration and validation of the structure type models and for estimation of various components of building value. Building footprint libraries were generated for implementation of algorithms to programmatically recognize two-dimensional building configurations. The modules were implemented to produce a building inventory for Shelby County, Tennessee that may be used effectively in loss estimation modeling.
Validation of the building inventory demonstrates effectively that advanced technologies and methods may be effectively and innovatively applied on combinations of primary and derived data and replicated in order to produce a bottom-up, reliable, accurate and cost-effective building inventory.
|
24 |
Neural Architectures For Active Contour Modelling And For Pulse-Encoded Shape RecognitionRishikesh, N 06 1900 (has links) (PDF)
An innate desire of many vision researchers IS to unravel the mystery of human
visual perception Such an endeavor, even ~f it were not wholly successful, is expected to yield byproducts of considerable significance to industrial applications
Based on the current understanding of the neurophysiological and computational
processes in the human bran, it is believed that visual perception can be decomposed into distinct modules, of which feature / contour extraction and recognition / classification of the features corresponding to the objects play an important role. A remarkable characteristic of human visual expertise is its invariance to rotation shift, and scaling of objects in a scene
Researchers concur on the relevance of imitating as many properties as we have
knowledge of, of the human vision system, in order to devise simple solutions to
the problems in computational vision. The inference IS that this can be more
efficiently achieved by invoking neural architectures with specific characteristics
(similar to those of the modules in the human brain), and conforming to rules of
an appropriate mathematical baas As a first step towards the development of
such a framework, we make explicit (1) the nature of the images to be analyzed,
(11) the features to be extracted, (111) the relationship among features, contours,
and shape, and (iv) the exact nature of the problems To this end, we formulate
explicitly the problems considered in this thesis as follows
Problem 1
Given an Image localize and extract the boundary (contours) of the object of
Interest in lt
Problem 2
Recognize the shape of the object characterized by that contour employing a
suitable coder-recognizer such that ~t IS unaffected by rotation scaling and
translation of the objects
Problem 3
Gwen a stereo-pair of Images (1) extract the salient contours from the Images,
(ii)establish correspondence between the points in them and (111) estimate the depth associated with the points
We present a few algorithm as practical solutions to the above problems. The main contributions of the thesis are:
• A new algorithm for extraction of contours from images: and
• A novel method for invariantly coding shapes as pulses to facilitate their recognition.
The first contribution refers to a new active contour model, which is a neural network designed to extract the nearest salient contour in a given image by deforming itself to match the boundary of the object. The novelty of the model consists in the exploitation of the principles of spatial isomorphism and self organization in order to create flexible contours characterizing shapes in images. It turns out that the theoretical basis for the proposed model can be traced to the extensive literature on:
• Gestalt perception in which the principles of psycho-physical isomorphism plays a role; and
• Early processing in the human visual system derived from neuro-anatomical and neuro-physiological properties.
The initially chosen contour is made to undergo deformation by a locally co-operative, globally competitive scheme, in order to enable it to cling to the nearest salient contour in the test image. We illustrate the utility and versatility of the model by applying to the problems of boundary extraction, stereo vision, and bio-medical image analysis (including digital libraries).
The second contribution of the thesis is relevant to the design and development of a machine vision system in which the required contours are first to be extracted from a given set of images. Then follows the stage of recognizing the shape of the object characterized by that contour. It should, however, be noted that the latter problem is to be resolved in such a way that the system is unaffected by translation, relation, and scaling of images of objects under consideration. To this end, we develop some novel schemes:
• A pulse-coding scheme for an invariant representation of shapes; and
• A neural architecture for recognizing the encoded shapes.
The first (pulse-encoding) scheme is motivated by the versatility of the human visual system, and utilizes the properties of complex logarithmic mapping (CLM) which transforms rotation and scaling (in its domain) to shifts (in its range). In order to handle this shift, the encoder converts the CLM output to a sequence of
pulses These pulses are then fed to a novel multi-layered neural recognizer which
(1) invokes template matching with a distinctly implemented architecture, and (11)
achieves robustness (to noise and shape deformation) by virtue of its overlapping
strategy for code classification The proposed encoder-recognizer system (a) is
hardware implementable by a high-speed electronic switching circuit, and (b) can
add new patterns on-line to the existing ones Examples are given to illustrate
the proposed schemes.
The them is organized as follows:
Chapter 2 deals with the problem of extraction of salient contours from a
given gray level image, using a neural network-based active contour model
It explains the need for the use of active contour models, along with a brief
survey of the existing models, followed by two possible psycho-physiological
theories to support the proposed model After presenting the essential characteristics
of the model, the advantages and applications of the proposed
approach are demonstrated by some experimental results.
Chapter 3 is concerned with the problem of coding shapes and recognizing
them To this end, we describe a pulse coder for generating pulses invariant
to rotation, scaling and shift The code thus generated IS then fed to a
recognizer which classifies shapes based on the pulse code fed to it The
recognizer can also add new shapes to its 'knowledge-base' on-line. The
recognizer's properties are then discussed, thereby bringing out its advantages
with respect to various related architectures found in the literature.
Experimental results are then presented to Illustrate some prominent characteristics
of the approach.
Chapter 4 concludes the thesis, summarizing the overall contribution of the
thesis, and describing possible future directions
|
25 |
Détection et classification de cibles multispectrales dans l'infrarouge / Detection and classification of multispectral infrared targetsMaire, Florian 14 February 2014 (has links)
Les dispositifs de protection de sites sensibles doivent permettre de détecter des menaces potentielles suffisamment à l’avance pour pouvoir mettre en place une stratégie de défense. Dans cette optique, les méthodes de détection et de reconnaissance d’aéronefs se basant sur des images infrarouge multispectrales doivent être adaptées à des images faiblement résolues et être robustes à la variabilité spectrale et spatiale des cibles. Nous mettons au point dans cette thèse, des méthodes statistiques de détection et de reconnaissance d’aéronefs satisfaisant ces contraintes. Tout d’abord, nous spécifions une méthode de détection d’anomalies pour des images multispectrales, combinant un calcul de vraisemblance spectrale avec une étude sur les ensembles de niveaux de la transformée de Mahalanobis de l’image. Cette méthode ne nécessite aucune information a priori sur les aéronefs et nous permet d’identifier les images contenant des cibles. Ces images sont ensuite considérées comme des réalisations d’un modèle statistique d’observations fluctuant spectralement et spatialement autour de formes caractéristiques inconnues. L’estimation des paramètres de ce modèle est réalisée par une nouvelle méthodologie d’apprentissage séquentiel non supervisé pour des modèles à données manquantes que nous avons développée. La mise au point de ce modèle nous permet in fine de proposer une méthode de reconnaissance de cibles basée sur l’estimateur du maximum de vraisemblance a posteriori. Les résultats encourageants, tant en détection qu’en classification, justifient l’intérêt du développement de dispositifs permettant l’acquisition d’images multispectrales. Ces méthodes nous ont également permis d’identifier les regroupements de bandes spectrales optimales pour la détection et la reconnaissance d’aéronefs faiblement résolus en infrarouge / Surveillance systems should be able to detect potential threats far ahead in order to put forward a defence strategy. In this context, detection and recognition methods making use of multispectral infrared images should cope with low resolution signals and handle both spectral and spatial variability of the targets. We introduce in this PhD thesis a novel statistical methodology to perform aircraft detection and classification which take into account these constraints. We first propose an anomaly detection method designed for multispectral images, which combines a spectral likelihood measure and a level set study of the image Mahalanobis transform. This technique allows to identify images which feature an anomaly without any prior knowledge on the target. In a second time, these images are used as realizations of a statistical model in which the observations are described as random spectral and spatial deformation of prototype shapes. The model inference, and in particular the prototype shape estimation, is achieved through a novel unsupervised sequential learning algorithm designed for missing data models. This model allows to propose a classification algorithm based on maximum a posteriori probability Promising results in detection as well as in classification, justify the growing interest surrounding the development of multispectral imaging devices. These methods have also allowed us to identify the optimal infrared spectral band regroupments regarding the low resolution aircraft IRS detection and classification
|
26 |
Meta-Pseudo Labelled Multi-View 3D Shape Recognition / Meta-pseudomärking med Bilder från Flera Kameravinklar för 3D ObjektigenkänningUçkun, Fehmi Ayberk January 2023 (has links)
The field of computer vision has long pursued the challenge of understanding the three-dimensional world. This endeavour is further fuelled by the increasing demand for technologies that rely on accurate perception of the 3D environment such as autonomous driving and augmented reality. However, the labelled data scarcity in the 3D domain continues to be a hindrance to extensive research and development. Semi-Supervised Learning is a valuable tool to overcome data scarcity yet most of the state-of-art methods are primarily developed and tested for two-dimensional vision problems. To address this challenge, there is a need to explore innovative approaches that can bridge the gap between 2D and 3D domains. In this work, we propose a technique that both leverages the existing abundance of two-dimensional data and makes the state-of-art semi-supervised learning methods directly applicable to 3D tasks. Multi-View Meta Pseudo Labelling (MV-MPL) combines one of the best-performing architectures in 3D shape recognition, Multi-View Convolutional Neural Networks, together with the state-of-art semi-supervised method, Meta Pseudo Labelling. To evaluate the performance of MV-MPL, comprehensive experiments are conducted on widely used shape recognition benchmarks ModelNet40, ShapeNetCore-v1, and ShapeNetCore-v2, as well as, Objaverse-LVIS. The results demonstrate that MV-MPL achieves competitive accuracy compared to fully supervised models, even when only \(10%\) of the labels are available. Furthermore, the study reveals that the object descriptors extracted from the MV-MPL model exhibit strong performance on shape retrieval tasks, indicating the effectiveness of the approach beyond classification objectives. Further analysis includes the evaluation of MV-MPL under more restrained scenarios, the enhancements to the view aggregation and pseudo-labelling processes; and the exploration of the potential of employing multi-views as augmentations for semi-supervised learning. / Forskningsområdet för datorseende har länge strävat efter utmaningen att förstå den tredimensionella världen. Denna strävan drivs ytterligare av den ökande efterfrågan på teknologier som är beroende av en korrekt uppfattning av den tredimensionella miljön, såsom autonom körning och förstärkt verklighet. Dock fortsätter bristen på märkt data inom det tredimensionella området att vara ett hinder för omfattande forskning och utveckling. Halv-vägledd lärning (semi-supervised learning) framträder som ett värdefullt verktyg för att övervinna bristen på data, ändå är de flesta av de mest avancerade semisupervised-metoderna primärt utvecklade och testade för tvådimensionella problem inom datorseende. För att möta denna utmaning krävs det att utforska innovativa tillvägagångssätt som kan överbrygga klyftan mellan 2D- och 3D-domänerna. I detta arbete föreslår vi en teknik som både utnyttjar den befintliga överflöd av tvådimensionella data och gör det möjligt att direkt tillämpa de mest avancerade semisupervised-lärandemetoderna på 3D-uppgifter. Multi-View Meta Pseudo Labelling (MV-MPL) kombinerar en av de bästa arkitekturerna för 3D-formigenkänning, Multi-View Convolutional Neural Networks, tillsammans med den mest avancerade semisupervised-metoden, Meta Pseudo Labelling. För att utvärdera prestandan hos MV-MPL genomförs omfattande experiment på väl använda uvärderingar för formigenkänning., ModelNet40, ShapeNetCore-v1 och ShapeNetCore-v2. Resultaten visar att MV-MPL uppnår konkurrenskraftig noggrannhet jämfört med helt vägledda modeller, även när endast \(10%\) av etiketterna är tillgängliga. Dessutom visar studien att objektbeskrivningarna som extraherats från MV-MPL-modellen uppvisar en stark prestanda i formåterhämtningsuppgifter, vilket indikerar effektiviteten hos tillvägagångssättet bortom klassificeringsmål. Vidare analys inkluderar utvärderingen av MV-MPL under mer begränsade scenarier, förbättringar av vyaggregerings- och pseudomärkningsprocesserna samt utforskning av potentialen att använda bilder från flera vinklar som en metod att få mer data för halv-vägledd lärande.
|
27 |
Computational analysis of wide-angle light scattering from single cellsPilarski, Patrick Michael 11 1900 (has links)
The analysis of wide-angle cellular light scattering patterns is a challenging problem. Small changes to the organization, orientation, shape, and optical properties of scatterers and scattering populations can significantly alter their complex two-dimensional scattering signatures. Because of this, it is difficult to find methods that can identify medically relevant cellular properties while remaining robust to experimental noise and sample-to-sample differences. It is an important problem. Recent work has shown that changes to the internal structure of cells---specifically, the distribution and aggregation of organelles---can indicate the progression of a number of common disorders, ranging from cancer to neurodegenerative disease, and can also predict a patient's response to treatments like chemotherapy. However, there is no direct analytical solution to the inverse wide-angle cellular light scattering problem, and available simulation and interpretation methods either rely on restrictive cell models, or are too computationally demanding for routine use.
This dissertation addresses these challenges from a computational vantage point. First, it explores the theoretical limits and optical basis for wide-angle scattering pattern analysis. The result is a rapid new simulation method to generate realistic organelle scattering patterns without the need for computationally challenging or restrictive routines. Pattern analysis, image segmentation, machine learning, and iterative pattern classification methods are then used to identify novel relationships between wide-angle scattering patterns and the distribution of organelles (in this case mitochondria) within a cell. Importantly, this work shows that by parameterizing a scattering image it is possible to extract vital information about cell structure while remaining robust to changes in organelle concentration, effective size, and random placement. The result is a powerful collection of methods to simulate and interpret experimental light scattering signatures. This gives new insight into the theoretical basis for wide-angle cellular light scattering, and facilitates advances in real-time patient care, cell structure prediction, and cell morphology research.
|
28 |
Computational analysis of wide-angle light scattering from single cellsPilarski, Patrick Michael Unknown Date
No description available.
|
Page generated in 0.0843 seconds