Global ETD Search

71	From robotics to healthcare: toward clinically-relevant 3-D human pose tracking for lower limb mobility assessments Mitjans i Coma, Marc 11 September 2024 (has links) With an increase in age comes an increase in the risk of frailty and mobility decline, which can lead to dangerous falls and can even be a cause of mortality. Despite these serious consequences, healthcare systems remain reactive, highlighting the need for technologies to predict functional mobility decline. In this thesis, we present an end-to-end autonomous functional mobility assessment system that seeks to bridge the gap between robotics research and clinical rehabilitation practices. Unlike many fully integrated black-box models, our approach emphasizes the need for a system that is both reliable as well as transparent to facilitate its endorsement and adoption by healthcare professionals and patients. Our proposed system is characterized by the sensor fusion of multimodal data using an optimization framework known as factor graphs. This method, widely used in robotics, enables us to obtain visually interpretable 3-D estimations of the human body in recorded footage. These representations are then used to implement autonomous versions of standardized assessments employed by physical therapists for measuring lower-limb mobility, using a combination of custom neural networks and explainable models. To improve the accuracy of the estimations, we investigate the application of the Koopman operator framework to learn linear representations of human dynamics: We leverage these outputs as prior information to enhance the temporal consistency across entire movement sequences. Furthermore, inspired by the inherent stability of natural human movement, we propose ways to impose stability constraints in the dynamics during the training of linear Koopman models. In this light, we propose a sufficient condition for the stability of discrete-time linear systems that can be represented as a set of convex constraints. Additionally, we demonstrate how it can be seamlessly integrated into larger-scale gradient descent optimization methods. Lastly, we report the performance of our human pose detection and autonomous mobility assessment systems by evaluating them on outcome mobility datasets collected from controlled laboratory settings and unconstrained real-life home environments. While we acknowledge that further research is still needed, the study results indicate that the system can demonstrate promising performance in assessing mobility in home environments. These findings underscore the significant potential of this and similar technologies to revolutionize physical therapy practices. Robotics Dynamics and controls Human activity recognition Human pose estimation Mobility assessment Sensor fusion Visual-inertial odometry
72	6DOF MAGNETIC TRACKING AND ITS APPLICATION TO HUMAN GAIT ANALYSIS Ravi Abhishek Shankar (18855049) 28 June 2024 (has links) <p dir="ltr">There is growing research in analyzing human gait in the context of various applications. This has been aided by the improvement in sensing technologies and computation power. A complex motor skill that it is, gait has found its use in medicine for diagnosing different neurological ailments and injuries. In sports, gait can be used to provide feedback to the player/athlete to improve his/her skill and to prevent injuries. In biometrics, gait can be used to identify and authenticate individuals. This can be easier to scale to perform biometrics of individuals in large crowds compared to conventional biometric methods. In the field of Human Computer Interaction (HCI), gait can be an additional input that could be provided to be used in applications such as video games. Gait analysis has also been used for Human Activity Recognition (HAR) for purposes such as personal fitness, elderly care and rehabilitation. </p><p dir="ltr">The current state-of-the-art methods for gait analysis involves non-wearable technology due to its superior performance. The sophistication afforded in non-wearable technologies, such as cameras, is better able to capture gait information as compared to wearables. However, non-wearable systems are expensive, not scalable and typically, inaccessible to the general public. These systems sometimes need to be set up in specialized clinical facilities by experts. On the other hand, wearables offer scalability and convenience but are not able to match the performance of non-wearables. So the current work is a step in the direction to bridge the gap between the performance of non-wearable systems and the convenience of wearables. </p><p dir="ltr">A magnetic tracking system is developed to be applied for gait analysis. The system performs position and orientation tracking, i.e. 6 degrees of freedom or 6DoF tracking. One or more tracker modules, called Rx modules, is tracked with respect to a module called the Tx module. The Tx module mainly consists of a magnetic field generating coil, Inertial Measurement Unit (IMU) and magnetometer. The Rx module mainly consists of a tri-axis sensing coil, IMU and magnetometer. The system is minimally intrusive, works with Non-Line-of-Sight (NLoS) condition, low power consuming, compact and light weight. </p><p dir="ltr">The magnetic tracking system has been applied to the task of Human Activity Recognition (HAR) in this work as a proof-of-concept. The tracking system was worn by participants, and 4 activities - walking, walking with weight, marching and jogging - were performed. The Tx module was worn on the waist and the Rx modules were placed on the feet. To compare magnetic tracking with the most commonly used wearable sensors - IMUs + magnetometer - the same system was used to provide IMU and magnetometer data for the same 4 activities. The gait data was processed by 2 commonly used deep learning models - Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM). The magnetic tracking system shows an overall accuracy of 92\% compared to 86.69\% of the IMU + magnetometer system. Moreover, an accuracy improvement of 8\% is seen with the magnetic tracking system in differentiating between the walking and walking with weight activities, which are very similar in nature. This goes to show the improvement in gait information that 6DoF tracking brings, that manifests as increased classification accuracy. This increase in gait information will have a profound impact in other applications of gait analysis as well.</p> Biomedical instrumentation Electronic instrumentation Electronic sensors gait analysis solution 6DoF pose estimation magnetic tracking machine learning
73	Gappy POD and Temporal Correspondence for Lizard Motion Estimation Kurdila, Hannah Robertshaw 20 June 2018 (has links) With the maturity of conventional industrial robots, there has been increasing interest in designing robots that emulate realistic animal motions. This discipline requires careful and systematic investigation of a wide range of animal motions from biped, to quadruped, and even to serpentine motion of centipedes, millipedes, and snakes. Collecting optical motion capture data of such complex animal motions can be complicated for several reasons. Often there is the need to use many high-quality cameras for detailed subject tracking, and self-occlusion, loss of focus, and contrast variations challenge any imaging experiment. The problem of self-occlusion is especially pronounced for animals. In this thesis, we walk through the process of collecting motion capture data of a running lizard. In our collected raw video footage, it is difficult to make temporal correspondences using interpolation methods because of prolonged blurriness, occlusion, or the limited field of vision of our cameras. To work around this, we first make a model data set by making our best guess of the points' locations through these corruptions. Then, we randomly eclipse the data, use Gappy POD to repair the data and then see how closely it resembles the initial set, culminating in a test case where we simulate the actual corruptions we see in the raw video footage. / Master of Science / There has been increasing interest over the past few years in designing robots that emulate realistic animal motions. To make these designs as accurate as possible requires thorough analysis of animal motion. This is done by recording video and then converting it into numerical data, which can be analyzed in a rigorous way. But this conversion cannot be made when the raw video footage is ambiguous, for instance, when the footage is blurry, the shot is too dark or too light, the subject (or parts of the subject) are out of view of the camera, etc. In this thesis, we walk through the process of collecting video footage of a lizard running and then converting it into data. Ambiguities in the video footage result in an incomplete translation into numerical data and we use a mathematical technique called the Gappy Proper Orthogonal Decomposition to fill in this incompleteness in an intelligible way. And in the process, we lay your hands on the fundamental drivers of the animal’s motion. Gappy proper orthogonal decomposition lizard locomotion motion capture occlusion pose estimation temporal correspondence tracking
74	Infrared Light-Based Data Association and Pose Estimation for Aircraft Landing in Urban Environments Akagi, David 10 June 2024 (has links) (PDF) In this thesis we explore an infrared light-based approach to the problem of pose estimation during aircraft landing in urban environments where GPS is unreliable or unavailable. We introduce a novel fiducial constellation composed of sparse infrared lights that incorporates projective invariant properties in its design to allow for robust recognition and association from arbitrary camera perspectives. We propose a pose estimation pipeline capable of producing high accuracy pose measurements at real-time rates from monocular infrared camera views of the fiducial constellation, and present as part of that pipeline a data association method that is able to robustly identify and associate individual constellation points in the presence of clutter and occlusions. We demonstrate the accuracy and efficiency of this pose estimation approach on real-world data obtained from multiple flight tests, and show that we can obtain decimeter level accuracy from distances of over 100 m from the constellation. To achieve greater robustness to the potentially large number of outlier infrared detections that can arise in urban environments, we also explore learning-based approaches to the outlier rejection and data association problems. By formulating the problem of camera image data association as a 2D point cloud analysis, we can apply deep learning methods designed for 3D point cloud segmentation to achieve robust, high-accuracy associations at constant real-time speeds on infrared images with high outlier-to-inlier ratios. We again demonstrate the efficiency of our learning-based approach on both synthetic and real-world data, and compare the results and limitations of this method to our first-principles-based data association approach. UAV landing GPS-denied pose estimation infrared data association PointNet Engineering
75	Repousser les limites de l'identification faciale en contexte de vidéo-surveillance / Breaking the limits of facial identification in video-surveillance context. Fiche, Cécile 31 January 2012 (has links) Les systèmes d'identification de personnes basés sur le visage deviennent de plus en plus répandus et trouvent des applications très variées, en particulier dans le domaine de la vidéosurveillance. Or, dans ce contexte, les performances des algorithmes de reconnaissance faciale dépendent largement des conditions d'acquisition des images, en particulier lorsque la pose varie mais également parce que les méthodes d'acquisition elles mêmes peuvent introduire des artéfacts. On parle principalement ici de maladresse de mise au point pouvant entraîner du flou sur l'image ou bien d'erreurs liées à la compression et faisant apparaître des effets de blocs. Le travail réalisé au cours de la thèse porte donc sur la reconnaissance de visages à partir d'images acquises à l'aide de caméras de vidéosurveillance, présentant des artéfacts de flou ou de bloc ou bien des visages avec des poses variables. Nous proposons dans un premier temps une nouvelle approche permettant d'améliorer de façon significative la reconnaissance des visages avec un niveau de flou élevé ou présentant de forts effets de bloc. La méthode, à l'aide de métriques spécifiques, permet d'évaluer la qualité de l'image d'entrée et d'adapter en conséquence la base d'apprentissage des algorithmes de reconnaissance. Dans un second temps, nous nous sommes focalisés sur l'estimation de la pose du visage. En effet, il est généralement très difficile de reconnaître un visage lorsque celui-ci n'est pas de face et la plupart des algorithmes d'identification de visages considérés comme peu sensibles à ce paramètre nécessitent de connaître la pose pour atteindre un taux de reconnaissance intéressant en un temps relativement court. Nous avons donc développé une méthode d'estimation de la pose en nous basant sur des méthodes de reconnaissance récentes afin d'obtenir une estimation rapide et suffisante de ce paramètre. / The person identification systems based on face recognition are becoming increasingly widespread and are being used in very diverse applications, particularly in the field of video surveillance. In this context, the performance of the facial recognition algorithms largely depends on the image acquisition context, especially because the pose can vary, but also because the acquisition methods themselves can introduce artifacts. The main issues are focus imprecision, which can lead to blurred images, or the errors related to compression, which can introduce the block artifact. The work done during the thesis focuses on facial recognition in images taken by video surveillance cameras, in cases where the images contain blur or block artifacts or show various poses. First, we are proposing a new approach that allows to significantly improve facial recognition in images with high blur levels or with strong block artifacts. The method, which makes use of specific noreference metrics, starts with the evaluation of the quality level of the input image and then adapts the training database of the recognition algorithms accordingly. Second, we have focused on the facial pose estimation. Normally, it is very difficult to recognize a face in an image taken from another viewpoint than the frontal one and the majority of facial identification algorithms which are robust to pose variation need to know the pose in order to achieve a satisfying recognition rate in a relatively short time. We have therefore developed a fast and satisfying pose estimation method based on recent recognition techniques. Reconnaissance de visages Conditions non contrôlées Estimateur de pose Métriques de qualité sans référence Flou Effets de bloc Face recognition Uncontrolled condition of acquisition Pose estimation No-reference quality metrics Blur Blocks artefacts Face pose estimation
76	Real-Time Head Pose Estimation in Low-Resolution Football Footage / Realtidsestimering av huvudets vridning i lågupplösta videosekvenser från fotbollsmatcher Launila, Andreas January 2009 (has links) This report examines the problem of real-time head pose estimation in low-resolution football footage. A method is presented for inferring the head pose using a combination of footage and knowledge of the locations of the football and players. An ensemble of randomized ferns is compared with a support vector machine for processing the footage, while a support vector machine performs pattern recognition on the location data. Combining the two sources of information outperforms either in isolation. The location of the football turns out to be an important piece of information. / QC 20100707 / Capturing and Visualizing Large scale Human Action (ACTVIS) head pose estimation football real-time coarse head pose estimation machine learning computer vision svm randomized ferns Computer Sciences Datavetenskap (datalogi) Computer Engineering Datorteknik
77	Vision-Based Techniques for Cognitive and Motor Skill Assessments Floyd, Beatrice K. 24 August 2012 (has links) No description available. Computer Engineering Computer Science Mechanical Engineering Psychological Tests Psychology Computer Vision Image Processing Automation Performance Assessment Fine-motor Skill Assessment Cognitive Assessment Wrist Tracking Pose Estimation Gaze Detection Pose Estimation from Shape
78	Advancing human pose and gesture recognition Pfister, Tomas January 2015 (has links) This thesis presents new methods in two closely related areas of computer vision: human pose estimation, and gesture recognition in videos. In human pose estimation, we show that random forests can be used to estimate human pose in monocular videos. To this end, we propose a co-segmentation algorithm for segmenting humans out of videos, and an evaluator that predicts whether the estimated poses are correct or not. We further extend this pose estimator to new domains (with a transfer learning approach), and enhance its predictions by predicting the joint positions sequentially (rather than independently) in an image, and using temporal information in the videos (rather than predicting the poses from a single frame). Finally, we go beyond random forests, and show that convolutional neural networks can be used to estimate human pose even more accurately and efficiently. We propose two new convolutional neural network architectures, and show how optical flow can be employed in convolutional nets to further improve the predictions. In gesture recognition, we explore the idea of using weak supervision to learn gestures. We show that we can learn sign language automatically from signed TV broadcasts with subtitles by letting algorithms 'watch' the TV broadcasts and 'match' the signs with the subtitles. We further show that if even a small amount of strong supervision is available (as there is for sign language, in the form of sign language video dictionaries), this strong supervision can be combined with weak supervision to learn even better models. 006.3
79	[en] AN EVALUATION OF AUTOMATIC FACE RECOGNITION METHODS FOR SURVEILLANCE / [pt] ESTUDO DE MÉTODOS AUTOMÁTICOS DE RECONHECIMENTO FACIAL PARA VÍDEO MONITORAMENTO VICTOR HUGO AYMA QUIRITA 26 March 2015 (has links) [pt] Esta dissertação teve por objetivo comparar o desempenho de diversos algoritmos que representam o estado da arte em reconhecimento facial a imagens de sequências de vídeo. Três objetivos específicos foram perseguidos: desenvolver um método para determinar quando uma face está em posição frontal com respeito à câmera (detector de face frontal); avaliar a acurácia dos algoritmos de reconhecimento com base nas imagens faciais obtidas com ajuda do detector de face frontal; e, finalmente, identificar o algoritmo com melhor desempenho quando aplicado a tarefas de verificação e identificação. A comparação dos métodos de reconhecimento foi realizada adotando a seguinte metodologia: primeiro, foi criado um detector de face frontal que permitiu o captura das imagens faciais frontais; segundo, os algoritmos foram treinados e testados com a ajuda do facereclib, uma biblioteca desenvolvida pelo Grupo de Biometria no Instituto de Pesquisa IDIAP; terceiro, baseando-se nas curvas ROC e CMC como métricas, compararam-se os algoritmos de reconhecimento; e por ultimo, as análises dos resultados foram realizadas e as conclusões estão relatadas neste trabalho. Experimentos realizados sobre os bancos de vídeo: MOBIO, ChokePOINT, VidTIMIT, HONDA, e quatro fragmentos de diversos filmes, indicam que o Inter Session Variability Modeling e Gaussian Mixture Model são os algoritmos que fornecem a melhor acurácia quando são usados em tarefas tanto de verificação quanto de identificação, o que os indica como técnicas de reconhecimento viáveis para o vídeo monitoramento automático em vídeo. / [en] This dissertation aimed to compare the performance of state-of-the-arte face recognition algorithms in facial images captured from multiple video sequences. Three specific objectives were pursued: to develop a method for determining when a face is in frontal position with respect to the camera (frontal face detector); to evaluate the accuracy for recognition algorithms based on the facial images obtained with the help of the frontal face detector; and finally, to identify the algorithm with better performance when applied to verification and identification tasks in video surveillance systems. The comparison of the recognition methods was performed adopting the following approach: first, a frontal face detector, which allowed the capture of facial images was created; second, the algorithms were trained and tested with the help of facereclib, a library developed by the Biometrics Group at the IDIAP Research Institute; third, ROC and CMC curves were used as metrics to compare the recognition algorithms; and finally, the results were analyzed and the conclusions were reported in this manuscript. Experiments conducted on the video datasets: MOBIO, ChokePOINT, VidTIMIT, HONDA, and four fragments of several films, indicate that the Inter-Session Variability Modelling and Gaussian Mixture Model algorithms provide the best accuracy on classification when the algorithms are used in verification and identification tasks, which indicates them as a good automatic recognition techniques for video surveillance applications. [pt] BIOMETRIA [en] BIOMETRICS [pt] RECONHECIMENTO FACIAL [en] FACE RECOGNITION [pt] VIDEO MONITORAMENTO [en] VIDEO SURVEILLANCE [pt] ESTIMATIVA DE POSE [en] POSE ESTIMATION
80	Avancements dans l'estimation de pose et la reconstruction 3D de scènes à 2 et 3 vues / Advances on Pose Estimation and 3D Resconstruction of 2 and 3-View Scenes Fernandez Julia, Laura 13 December 2018 (has links) L'étude des caméras et des images a été un sujet prédominant depuis le début de la vision par ordinateur, l'un des principaux axes étant l'estimation de la pose et la reconstruction 3D. Le but de cette thèse est d'aborder et d'étudier certains problèmes et méthodes spécifiques du pipeline de la structure-from-motion afin d'améliorer la précision, de réaliser de vastes études pour comprendre les avantages et les inconvénients des modèles existants et de créer des outils mis à la disposition du public. Plus spécifiquement, nous concentrons notre attention sur les pairs stéréoscopiques et les triplets d'images et nous explorons certaines des méthodes et modèles capables de fournir une estimation de la pose et une reconstruction 3D de la scène.Tout d'abord, nous abordons la tâche d'estimation de la profondeur pour les pairs stéréoscopiques à l'aide de la correspondance de blocs. Cette approche suppose implicitement que tous les pixels du patch ont la même profondeur, ce qui produit l'artefact commun dénommé "foreground-fattening effect". Afin de trouver un support plus approprié, Yoon et Kweon ont introduit l'utilisation de poids basés sur la similarité des couleurs et la distance spatiale, analogues à ceux utilisés dans le filtre bilatéral. Nous présentons la théorie de cette méthode et l'implémentation que nous avons développée avec quelques améliorations. Nous discutons de quelques variantes de la méthode et analysons ses paramètres et ses performances.Deuxièmement, nous considérons l'ajout d'une troisième vue et étudions le tenseur trifocal, qui décrit les contraintes géométriques reliant les trois vues. Nous explorons les avantages offerts par cet opérateur dans la tâche d'estimation de pose d'un triplet de caméras par opposition au calcul des poses relatives paire par paire en utilisant la matrice fondamentale. De plus, nous présentons une étude et l’implémentation de plusieurs paramétrisations du tenseur. Nous montrons que l'amélioration initiale de la précision du tenseur trifocal n'est pas suffisante pour avoir un impact remarquable sur l'estimation de la pose après ajustement de faisceau et que l'utilisation de la matrice fondamentale avec des triplets d'image reste pertinente.Enfin, nous proposons d'utiliser un modèle de projection différent de celui de la caméra à sténopé pour l'estimation de la pose des caméras en perspective. Nous présentons une méthode basée sur la factorisation matricielle due à Tomasi et Kanade qui repose sur la projection orthographique. Cette méthode peut être utilisée dans des configurations où d'autres méthodes échouent, en particulier lorsque l'on utilise des caméras avec des objectifs à longue distance focale. La performance de notre implémentation de cette méthode est comparée à celle des méthodes basées sur la perspective, nous considérons que l'exactitude obtenue et la robustesse démontré en font un élément à considérer dans toute procédure de la SfM / The study of cameras and images has been a prominent subject since the beginning of computer vision, one of the main focus being the pose estimation and 3D reconstruction. The goal of this thesis is to tackle and study some specific problems and methods of the structure-from-motion pipeline in order to provide improvements in accuracy, broad studies to comprehend the advantages and disadvantages of the state-of-the-art models and useful implementations made available to the public. More specifically, we center our attention to stereo pairs and triplets of images and discuss some of the methods and models able to provide pose estimation and 3D reconstruction of the scene.First, we address the depth estimation task for stereo pairs using block-matching. This approach implicitly assumes that all pixels in the patch have the same depth producing the common artifact known as the ``foreground fattening effect''. In order to find a more appropriate support, Yoon and Kweon introduced the use of weights based on color similarity and spatial distance, analogous to those used in the bilateral filter. We present the theory of this method and the implementation we have developed with some improvements. We discuss some variants of the method and analyze its parameters and performance.Secondly, we consider the addition of a third view and study the trifocal tensor, which describes the geometric constraints linking the three views. We explore the advantages offered by this operator in the pose estimation task of a triplet of cameras as opposed to computing the relative poses pair by pair using the fundamental matrix. In addition, we present a study and implementation of several parameterizations of the tensor. We show that the initial improvement in accuracy of the trifocal tensor is not enough to have a remarkable impact on the pose estimation after bundle adjustment and that using the fundamental matrix with image triplets remains relevant.Finally, we propose using a different projection model than the pinhole camera for the pose estimation of perspective cameras. We present a method based on the matrix factorization due to Tomasi and Kanade that relies on the orthographic projection. This method can be used in configurations where other methods fail, in particular, when using cameras with long focal length lenses. The performance of our implementation of this method is compared to that given by the perspective-based methods, we consider that the accuracy achieved and its robustness make it worth considering in any SfM procedure Tenseur trifocal Reconstruction 3D Projection orthographique Stereovision Estimation de pose Stereovision Pose estimation Orthographic projection 3D reconstruction Trifocal tensor

Search results