Global ETD Search

51	Classifying Pairwise Object Interactions: A Trajectory Analytics Approach Janmohammadi, Siamak 05 1900 (has links) We have a huge amount of video data from extensively available surveillance cameras and increasingly growing technology to record the motion of a moving object in the form of trajectory data. With proliferation of location-enabled devices and ongoing growth in smartphone penetration as well as advancements in exploiting image processing techniques, tracking moving objects is more flawlessly achievable. In this work, we explore some domain-independent qualitative and quantitative features in raw trajectory (spatio-temporal) data in videos captured by a fixed single wide-angle view camera sensor in outdoor areas. We study the efficacy of those features in classifying four basic high level actions by employing two supervised learning algorithms and show how each of the features affect the learning algorithms’ overall accuracy as a single factor or confounded with others. action recognition machine learning trajectory analysis supervised classification methods activity recognition Human activity recognition. Pattern recognition systems. Machine learning. Electronic surveillance.
52	Reconnaissance d’activités humaines à partir de séquences vidéo / Human activity recognition from video sequences Selmi, Mouna 12 December 2014 (has links) Cette thèse s’inscrit dans le contexte de la reconnaissance des activités à partir de séquences vidéo qui est une des préoccupations majeures dans le domaine de la vision par ordinateur. Les domaines d'application pour ces systèmes de vision sont nombreux notamment la vidéo surveillance, la recherche et l'indexation automatique de vidéos ou encore l'assistance aux personnes âgées. Cette tâche reste problématique étant donnée les grandes variations dans la manière de réaliser les activités, l'apparence de la personne et les variations des conditions d'acquisition des activités. L'objectif principal de ce travail de thèse est de proposer une méthode de reconnaissance efficace par rapport aux différents facteurs de variabilité. Les représentations basées sur les points d'intérêt ont montré leur efficacité dans les travaux d'art; elles ont été généralement couplées avec des méthodes de classification globales vue que ses primitives sont temporellement et spatialement désordonnées. Les travaux les plus récents atteignent des performances élevées en modélisant le contexte spatio-temporel des points d'intérêts par exemple certains travaux encodent le voisinage des points d'intérêt à plusieurs échelles. Nous proposons une méthode de reconnaissance des activités qui modélise explicitement l'aspect séquentiel des activités tout en exploitant la robustesse des points d'intérêts dans les conditions réelles. Nous commençons par l'extractivité des points d'intérêt dont a montré leur robustesse par rapport à l'identité de la personne par une étude tensorielle. Ces primitives sont ensuite représentées en tant qu'une séquence de sac de mots (BOW) locaux: la séquence vidéo est segmentée temporellement en utilisant la technique de fenêtre glissante et chacun des segments ainsi obtenu est représenté par BOW des points d'intérêt lui appartenant. Le premier niveau de notre système de classification séquentiel hybride consiste à appliquer les séparateurs à vaste marge (SVM) en tant que classifieur de bas niveau afin de convertir les BOWs locaux en des vecteurs de probabilités des classes d'activité. Les séquences de vecteurs de probabilité ainsi obtenues sot utilisées comme l'entrées de classifieur séquentiel conditionnel champ aléatoire caché (HCRF). Ce dernier permet de classifier d'une manière discriminante les séries temporelles tout en modélisant leurs structures internes via les états cachés. Nous avons évalué notre approche sur des bases publiques ayant des caractéristiques diverses. Les résultats atteints semblent être intéressant par rapport à celles des travaux de l'état de l'art. De plus, nous avons montré que l'utilisation de classifieur de bas niveau permet d'améliorer la performance de système de reconnaissance vue que le classifieur séquentiel HCRF traite directement des informations sémantiques des BOWs locaux, à savoir la probabilité de chacune des activités relativement au segment en question. De plus, les vecteurs de probabilités ont une dimension faible ce qui contribue à éviter le problème de sur apprentissage qui peut intervenir si la dimension de vecteur de caractéristique est plus importante que le nombre des données; ce qui le cas lorsqu'on utilise les BOWs qui sont généralement de dimension élevée. L'estimation les paramètres du HCRF dans un espace de dimension réduite permet aussi de réduire le temps d'entrainement / Human activity recognition (HAR) from video sequences is one of the major active research areas of computer vision. There are numerous application HAR systems, including video-surveillance, search and automatic indexing of videos, and the assistance of frail elderly. This task remains a challenge because of the huge variations in the way of performing activities, in the appearance of the person and in the variation of the acquisition conditions. The main objective of this thesis is to develop an efficient HAR method that is robust to different sources of variability. Approaches based on interest points have shown excellent state-of-the-art performance over the past years. They are generally related to global classification methods as these primitives are temporally and spatially disordered. More recent studies have achieved a high performance by modeling the spatial and temporal context of interest points by encoding, for instance, the neighborhood of the interest points over several scales. In this thesis, we propose a method of activity recognition based on a hybrid model Support Vector Machine - Hidden Conditional Random Field (SVM-HCRF) that models the sequential aspect of activities while exploiting the robustness of interest points in real conditions. We first extract the interest points and show their robustness with respect to the person's identity by a multilinear tensor analysis. These primitives are then represented as a sequence of local "Bags of Words" (BOW): The video is temporally fragmented using the sliding window technique and each of the segments thus obtained is represented by the BOW of interest points belonging to it. The first layer of our hybrid sequential classification system is a Support Vector Machine that converts each local BOW extracted from the video sequence into a vector of activity classes’ probabilities. The sequence of probability vectors thus obtained is used as input of the HCRF. The latter permits a discriminative classification of time series while modeling their internal structures via the hidden states. We have evaluated our approach on various human activity datasets. The results achieved are competitive with those of the current state of art. We have demonstrated, in fact, that the use of a low-level classifier (SVM) improves the performance of the recognition system since the sequential classifier HCRF directly exploits the semantic information from local BOWs, namely the probability of each activity relatively to the current local segment, rather than mere raw information from interest points. Furthermore, the probability vectors have a low-dimension which prevents significantly the risk of overfitting that can occur if the feature vector dimension is relatively high with respect to the training data size; this is precisely the case when using BOWs that generally have a very high dimension. The estimation of the HCRF parameters in a low dimension allows also to significantly reduce the duration of the HCRF training phase Reconnaissance des activités Points d’intérêt Points denses Analyse tensorielle multilinéaire Séparateurs à vaste marge Champs aléatoires conditionnels cachés Human activity recognition Interest points Dense points Multilinear tensor analysis Classification of sequential data Support vector machines Hidden conditional random fields
53	Deep Learning Models for Human Activity Recognition Albert Florea, George, Weilid, Filip January 2019 (has links) AMI Meeting Corpus (AMI) -databasen används för att undersöka igenkännande av gruppaktivitet. AMI Meeting Corpus (AMI) -databasen ger forskare fjärrstyrda möten och naturliga möten i en kontorsmiljö; mötescenario i ett fyra personers stort kontorsrum. För attuppnågruppaktivitetsigenkänninganvändesbildsekvenserfrånvideosoch2-dimensionella audiospektrogram från AMI-databasen. Bildsekvenserna är RGB-färgade bilder och ljudspektrogram har en färgkanal. Bildsekvenserna producerades i batcher så att temporala funktioner kunde utvärderas tillsammans med ljudspektrogrammen. Det har visats att inkludering av temporala funktioner både under modellträning och sedan förutsäga beteende hos en aktivitet ökar valideringsnoggrannheten jämfört med modeller som endast använder rumsfunktioner[1]. Deep learning arkitekturer har implementerats för att känna igen olika mänskliga aktiviteter i AMI-kontorsmiljön med hjälp av extraherade data från the AMI-databas.Neurala nätverks modellerna byggdes med hjälp av KerasAPI tillsammans med TensorFlow biblioteket. Det ﬁnns olika typer av neurala nätverksarkitekturer. Arkitekturerna som undersöktes i detta projektet var Residual Neural Network, Visual GeometryGroup 16, Inception V3 och RCNN (LSTM). ImageNet-vikter har använts för att initialisera vikterna för Neurala nätverk basmodeller. ImageNet-vikterna tillhandahålls av Keras API och är optimerade för varje basmodell [2]. Basmodellerna använder ImageNet-vikter när de extraherar funktioner från inmatningsdata. Funktionsextraktionen med hjälp av ImageNet-vikter eller slumpmässiga vikter tillsammans med basmodellerna visade lovande resultat. Både Deep Learning användningen av täta skikt och LSTM spatio-temporala sekvens predikering implementerades framgångsrikt. / The Augmented Multi-party Interaction(AMI) Meeting Corpus database is used to investigate group activity recognition in an oﬃce environment. The AMI Meeting Corpus database provides researchers with remote controlled meetings and natural meetings in an oﬃce environment; meeting scenario in a four person sized oﬃce room. To achieve the group activity recognition video frames and 2-dimensional audio spectrograms were extracted from the AMI database. The video frames were RGB colored images and audio spectrograms had one color channel. The video frames were produced in batches so that temporal features could be evaluated together with the audio spectrogrames. It has been shown that including temporal features both during model training and then predicting the behavior of an activity increases the validation accuracy compared to models that only use spatial features [1]. Deep learning architectures have been implemented to recognize diﬀerent human activities in the AMI oﬃce environment using the extracted data from the AMI database.The Neural Network models were built using the Keras API together with TensorFlow library. There are diﬀerent types of Neural Network architectures. The architecture types that were investigated in this project were Residual Neural Network, Visual Geometry Group 16, Inception V3 and RCNN(Recurrent Neural Network). ImageNet weights have been used to initialize the weights for the Neural Network base models. ImageNet weights were provided by Keras API and was optimized for each base model[2]. The base models uses ImageNet weights when extracting features from the input data.The feature extraction using ImageNet weights or random weights together with the base models showed promising results. Both the Deep Learning using dense layers and the LSTM spatio-temporal sequence prediction were implemented successfully. ANN Deep learning DL human activity recognition ResNet VGG16 Inception V3 transfer learning ImageNet Keras AMI Augmented Multi-party Interaction LSTM RCNN CNN RGB colored images audio spectrograms Neural Network Engineering and Technology Teknik och teknologier
54	Exploration and Evaluation of RNN Models on Low-Resource Embedded Devices for Human Activity Recognition / Undersökning och utvärdering av RNN-modeller på resurssvaga inbyggda system för mänsklig aktivitetsigenkänning Björnsson, Helgi Hrafn, Kaldal, Jón January 2023 (has links) Human activity data is typically represented as time series data, and RNNs, often with LSTM cells, are commonly used for recognition in this field. However, RNNs and LSTM-RNNs are often too resource-intensive for real-time applications on resource constrained devices, making them unsuitable. This thesis project is carried out at Wrlds AB, Stockholm. At Wrlds, all machine learning is run in the cloud, but they have been attempting to run their AI algorithms on their embedded devices. The main task of this project was to investigate alternative network structures to minimize the size of the networks to be used on human activity data. This thesis investigates the use of Fast GRNN, a deep learning algorithm developed by Microsoft researchers, to classify human activity on resource-constrained devices. The FastGRNN algorithm was compared to state-of-the-art RNNs, LSTM, GRU, and Simple RNN in terms of accuracy, classification time, memory usage, and energy consumption. This research is limited to implementing the FastRNN algorithm on Nordic SoCs using their SDK and TensorFlow Lite Micro. The result of this thesis shows that the proposed network has similar performance as LSTM networks in terms of accuracy while being both considerably smaller and faster, making it a promising solution for human activity recognition on embedded devices with limited computational resources and merits further investigation. / Rörelse igenkännings analys är oftast representerat av tidsseriedata där ett RNN modell meden LSTM arkitektur är oftast den självklara vägen att ta. Dock så är denna arkitektur väldigt resurskrävande för applikationer i realtid och gör att det uppstår problem med resursbegränsad hårdvara. Detta examensarbete är utfört i samarbete med Wrlds Technologies AB. På Wrlds så körs deras maskin inlärningsmodeller på molnet och lokalt på mobiltelefoner. Wrlds har nu påbörjat en resa för att kunna köra modeller direkt på små inbyggda system. Examensarbete kommer att utvärdera en FastGRNN som är en NN-arkitektur utvecklad av Microsoft i syfte att användas på resurs begränsad hårdvara. FastGRNN algoritmen jämfördes med andra högkvalitativa arkitekturer som RNNs, LSTM, GRU och en simpel RNN. Träffsäkerhet, klassifikationstid, minnesanvändning samt energikonsumtion användes för att jämföra dom olika varianterna. Detta arbete kommer bara att utvärdera en FastGRNN algoritm på en Nordic SoCs och kommer att användas deras SDK samt Tensorflow Lite Micro. Resultatet från detta examensarbete visar att det utvärderade nätverket har liknande prestanda som ett LSTM nätverk men också att nätverket är betydligt mindre i storlek och därmed snabbare. Detta betyder att ett FastGRNN visar lovande resultat för användningen av rörelseigenkänning på inbyggda system med begränsad prestanda kapacitet. Recurrent Neural Networks Long Short-Term Memory Networks Embedded Systems Human Activity Recognition Edge AI TensorFlow Lite Micro Recurrent Neural Networks Long Short-Term Memory Networks Innbyggda systyem Mänsklig aktivitetsigenkänning Edge AI TensorFlow Lite Micro Mechanical Engineering Maskinteknik
55	Non-Bayesian Out-of-Distribution Detection Applied to CNN Architectures for Human Activity Recognition Socolovschi, Serghei January 2022 (has links) Human Activity Recognition (HAR) field studies the application of artificial intelligence methods for the identification of activities performed by people. Many applications of HAR in healthcare and sports require the safety-critical performance of the predictive models. The predictions produced by these models should be not only correct but also trustworthy. However, in recent years it has been shown that modern neural networks tend to produce sometimes wrong and overconfident predictions when processing unusual inputs. This issue puts at risk the prediction credibility and calls for solutions that might help estimate the uncertainty of the model’s predictions. In the following work, we started the investigation of the applicability of Non-Bayesian Uncertainty Estimation methods to the Deep Learning classification models in the HAR. We trained a Convolutional Neural Network (CNN) model with public datasets, such as UCI HAR and WISDM, which collect sensor-based time-series data about activities of daily life. Through a series of four experiments, we evaluated the performance of two Non-Bayesian uncertainty estimation methods, ODIN and Deep Ensemble, on out-of-distribution detection. We found out that the ODIN method is able to separate out-of-distribution samples from the in-distribution data. However, we also obtained unexpected behavior, when the out-of-distribution data contained exclusively dynamic activities. The Deep Ensemble method did not provide satisfactory results for our research question. / Inom området Human Activity Recognition (HAR) studeras tillämpningen av metoder för artificiell intelligens för identifiering av aktiviteter som utförs av människor. Många av tillämpningarna av HAR inom hälso och sjukvård och idrott kräver att de prediktiva modellerna har en säkerhetskritisk prestanda. De förutsägelser som dessa modeller ger upphov till ska inte bara vara korrekta utan också trovärdiga. Under de senaste åren har det dock visat sig att moderna neurala nätverk tenderar att ibland ge felaktiga och överdrivet säkra förutsägelser när de behandlar ovanliga indata. Detta problem äventyrar förutsägelsernas trovärdighet och kräver lösningar som kan hjälpa till att uppskatta osäkerheten i modellens förutsägelser. I följande arbete inledde vi undersökningen av tillämpligheten av icke-Bayesianska metoder för uppskattning av osäkerheten på Deep Learning-klassificeringsmodellerna i HAR. Vi tränade en CNN-modell med offentliga dataset, såsom UCI HAR och WISDM, som samlar in sensorbaserade tidsseriedata om aktiviteter i det dagliga livet. Genom en serie av fyra experiment utvärderade vi prestandan hos två icke-Bayesianska metoder för osäkerhetsuppskattning, ODIN och Deep Ensemble, för upptäckt av out-of-distribution. Vi upptäckte att ODIN-metoden kan skilja utdelade prover från data som är i distribution. Vi fick dock också ett oväntat beteende när uppgifterna om out-of-fdistribution uteslutande innehöll dynamiska aktiviteter. Deep Ensemble-metoden gav inga tillfredsställande resultat för vår forskningsfråga. Human Activity Recognition Deep Learning Time Series Uncertainty Estimation Outofdistribution Detection Convolutional Neural Network Human Activity Recognition Deep Learning Tidsserie Uppskattning av Osäkerheten Outofdistribution Detection Convolutional Neural Network Computer and Information Sciences Data- och informationsvetenskap
56	[en] USING BODY SENSOR NETWORKS AND HUMAN ACTIVITY RECOGNITION CLASSIFIERS TO ENHANCE THE ASSESSMENT OF FORM AND EXECUTION QUALITY IN FUNCTIONAL TRAINING / [pt] UTILIZANDO REDES DE SENSORES CORPORAIS E CLASSIFICADORES DE RECONHECIMENTO DE ATIVIDADE HUMANA PARA APRIMORAR A AVALIAÇÃO DE QUALIDADE DE FORMA E EXECUÇÃO EM TREINAMENTOS FUNCIONAIS RAFAEL DE PINHO ANDRE 14 December 2020 (has links) [pt] Dores no pé e joelho estão relacionadas com patologias ortopédicas e lesões nos membros inferiores. Desde a corrida de rua até o treinamento funcional CrossFit, estas dores e lesões estão correlacionadas com a distribuição iregular da pressão plantar e o posicionamento inadequado do joelho durante a prática física de longo prazo, e podem levar a lesões ortopédicas graves se o padrão de movimento não for corrigido. Portanto, o monitoramento da distribuição da pressão plantar do pé e das características espaciais e temporais das irregularidades no posicionamento dos pés e joelhos são de extrema importância para a prevenção de lesões. Este trabalho propõe uma plataforma, composta de uma rede de sensores vestíveis e um classificador de Reconhecimento de Atividade Humana (HAR), para fornecer feedback em tempo real de exercícios funcionais, visando auxiliar educadores físicos a reduzir a probabilidade de lesões durante o treinamento. Realizamos um experimento com 12 voluntários diversos para construir um classificador HAR com aproximadamente de 87 porcento de precisão geral na classificação, e um segundo experimento para validar nosso modelo de avaliação física. Por fim, realizamos uma entrevista semi estruturada para avaliar questões de usabilidade e experiência do usuário da plataforma proposta.Visando uma pesquisa replicável, fornecemos informações completas sobre o hardware e o código fonte do sistema, e disponibilizamos o conjunto de dados do experimento. / [en] Foot and knee pain fave been associated with numerous orthopedic pathologies and injuries of the lower limbs. From street running to CrossFitTM functional training, these common pains and injuries correlate highly with unevenly distributed plantar pressure and knee positioning during long-term physical practice and can lead to severe orthopedic injuries if the movement pattern is not amended. Therefore, the monitoring of foot plantar pressure distribution and the spatial and temporal characteristics of foot and knee positioning abnomalities is of utmost importance for injury prevention. This work proposes a platform, composed af an lot wearable body sensor network and a Human Activity Recognition (HAR), to provide realtime feedback of functional exercises, aiming to enhace physical educators capability to mitigate the probability of injuries during training. We conducted an experiment with 12 diverse volunteers to build a HAR classifier that achieved about 87 percent overall classification accuracy, and a second experiment to validate our physical evaluation model. Finally, we performed a semi-structured interview to evaluate usability and user experience issues regarding the proposed platform. Aiming at a replicable research, we provide full hardware information, system source code and a public domain dataset. [pt] EDUCACAO FISICA [pt] APLICACOES DE SENSORES MOVEIS [pt] COMPUTACAO VESTIVEL E SENSORIAMENTO [pt] DISPOSTIVOS IOT [pt] SAUDE E ERGONOMIA [pt] RECONHECIMENTO DE ATIVIDADE HUMANA [en] PHYSICAL EDUCATION [en] MOBILE SENSING APPLICATIONS [en] WEARABLE COMPUTING AND SENSING [en] IOT DEVICES [en] HEALTH AND ERGONOMICS [en] HUMAN ACTIVITY RECOGNITION
57	Mobility anomaly detection with intelligent video surveillance Ebrahimi, Fatemeh 06 1900 (has links) Dans ce mémoire, nous présentons une étude visant à améliorer les soins aux personnes âgées grâce à la mise en œuvre d'un système de vidéosurveillance intelligent avancé. Ce système est conçu pour exploiter la puissance des algorithmes d’apprentissage profond pour détecter les anomalies de mobilité, avec un accent particulier sur l’identification des quasi-chutes. L’importance d’identifier les quasi-chutes réside dans le fait que les personnes qui subissent de tels événements au cours de leurs activités quotidiennes courent un risque accru de subir des chutes à l’avenir pouvant mener à des blessures graves et une hospitalisation. L’une des principales réalisations de notre étude est le développement d’un auto-encodeur capable de détecter les anomalies de mobilité, en particulier les quasi-chutes, en identifiant des erreurs de reconstruction élevées sur cinq images consécutives. Pour extraire avec précision une structure squelettique de la personne, nous avons utilisé MoveNet et affiné ce modèle sur sept points clés. Par la suite, nous avons utilisé un ensemble complet de 20 caractéristiques, englobant les positions des articulations, les vitesses, les accélérations, les angles et les accélérations angulaires, pour entraîner l’auto-encodeur. Afin d'évaluer l'efficacité de notre modèle, nous avons effectué des tests rigoureux à l'aide de 100 vidéos d'activités quotidiennes simulées enregistrées dans un laboratoire d'appartement, la moitié des vidéos contenant des cas de quasi-chutes. Un autre ensemble de 50 vidéos a été utilisé pour l’entrainement. Les résultats de notre phase de test sont très prometteurs, car ils indiquent que notre modèle est capable de détecter efficacement les quasi-chutes avec une sensibilité, une spécificité et une précision impressionnantes de 90 %. Ces résultats soulignent le potentiel de notre modèle à améliorer considérablement les soins aux personnes âgées dans leur environnement de vie. / In this thesis, we present a comprehensive study aimed at enhancing elderly care through the implementation of an advanced intelligent video surveillance system. This system is designed to leverage the power of deep learning algorithms to detect mobility anomalies, with a specific focus on identifying near-falls. The significance of identifying near-falls lies in the fact that individuals who experience such events during their daily activities are at an increased risk of experiencing falls in the future that can lead to serious injury and hospitalization. A key achievement of our study is the successful development of an autoencoder capable of detecting mobility anomalies, particularly near-falls, by pinpointing high reconstruction errors across five consecutive frames. To precisely extract a person's skeletal structure, we utilized MoveNet and focused on seven key points. Subsequently, we employed a comprehensive set of 20 features, encompassing joint positions, velocities, accelerations, angles, and angular accelerations, to train the model. In order to assess the efficacy of our model, we conducted rigorous testing using 100 videos of simulated daily activities recorded in an apartment laboratory, with half of the videos containing instances of near-falls. Another set of 50 videos was used for training. The results from our testing phase are highly promising, as they indicate that our model is able to effectively detect near-falls with an impressive 90% sensitivity, specificity, and accuracy. These results underscore the potential of our model to significantly enhance elderly care within their living environments. Vidéosurveillance Quasi-chute Détection d'anomalies MoveNet Extraction de squelette Estimation de pose Reconnaissance d'activité humaine Vdeo surveillance Near-fall Anomaly detection Autoencoder Skeleton extraction Pose estimation Human activity recognition Auto-encodeur
58	SIRAH : sistema de reconhecimento de atividades humanas e avaliação do equilibrio postural / Durango, Melisa de Jesus Barrera January 2017 (has links) Orientador: Alexandre César Rodrigues da Silva / Resumo: O reconhecimento de atividades humanas abrange diversas técnicas de classificação que permitem identificar padrões específicos do comportamento humano no momento da ocorrência. A identificação é realizada analisando dados gerados por diversos sensores corporais, entre os quais destaca-se o acelerômetro, pois responde tanto à frequência como à intensidade dos movimentos. A identificação de atividades é uma área bastante explorada. Porém, existem desafios que necessitam ser superados, podendo-se mencionar a necessidade de sistemas leves, de fácil uso e aceitação por parte dos usuários e que cumpram com requerimentos de consumo de energia e de processamento de grandes quantidades de dados. Neste trabalho apresenta-se o desenvolvimento do Sistema de Reconhecimento de atividades Humanas e Avaliação do Equilíbrio Postural, denominado SIRAH. O sistema está baseado no uso de um acelerômetro localizado na cintura do usuário. As duas fases do reconhecimento de atividades são apresentadas, fase Offline e fase Online. A fase Offline trata do treinamento de uma rede neural artificial do tipo perceptron de três camadas. No treinamento foram avaliados três estudos de caso com conjuntos de atributos diferentes, visando medir o desempenho do classificador na diferenciação de 3 posturas e 4 atividades. No primeiro caso o treinamento foi realizado com 15 atributos, gerados no domínio do tempo, com os que a rede neural artificial alcançou uma precisão de 94,40%. No segundo caso foram gerados 34 ... (Resumo completo, clicar acesso eletrônico abaixo) / Doutor Inteligência ambiental Ambientes inteligentes Internet das coisas. Reconhecimento de atividades humanas Detectores. Sensores corporais Acelerômetro Classificação em tempo real Redes neurais artificiais Oscilação corporal Baropodômetro Ambient intelligence: Smart environments Internet das coisas. Human activity recognition Sensors Wearable sensor Accelerometer Online classification Real-time classification Artificial neural network Principal component analysis Body sway Baraopodometer
59	Spatio-Temporal Networks for Human Activity Recognition based on Optical Flow in Omnidirectional Image Scenes Seidel, Roman 29 February 2024 (has links) The ability of human beings to perceive the environment around them with their visual system is called motion perception. This means that the attention of our visual system is primarily focused on those objects that are moving. The property of human motion perception is used in this dissertation to infer human activity from data using artificial neural networks. One of the main aims of this thesis is to discover which modalities, namely RGB images, optical flow and human keypoints, are best suited for HAR in omnidirectional data. Since these modalities are not yet available for omnidirectional cameras, they are synthetically generated and captured with an omnidirectional camera. During data generation, a distinction is made between synthetically generated omnidirectional data and a real omnidirectional dataset that was recorded in a Living Lab at Chemnitz University of Technology and subsequently annotated by hand. The synthetically generated dataset, called OmniFlow, consists of RGB images, optical flow in forward and backward directions, segmentation masks, bounding boxes for the class people, as well as human keypoints. The real-world dataset, OmniLab, contains RGB images from two top-view scenes as well as manually annotated human keypoints and estimated forward optical flow. In this thesis, the generation of the synthetic and real-world datasets is explained. The OmniFlow dataset is generated using the 3D rendering engine Blender, in which a fully configurable 3D indoor environment is created with artificially textured rooms, human activities, objects and different lighting scenarios. A randomly placed virtual camera following the omnidirectional camera model renders the RGB images, all other modalities and 15 predefined activities. The result of modelling the 3D indoor environment is the OmniFlow dataset. Due to the lack of omnidirectional optical flow data, the OmniFlow dataset is validated using Test-Time Augmentation (TTA). Compared to the baseline, which contains Recurrent All-Pairs Field Transforms (RAFT) trained on the FlyingChairs and FlyingThings3D datasets, it was found that only about 1000 images need to be used for fine-tuning to obtain a very low End-point Error (EE). Furthermore, it was shown that the influence of TTA on the test dataset of OmniFlow affects EE by about a factor of three. As a basis for generating artificial keypoints on OmniFlow with action labels, the Carnegie Mellon University motion capture database is used with a large number of sports and household activities as skeletal data defined in the BVH format. From the BVH-skeletal data, the skeletal points of the people performing the activities can be directly derived or extrapolated by projecting these points from the 3D world into an omnidirectional 2D image. The real-world dataset, OmniLab, was recorded in two rooms of the Living Lab with five different people mimicking the 15 actions of OmniFlow. Human keypoint annotations were added manually in two iterations to reduce the error rate of incorrect annotations. The activity-level evaluation was investigated using a TSN and a PoseC3D network. The TSN consists of two CNNs, a spatial component trained on RGB images and a temporal component trained on the dense optical flow fields of OmniFlow. The PoseC3D network, an approach to skeleton-based activity recognition, uses a heatmap stack of keypoints in combination with 3D convolution, making the network more effective at learning spatio-temporal features than methods based on 2D convolution. In the first step, the networks were trained and validated on the synthetically generated dataset OmniFlow. In the second step, the training was performed on OmniFlow and the validation on the real-world dataset OmniLab. For both networks, TSN and PoseC3D, three hyperparameters were varied and the top-1, top-5 and mean accuracy given. First, the learning rate of the stochastic gradient descent (Stochastic Gradient Descent (SGD)) was varied. Secondly, the clip length, which indicates the number of consecutive frames for learning the network, was varied, and thirdly, the spatial resolution of the input data was varied. For the spatial resolution variation, five different image sizes were generated from the original dataset by cropping from the original dataset of OmniFlow and OmniLab. It was found that keypoint-based HAR with PoseC3D performed best compared to human activity classification based on optical flow and RGB images. This means that the top-1 accuracy was 0.3636, the top-5 accuracy was 0.7273 and the mean accuracy was 0.3750, showing that the most appropriate output resolution is 128px × 128px and the clip length is at least 24 consecutive frames. The best results could be achieved with a learning rate of PoseC3D of 10-3. In addition, confusion matrices indicating the class-wise accuracy of the 15 activity classes have been given for the modalities RGB images, optical flow and human keypoints. The confusion matrix for the modality RGB images shows the best classification result of the TSN for the action walk with an accuracy of 1.00, but almost all other actions are also classified as walking in real-world data. The classification of human actions based on optical flow works best on the action sit in chair and stand up with an accuracy of 1.00 and walk with 0.50. Furthermore, it is noticeable that almost all actions are classified as sit in chair and stand up, which indicates that the intra-class variance is low, so that the TSN is not able to distinguish between the selected action classes. Validated on real-world data for the modality keypoint the actions rugpull (1.00) and cleaning windows (0.75) performs best. Therefore, the PoseC3D network on a time-series of human keypoints is less sensitive to variations in the image angle between the synthetic and real-world data than for the modalities RGB images and optical flow. The pipeline for the generation of synthetic data with regard to a more uniform distribution of the motion magnitudes needs to be investigated in future work. Random placement of the person and other objects is not sufficient for a complete coverage of all movement magnitudes. An additional improvement of the synthetic data could be the rotation of the person around their own axis, so that the person moves in a different direction while performing the activity and thus the movement magnitudes contain more variance. Furthermore, the domain transition between synthetic and real-world data should be considered further in terms of viewpoint invariance and augmentation methods. It may be necessary to generate a new synthetic dataset with only top-view data and re-train the TSN and PoseC3D. As an augmentation method, for example, the Fourier Domain Adaption (FDA) could reduce the domain gap between the synthetically generated and the real-world dataset.:1 Introduction 2 Theoretical Background 3 Related Work 4 Omnidirectional Synthetic Human Optical Flow 5 Human Keypoints for Pose in Omnidirectional Images 6 Human Activity Recognition in Indoor Scenarios 7 Conclusion and Future Work A Chapter 4: Flow Dataset Statistics B Chapter 5: 3D Rotation Matrices C Chapter 6: Network Training Parameters info:eu-repo/classification/ddc/000 ddc:000 info:eu-repo/classification/ddc/620 ddc:620
60	Enhancing human activity recognition via analysis of hexoskin sensor data and deep learning techniques Saini, Anuj 05 1900 (has links) Les technologies portables sont dans le processus de révolutionner le domaine de la santé en offrant des données vitales qui assistent dans la prévention et le traitement des maladies. Les appareils portables de la santé (HWDs), comme le vêtement biométrique de Hexoskin, sont à la pointe de cette innovation en offrant des données physiologiques détaillées et en ayant un impact significatif dans les domaines comme l’analyse de la démarche et la surveillance des activités. Le but de cette étude est de développer des modèles précis de machine learning et deep learning capables de prédire les activités humaines à l’aide de données provenant des technologies portables Hexoskin. Ceci implique l’analyse des données des capteurs comme la fréquence cardiaque et les mouvements du torse dans l’optique de prédire avec précision les activités telles que la marche, la course et le sommeil. Cette étude a fait l’objet d’une collecte de données des capteurs de 52 participants sur une période de deux semaines à l’aide des technologies portables Hexoskin. Plusieurs techniques avancées d’ingénierie des caractéristiques ont été appliquées pour extraire des caractéristiques critiques comme les accélérations X, Y et Z. Plusieurs algorithmes de machine learning tels que le Balanced Random Forest (BRF), XGradient Boosting et LSTM (sans ingénierie des caractéristiques) ont été utilisés pour l’analyse des données. Les modèles ont été entraînés et testés sur des données provenant d’Hexoskin pour évaluer leurs performances basées sur l’exactitude, le rappel, la précision et du score F1. Cette étude démontre que les technologies portables Hexoskin, couplées à des modèles de machine learning sophistiqués, pouvaient prédire avec une grande précision les activités humaines. La recherche valide l’efficacité des technologies portables Hexoskin dans la reconnaissance des activités humaines, en mettant en lumière leur potentielle utilisation dans le domaine de la santé, l’analyse de la démarche, et la surveillance des activités. Cette étude contribue de manière significative à l’amélioration des standards de soins médicaux et ouvre des nouvelles perspectives pour le diagnostic et le traitement des conditions liées à la démarche. L’intégration des technologies Hexoskin avec des algorithmes de machine learning représente un pas en avant significatif dans la surveillance continue et en temps réel des maladies chroniques, v positionnant ainsi Hexoskin comme un outil fiable pour une multitude d’applications dans le domaine de la santé. / Wearable technologies are revolutionizing the healthcare field by providing vital data that assists in the prevention and treatment of diseases. Health wearable devices (HWDs), like the Hexoskin biometric garment, are at the forefront of this innovation by offering detailed physiological data and significantly impacting fields such as gait analysis and activity monitoring. The aim of this study is to develop accurate machine learning and deep learning models capable of predicting human activities using data from Hexoskin wearable technologies. This involves analyzing sensor data such as heart rate and torso movements to accurately predict activities such as walking, running, and sleeping. This study involved collecting sensor data from 52 participants over a two-week period using Hexoskin wearable technologies. Several advanced feature engineering techniques were applied to extract critical features such as X, Y, and Z accelerations. Multiple machine learning algorithms, such as Balanced Random Forest (BRF), XGradient Boosting, and LSTM (without feature engineering), were used for data analysis. The models were trained and tested on data from Hexoskin to evaluate their performance based on accuracy, recall, precision, and F1 score. This study demonstrates that Hexoskin wearable technologies, coupled with sophisticated machine learning models, can predict human activities with high accuracy. The research validates the effectiveness of Hexoskin wearable technologies in human activity recognition, highlighting their potential use in healthcare, gait analysis, and activity monitoring. This study significantly contributes to improving medical care standards and opens new perspectives for the diagnosis and treatment of gait-related conditions. The integration of Hexoskin technologies with machine learning algorithms represents a significant step forward in the continuous and real-time monitoring of chronic diseases, positioning Hexoskin as a reliable tool for a multitude of applications in the healthcare field. Technologie Portable Reconnaissance Des Activités Humaines Vêtement Hexoskin Machine Learning Deep Learning Ingénierie Des Caractéristiques Technologie Portable De Santé Analyse De La Démarche Balanced Random Forest XGradient Boosting Analyse De Données Des Capteurs Wearable Technology Human Activity Recognition Hexoskin Garment Machine Learning Deep Learning Feature Engineering Health Wearable Technology Gait Analysis LSTM Sensor Data Analysis

Search results