Global ETD Search

1	Radar detection and identification of human signatures using moving platforms Gürbüz, Sevgi Zübeyde 17 August 2009 (has links) Radar offers unique advantages over other sensors for the detection of humans, such as remote operation during virtually all weather and lighting conditions, increased range, and better coverage. Many current radar-based human detection systems employ some type of Fourier analysis, such as Doppler processing. However, in many environments, the signal-to-noise ratio (SNR) of human returns is quite low. Furthermore, Fourier-based techniques assume a linear variation in target phase over the aperture, whereas human targets have a highly nonlinear phase history. The resulting phase mismatch causes significant SNR loss in the detector itself. In this work, human target modeling is used to derive a more accurate non-linear approximation to the true target phase history. Two algorithms are proposed: a parameter estimation-based optimized non-linear phase (ONLP) detector, and a dictionary search-based enhanced optimized non-linear phase (EnONLP) detector. The ONLP algorithm optimizes the likelihood ratio over the unknown model parameters to derive a more accurate approximation to the expected human return. The EnONLP algorithm stores expected target signatures generated for each possible combination of model parameters in a dictionary, and then applies Orthogonal Matching Pursuit (OMP) to determine the optimal linear combination of dictionary entries that comprises the measured radar data. Thus, unlike the ONLP, the EnONLP algorithm also has the capability of detecting the presence of multiple human targets. Cramer-Rao bounds (CRB) on parameter estimates and receiver operating characteristics (ROC) curves are used to validate analytically the performance of both proposed methods to that of conventional, fully adaptive STAP. Finally, application of EnONLP to target characterization is illustrated. Detection and estimation Dismount detection Human detection Radar signal processing Detectors
2	Human detection and action recognition using depth information by Kinect Xia, Lu, active 21st century 10 July 2012 (has links) Traditional computer vision algorithms depend on information taken by visible-light cameras. But there are inherent limitations of this data source, e.g. they are sensitive to illumination changes, occlusions and background clutter. Range sensors give us 3D structural information of the scene and it’s robust to the change of color and illumination. In this thesis, we present a series of approaches which are developed using the depth information by Kinect to address the issues regarding human detection and action recognition. Taking the depth information, the basic problem we consider is to detect humans in the scene. We propose a model based approach, which is comprised of a 2D head contour detector and a 3D head surface detector. We propose a segmentation scheme to segment the human from the surroundings based on the detection point and extract the whole body of the subject. We also explore the tracking algorithm based on our detection result. The methods are tested on a dataset we collected and present superior results over the existing algorithms. With the detection result, we further studied on recognizing their actions. We present a novel approach for human action recognition with histograms of 3D joint locations (HOJ3D) as a compact representation of postures. We extract the 3D skeletal joint locations from Kinect depth maps using Shotton et al.’s method. The HOJ3D computed from the action depth sequences are reprojected using LDA and then clustered into k posture visual words, which represent the prototypical poses of actions. The temporal evolutions of those visual words are modeled by discrete hidden Markov models (HMMs). In addition, due to the design of our spherical coordinate system and the robust 3D skeleton estimation from Kinect, our method demonstrates significant view invariance on our 3D action dataset. Our dataset is composed of 200 3D sequences of 10 indoor activities performed by 10 individuals in varied views. Our method is real-time and achieves superior results on the challenging 3D action dataset. We also tested our algorithm on the MSR Action3D dataset and our algorithm outperforms existing algorithm on most of the cases. / text Human detection Action recognition Kinect Depth image 3D View-invariant
3	Human Detection Using Ultra Wideband Radar and Continuous Wave Radar Ahmed, Atheeq January 2017 (has links) A radar works by radiating electromagnetic energy and detecting the reflected signal returned from the target. The nature of the reflected signal provides information about the target’s distance or speed. In this thesis, we will be using a UWB radar and a CW radar to help detect the presence and rough location of trapped survivors by detecting their motions. Range is estimated in the UWB radar using clutter removal with SVD and for the dual frequency CW Radar using STFT and median filtering. The effect of the algorithm parameters on their performance was analyzed. The performance of the implemented algorithms with regards to small motion detection, distance estimation and penetration capability was analyzed. Both systems are certainly capable of human detection and tracking. UWB Dual Frequency CW Radar Human detection Communication Systems Kommunikationssystem
4	Real-time Human Detection using Convolutional Neural Networks with FMCW RADAR RGB data / Upptäckt av människor i real-tid med djupa faltningsnät samt FMCW RADAR RGB data Phan, Anna, Medina, Rogelio January 2022 (has links) Machine learning has been employed in the automotive industry together with cameras to detect objects in surround sensing technology. You Only Look Once is a state-of-the-art object detection algorithm especially suitable for real-time applications due to its speed and relatively high accuracy compared to competing methods. Recent studies have investigated whether radar data can be used as an alternative to camera data with You Only Look Once, seeing as radars are more robust to changing environments such as various weather and lighting conditions. These studies have used 3D data from radar consisting of range, angle, and velocity, transformed into a 2D image representation, either in the Range-Angle or Range-Doppler domain. Furthermore, the processed radar image can use either a Cartesian or a polar coordinate system for the rendering. This study will combine previous studies, using You Only Look Once with Range-Angle radar images and examine which coordinate system of Cartesian or polar is most optimal. Additionally, evaluating the localization and classification performance will be done using a combination of concepts and evaluation metrics. Ultimately, the conclusion is that the Cartesian coordinate system prevails with asignificant improvement compared to polar. / Maskininlärning har sedan länge använts inom fordinsindustrin tillsammans med kameror för att upptäcka föremål och få en ökad överblick över omgivningar. You Only Look Once är en toppmodern objektdetekteringsalgoritm särskilt lämplig för realtidsapplikationer tack vare dess hastighet och relativt höga noggrannhet jämfört med konkurrerande metoder. Nyligen genomförda studier har undersökt om radardata kan användas som ett alternativ till kameradata med You Only Look Once, eftersom radar är mer robusta för ändrade miljöer så som olika väder- och ljusförhållanden. Dessa studier har utnyttjat 3D data från radar bestående av avstånd, vinkel och hastighet, som transformerats till en 2D bildrepresentation, antingen i domänen Range-Angle eller Range-Doppler. Vidare kan den bearbetade radarbilden använda antingen ett kartesiskt eller ett polärt koordinatsystem för framställningen. Denna studie kommer att kombinera tidigare studier om You Only Look Once med Range-Angle radarbilder och undersöka vilket koordinatsystem, kartesiskt eller polärt, som är mest optimalt att använda för människodetektering med radar. Dessutom kommer en utvärdering av lokaliserings- och klassificeringsförmåga att göras med hjälp av en blandning av koncept och olika mått på prestanda. Slutsatsen gjordes att det kartesiska koordinatsystemet är det bättre alternativet med en betydligt högre prestanda jämfört med det polära koordinatsystemet. Human Detection Machine Learning Convolutional Neural Networks YOLO FMCW Radar Human Detection Evaluation Människodetektering Maskininlärning Neurala faltningsnät Djupa faltningsnät YOLO FMCW Radar Utvärdering Engineering and Technology Teknik och teknologier
5	A study on detection of risk factors of a toddler's fall injuries using visual dynamic motion cues Na, Hana January 2009 (has links) The research in this thesis is intended to aid caregivers’ supervision of toddlers to prevent accidental injuries, especially injuries due to falls in the home environment. There have been very few attempts to develop an automatic system to tackle young children’s accidents despite the fact that they are particularly vulnerable to home accidents and a caregiver cannot give continuous supervision. Vision-based analysis methods have been developed to recognise toddlers’ fall risk factors related to changes in their behaviour or environment. First of all, suggestions to prevent fall events of young children at home were collected from well-known organisations for child safety. A large number of fall records of toddlers who had sought treatment at a hospital were analysed to identify a toddler’s fall risk factors. The factors include clutter being a tripping or slipping hazard on the floor and a toddler moving around or climbing furniture or room structures. The major technical problem in detecting the risk factors is to classify foreground objects into human and non-human, and novel approaches have been proposed for the classification. Unlike most existing studies, which focus on human appearance such as skin colour for human detection, the approaches addressed in this thesis use cues related to dynamic motions. The first cue is based on the fact that there is relative motion between human body parts while typical indoor clutter does not have such parts with diverse motions. In addition, other motion cues are employed to differentiate a human from a pet since a pet also moves its parts diversely. They are angle changes of ellipse fitted to each object and history of its actual heights to capture the various posture changes and different body size of pets. The methods work well as long as foreground regions are correctly segmented. 610.7343
6	Finding People in Images and Videos Dalal, Navneet 17 July 2006 (has links) (PDF) Cette thèse propose une solution pour la détection de personnes et de classes d'objet dans des images et vidéos. Le but principal est de développer des représentations robustes et discriminantes de formes visuelles, qui permettent de décider si un objet de la classe apparaˆit dans une région de l'image. Les décisions sont basées sur des vecteurs de descripteurs visuels de dimension élevée extraits des régions. Afin d'avoir une comparaison objective des différents ensembles de descripteurs, nous apprenons une règle de décision pour chaque ensemble avec un algorithme de type machine à vecteur de support linéaire. Piloté entièrement par les données, notre approche se base sur des descripteurs d'apparence et de mouvement de bas niveau sans utiliser de modèle explicite pour l'objet a détecter. Dans la plupart des cas nous nous concentrons sur la détection de personnes – classe difficile, fréquente et particulièrement intéressante dans applications come l'analyse de film et de vidéo, la détection de piétons pour la conduite assistée ou la surveillance. Cependant, notre méthode ne fait pas d'hypothèse forte sur la classe à reconnaˆitre et elle donne également des résultats satisfaisants pour d'autres classes comme les voitures, les motocyclettes, les vaches et les moutons. Nous apportons quatre contributions principales au domaine de la reconnaissance visuelle. D'abord, nous présentons des descripteurs visuels pour la détection d'objets dans les images statiques : les grilles d'histogrammes d'orientations de gradients d'image (en anglais, HOG – Histogrammes of Oriented Gradients). Les histogrammes sont évalués sur une grille de blocs spatiaux, avec une forte normalisation locale. Cette structure assure à la fois une bonne caract érisation de la forme visuelle locale de l'objet et la robustesse aux petites variations de position, d'orientation spatiale, d'illumination locale et de couleur. Nous montrons que la combinaison de gradients peu lissés, une quantification fine de l'orientation et relativement grossière de l'espace, une normalisation forte de l'intensité, et une méthode évoluée de ré-apprentissage des cas difficiles permet de réduire le taux de faux positifs par un à deux ordres de grandeur par rapport aux méthodes précédentes. Deuxièmement, afin de détecter des personnes dans les vidéos, nous proposons plusieurs descripteurs de mouvement basés sur le flot optique. Ces descripteurs sont incorporés dans l'approche précédente. Analogues aux HOG statiques, ils substituent aux gradients d'image statique les différences spatiales du flot optique dense. L'utilisation de différences minimise l'influence du mouvement de la caméra et du fond sur les détections. Nous évaluons plusieurs variations de cette approche, qui codent soit les frontières de mouvement (motion boundaries), soit les mouvements relatifs des paires de régions adjacentes. L'incorporation du mouvement réduit le taux de faux positifs d'un ordre de grandeur par rapport à l'approche précédente. Troisièmement, nous proposons une méthode générale pour combiner les détections multiples basées sur l'algorithme “mean shift” pour estimer des maxima de densité à base de noyaux. L'approche tient compte du nombre, de la confiance et de l'échelle relative des détections. Finalement, nous présentons un travail en cours sur la fac¸on de créer de un détecteur de personnes à partir de plusieurs détecteurs de parties – en occurrence le visage, la tête, le torse, et les jambes. computer vision human detection machine learning
7	Recognizing human activity using RGBD data Xia, Lu, active 21st century 03 July 2014 (has links) Traditional computer vision algorithms try to understand the world using visible light cameras. However, there are inherent limitations of this type of data source. First, visible light images are sensitive to illumination changes and background clutter. Second, the 3D structural information of the scene is lost when projecting the 3D world to 2D images. Recovering the 3D information from 2D images is a challenging problem. Range sensors have existed for over thirty years, which capture 3D characteristics of the scene. However, earlier range sensors were either too expensive, difficult to use in human environments, slow at acquiring data, or provided a poor estimation of distance. Recently, the easy access to the RGBD data at real-time frame rate is leading to a revolution in perception and inspired many new research using RGBD data. I propose algorithms to detect persons and understand the activities using RGBD data. I demonstrate the solutions to many computer vision problems may be improved with the added depth channel. The 3D structural information may give rise to algorithms with real-time and view-invariant properties in a faster and easier fashion. When both data sources are available, the features extracted from the depth channel may be combined with traditional features computed from RGB channels to generate more robust systems with enhanced recognition abilities, which may be able to deal with more challenging scenarios. As a starting point, the first problem is to find the persons of various poses in the scene, including moving or static persons. Localizing humans from RGB images is limited by the lighting conditions and background clutter. Depth image gives alternative ways to find the humans in the scene. In the past, detection of humans from range data is usually achieved by tracking, which does not work for indoor person detection. In this thesis, I propose a model based approach to detect the persons using the structural information embedded in the depth image. I propose a 2D head contour model and a 3D head surface model to look for the head-shoulder part of the person. Then, a segmentation scheme is proposed to segment the full human body from the background and extract the contour. I also give a tracking algorithm based on the detection result. I further research on recognizing human actions and activities. I propose two features for recognizing human activities. The first feature is drawn from the skeletal joint locations estimated from a depth image. It is a compact representation of the human posture called histograms of 3D joint locations (HOJ3D). This representation is view-invariant and the whole algorithm runs at real-time. This feature may benefit many applications to get a fast estimation of the posture and action of the human subject. The second feature is a spatio-temporal feature for depth video, which is called Depth Cuboid Similarity Feature (DCSF). The interest points are extracted using an algorithm that effectively suppresses the noise and finds salient human motions. DCSF is extracted centered on each interest point, which forms the description of the video contents. This descriptor can be used to recognize the activities with no dependence on skeleton information or pre-processing steps such as motion segmentation, tracking, or even image de-noising or hole-filling. It is more flexible and widely applicable to many scenarios. Finally, all the features herein developed are combined to solve a novel problem: first-person human activity recognition using RGBD data. Traditional activity recognition algorithms focus on recognizing activities from a third-person perspective. I propose to recognize activities from a first-person perspective with RGBD data. This task is very novel and extremely challenging due to the large amount of camera motion either due to self exploration or the response of the interaction. I extracted 3D optical flow features as the motion descriptor, 3D skeletal joints features as posture descriptors, spatio-temporal features as local appearance descriptors to describe the first-person videos. To address the ego-motion of the camera, I propose an attention mask to guide the recognition procedures and separate the features on the ego-motion region and independent-motion region. The 3D features are very useful at summarizing the discerning information of the activities. In addition, the combination of the 3D features with existing 2D features brings more robust recognition results and make the algorithm capable of dealing with more challenging cases. / text Activity recognition RGBD Depth sensing 3D Human detection First-person Human interaction
8	IR Image Macine Learning for Smart Homes Nerborg, Amanda, Josse, Elias January 2020 (has links) Sweden is expecting an aging population and a shortage of healthcare professionals in the near future. This amounts to problems like providing a safe and dignified life for the elderly both economically and practically. Technical solutions that contribute to safety, comfort and quick help when needed is essential for this future. Nowadays, a lot of solutions include a camera, which is effective but invasive on personal integrity. Griddy, a hardware solution containing a Panasonic Grid-EYE, an infrared thermopile array sensor, offers more integrity for the user. Griddy was developed by students in a previous project and was used for this projects data collecting. With Griddy mounted over a bed and additional software to determine if the user is on the bed or not a system could offer monitoring with little human interaction. The purpose was to determine if this system could predict human presence with high accuracy and what limitations it might have. Two data sets, a main and a variational, were captured with Griddy. The main data set consisted of 240 images with the label “person” and 240 images with the label “no person”. The machine learning algorithms used were Support Vector Machine (SVM), k-Nearest Neighbors (kNN) and Neural Network (NN). With 10-Fold Cross Validation, the highest accuracy found was for both SVM and kNN (0.99). This was verified with both algorithms accuracy (1.0) on the test set. The results for the variational data set showed lower reliability in the system when faced with variations not presented in the training, such as elevated room temperature or a duvet covering the person. More work needs to be done to expand the main data set to meet the challenge of variations. / I Sveriges väntas i den närmaste framtiden en åldrande population och en brist på vårdpersonal. Detta innebär både ekonomiska och praktiska problem för att ge äldre ett säkert och värdigt liv. Tekniska lösningar som kan bidra med säkerhet, komfort och snabb hjälp vid behov är av essentiell vikt i framtiden. Idag innehåller många lösningar en kamera. Detta är en effektiv men integritetskränkande lösning. Griddy, som är en hårdvarulösning innehållande en Panasonic Grid-EYE, en infraröd termosensor, erbjuder mer integritet för brukaren. Griddy utvecklades av studenter i ett tidigare projekt och användes för datainsamling i detta projektet. Genom att montera Griddy över sängen och använda en tillhörande mjukvara, som avgör om brukaren är i sängen eller inte, skulle ett system kunna erbjuda övervakning med lite mänsklig inblandning. Syftet var att ta reda på om detta system skulle kunna avgöra brukarens närvaro med hög tillförlitlighet och vilka begränsningar systemet skulle ha. Två datasamlingar samlades in med hjälp av Griddy. En huvudsaklig datasamling och en med variation. Den huvudsakliga datasamlingen bestod av 240 bilder med etiketten "person" och 240 bilder med etiketten "ingen person". Algoritmerna för maskininlärning som användes var Support Vector Machine (SVM), k-Nearest Neighbors (kNN) och Neural Network (NN). Med 10-Fold Cross Validation fanns den högsta tillförlitligheten med algoritmerna SVM och kNN (0.99). Detta verifierades med tillförlitligheten för testsamlingen hos SVM och kNN (1.0). För datasamlingen med variation visade resultaten på en lägre tillförlitlighet när systemet mötte variationer som det inte tränats med, såsom förhöjd rumstemperatur eller ett täcke över personen. Slutsatsen är att en huvudsaklig datasamling bör utökas med mer variation så att systemet tränas till att klara större utmaningar. machine learning infrared radiation low resolution Panasonic Grid-EYE human detection Computer Sciences Datavetenskap (datalogi)
9	From Body Parts Responses to Underwater Human Detection: A Deep Learning Approach Zhan, Wenjie, Zheng, Maowei January 2020 (has links) Context. Underwater human detection has been an important problem in computer vision areas. Body part-based models could gain good performance in on-land human detection with occlusion existing scenarios. This thesis explores the feasibility of human body parts detection in underwater environment. Objectives. This thesis aims to build a DNN-based underwater human body part detector for human body part detection task. Three body part detectors implemented with different DNN-based models (Faster R-CNN, SSD and YOLO) are built and compared over underwater human body part detection task. Methods. In this thesis, experiments are used as research methods. Three DNN-based models which are regarded as the independent variables in the experiment is trained, tested and evaluated. And the detection results of detector based on the three different models are dependent variables. Finally the detection performance calculated on the result for each detector is compared. Results. Underwater Body part detector based on Faster R-CNN provides the best detection performance on the body part detection task in terms of mAP, and YOLOv2 achieves the fastest detection speed but it has the smallest mAP value. In addition, SSD model has both decent detection performance and also detection speed. Conclusions. Underwater Body part detector based on Faster R-CNN, SSD, and YOLO could gain good performance over underwater human body part detection task. Building an underwater body part detector via deep learning method is feasible. Deep learning human body parts detection underwater human detection Computer Sciences Datavetenskap (datalogi)
10	Edge Machine Learning for Wildlife Conservation : Detection of Poachers Using Camera Traps Arnesson, Pontus, Forslund, Johan January 2021 (has links) This thesis presents how deep learning can be utilized for detecting humans ina wildlife setting using image classification. Two different solutions have beenimplemented where both of them use a camera-equipped microprocessor to cap-ture the images. In one of the solutions, the deep learning model is run on themicroprocessor itself, which requires the size of the model to be as small as pos-sible. The other solution sends images from the microprocessor to a more pow-erful computer where a larger object detection model is run. Both solutions areevaluated using standard image classification metrics and compared against eachother. To adapt the models to the wildlife environment,transfer learningis usedwith training data from a similar setting that has been manually collected andannotated. The thesis describes a complete system’s implementation and results,including data transfer, parallel computing, and hardware setup. One of the contributions of this thesis is an algorithm that improves the classifi-cation performance on images where a human is far away from the camera. Thealgorithm detects motion in the images and extracts only the area where thereis movement. This is specifically important on the microprocessor, where theclassification model is too simple to handle those cases. By only applying theclassification model to this area, the task is more simple, resulting in better per-formance. In conclusion, when integrating this algorithm, a model running onthe microprocessor gives sufficient results to run as a camera trap for humans.However, test results show that this implementation is still quite underperform-ing compared to a model that is run on a more powerful computer. deep learning ai human detection microcontroller edge devices camera trap esp32 Control Engineering Reglerteknik

Search results