• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 195
  • 24
  • 17
  • 10
  • 9
  • 6
  • 6
  • 3
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 338
  • 214
  • 143
  • 105
  • 70
  • 61
  • 56
  • 47
  • 44
  • 43
  • 43
  • 43
  • 39
  • 38
  • 36
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
161

What, When, and Where Exactly? Human Activity Detection in Untrimmed Videos Using Deep Learning

Rahman, Md Atiqur 06 December 2023 (has links)
Over the past decade, there has been an explosion in the volume of video data, including internet videos and surveillance camera footage. These videos often feature extended durations with unedited content, predominantly filled with background clutter, while the relevant activities of interest occupy only a small portion of the footage. Consequently, there is a compelling need for advanced processing techniques to automatically analyze this vast reservoir of video data, specifically with the goal of identifying the segments that contain the events of interest. Given that humans are the primary subjects in these videos, comprehending human activities plays a pivotal role in automated video analysis. This thesis seeks to tackle the challenge of detecting human activities from untrimmed videos, aiming to classify and pinpoint these activities both in their spatial and temporal dimensions. To achieve this, we propose a modular approach. We begin by developing a temporal activity detection framework, and then progressively extend the framework to support activity detection in the spatio-temporal dimension. To perform temporal activity detection, we introduce an end-to-end trainable deep learning model leveraging 3D convolutions. Additionally, we propose a novel and adaptable fusion strategy to combine both the appearance and motion information extracted from a video, using RGB and optical flow frames. Importantly, we incorporate the learning of this fusion strategy into the activity detection framework. Building upon the temporal activity detection framework, we extend it by incorporating a spatial localization module to enable activity detection both in space and time in a holistic end-to-end manner. To accomplish this, we leverage shared spatio-temporal feature maps to jointly optimize both spatial and temporal localization of activities, thus making the entire pipeline more effective and efficient. Finally, we introduce several novel techniques for modeling actor motion, specifically designed for efficient activity recognition. This is achieved by harnessing 2D pose information extracted from video frames and then representing human motion through bone movement, bone orientation, and body joint positions. Our experimental evaluations, conducted using benchmark datasets, showcase the effectiveness of the proposed temporal and spatio-temporal activity detection methods when compared to the current state-of-the-art methods. Moreover, the proposed motion representations excel in both performance and computational efficiency. Ultimately, this research shall pave the way forward towards imbuing computers with social visual intelligence, enabling them to comprehend human activities in any given time and space, opening up exciting possibilities for the future.
162

Controllable Visual Synthesis

AlBahar, Badour A. Sh A. 08 June 2023 (has links)
Computer graphics has become an integral part of various industries such as entertainment (i.e.,films and content creation), fashion (i.e.,virtual try-on), and video games. Computer graphics has evolved tremendously over the past years. It has shown remarkable image generation improvement from low-quality, pixelated images with limited details to highly realistic images with fine details that can often be mistaken for real images. However, the traditional pipeline of rendering an image in computer graphics is complex and time- consuming. The whole process of creating the geometry, material, and textures requires not only time but also significant expertise. In this work, we aim to replace this complex traditional computer graphics pipeline with a simple machine learning model. This machine learning model can synthesize realistic images without requiring expertise or significant time and effort. Specifically, we address the problem of controllable image synthesis. We propose several approaches that allow the user to synthesize realistic content and manipulate images to achieve their desired goals with ease and flexibility. / Doctor of Philosophy / Computer graphics has become an integral part of various industries such as entertainment (i.e.,films and content creation), fashion (i.e.,virtual try-on), and video games. Computer graphics has evolved tremendously over the past years. It has shown remarkable image generation improvement from low-quality, pixelated images with limited details to highly realistic images with fine details that can often be mistaken for real images. However, the traditional process of generating an image in computer graphics is complex and time- consuming. You need to set up a camera and light, and create objects with all sorts of details. This requires not only time but also significant expertise. In this work, we aim to replace this complex traditional computer graphics pipeline with a simple machine learning model. This machine learning model can generate realistic images without requiring expertise or significant time and effort. Specifically, we address the problem of controllable image synthesis. We propose several approaches that allow the user to synthesize realistic content and manipulate images to achieve their desired goals with ease and flexibility.
163

TO KILL AND TO BE KILLED: THE TRANSFERENCE, TRANSFORMATION AND USE OF THE SMITING POSE IN EGYPT AND THE AEGEAN DURING THE BRONZE AGE

Kellenbarger, Tenninger 08 1900 (has links)
The smiting pose is a motif used by the Egyptians, Minoans, and the Mycenaeans during the Bronze Age (ca. 3000–1200 BCE). Although the smiting pose has been identified as an emblem of the pharaonic office, the pose has never been investigated in the field of Aegean prehistory. This motif is incorporated as evidence when discussing larger topics, such as warriors and warfare of the Aegean during the Late Bronze Age. In these arguments, art-bearing iconography is used as evidence to support the presence of martial Minoans and are only ever mentioned as such. This dissertation investigates the smiting scenes from the Egypt and Crete and the Mainland of Greece and examines them to answer the following questions: how people are creating and expressing power in the Eastern Mediterranean and how do trade networks influence this. The first part of this approach considers different trade routes explored by Crete and the Mainland as well as the role the Aegean peoples played in the international trade networks. The second part of this study focuses on the smiting motif in its regional context to explore how power was constructed and represented through violence to fit their concepts of ruling and kingship. / Art History
164

Automated Implementation of the Edinburgh Visual Gait Score (EVGS)

Ramesh, Shri Harini 14 July 2023 (has links)
Analyzing a person's gait is important in determining their physical and neurological health. However, typical motion analysis laboratories are only in urban specialty care facilities and can be expensive due to the specialized personnel and technology needed for these examinations. Many patients, especially those who reside in underdeveloped or isolated locations, find it impractical to go to such facilities. With the help of recent developments in high-performance computing and artificial intelligence models, it is now feasible to evaluate human movement using digital video. Over the past 20 years, various visual gait analysis tools and scales have been developed. A study of the literature and discussions with physicians who are domain experts revealed that the Edinburgh Visual Gait Score (EVGS) is one of the most effective scales currently available. Clinical implementations of EVGS currently rely on human scoring of videos. In this thesis, an algorithmic implementation of EVGS scoring based on hand-held smart phone video was implemented. Walking gait was recorded using a handheld smartphone at 60Hz as participants walked along a hallway. Body keypoints representing joints and limb segments were then identified using the OpenPose - Body 25 pose estimation model. A new algorithm was developed to identify foot events and strides from the keypoints and determine EVGS parameters at relevant strides. The stride identification results were compared with ground truth foot events that were manually labeled through direct observation, and the EVGS results were compared with evaluations by human scorers. Stride detection was accurate within 2 to 5 frames. The level of agreement between the scorers and the algorithmic EVGS score was strong for 14 of 17 parameters. The algorithm EVGS results were highly correlated to scorers' scores (r>0.80) for eight of the 17 factors. Smartphone-based remote motion analysis with automated implementation of the EVGS may be employed in a patient's neighborhood, eliminating the need to travel. These results demonstrated the viability of automated EVGS for remote human motion analysis.
165

Reconstructing 3D Humans From Visual Data

Zheng, Ce 01 January 2023 (has links) (PDF)
Understanding humans in visual content is fundamental for numerous computer vision applications. Extensive research has been conducted in the field of human pose estimation (HPE) to accurately locate joints and construct body representations from images and videos. Expanding on HPE, human mesh recovery (HMR) addresses the more complex task of estimating the 3D pose and shape of the entire human body. HPE and HMR have gained significant attention due to their applications in areas such as digital human avatar modeling, AI coaching, and virtual reality [135]. However, HPE and HMR come with notable challenges, including intricate body articulation, occlusion, depth ambiguity, and the limited availability of annotated 3D data. Despite the progress made so far, the research community continues to strive for robust, accurate, and efficient solutions in HPE and HMR, advancing us closer to the ultimate goals in the field. This dissertation tackles various challenges in the domains of HPE and HMR. The initial focus is on video-based HPE, where we proposed a transformer architecture named PoseFormer [136] to leverage to capture the spatial relationships between body joints and temporal correlations across frames. This approach effectively harnesses the comprehensive connectivity and expressive power of transformers, leading to improved pose estimation accuracy in video sequences. Building upon this, the dissertation addresses the heavy computational and memory burden associated with image-based HMR. Our proposed Feater Map-based Transformer method (FeatER [133]) and a Pooling attention transformer method (POTTER[130]), demonstrate superior performance while significantly reducing computational and memory requirements compared to existing state-of-the-art techniques. Furthermore, a diffusion-based framework (DiffMesh[134]) is proposed for reconstructing high-quality human mesh outputs given input video sequences. These achievements provide practical and efficient solutions that cater to the demands of real-world applications in HPE and HMR. In this dissertation, our contributions advance the fields of HPE and HMR, bringing us closer to accurate and efficient solutions for understanding humans in visual content.
166

STUDENT ATTENTIVENESS CLASSIFICATION USING GEOMETRIC MOMENTS AIDED POSTURE ESTIMATION

Gowri Kurthkoti Sridhara Rao (14191886) 30 November 2022 (has links)
<p> Body Posture provides enough information regarding the current state of mind of a person. This idea is used to implement a system that provides feedback to lecturers on how engaging the class has been by identifying the attentive levels of students. This is carried out using the posture information extracted with the help of Mediapipe. A novel method of extracting features are from the key points returned by Mediapipe is proposed. Geometric moments aided features classification performs better than the general distances and angles features classification. In order to extend the single person pose classification to multi person pose classification object detection is implemented. Feedback is generated regarding the entire lecture and provided as the output of the system. </p>
167

Analysis and control of an eight degree-of-freedom manipulator

Nyzen, Robert J. January 1999 (has links)
No description available.
168

Rigorous Model of Panoramic Cameras

Shin, Sung Woong 31 March 2003 (has links)
No description available.
169

Using a Leadership and Civic Engagement Course to Address the Retention of African American Males

Cunningham, Patricia Frances Rene 20 October 2011 (has links)
No description available.
170

Deep Learning for estimation of fingertip location in 3-dimensional point clouds : An investigation of deep learning models for estimating fingertips in a 3D point cloud and its predictive uncertainty

Hölscher, Phillip January 2021 (has links)
Sensor technology is rapidly developing and, consequently, the generation of point cloud data is constantly increasing. Since the recent release of PointNet, it is possible to process this unordered 3-dimensional data directly in a neural network. The company TLT Screen AB, which develops cutting-edge tracking technology, seeks to optimize the localization of the fingertips of a hand in a point cloud. To do so, the identification of relevant 3D neural network models for modeling hands and detection of fingertips in various hand orientations is essential. The Hand PointNet processes point clouds of hands directly and generate estimations of fixed points (joints), including fingertips, of the hands. Therefore, this model was selected to optimize the localization of fingertips for TLT Screen AB and forms the subject of this research. The model has advantages over conventional convolutional neural networks (CNN). First of all, in contrast to the 2D CNN, the Hand PointNet can use the full 3-dimensional spatial information. Compared to the 3D CNN, moreover, it avoids unnecessarily voluminous data and enables more efficient learning. The model was trained and evaluated on the public dataset MRSA Hand. In contrast to previously published work, the main object of this investigation is the estimation of only 5 joints, for the fingertips. The behavior of the model with a reduction from the usual 21 to 11 and only 5 joints are examined. It is found that the reduction of joints contributed to an increase in the mean error of the estimated joints. Furthermore, the examination of the distribution of the residuals of the estimate for fingertips is found to be less dense. MC dropout to study the prediction uncertainty for the fingertips has shown that the uncertainty increases when the joints are decreased. Finally, the results show that the uncertainty is greatest for the prediction of the thumb tip. Starting from the tip of the thumb, it is observed that the uncertainty of the estimates decreases with each additional fingertip.

Page generated in 0.0472 seconds