Spelling suggestions: "subject:"keypoints"" "subject:"waypoint""
1 |
Shoulder Keypoint-Detection from Object DetectionKapoor, Prince 22 August 2018 (has links)
This thesis presents detailed observation of different Convolutional Neural Network
(CNN) architecture which had assisted Computer Vision researchers to achieve state-of-the-art performance on classification, detection, segmentation and much more to
name image analysis challenges. Due to the advent of deep learning, CNN had
been used in almost all the computer vision applications and that is why there is
utter need to understand the miniature details of these feature extractors and find
out their pros and cons of each feature extractor meticulously. In order to perform
our experimentation, we decided to explore an object detection task using a particular
model architecture which maintains a sweet spot between computational cost and
accuracy. The model architecture which we had used is LSTM-Decoder. The
model had been experimented with different CNN feature extractor and found their
pros and cons in variant scenarios. The results which we had obtained on different
datasets elucidates that CNN plays a major role in obtaining higher accuracy and
we had also achieved a comparable state-of-the-art accuracy on Pedestrian Detection
Dataset.
In extension to object detection, we also implemented two different model architectures which find shoulder keypoints. So, One of our idea can be explicated as
follows: using the detected annotation from object detection, a small cropped image
is generated which would be feed into a small cascade network which was trained
for detection of shoulder keypoints. The second strategy is to use the same object detection model and fine tune their weights to predict shoulder keypoints. Currently,
we had generated our results for shoulder keypoint detection. However, this idea
could be extended to full-body pose Estimation by modifying the cascaded network
for pose estimation purpose and this had become an important topic of discussion
for the future work of this thesis.
|
2 |
Single Camera Autonomous Navigation for Micro Aerial VehiclesBowen, Jacob 15 December 2012 (has links)
Micro Aerial Vehicles (MAVs) provide a highly capable, agile platform, ideally suited for intelligence/surveillance/reconnaissance missions, urban search and rescue, and scientific exploration. Critical to the success of these tasks is a system which moves au-tonomously through an unknown, obstacle-strewn, GPS-denied environment. Classical simultaneous localization and mapping (SLAM) approaches rely on large, heavy sensors to generate 3-D information about a MAV’s surroundings, severely limiting its abilities. This motivates a study of Parallel Tracking and Mapping (PTAM), an algorithm requiring only a single camera to provide 3-D data to an autonomous navigation system. Metric properties of 3-D MAV pose estimates are compared with physical measurements to ex-plore tracking accuracy. Additionally, a discrete wavelet transform-based keypoint detec-tor is implemented for a feasibility study on improving map density in low-visual-detail environments. Finally, a system is presented that integrates PTAM, autonomous MAV control, and a human interface for manual control and data logging.
|
3 |
Cluster-Based Salient Object Detection Using K-Means Merging and Keypoint Separation with Rectangular CentersBuck, Robert 01 May 2016 (has links)
The explosion of internet traffic, advent of social media sites such as Facebook and Twitter, and increased availability of digital cameras has saturated life with images and videos. Never before has it been so important to sift quickly through large amounts of digital information. Salient Object Detection (SOD) is a computer vision topic that finds methods to locate important objects in pictures. SOD has proven to be helpful in numerous applications such as image forgery detection and traffic sign recognition. In this thesis, I outline a novel SOD technique to automatically isolate important objects from the background in images.
|
4 |
Vyhledávání fotografií podle obsahu / Content Based Photo SearchDvořák, Pavel January 2014 (has links)
This thesis covers design and practical realization of a tool for quick search in large image databases, containing from tens to hundreds of thousands photos, based on image similarity. The proposed technique uses various methods of descriptor extraction, creation of Bag of Words dictionaries and methods of storing image data in PostgreSQL database. Further, experiments with the implemented software were carried out to evaluate the search time effectivity and scaling possibilities of the design solution.
|
5 |
Direction estimation using visual odometry / Uppskattning av riktning med visuell odometriMasson, Clément January 2015 (has links)
This Master thesis tackles the problem of measuring objects’ directions from a motionlessobservation point. A new method based on a single rotating camera requiring the knowledge ofonly two (or more) landmarks’ direction is proposed. In a first phase, multi-view geometry isused to estimate camera rotations and key elements’ direction from a set of overlapping images.Then in a second phase, the direction of any object can be estimated by resectioning the cameraassociated to a picture showing this object. A detailed description of the algorithmic chain isgiven, along with test results on both synthetic data and real images taken with an infraredcamera. / Detta masterarbete behandlar problemet med att mäta objekts riktningar från en fastobservationspunkt. En ny metod föreslås, baserad på en enda roterande kamera som kräverendast två (eller flera) landmärkens riktningar. I en första fas används multiperspektivgeometri,för att uppskatta kamerarotationer och nyckelelements riktningar utifrån en uppsättningöverlappande bilder. I en andra fas kan sedan riktningen hos vilket objekt som helst uppskattasgenom att kameran, associerad till en bild visande detta objekt, omsektioneras. En detaljeradbeskrivning av den algoritmiska kedjan ges, tillsammans med testresultat av både syntetisk dataoch verkliga bilder tagen med en infraröd kamera.
|
6 |
Deep Image Processing with Spatial Adaptation and Boosted Efficiency & Supervision for Accurate Human Keypoint Detection and Movement Dynamics TrackingDai, Chao Yang 05 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / This thesis aims to design and develop the spatial adaptation approach through spatial transformers to improve the accuracy of human keypoint recognition models. We have studied different model types and design choices to gain an accuracy increase over models without spatial transformers and analyzed how spatial transformers increase the accuracy of predictions. A neural network called Widenet has been leveraged as a specialized network for providing the parameters for the spatial transformer. Further, we have evaluated methods to reduce the model parameters, as well as the strategy to enhance the learning supervision for further improving the performance of the model. Our experiments and results have shown that the proposed deep learning framework can effectively detect the human key points, compared with the baseline methods. Also, we have reduced the model size without significantly impacting the performance, and the enhanced supervision has improved the performance. This study is expected to greatly advance the deep learning of human key points and movement dynamics.
|
7 |
COMPUTER VISION SYSTEMS FOR PRACTICAL APPLICATIONS IN PRECISION LIVESTOCK FARMINGPrajwal Rao (19194526) 23 July 2024 (has links)
<p dir="ltr">The use of advanced imaging technology and algorithms for managing and monitoring livestock improves various aspects of livestock, such as health monitoring, behavioral analysis, early disease detection, feed management, and overall farming efficiency. Leveraging computer vision techniques such as keypoint detection, and depth estimation for these problems help to automate repeatable tasks, which in turn improves farming efficiency. In this thesis, we delve into two main aspects that are early disease detection, and feed management:</p><ul><li><b>Phenotyping Ducks using Keypoint Detection: </b>A platform to measure duck phenotypes such as wingspan, back length, and hip width packaged in an online user interface for ease of use.</li><li><b>Real-Time Cattle Intake Monitoring Using Computer Vision:</b> A complete end-to-end real-time monitoring system to measure cattle feed intake using stereo cameras.</li></ul><p dir="ltr">Furthermore, considering the above implementations and their drawbacks, we propose a cost-effective simulation environment for feed estimation to conduct extensive experiments prior to real-world implementation. This approach allows us to test and refine the computer vision systems under controlled conditions, identify potential issues, and optimize performance without the high costs and risks associated with direct deployment on farms. By simulating various scenarios and conditions, we can gather valuable data, improve algorithm accuracy, and ensure the system's robustness. Ultimately, this preparatory step will facilitate a smoother transition to real-world applications, enhancing the reliability and effectiveness of computer vision in precision livestock farming.</p>
|
8 |
Detecção de objetos por reconhecimento de grafos-chave / Object detection by keygraph recognitionHashimoto, Marcelo 27 April 2012 (has links)
Detecção de objetos é um problema clássico em visão computacional, presente em aplicações como vigilância automatizada, análise de imagens médicas e recuperação de informação. Dentre as abordagens existentes na literatura para resolver esse problema, destacam-se métodos baseados em reconhecimento de pontos-chave que podem ser interpretados como diferentes implementações de um mesmo arcabouço. O objetivo desta pesquisa de doutorado é desenvolver e avaliar uma versão generalizada desse arcabouço, na qual reconhecimento de pontos-chave é substituído por reconhecimento de grafos-chave. O potencial da pesquisa reside na riqueza de informação que um grafo pode apresentar antes e depois de ser reconhecido. A dificuldade da pesquisa reside nos problemas que podem ser causados por essa riqueza, como maldição da dimensionalidade e complexidade computacional. Três contribuições serão incluídas na tese: a descrição detalhada de um arcabouço para detecção de objetos baseado em grafos-chave, implementações fiéis que demonstram sua viabilidade e resultados experimentais que demonstram seu desempenho. / Object detection is a classic problem in computer vision, present in applications such as automated surveillance, medical image analysis and information retrieval. Among the existing approaches in the literature to solve this problem, we can highlight methods based on keypoint recognition that can be interpreted as different implementations of a same framework. The objective of this PhD thesis is to develop and evaluate a generalized version of this framework, on which keypoint recognition is replaced by keygraph recognition. The potential of the research resides in the information richness that a graph can present before and after being recognized. The difficulty of the research resides in the problems that can be caused by this richness, such as curse of dimensionality and computational complexity. Three contributions are included in the thesis: the detailed description of a keygraph-based framework for object detection, faithful implementations that demonstrate its feasibility and experimental results that demonstrate its performance.
|
9 |
Detecção de objetos por reconhecimento de grafos-chave / Object detection by keygraph recognitionMarcelo Hashimoto 27 April 2012 (has links)
Detecção de objetos é um problema clássico em visão computacional, presente em aplicações como vigilância automatizada, análise de imagens médicas e recuperação de informação. Dentre as abordagens existentes na literatura para resolver esse problema, destacam-se métodos baseados em reconhecimento de pontos-chave que podem ser interpretados como diferentes implementações de um mesmo arcabouço. O objetivo desta pesquisa de doutorado é desenvolver e avaliar uma versão generalizada desse arcabouço, na qual reconhecimento de pontos-chave é substituído por reconhecimento de grafos-chave. O potencial da pesquisa reside na riqueza de informação que um grafo pode apresentar antes e depois de ser reconhecido. A dificuldade da pesquisa reside nos problemas que podem ser causados por essa riqueza, como maldição da dimensionalidade e complexidade computacional. Três contribuições serão incluídas na tese: a descrição detalhada de um arcabouço para detecção de objetos baseado em grafos-chave, implementações fiéis que demonstram sua viabilidade e resultados experimentais que demonstram seu desempenho. / Object detection is a classic problem in computer vision, present in applications such as automated surveillance, medical image analysis and information retrieval. Among the existing approaches in the literature to solve this problem, we can highlight methods based on keypoint recognition that can be interpreted as different implementations of a same framework. The objective of this PhD thesis is to develop and evaluate a generalized version of this framework, on which keypoint recognition is replaced by keygraph recognition. The potential of the research resides in the information richness that a graph can present before and after being recognized. The difficulty of the research resides in the problems that can be caused by this richness, such as curse of dimensionality and computational complexity. Three contributions are included in the thesis: the detailed description of a keygraph-based framework for object detection, faithful implementations that demonstrate its feasibility and experimental results that demonstrate its performance.
|
10 |
Interpretable Fine-Grained Visual CategorizationGuo, Pei 16 June 2021 (has links)
Not all categories are created equal in object recognition. Fine-grained visual categorization (FGVC) is a branch of visual object recognition that aims to distinguish subordinate categories within a basic-level category. Examples include classifying an image of a bird into specific species like "Western Gull" or "California Gull". Such subordinate categories exhibit characteristics like small inter-category variation and large intra-class variation, making distinguishing them extremely difficult. To address such challenges, an algorithm should be able to focus on object parts and be invariant to object pose. Like many other computer vision tasks, FGVC has witnessed phenomenal advancement following the resurgence of deep neural networks. However, the proposed deep models are usually treated as black boxes. Network interpretation and understanding aims to unveil the features learned by neural networks and explain the reason behind network decisions. It is not only a necessary component for building trust between humans and algorithms, but also an essential step towards continuous improvement in this field. This dissertation is a collection of papers that contribute to FGVC and neural network interpretation and understanding. Our first contribution is an algorithm named Pose and Appearance Integration for Recognizing Subcategories (PAIRS) which performs pose estimation and generates a unified object representation as the concatenation of pose-aligned region features. As the second contribution, we propose the task of semantic network interpretation. For filter interpretation, we represent the concepts a filter detects using an attribute probability density function. We propose the task of semantic attribution using textual summarization that generates an explanatory sentence consisting of the most important visual attributes for decision-making, as found by a general Bayesian inference algorithm. Pooling has been a key component in convolutional neural networks and is of special interest in FGVC. Our third contribution is an empirical and experimental study towards a thorough yet intuitive understanding and extensive benchmark of popular pooling approaches. Our fourth contribution is a novel LMPNet for weakly-supervised keypoint discovery. A novel leaky max pooling layer is proposed to explicitly encourages sparse feature maps to be learned. A learnable clustering layer is proposed to group the keypoint proposals into final keypoint predictions. 2020 marks the 10th year since the beginning of fine-grained visual categorization. It is of great importance to summarize the representative works in this domain. Our last contribution is a comprehensive survey of FGVC containing nearly 200 relevant papers that cover 7 common themes.
|
Page generated in 0.0404 seconds