Global ETD Search

291	VIP: Finding Important People in Images Mathialagan, Clint Solomon 25 June 2015 (has links) People preserve memories of events such as birthdays, weddings, or vacations by capturing photos, often depicting groups of people. Invariably, some individuals in the image are more important than others given the context of the event. This work analyzes the concept of the importance of individuals in group photographs. We address two specific questions - Given an image, who are the most important individuals in it? Given multiple images of a person, which image depicts the person in the most important role? We introduce a measure of importance of people in images and investigate the correlation between importance and visual saliency. We find that not only can we automatically predict the importance of people from purely visual cues, incorporating this predicted importance results in significant improvement in applications such as im2text (generating sentences that describe images of groups of people). / Master of Science Computer Vision Machine Learning Importance
292	Object Proposals in Computer Vision Chavali, Neelima 09 September 2015 (has links) Object recognition is a central problem in computer vision which deals with both localizing and identifying objects in images. Object proposals have recently become an important part of the object recognition process. Object proposals are algorithms used for localizing objects in images. This thesis is a study in object proposals and is composed of three parts. First, we present a new data-driven approach for generating object proposals. Second, we release a MATLAB library which can be used to generate object proposals using all the existing algorithms. The library can also be used for evaluating object proposals using the three most commonly used metrics. Finally, we identify previously unnoticed bias in the existing protocol for evaluating object proposals and propose ways to alleviate this bias. / Master of Science Object proposals evaluation computer vision
293	Data-Efficient Learning in Image Synthesis and Instance Segmentation Robb, Esther Anne 18 August 2021 (has links) Modern deep learning methods have achieve remarkable performance on a variety of computer vision tasks, but frequently require large, well-balanced training datasets to achieve high-quality results. Data-efficient performance is critical for downstream tasks such as automated driving or facial recognition. We propose two methods of data-efficient learning for the tasks of image synthesis and instance segmentation. We first propose a method of high-quality and diverse image generation from finetuning to only 5-100 images. Our method factors a pretrained model into a small but highly expressive weight space for finetuning, which discourages overfitting in a small training set. We validate our method in a challenging few-shot setting of 5-100 images in the target domain. We show that our method has significant visual quality gains compared with existing GAN adaptation methods. Next, we introduce a simple adaptive instance segmentation loss which achieves state-of-the-art results on the LVIS dataset. We demonstrate that the rare categories are heavily suppressed by textit{correct background predictions}, which reduces the probability for all foreground categories with equal weight. Due to the relative infrequency of rare categories, this leads to an imbalance that biases towards predicting more frequent categories. Based on this insight, we develop DropLoss -- a novel adaptive loss to compensate for this imbalance without a trade-off between rare and frequent categories. / Master of Science / Many of the impressive results seen in modern computer vision rely on learning patterns from huge datasets of images, but these datasets may be expensive or difficult to collect. Many applications of computer vision need to learn from a very small number of examples, such as learning to recognize an unusual traffic event and behave safely in a self-driving car. In this thesis we propose two methods of learning from only a few examples. Our first method generates novel, high-quality and diverse images using a model fine-tuned on only 5-100 images. We start with an image generation model that was trained a much larger image set (70K images), and adapts it to a smaller image set (5-100 images). We selectively train only part of the network to encourage diversity and prevent memorization. Our second method focuses on the instance segmentation setting, where the model predicts (1) what objects occur in an image and (2) their exact outline in the image. This setting commonly suffers from long-tail distributions, where some of the known objects occur frequently (e.g. "human" may occur 1000+ times) but most only occur a few times (e.g. "cake" or "parrot" may only occur 10 times). We observed that the "background" label has a disproportionate effect of suppressing the rare object labels. We use this to develop a method to balance suppression from background classes during training. Computer vision data-efficient learning
294	Target Tracking from a UAV based on Computer Vision Zhang, Yuhan 13 June 2018 (has links) This thesis presents the design and build of tracking system for a quadrotor to chase a moving target based on computer vision in GPS-denied environment. The camera is mounted at the bottom of the quadrotor and used to capture the image below the quadrotor. The image information is transmitted to computer via a video transmitter and receiver module. The target is detected by the color and contour-based detection algorithm. The desired pitch and roll angles are calculated from the position controller based on the relative position and velocity between the moving target and the quadrotor. Interface between PC and quadrotor is built by controlling the PWM signals of the transmitter for command transmission. Three types of position controllers including PD controller, fuzzy controller and self-tuning PD controller based on fuzzy logic are designed and tested in the tracking tests. Results on the corresponding tracking performances are presented. Solutions to improving the tracking performance including the usage of optical sensor for velocity measurement and high-resolution camera for higher image quality are discussed in future work. / Master of Science / In this thesis, an automatic tracking system for a quadrotor based on computer vision in GPS-denied environment is studied and developed. A camera mounted on the quadrotor is used to “see” the moving target. The relative position and velocity between the quadrotor and the target can be obtained by a visual detection and tracking algorithm. Through the position controller, the desired pitch and roll angles are calculated to determine how much acceleration the quadrotor requires to chase the moving target and keeps the target within the detection range. Three position controllers are designed and tested, and their corresponding performances are compared and discussed. Drone aircraft Control computer vision
295	REPRODUCIBLE DEEP LEARNING SOFTWARE FOR EFFICIENT COMPUTER VISION Nikita Ravi (18398481) 19 April 2024 (has links) <p dir="ltr">Computer vision (CV) using deep learning can equip machines with the ability to understand visual information. CV has seen widespread adoption across numerous industries, from autonomous vehicles to facial recognition on smartphones. However, alongside these advancements, there have been increasing concerns about reproducing the results. The difficulty of reproducibility may arise due to multiple reasons, such as differences in execution environments, missing or incompatible software libraries, proprietary data, and the stochastic nature in some software. A study conducted by the Nature journal reveals that more than 70% of researchers failed to reproduce other researcher's experiments; over 50% failed to reproduce their own experiments. Given the critical role that computer vision plays in many applications, for example in edge devices like mobile phones and drones, irreproducibility poses significant challenges for researchers and practitioners. To address these concerns, this thesis presents a systematic approach at analyzing and improving the reproducibility of computer vision models through case studies. This approach combines rigorous documentation standards, standardized software environment, and a comprehensive guide of best practices. By implementing these strategies, we aim to bridge the gap between research and practice, ensuring that innovations in computer vision can be effectively reproduced and deployed. </p> Computer vision Image processing Machine Learning Deep Learning Computer Vision Low-Powered Computer Vision Reproducibility
296	New toolsets to understand environmental sensation and variability in the aging process Zhan, Mei 07 January 2016 (has links) Aging is a complex process by which a combination of environmental, genetic and stochastic factors generate whole-system changes that modify organ and tissue function and alter physiological processes. Over the last few decades, many genetic and environmental modulators of aging have been found to be highly conserved between humans and a diverse group of model organisms. Yet, an integrative understanding of how these environmental and genetic variables interact over time in a whole organism to modulate the systemic changes involved in aging is lacking. The goal of this thesis project is to advance a systems perspective of aging by providing the experimental tools and conceptual framework for dissecting the regulatory connection between environmental inputs, molecular outputs and long term aging phenotypes in Caenorhabditis elegans, an experimentally tractable multi-cellular model for aging. Specifically, this work advances the quantitative imaging toolsets available to biologists by developing and refining microfluidic, hardware, computer vision, and software integration tools for high-throughput, high-content imaging of C. elegans. As a result of these technological advances, new roles for the TGF-beta and serotonin signaling pathways in encoding environmental food signals to influence longevity were uncovered and quantitatively characterized. Moreover, this work develops and integrates new microfluidic technologies with off-chip support systems to establish a platform for long-term tracking of the health and longevity trajectories of large numbers of individual C. elegans. The capabilities of this platform have the potential to address many important questions in aging including addressing environmental determinants of aging, the sources of inter-individual variability, the time course of aging-related declines and the effects of interventional strategies to improve health outcomes. Together, the toolsets for quantitative imaging and the long-term culture platform permit the large-scale investigation of both the internal state and long-term behavioral and health outputs of an important multicellular model organism for aging. Microfluidics Aging Computer vision C. Elegans
297	Foreground detection of video through the integration of novel multiple detection algorithims Nawaz, Muhammad January 2013 (has links) The main outcomes of this research are the design of a foreground detection algorithm, which is more accurate and less time consuming than existing algorithms. By the term accuracy we mean an exact mask (which satisfies the respective ground truth value) of the foreground object(s). Motion detection being the prior component of foreground detection process can be achieved via pixel based and block based methods, both of which have their own merits and disadvantages. Pixel based methods are efficient in terms of accuracy but a time consuming process, so cannot be recommended for real time applications. On the other hand block based motion estimation has relatively less accuracy but consumes less time and is thus ideal for real-time applications. In the first proposed algorithm, block based motion estimation technique is opted for timely execution. To overcome the issue of accuracy another morphological based technique was adopted called opening-and-closing by reconstruction, which is a pixel based operation so produces higher accuracy and requires lesser time in execution. Morphological operation opening-and-closing by reconstruction finds the maxima and minima inside the foreground object(s). Thus this novel simultaneous process compensates for the lower accuracy of block based motion estimation. To verify the efficiency of this algorithm a complex video consisting of multiple colours, and fast and slow motions at various places was selected. Based on 11 different performance measures the proposed algorithm achieved an average accuracy of more than 24.73% than four of the well-established algorithms. Background subtraction, being the most cited algorithm for foreground detection, encounters the major problem of proper threshold value at run time. For effective value of the threshold at run time in background subtraction algorithm, the primary component of the foreground detection process, motion is used, in this next proposed algorithm. For the said purpose the smooth histogram peaks and valley of the motion were analyzed, which reflects the high and slow motion areas of the moving object(s) in the given frame and generates the threshold value at run time by exploiting the values of peaks and valley. This proposed algorithm was tested using four recommended video sequences including indoor and outdoor shoots, and were compared with five high ranked algorithms. Based on the values of standard performance measures, the proposed algorithm achieved an average of more than 12.30% higher accuracy results. 006.3
298	Fast computation of moments with applications to transforms Liu, Jianguo, 劉建國 January 1996 (has links) published_or_final_version / Electrical and Electronic Engineering / Doctoral / Doctor of Philosophy Moments method (Statistics) Computer vision Algorithms
299	A methodology for resolving multiple vehicle occlusion in visual traffic surveillance Pang, Chun-cheong., 彭俊昌. January 2005 (has links) published_or_final_version / abstract / Electrical and Electronic Engineering / Doctoral / Doctor of Philosophy Traffic congestion. Image processing. Computer vision.
300	Combining silhouette and shading cues for model reconstruction Li, Shuda, 李書達 January 2007 (has links) published_or_final_version / abstract / Computer Science / Master / Master of Philosophy Computer vision. Image processing - Data processing.

Search results