Global ETD Search

561	SMARTGUIDE: Revolutionizing the Depth and Dependability of Vision-Impaired Navigation Gandham, Rishith 03 January 2025 (has links) Globally, over 2.2 billion people face vision impairment, necessitating innovative solutions for safe, independent navigation. Traditional aids like canes, guide dogs, and GPS offer basic support but lack the sophistication to provide contextual understanding, precise navigation, or real-time hazard alerts. This project presents SmartGuide, a mobile app designed to enhance the independence of visually impaired users through AI-driven features. SmartGuide offers three main functions: (1) Smart Vision, using the GPT-4 Vision API to deliver spoken feedback about surroundings; (2) Navigation, combining QR code detection via YOLO with ZoeDepth for depth estimation, guiding users to destinations through the shortest path calculated by Dijkstra's algorithm; and (3) Obstacle Detection and Alerts, where YOLO identifies obstacles, and ZoeDepth estimates their distance to inform users of potential hazards. By adapting its responses based on user feedback, SmartGuide provides personalized, reliable guidance that empowers visually impaired individuals to navigate with confidence and safety, advancing the field of accessible technology. / Master of Science / Navigating unfamiliar or crowded spaces is a major challenge for visually impaired individuals, who often rely on canes, guide dogs, or GPS tools. While useful, these aids offer only limited guidance and do not provide detailed information about surroundings, directions, or nearby obstacles. SmartGuide is a mobile application designed to address these gaps, enabling visually impaired users to navigate more independently and safely. SmartGuide includes three main features: Smart Vision, which gives audio feedback about the user's surroundings; Navigation, which uses QR codes and depth estimation to guide users to destinations along the shortest path; and Obstacle Alerts that detects obstacles, warning users of potential hazards. Using advanced AI technologies and feedback from visually impaired users, SmartGuide delivers clear, actionable guidance that supports confident, safe movement in both familiar and new environments. This research-driven tool demonstrates how technology can enhance accessibility, making navigation easier and safer for those with vision impairment. Indoor Navigation Visually Impaired Computer Vision
562	A Framework for Human Body Tracking Using an Agent-based Architecture Fang, Bing 12 August 2011 (has links) The purpose of this dissertation is to present our agent-based human tracking framework, and to evaluate the results of our work in light of the previous research in the same field. Our agent-based approach departs from a process-centric model where the agents are bound to specific processes, and introduces a novel model by which agents are bound to the objects or sub-objects being recognized or tracked. The hierarchical agent-based model allows the system to handle a variety of cases, such as single people or multiple people in front of single or stereo cameras. We employ the job-market model for agents' communication. In this dissertation, we will present several experiments in detail, which demonstrate the effectiveness of the agent-based tracking system. Per our research, the agents are designed to be autonomous, self-aware entities that are capable of communicating with other agents to perform tracking within agent coalitions. Each agent with high-level abstracted knowledge seeks evidence for its existence from the low-level features (e.g. motion vector fields, color blobs) and its peers (other agents representing body-parts with which it is compatible). The power of the agent-based approach is its flexibility by which the domain information may be encoded within each agent to produce an overall tracking solution. / Ph. D. computer vision agent-based human tracking
563	Advanced Control Design of an Autonomous Line Painting Robot Cao, Mincan 30 May 2017 (has links) Painting still plays a fundamental role in communication nowadays. For example, the paint on the road, called road surface marking, guides the traffic in order and maintains the high efficiency of the entire modern traffic system. With the development of the Autonomous Ground Vehicle (AGV), the idea of a line Painting Robot emerged. In this thesis, a Painting Robot was designed as a standalone system based on the AGV platform. In this study, the mechanical and electronic design of a Painting Robot was discussed. The overall design was to fulfill the requirements of the line painting. Computer vision techniques were applied to this thesis since the camera was selected as the major sensor of the robot. Advanced control theory was introduced to this thesis as well. Three different controllers were developed. The Proportional-Integral (PI) controller with an anti-windup feature was designed to overcome the drawbacks of the traditional PI controller. Model Reference Adaptive Control (MRAC) was introduced into this thesis to deal with the uncertainties of the system. At last, the hybrid PI-MRAC controller was implemented to maintain the advantages of both PI and MRAC approaches. Experiments were conducted to evaluate the performance of the entire system, which indicated the successful design of the Painting Robot. / Master of Science / Painting still plays a fundamental role in communication nowadays. With the development of the Autonomous Ground Vehicle (AGV), the idea of a line Painting Robot emerged. In this thesis, a Painting Robot was designed as a standalone system based on the AGV platform. In this study, a Painting Robot with a two-camera system was designed. Computer vision techniques and advanced control theory were introduced into this thesis. Three different controllers were developed, including Proportional-Integral (PI) with an anti-windup feature, Model Reference Adaptive Control (MRAC) and the hybrid PI-MRAC. Experiments were conducted to evaluate the performance of the entire system, which indicated the successful design of the Painting Robot. Adaptive Control Painting Robot Computer Vision
564	Computer Vision Tracking of sUAS From a Pan/Tilt Platform Ogorzalek, Jeremy Patrick 24 June 2019 (has links) The ability to quickly, accurately, and autonomously identify and track objects in digital images in real-time has been an area of investigation for quite some time. Research in this area falls under the broader category of computer vision. Only in recent decades, with advances in computing power and commercial optical hardware, has this capability become a possibility. There are many different methods of identifying and tracking objects of interest, and best practices are still being developed, varying based on application. This thesis examines background subtraction methods as they apply to the tracking of small unmanned aerial systems (sUAS). A system combining commercial off-the-shelf (COTS) cameras and a pan-tilt unit (PTU), along with custom developed code, is developed for the purpose of continuously pointing at and tracking the motion of a sUAS in flight. Mixtures of Gaussians Background Modeling (MOGBM) is used to track the motion of the sUAS in frame and determine when to command the PTU. When the camera is moving, background subtraction methods are unusable, so additional methods are explored for filling this performance gap. The stereo vision capabilities of the system, enabled by the use of two cameras simultaneously, allow for estimation of the three-dimensional position and trajectory of the sUAS. This system can be used as a supplement or replacement to traditional tracking methods such as GPS and RADAR as part of a larger unmanned aerial systems traffic control (UTC) infrastructure. / Master of Science / The ability to quickly, accurately, and automatically identify and track targets in digital images has been of interest for some time now. Research in this area falls under the broader category of computer vision. Only in recent decades, with advances in computing power and commercial optical hardware, has this ability become a possibility. There are many different methods of identifying and tracking targets of interest, and best practices are still being developed, varying based on application. This thesis examines background subtraction methods as they apply to the tracking of small unmanned aerial systems (sUAS), commonly referred to as drones. A system combining cameras and a moving platform, along with custom developed code, is developed for the purpose of continuously pointing at and tracking the motion of an sUAS in flight. The system is able to map out the three-dimensional position of a flying sUAS over time. sUAS Computer Vision Target Tracking PTU
565	Understanding Representations and Reducing their Redundancy in Deep Networks Cogswell, Michael Andrew 15 March 2016 (has links) Neural networks in their modern deep learning incarnation have achieved state of the art performance on a wide variety of tasks and domains. A core intuition behind these methods is that they learn layers of features which interpolate between two domains in a series of related parts. The first part of this thesis introduces the building blocks of neural networks for computer vision. It starts with linear models then proceeds to deep multilayer perceptrons and convolutional neural networks, presenting the core details of each. However, the introduction also focuses on intuition by visualizing concrete examples of the parts of a modern network. The second part of this thesis investigates regularization of neural networks. Methods like dropout and others have been proposed to favor certain (empirically better) solutions over others. However, big deep neural networks still overfit very easily. This section proposes a new regularizer called DeCov, which leads to significantly reduced overfitting (difference between train and val performance) and greater generalization, sometimes better than dropout and other times not. The regularizer is based on the cross-covariance of hidden representations and takes advantage of the intuition that different features should try to represent different things, an intuition others have explored with similar losses. Experiments across a range of datasets and network architectures demonstrate reduced overfitting due to DeCov while almost always maintaining or increasing generalization performance and often improving performance over dropout. / Master of Science Object Recognition Overfitting Computer Vision Machine learning
566	Collaborative Path Planning and Control for Ground Agents Via Photography Collected by Unmanned Aerial Vehicles Wood, Sami Warren 24 June 2022 (has links) Natural disasters damage infrastructure and create significant obstacles to humanitarian aid efforts. Roads may become unusable, hindering or halting efforts to provide food, water, shelter, and life-saving emergency care. Finding a safe route during a disaster is especially difficult because as the disaster unfolds, the usability of roads and other infrastructure can change quickly, rendering most navigation services useless. With the proliferation of cheap cameras and unmanned aerial vehicles [UAVs], the rapid collection of aerial data after a natural disaster has become increasingly common. This data can be used to quickly appraise the damage to critical infrastructure, which can help solve navigational and logistical problems that may arise after the disaster. This work focuses on a framework in which a UAV is paired with an unmanned ground vehicle [UGV]. The UAV follows the UGV with a downward-facing camera and helps the ground vehicle navigate the flooded environment. This work makes several contributions: a simulation environment is created to allow for automated data collection in hypothetical disaster scenarios. The simulation environment uses real-world satellite and elevation data to emulate natural disasters such as floods. The environment partially simulates the dynamics of the UAV and UGV, allowing agents to ex- plore during hypothetical disasters. Several semantic image segmentation models are tested for efficacy in identifying obstacles and creating cost maps for navigation within the environ- ment, as seen by the UAV. A deep homography model incorporates temporal relations across video frames to stitch cost maps together. A weighted version of a navigation algorithm is presented to plan a path through the environment. The synthesis of these modules leads to a novel framework wherein a UAV may guide a UGV safely through a disaster area. / Master of Science / Damage to infrastructure after a natural disaster can make navigation a major challenge. Imagine a hurricane has hit someone's house; they are hurt and need to go to the hospital. Using a traditional GPS navigation system or even their memory may not work as many roads could be impassible. However, if the GPS could be quickly updated as to which roads were not flooded, it could still be used to navigate and avoid hazards. While the system presented is designed to work with a self-driving vehicle, it could easily be extended to give directions to a human. The goal of this work is to provide a system that could be used as a replacement for a GPS based on aerial photography. The advantage of this system is that flooded or damaged infrastructure can be identified and avoided in real-time. The system could even identify other possible routes by using photography, such as driving across a field to reach higher ground. Like a GPS, the system works automatically, tracking a user's position and sug- gesting turns, aiding navigation. A contribution of this work is a simulation of the environment designed in a video game engine. The game engine creates a video game world that can be flooded and used to test the new navigation system. The video game environment is used to train an artificial intel- ligence computer model to identify hazards and create routes that would avoid them. The system could be used in a real-world disaster following training in a video game world. Path Planning Computer Vision Deep Learning
567	Agricultural Crop Monitoring with Computer Vision Burns, James Ian 25 September 2014 (has links) Precision agriculture allows farmers to efficiently use their resources with site-specific applications. The current work looks to computer vision for the data collection method necessary for such a smart field, including cameras sensitive to visual (430-650~nm), near infrared (NIR,750-900~nm), shortwave infrared (SWIR,950-1700~nm), and longwave infrared (LWIR,7500-16000~nm) light. Three areas are considered in the study: image segmentation, multispectral image registration, and the feature tracking of a stressed plant. The accuracy of several image segmentation methods are compared. Basic thresholding on pixel intensities and vegetation indices result in accuracies below 75% . Neural networks (NNs) and support vector machines (SVMs) label correctly at 89% and 79%, respectively, when given only visual information, and final accuracies of 97% when the near infrared is added. The point matching methods of Scale Invariant Feature Transform (SIFT) and Edge Orient Histogram (EOH) are compared for accuracy. EOH improves the matching accuracy, but ultimately not enough for the current work. In order to track the image features of a stressed plant, a set of basil and catmint seedlings are grown and placed under drought and hypoxia conditions. Trends are shown in the average pixel values over the lives of the plants and with the vegetation indices, especially that of Marchant and NIR. Lastly, trends are seen in the image textures of the plants through use of textons. / Master of Science Computer Vision Machine learning Agriculture Multispectral Automation
568	Design and Development of an Autonomous Line Painting System Nagi, Navneet Singh 08 February 2019 (has links) With vast improvements in computing power in the last two decades, humans have invested significantly in engineering resources in an attempt to automate labor intensive or dangerous tasks. A particularly dangerous and labor-intensive task is painting lines on roads for facilitating urban mobility. This thesis proposes an approach to automate the process of painting lines on the ground using an autonomous ground vehicle (AGV) fitted with a stabilized painting mechanism. The AGV accepts Global Positioning System (GPS) coordinates for waypoint navigation. A computer vision algorithm is developed to provide vision feedback to stabilize the painting mechanism. The system is demonstrated to follow an input desired trajectory and cancel any high frequency vibrations due to the uneven terrain that the vehicle is traversing. Also, the stabilizing system is able to eliminate the long-term drift (due to inaccurate GPS waypoint navigation) using the complementary vision system. / MS / There is a need to develop an automated system capable of painting lines on the ground with minimal human intervention as the current methods to paint lines on the ground are inefficient, labor intensive, and dangerous. The human input to such a system is limited to the determination of the desired trajectory of the line to be drawn. This thesis presents the design and development of an autonomous line painting system that includes an autonomous ground vehicle (capable of following GPS waypoints) integrated with an automatic line painting mechanism. As the vehicle traverses the ground, it experiences disturbances due to the interaction between the wheels and the ground, and also a long-term drift due to inaccurate tracking of the input GPS coordinates. In order to compensate for these disturbances, a vision system is implemented providing feedback to a stabilizing arm. This automated system is able to demonstrate the capability to follow a square trajectory defined by GPS coordinates while compensating for the disturbances. Autonomous painting Computer Vision Feedback Line Dectection
569	Handling Invalid Pixels in Convolutional Neural Networks Messou, Ehounoud Joseph Christopher 29 May 2020 (has links) Most neural networks use a normal convolutional layer that assumes that all input pixels are valid pixels. However, pixels added to the input through padding result in adding extra information that was not initially present. This extra information can be considered invalid. Invalid pixels can also be inside the image where they are referred to as holes in completion tasks like image inpainting. In this work, we look for a method that can handle both types of invalid pixels. We compare on the same test bench two methods previously used to handle invalid pixels outside the image (Partial and Edge convolutions) and one method that was designed for invalid pixels inside the image (Gated convolution). We show that Partial convolution performs the best in image classification while Gated convolution has the advantage on semantic segmentation. As for hotel recognition with masked regions, none of the methods seem appropriate to generate embeddings that leverage the masked regions. / Master of Science / A module at the heart of deep neural networks built for Artificial Intelligence is the convolutional layer. When multiple convolutional layers are used together with other modules, a Convolutional Neural Network (CNN) is obtained. These CNNs can be used for tasks such as image classification where they tell if the object in an image is a chair or a car, for example. Most CNNs use a normal convolutional layer that assumes that all parts of the image fed to the network are valid. However, most models zero pad the image at the beginning to maintain a certain output shape. Zero padding is equivalent to adding a black frame around the image. These added pixels result in adding information that was not initially present. Therefore, this extra information can be considered invalid. Invalid pixels can also be inside the image where they are referred to as holes in completion tasks like image inpainting where the network is asked to fill these holes and give a realistic image. In this work, we look for a method that can handle both types of invalid pixels. We compare on the same test bench two methods previously used to handle invalid pixels outside the image (Partial and Edge convolutions) and one method that was designed for invalid pixels inside the image (Gated convolution). We show that Partial convolution performs the best in image classification while Gated convolution has the advantage on semantic segmentation. As for hotel recognition with masked regions, none of the methods seem appropriate to generate embeddings that leverage the masked regions. Computer Vision Padding Convolution Invalid pixels
570	Development of a Peripheral-Central Vision System to Detect and Characterize Airborne Threats Kang, Chang Koo 29 October 2020 (has links) With the rapid proliferation of small unmanned aircraft systems (UAS), the risk of mid-air collisions is growing, as is the risk associated with the malicious use of these systems. The airborne detect-and-avoid (ABDAA) problem and the counter-UAS problem have similar sensing requirements for detecting and tracking airborne threats. In this dissertation, two image-based sensing methods are merged to mimic human vision in support of counter-UAS applications. In the proposed sensing system architecture, a ``peripheral vision'' camera (with a fisheye lens) provides a large field-of-view while a ``central vision'' camera (with a perspective lens) provides high resolution imagery of a specific object. This pair form a heterogeneous stereo vision system that can support range resolution. A novel peripheral-central vision (PCV) system to detect, localize, and classify an airborne threat is first introduced. To improve the developed PCV system's capability, three novel algorithms for the PCV system are devised: a model-based path prediction algorithm for fixed-wing unmanned aircraft, a multiple threat scheduling algorithm considering not only the risk of threats but also the time required for observation, and the heterogeneous stereo-vision optimal placement (HSOP) algorithm providing optimal locations for multiple PCV systems to minimize the localization error of threat aircraft. The performance of algorithms is assessed using an experimental data set and simulations. / Doctor of Philosophy / With the rapid proliferation of small unmanned aircraft systems (UAS), the risk of mid-air collisions is growing, as is the risk associated with the malicious use of these systems. The sensing technologies for detecting and tracking airborne threats have been developed to solve these UAS-related problems. In this dissertation, two image-based sensing methods are merged to mimic human vision in support of counter-UAS applications. In the proposed sensing system architecture, a ``peripheral vision'' camera (with a fisheye lens) provides a large field-of-view while a ``central vision'' camera (with a perspective lens) provides high resolution imagery of a specific object. This pair enables estimation of an object location using the different viewpoints of the different cameras (denoted as ``heterogeneous stereo vision.'') A novel peripheral-central vision (PCV) system to detect an airborne threat, estimate the location of the threat, and determine the threat class (e.g. aircraft, bird) is first introduced. To improve the developed PCV system's capability, three novel algorithms for the PCV system are devised: an algorithm to predict the future path of an fixed-wing unmanned aircraft, an algorithm to decide an efficient observation schedule for multiple threats, and an algorithm that provides optimal locations for multiple PCV systems to estimate the threat position better. The performance of algorithms is assessed using an experimental data set and simulations. Counter-UAS Computer vision Aircraft dynamics Optimization

Search results