181 |
Artificial-Intelligence-Enabled Robotic Navigation Using Crop Row Detection Based Multi-Sensory Plant Monitoring System DeploymentAlshanbari, Reem 07 1900 (has links)
The ability to detect crop rows and release sensors in large areas to ensure homogeneous coverage is crucial to monitor and increase the yield of crop rows. Aerial
robotics in the agriculture field helps to reduce soil compaction. We report a release
mechanics system based on image processing for crop row detection, which is essential for field navigation-based machine vision since most plants grow in a row. The
release mechanics system is fully automated using embedded hardware and operated
from a UAV. Once the crop row is detected, the release mechanics system releases
lightweight, flexible multi-sensory devices on top of each plant to monitor the humidity and temperature conditions. The capability to monitor the local environmental
conditions of plants can have a high impact on enhancing the plant’s health and in creasing the output of agriculture. The proposed algorithm steps: image acquisition,
image processing, and line detection. First, we select the Region of Interest (ROI)
from the frame, transform it to grayscale, remove noise, and then skeletonize and
remove the background. Next, apply a Hough transform to detect crop rows and
filter the lines. Finally, we use the Kalman filter to predict the crop row line in the
next frame to improve the performance. This work’s main contribution is the release
mechanism integrated with embedded hardware with a high-performance crop row
detection algorithm for field navigation. The experimental results show the algorithm’s performance achieved a high accuracy of 90% of images with resolutions of
(900x470) the speed reached 2 Frames Per Second (FPS).
|
182 |
Camera-independent learning and image quality assessment for super-resolutionBégin, Isabelle. January 2007 (has links)
No description available.
|
183 |
Zhiwen_Dissertation.pdfZhiwen Cao (15347242) 29 April 2023 (has links)
<p>In this work, we presented a novel approach to the mathematical representation of facial pose, followed by the design of a neural network (NN) capable of leveraging these representations to solve the task of facial pose estimation. Our core contribution lay in the development of advanced mathematical representations for face orientation, which include: 1) three column-vector-based representation, 2) an Anisotropic Spherical Gaussian (ASG)-based Label Distribution Learning (LDL) representation, and 3) the SO(3) Hopf coordinate-based LDL representation. These representations provided continuous and unique descriptions of the facial orientation and avoided the Gimbal lock issue of Euler angles and the antipodal issue of quaternions. Building upon these mathematical representations, we specifically designed neural network architectures to utilize these features. Key components of our NN design included 1) orthogonal loss function for column-vector-based representations which encouraged the orthogonality of predicted vectors. 2) dynamic distribution parameter learning for ASG- and SO(3)-based LDL representations which allowed the NN to adjust the contributions of adjacent labels adaptively. Our proposed mathematical representations of rotations, combined with our NN architectures, provided a powerful framework for robust and accurate facial pose estimation.</p>
<p><br></p>
|
184 |
SwinFSR: Stereo Image Super-Resolution using SwinIR and Frequency Domain KnowledgeCHEN, KE January 2023 (has links)
Stereo Image Super-Resolution (stereoSR) has attracted significant attention in recent
years due to the extensive deployment of dual cameras in mobile phones, autonomous
vehicles and robots. In this work, we propose a new StereoSR method, named SwinFSR,
based on an extension of SwinIR, originally designed for single image restoration, and the
frequency domain knowledge obtained by the Fast Fourier Convolution (FFC). Specifically, to effectively gather global information, we modify the Residual Swin Transformer
blocks (RSTBs) in SwinIR by explicitly incorporating the frequency domain knowledge
using the FFC and employing the resulting residual Swin Fourier Transformer blocks
(RSFTBlocks) for feature extraction. Besides, for the efficient and accurate fusion of
stereo views, we propose a new cross-attention module referred to as RCAM, which
achieves highly competitive performance while requiring less computational cost than
the state-of-the-art cross-attention modules. Extensive experimental results and ablation studies demonstrate the effectiveness and efficiency of our proposed SwinFSR.
iv / Thesis / Master of Applied Science (MASc)
|
185 |
A stereo vision approach to automatic stereo matching in photogrammetry /Greenfeld, Joshua S. January 1987 (has links)
No description available.
|
186 |
Fast Screening Algorithm for Template MatchingLiu, Bolin January 2017 (has links)
This paper presents a generic pre-processor for expediting
conventional template matching techniques. Instead of locating the
best matched patch in the reference image to a query template via
exhaustive search, the proposed algorithm rules out regions with no
possible matches with minimum computational efforts. While working
on simple patch features, such as mean, variance and gradient, the
fast pre-screening is highly discriminative. Its computational
efficiency is gained by using a novel octagonal-star-shaped template
and the inclusion-exclusion principle to extract and compare patch
features. Moreover, it can handle arbitrary rotation and scaling of
reference images effectively, and also be robust to uniform
illumination changes. GPU-aided implementation shows great efficiency
of parallel computing in the algorithm design, and extensive
experiments demonstrate that the proposed algorithm greatly reduces
the search space while never missing the best match. / Thesis / Master of Applied Science (MASc)
|
187 |
Adaptive Lighting for Computer VisionCabrera, Mario 01 1900 (has links)
A system capable of adjusting a computer vision system to unpredictable ambient lighting has been designed and attached to a silhouette robot vision system. Its principle of operation is based on the generation and analysis of the distribution of light in one T.V. frame. Designed to be used in robot vision applications, high speed processing of data is achieved in the system to generate a histogram of grey levels in one frame time. An addressable RAM technique for this purpose is explained. The system obtains two threshold values from the histogram of grey levels and places them into a threshold logic unit. A silhouette
from a grey level picture is obtained as the result of the process. Adaptability of the system is performed by using different integration times in the read out of the visual transducer. The implementation of the system is based on a video rate histogram generator, a sensitivity control unit, a DMA circuit, an 86/12A microcomputer and a solid state T.V. camera. A graphics printer is used to print out results and a CRT terminal to communicate with the microcomputer. The custom hardware and software implementations
for the system are depicted in detail. / Thesis / Master of Engineering (ME)
|
188 |
Representing junctions through asymmetric tensor diffusionArseneau, Shawn January 2006 (has links)
No description available.
|
189 |
The automated synchronisation of independently moving cameras.Pooley, Daniel William January 2008 (has links)
Computer vision is concerned with the recovery of useful scene or camera information from a set of images. One classical problem is the estimation of the 3D scene structure depicted in multiple photographs. Such estimation fundamentally requires determining how the cameras are related in space. For a dynamic event recorded by multiple video cameras, finding the temporal relationship between cameras has a similar importance. Estimating such synchrony is key to a further analysis of the dynamic scene components. Existing approaches to synchronisation involve using visual cues common to both videos, and consider a discrete uniform range of synchronisation hypotheses. These prior methods exploit known constraints which hold in the presence of synchrony, from which both a temporal relationship, and an unchanging spatial relationship between the cameras can be recovered. This thesis presents methods that synchronise a pair of independently moving cameras. The spatial configuration of cameras is assumed to be known, and a cost function is developed to measure the quality of synchrony even for accuracies within a fraction of a frame. A Histogram method is developed which changes the approach from a consideration of multiple synchronisation hypotheses, to searching for seemingly synchronous frame pairs independently. Such a strategy has increased efficiency in the case of unknown frame rates. Further savings can be achieved by reducing the sampling rate of the search, by only testing for synchrony across a small subset of frames. Two robust algorithms are devised, using Bayesian inference to adaptively seek the sampling rate that minimises total execution time. These algorithms have a general underlying premise, and should be applicable to a wider class of robust estimation problems. A method is also devised to robustly synchronise two moving cameras when their spatial relationship is unknown. It is assumed that the motion of each camera has been estimated independently, so that these motion estimates are unregistered. The algorithm recovers both a synchronisation estimate, and a 3D transformation that spatially registers the two cameras. / Thesis (Ph.D.) - University of Adelaide, School of Computer Science, 2008
|
190 |
Toward computer vision for understanding American football in videoHess, Robin W. 14 June 2012 (has links)
In this work, I examine the problem of understanding American football in video. In particular,
I present several mid-level computer vision algorithms that each accomplish a different sub-task
within a larger system for annotating, interpreting, and analyzing collections of American football
video. The analysis of football video is useful in its own right, as teams at all levels from
high school to professional football currently spend thousands of dollars and countless human
work hours processing video of their own play and the play of their opponents with the aim
of developing strategy and improving performance. However, because football is an extremely
challenging visual domain, with difficulties ranging from the chaotic motion and identical appearance
of the players to the visual clutter on the field in the form of logos and other markings,
computer vision algorithms developed towards the end goal of understanding American football
are broadly applicable across a variety of visual problems.
I address four specific football-related problems in this thesis. First, I describe an approach
for registering video with a static model (i.e. the football field in the American football domain)
using a novel concept of locally distinctive invariant image feature matches. I also introduce a
novel empirical registration transform stability test, which we use to initialize our registration
procedure.
Second, I outline a novel method for constructing mosaics from collections of video. This
method takes a greedy utility maximization approach to build mosaics that achieve user-definable
mosaic quality objectives. While broadly applicable, our mosaicing approach accomplishes several
tasks specifically relevant to the analysis of football video, including automatically constructing
reference image sets for our video registration procedure and for computing background
models for initial formation recognition and player tracking algorithms.
Third, I present an approach for recognizing initial player formations. This approach, called
the Mixture-of-Parts Pictorial Structure (MoPPS) model, extends classical pictorial structures
to recognize multi-part objects whose parts can vary in both type and location and for which
an object part's location can depend on its type. While this model is effective in the American
football domain, it is also broadly applicable.
Finally, I address the problem of tracking football players through video using a novel particle
filtering formulation and an associated discriminative training procedure that directly maximizes
filter performance based on observed errors during tracking. This particle filtering framework
and training procedure are also broadly applicable.
For each of these algorithms, I also present a series of detailed experiments demonstrating
the method's effectiveness in the American football domain. As a further contribution, I have
made the data sets from most of these experiments publicly available. / Graduation date: 2013
|
Page generated in 0.0835 seconds