1 |
Foreground detection in specific outdoor scenes : A review of recognized techniques and proposed improvements for a real-time GPU-based implementation in C++Sandström, Gustav January 2016 (has links)
Correct insertion of computer graphics into live-action broadcasts of outdoor sports requires precise knowledge of the foreground, i.e. players present in the scene. This thesis proposes a foreground detection and segmentation- framework with focus on real-time performance for 1080p resolution. A dataset consisting of four scenes; single-, multi-segment-, transcending-foreground and a light-witch scene all with dynamic backgrounds was constructed together with 26 ground-truths. Results show that the framework should run internally at 288p using GPU acceleration with geometrical nearest-neighbour-interpolation to attain real-time-capability. To maximize accuracy of the results, the framework uses two instances of OpenCV MOG2 in parallel on differently downsampled frames that are bitwise-joined to increase robustness. A set of morphological operations provides post-processing to get spatial coherence and a specific turf- consideration gives accurate contours. Thanks to additional camera- operator input, a crude distance-estimate lets foreground segments fade into background at a predetermined depth. The framework suffers from inaccurate segmentation during rapid light-switches, but recovers in a matter of seconds like the 'vanilla' MOG algorithm. For the specific scenes the framework provides excellent performance, especially considering the light-switch scene by comparison to the MOG-algorithm. For non-specific scenes of the 'BMC 2012' performance does not exceed the current state-of-the-art. / Korrekt placering av datorgrafik i video för tv-produktion kräver god känndedom om aktuell förgrund. Denna avhandling föreslår ett förgrundsdetektions- och segmenterings- ramverk med fokus på realtidsbearbetning av full-HD upplöst sport i utomhusmiljö. För utvärdering skapades ett dataset bestående av fyra scener; singel-, multisegment-, avlägsnande-förgrund och en ljusomväxlingsscen tillsammans med 26 referensförgrunder. För att erhålla realtidsbearbetning skall ramverket internt nyttja 288p upplösning med GPU acceleration och geometrisk närmaste-granne-interpolation. Resultaten visade att maximal noggranhet och ökad robusthet erhölls med två instanser av OpenCV MOG2 arbetandes parallellt på olikt nerskalade bilder för att därefter pixelvis förenas. För att erhålla sammanhängande förgrundssegment nyttjades morfologiska operationer på den binära sammansatta förgrunden vilket tillsammans med en specifik gräskantskorrektion ger precisa konturer. Tack vare givna kameraparametrar kan djupet till förgrundselementen uppskattas därmed låts de övergå till bakgrund för ett visst djupt. Ramverket lider av oprecis segmententering vid snabba ljusomväxlingar, men återhämtar sig när bakgrundsmodellen uppdaterats till de nya ljusförutsättningarna. För ovan nämnda specifika scener presterar ramverket utmärkt, speciellt med avseende på ljusomväxlingen, där prestandan är flerfaldigt bättre än den enskilda 'MOG'-metoden. För generella scener ur 'BMC 2012' datasetet presterar vår metod dock inte bättre än state-of-the-art.
|
2 |
The research of background removal applied to fashion data : The necessity analysis of background removal for fashion data / Forskningen av bakgrundsborttagning tillämpas på modedata : Nödvändighetsanalysen av bakgrundsborttagning för modedataLiang, Junhui January 2022 (has links)
Fashion understanding is a hot topic in computer vision, with many applications having a great business value in the market. It remains a difficult challenge for computer vision due to the immense diversity of garments and a wide range of scenes and backgrounds. In this work, we try to remove the background of fashion images to boost data quality and ultimately increase model performance. Thanks to the fashion image consisting of evident persons in full garments visible, we can utilize Salient Object Detection (SOD) to achieve the background removal of fashion data to our expectations. The fashion image with removing the background is claimed as the “rembg” image, contrasting with the original one in the fashion dataset. We conduct comparative experiments between these two types of images on multiple aspects of model training, including model architectures, model initialization, compatibility with other training tricks and data augmentations, and target task types. Our experiments suggested that background removal can significantly work for fashion data in simple and shallow networks that are not susceptible to overfitting. It can improve model accuracy by up to 5% in the classification of FashionStyle14 when training models from scratch. However, background removal does not perform well in the deep network due to its incompatibility with other regularization techniques like batch normalization, pre-trained initialization, and data augmentations introducing randomness. The loss of background pixels invalidates many existing training tricks in the model training, adding the risk of overfitting for deep models. / Modeförståelse är ett hett ämne inom datorseende, med många applikationer som har ett stort affärsvärde på marknaden. Det är fortfarande en svår utmaning för datorseende på grund av den enorma mångfalden av plagg och ett brett utbud av scener och bakgrunder. I det här arbetet försöker vi ta bort bakgrunden från modebilder för att öka datakvaliteten och i slutändan öka modellens prestanda. Tack vare modebilden som består av synliga personer i helt synliga plagg, kan vi använda framträdande objektivdetektion för att uppnå bakgrundsborttagning av modedata enligt våra förväntningar. Modebilden med att ta bort bakgrunden hävdas vara “rembg”-bilden, i kontrast till den ursprungliga i modedatasetet. Vi genomför jämförande experiment mellan dessa två typer av bilder på flera aspekter av modellträning, inklusive modellarkitekturer, modellinitiering , kompatibilitet med andra träningsknep och dataökningar och måluppgiftstyper. Våra experiment antydde att bakgrundsborttagning avsevärt kan fungera för modedata i enkla och ytliga nätverk som inte är mottagliga för överanpassning. Det kan förbättra modellens noggrannhet med upp till 5 % i klassificeringen av FashionStyle14 när man tränar modeller från grunden. Bakgrundsborttagning fungerar dock inte bra i det djupa nätverket på grund av dess inkompatibilitet med andra regulariseringstekniker som batchnormalisering, förtränad initialisering och dataförstärkningar som introducerar slumpmässighet. Förlusten av bakgrundspixlar ogiltigförklarar många befintliga träningsknep i modellträningen, lägg till risken för övermontering för djupa modeller.
|
3 |
Fast Registration of Tabular Document Images Using the Fourier-Mellin TransformHutchison, Luke Alexander Daysh 24 March 2004 (has links)
Image registration, the process of finding the transformation that best maps one image to another, is an important tool in document image processing. Having properly-aligned microfilm images can help in manual and automated content extraction, zoning, and batch compression of images. An image registration algorithm is presented that quickly identifies the global affine transformation (rotation, scale, translation and/or shear) that maps one tabular document image to another, using the Fourier-Mellin Transform. Each component of the affine transform is recovered independantly from the others, dramatically reducing the parameter space of the problem, and improving upon standard Fourier-Mellin Image Registration (FMIR), which only directly separates translation from the other components. FMIR is also extended to handle shear, as well as different scale factors for each document axis. This registration method deals with all transform components in a uniform way, by working in the frequency domain. Registration is limited to foreground pixels (the document form and printed text) through the introduction of a novel, locally adaptive foreground-background segmentation algorithm, based on the median filter. The background removal algorithm is also demonstrated as a useful tool to remove ambient signal noise during correlation. Common problems with FMIR are eliminated by background removal, meaning that apodization (tapering down to zero at the edge of the image) is not needed for accurate recovery of the rotation parameter, allowing the entire image to be used for registration. An effective new optimization to the median filter is presented. Rotation and scale parameter detection is less susceptible to problems arising from the non-commutativity of rotation and "tiling" (periodicity) than for standard FMIR, because only the regions of the frequency domain directly corresponding to tabular features are used in registration. An original method is also presented for automatically obtaining blank document templates from a set of registered document images, by computing the "pointwise median" of a set of registered documents. Finally, registration is demonstrated as an effective tool for predictive image compression. The presented registration algorithm is reliable and robust, and handles a wider range of transformation types than most document image registration systems (which typically only perform deskewing).
|
4 |
Non-Destructive Evaluation of the Condition of Subsurface Drainage in Pavement Using Ground Penetrating RADAR (GPR)Hao Bai (5929478) 14 December 2020 (has links)
<div>Pavement drainage systems are one of the key drivers of pavement function and longevity, and effective drain maintenance can significantly extend a pavement's service life. Maintenance of these drains, however, is often hampered by the challenge of locating the drains. Ground Penetrating Radar (GPR) typically offers a rapid and effective method to detect these underground targets. However, typical detection schema that rely upon the observation of the hyperbolic return from a GPR scan of a buried conduit still tend to miss many of the older drains beneath pavements as they may be partially or fully filled with sediment and/or may be fabricated from clay or other earthen materials, yielding a return signal that is convolved with significant background noise. </div><div><br></div><div>To manage this challenge, this work puts forward an improved background noise and clutter reduction method to enhance the target signals in what amounts to a constructed environment that tends to have more consistent subsurface properties than one might encounter in a general setting. Within this technique, two major algorithms are employed. Algorithm 1 is the core of this method, and plays the role of reducing background noise and clutter. Algorithm 2 is supplementary, and helps eliminate anomalous discontinuous returns generated by the equipment itself, which could otherwise lead to false detection indications in the output of Algorithm 1. Instead of traditional 2-D GPR images, the result of the proposed algorithms is a 1-D plot along the survey line, highlighting a set of “points of interest” that could indicate buried drain locations identified at any given GPR operating frequency. Subsurface exploration using two different operating frequencies, 900 MHz and 400 MHz herein, is then employed to further enhance detection confidence. Points of interest are ultimately coded to define the confidence of the detection. Comparing the final result of proposed algorithms with the original GPR images, the improved algorithm is demonstrated to provide significantly improved detection results, and could potentially be applied to similar problems in other contexts.</div><div><br></div><div>Besides the background reduction methods, a group of simulations performed using GPRMAX2D software are examined to explore the influence of road cross-section designs on sub-pavement drainage conduit GPR signatures, and evaluate the effectiveness of alternate GPR antennae configurations in locating these buried conduits in different ground conditions. Two different models were explored to simulate conduit detection. In addition, different pipe and soil conditions were modeled, such as pipe size, pipe material, soil moisture level, and soil type. Four different quantitative measurements are used to analyze GPR performance based on different key factors. The four measurements are 1) signal to background ratio (SBR) in dB; 2) signal to receiver noise ratio (SNR) in dB; 3) signal energy in Volts; and 4) average signal band power in Watts.</div><div><br></div><div>The water and clay content of subsurface soil can significantly influence the detection results obtained from ground penetrating radar (GPR). Due to the variation of the material properties underground, the center frequency of transmitted GPR signals shifts to a lower range as wave attenuation increases. Examination of wave propagation in the subsurface employing an attenuation filter based on a linear system model shows that received GPR signals will be shifted to lower frequencies than those originally transmitted. The amount of the shift is controlled by a wave attenuation factor, which is determined by the dielectric constant, electric conductivity, and magnetic susceptibility of the transmitted medium. This work introduces a receiver-transmitter-receiver dual-frequency configuration for GPR that employs two operational frequencies for a given test - one higher and one slightly lower - to take advantage of this phenomenon to improve subpavement drain detection results. In this configuration, the original signal is transmitted from the higher frequency transmitter. After traveling through underground materials, the signal is received by two receivers with different frequencies. One of the receivers has the same higher center frequency as the transmitter, and the other receiver has a lower center frequency. This configuration can be expressed as Rx(low-frequency)-Tx(high-frequency)-Rx(high-frequency) and was applied in both laboratory experiments and field tests. Results are analyzed in the frequency domain to evaluate and compare the properties of the signal obtained by both receivers. The laboratory experiment used the configuration of Rx(400MHz)-Tx(900MHz)-Rx(900MHz). The field tests, in addition to the configuration used in the lab tests, employed another configuration of Rx(270MHz)-Tx(400MHz)-Rx(400MHz) to obtain more information about this phenomenon. Both lab and field test results illustrate the frequency-shift phenomenon described by theoretical calculations. Based on the power spectrum for each signal, the lower frequency antenna typically received more energy (higher density values) at its peak frequency than the higher frequency antenna.</div>
|
Page generated in 0.0753 seconds