• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • Tagged with
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

AI-Driven Image Manipulation : Image Outpainting Applied on Fashion Images

Mennborg, Alexander January 2021 (has links)
The e-commerce industry frequently has to deal with displaying product images in a website where the images are provided by the selling partners. The images in question can have drastically different aspect ratios and resolutions which makes it harder to present them while maintaining a coherent user experience. Manipulating images by cropping can sometimes result in parts of the foreground (i.e. product or person within the image) to be cut off. Image outpainting is a technique that allows images to be extended past its boundaries and can be used to alter the aspect ratio of images. Together with object detection for locating the foreground makes it possible to manipulate images without sacrificing parts of the foreground. For image outpainting a deep learning model was trained on product images that can extend images by at least 25%. The model achieves 8.29 FID score, 44.29 PSNR score and 39.95 BRISQUE score. For testing this solution in practice a simple image manipulation pipeline was created which uses image outpainting when needed and it shows promising results. Images can be manipulated in under a second running on ZOTAC GeForce RTX 3060 (12GB) GPU and a few seconds running on a Intel Core i7-8700K (16GB) CPU. There is also a special case of images where the background has been digitally replaced with a solid color and they can be outpainted even faster without deep learning.
2

CONTENT UNDERSTANDING FOR IMAGING SYSTEMS: PAGE CLASSIFICATION, FADING DETECTION, EMOTION RECOGNITION, AND SALIENCY BASED IMAGE QUALITY ASSESSMENT AND CROPPING

Shaoyuan Xu (9116033) 12 October 2021 (has links)
<div>This thesis consists of four sections which are related with four research projects.</div><div><br></div><div>The first section is about Page Classification. In this section, we extend our previous approach which could classify 3 classes of pages: Text, Picture and Mixed, to 5 classes which are: Text, Picture, Mixed, Receipt and Highlight. We first design new features to define those two new classes and then use DAG-SVM to classify those 5 classes of images. Based on the results, our algorithm performs well and is able to classify 5 types of pages.</div><div><br></div><div>The second section is about Fading Detection. In this section, we develop an algorithm that can automatically detect fading for both text and non-text region. For text region, we first do global alignment and then perform local alignment. After that, we create a 3D color node system, assign each connected component to a color node and get the color difference between raster page connected component and scanned page connected. For non-text region, after global alignment, we divide the page into "super pixels" and get the color difference between raster super pixels and testing super pixels. Compared with the traditional method that uses a diagnostic page, our method is more efficient and effective.</div><div><br></div><div>The third section is about CNN Based Emotion Recognition. In this section, we build our own emotion recognition classification and regression system from scratch. It includes data set collection, data preprocessing, model training and testing. We extend the model to real-time video application and it performs accurately and smoothly. We also try another approach of solving the emotion recognition problem using Facial Action Unit detection. By extracting Facial Land Mark features and adopting SVM training framework, the Facial Action Unit approach achieves comparable accuracy to the CNN based approach.</div><div><br></div><div>The forth section is about Saliency Based Image Quality Assessment and Cropping. In this section, we propose a method of doing image quality assessment and recomposition with the help of image saliency information. Saliency is the remarkable region of an image that attracts people's attention easily and naturally. By showing everyday examples as well as our experimental results, we demonstrate the fact that, utilizing the saliency information will be beneficial for both tasks.</div>
3

Algoritmy pro automatický ořez sférické fotografie a videa / Algorithms for Automatic Spherical Image and Video Cropping

Ivančo, Martin January 2020 (has links)
Cieľom tejto práce je priniesť detailný pohľad na doterajší prieskum v oblasti sférických videí. Konkrétne sa táto práca zameriava na problém tvorby videa s normálnym zorným poľom zo sférického videa pre potreby zobrazovania. Prináša tiež implementáciu niektorých dostupných metód. Doteraz boli predstavené tri metódy v štyroch článkoch, ktoré riešia tento problém. Všetky priniesli zaujímavé výsledky a táto práca sa dvomi z nich zaoberá hlbšie. Táto práca tiež prináša základnú metódu využívajúcu overené metódy automat- ického orezu obrazu. Táto metóda je využitá na porovnanie so skúmanými metódami, u ktorých zvýrazní ich vylepšenia ale aj nedostatky. Na základe porovnania metód pomocou užívateľského experimentu táto práca usudzuje, že najlepšou zo skúmaných metód pre túto úlohu je upravená varianta metódy od Pavel et al. [14], predstavená v tejto práci.

Page generated in 0.0839 seconds