1 |
Implicit image annotation by using gaze analysisHajimirza, S. Navid January 2012 (has links)
Thanks to the advances in technology, people are storing a massive amount of visual information in the online databases. Today it is normal for a person to take a photo of an event with their smartphone and effortlessly upload it to a host domain. For later quick access, this enormous amount of data needs to be indexed by providing metadata for their content. The challenge is to provide suitable captions for the semantics of the visual content. This thesis investigates the possibility of extracting and using the valuable information stored inside human’s eye movements when interacting with digital visual content in order to provide information for image annotation implicitly. A non-intrusive framework is developed which is capable of inferring gaze movements to classify the visited images by a user into two classes when the user is searching for a Target Concept (TC) in the images. The first class is formed of the images that contain the TC and it is called the TC+ class and the second class is formed of the images that do not contain the TC and it is called the TC- class. By analysing the eye-movements only, the developed framework was able to identify over 65% of the images that the subject users were searching for with the accuracy over 75%. This thesis shows that the existing information in gaze patterns can be employed to improve the machine’s judgement of image content by assessment of human attention to the objects inside virtual environments.
|
2 |
Design and Analysis of Low Complexity Video Coding for Realtime CommunicationsPark, Insu 02 1900 (has links)
<P> Video coding standards have been designed to support many applications such as
broadcasting systems, movie industries and media storage. All video coding standards
try to reduce data in video sequences as much as possible by exploiting spatial
and temporal redundancies. Although those video coding standards are suitable for a
wide variety of applications, some applications require low encoder complexity specifically
for real time video encoding. Most of the computational complexity of a video
encoder can be attributed to the motion estimation function. </p> <p> Motion estimation using multiple reference frames is widely used as the basis for
recent video coding standards (eg. H.264/ AVC) to achieve increased coding efficiency.
However, this increases the complexity of the encoding process. In this thesis, new
techniques for efficient motion estimation are proposed. A combination of multiple
reference frame selection and image residue-based mode selection is used to improve
motion estimation time. By dynamic selection of an initial reference frame in advance,
the number of reference frames to be considered is reduced. In addition, from
examination of the residue between the current block and reconstructed blocks in
preceding frames, variable block size mode decisions are made. Modified initial motion
vector estimation and early stop condition detection are also adopted to speed
up the motion estimation procedure. Experimental results compare the performance of the proposed algorithm with state of the art motion estimation algorithms and
demonstrate significantly reduced motion estimation time while maintaining PSNR
performance. </p> <p> In addition a new side information generation algorithm using dynamic motion
estimation and post processing is proposed for improved distributed video coding.
Multiple reference frames are employed for motion estimation at the side information
frame generation block of the decoder. After motion estimation and compensation,
post processing is applied to improve the hole and overlapped areas on the
reconstructed side information frame. Both median filtering and residual-based block
selecting algorithms are used to deal with hole and overlapped areas, respectively.
The proposed side information method contributes to improving the quality of reconstructed
frames at the distributed video decoder. The average encoding time of the
distributed video coding is shown to be around 15% of H.264 inter coding and 40%
of H.264 intra coding. The proposed side generation algorithm is implemented in a
frequency domain distributed system and tested throughout various test sequences.
The proposed side information based distributed video coding demonstrates improved
performance compared with that of H.264 intra coding. </p> <p> Experimental implementations of the proposed algorithms are demonstrated using
a set of video test sequences that are widely used and freely available. </p> / Thesis / Doctor of Philosophy (PhD)
|
Page generated in 0.0628 seconds