Global ETD Search

1	Intelligent image cropping and scaling Deigmoeller, Joerg January 2011 (has links) Nowadays, there exist a huge number of end devices with different screen properties for watching television content, which is either broadcasted or transmitted over the internet. To allow best viewing conditions on each of these devices, different image formats have to be provided by the broadcaster. Producing content for every single format is, however, not applicable by the broadcaster as it is much too laborious and costly. The most obvious solution for providing multiple image formats is to produce one high resolution format and prepare formats of lower resolution from this. One possibility to do this is to simply scale video images to the resolution of the target image format. Two significant drawbacks are the loss of image details through ownscaling and possibly unused image areas due to letter- or pillarboxes. A preferable solution is to find the contextual most important region in the high-resolution format at first and crop this area with an aspect ratio of the target image format afterwards. On the other hand, defining the contextual most important region manually is very time consuming. Trying to apply that to live productions would be nearly impossible. Therefore, some approaches exist that automatically define cropping areas. To do so, they extract visual features, like moving reas in a video, and define regions of interest (ROIs) based on those. ROIs are finally used to define an enclosing cropping area. The extraction of features is done without any knowledge about the type of content. Hence, these approaches are not able to distinguish between features that might be important in a given context and those that are not. The work presented within this thesis tackles the problem of extracting visual features based on prior knowledge about the content. Such knowledge is fed into the system in form of metadata that is available from TV production environments. Based on the extracted features, ROIs are then defined and filtered dependent on the analysed content. As proof-of-concept, this application finally adapts SDTV (Standard Definition Television) sports productions automatically to image formats with lower resolution through intelligent cropping and scaling. If no content information is available, the system can still be applied on any type of content through a default mode. The presented approach is based on the principle of a plug-in system. Each plug-in represents a method for analysing video content information, either on a low level by extracting image features or on a higher level by processing extracted ROIs. The combination of plug-ins is determined by the incoming descriptive production metadata and hence can be adapted to each type of sport individually. The application has been comprehensively evaluated by comparing the results of the system against alternative cropping methods. This evaluation utilised videos which were manually cropped by a professional video editor, statically cropped videos and simply scaled, non-cropped videos. In addition to and apart from purely subjective evaluations, the gaze positions of subjects watching sports videos have been measured and compared to the regions of interest positions extracted by the system. 621.382
2	Visual odometry from omnidirectional camera / Visual odometry from omnidirectional camera Diviš, Jiří January 2013 (has links) We present a system that estimates the motion of a robot relying solely on images from onboard omnidirectional camera (visual odometry). Compared to other visual odometry hardware, ours is unusual in utilizing high resolution, low frame-rate (1 to 3 Hz) omnidirectional camera mounted on a robot that is propelled using continuous tracks. We focus on high precision estimates in scenes, where objects are far away from the camera. This is achieved by utilizing omnidirectional camera that is able to stabilize the motion estimates between camera frames that are known to be ill-conditioned for narrow field of view cameras. We employ feature based-approach for estimation camera motion. Given our hardware, possibly high ammounts of camera rotation between frames can occur. Thus we use techniques of feature matching rather than feature tracking.
3	Visual odometry from omnidirectional camera / Visual odometry from omnidirectional camera Diviš, Jiří January 2013 (has links) We present a system that estimates the motion of a robot relying solely on images from onboard omnidirectional camera (visual odometry). Compared to other visual odometry hardware, ours is unusual in utilizing high resolution, low frame-rate (1 to 3 Hz) omnidirectional camera mounted on a robot that is propelled using continuous tracks. We focus on high precision estimates in scenes, where objects are far away from the camera. This is achieved by utilizing omnidirectional camera that is able to stabilize the motion estimates between camera frames that are known to be ill-conditioned for narrow field of view cameras. We employ feature based-approach for estimation camera motion. Given our hardware, possibly high ammounts of camera rotation between frames can occur. Thus we use techniques of feature matching rather than feature tracking.
4	Visual odometry from omnidirectional camera / Visual odometry from omnidirectional camera Diviš, Jiří January 2012 (has links) We present a system that estimates the motion of a robot relying solely on images from onboard omnidirectional camera (visual odometry). Compared to other visual odometry hardware, ours is unusual in utilizing high resolution, low frame-rate (1 to 3 Hz) omnidirectional camera mounted on a robot that is propelled using continuous tracks. We focus on high precision estimates in scenes, where objects are far away from the camera. This is achieved by utilizing omnidirectional camera that is able to stabilize the motion estimates between camera frames that are known to be ill-conditioned for narrow field of view cameras and the fact that low frame-rate of the imaging system allows us to focus computational resources on utilizing high resolution images. We employ feature based-approach for estimation camera motion. Given our hardware, possibly high ammounts of camera rotation between frames can occur. Thus we use techniques of feature matching rather than feature tracking.
5	Key Technologies in Low-cost Integrated Vehicle Navigation Systems Zhao, Yueming January 2013 (has links) Vehicle navigation systems incorporate on-board sensors/signal receivers and provide necessary positioning and guidance information for land, marine, airborne and space vehicles. Among different navigation solutions, the Global Positioning System (GPS) and an Inertial Navigation System (INS) are two basic navigation systems. Due to their complementary characters in many aspects, a GPS/INS integrated navigation system has been a hot research topic in recent decades. Both advantages and disadvantages of each individual system and their combination are analysed in this thesis. The Micro Electrical Mechanical Sensors (MEMS) successfully solved the problems of price, size and weight with traditional INS, and hence are widely applied in GPS/INS integrated systems. The main problem of MEMS is the large sensor errors, which rapidly degrade the navigation performance in an exponential speed. By means of different methods, such as autoregressive model, Gauss-Markov process, Power Spectral Density and Allan Variance, we analyse the stochastic errors within the MEMS sensors. The test results show that different methods give similar estimates of stochastic error sources. An equivalent model of coloured noise components (random walk, bias instability and ramp noise) is given. Three levels of GPS/IMU integration structures, i.e. loose, tight and ultra-tight GPS/IMU navigation, are introduced with a brief analysis of each character. The loose integration principles are presented with detailed equations as well as the INS navigation principles. The Extended Kalman Filter (EKF) is introduced as the data fusion algorithm, which is the core of the whole navigation system. Based on the system model, we show the propagation of position standard errors with the tight integration structure under different scenarios. Even less than 4 observable GNSS satellites can contribute to the integrated system, especially for the orientation errors. A real test with loose integration is carried out, and the EKF performance is analysed in detail. Since the GPS receivers are normally working with a digital map, the map matching principle and its link-choosing problem are briefly introduced. This problem is proposed to be solved by the lane detection from real-time images. The procedures for the lane detection based on image processing are presented. The test on high ways, city streets and pathways are successfully carried out, and analyses with possible solutions are given for some special failure situations. To solve the large error drift of the IMU, we propose to support the IMU orientation with camera motion estimation from image pairs. First the estimation theory and computer vision principles are briefly introduced. Then both point and line matches algorithms are given. Finally the L1-norm estimator with balanced adjustment is proposed to deal with possible mismatches (outliers). Tests and comparisons with the RANSAC algorithm are also presented. For the latest trend of MEMS chip sensors, their industry and market are introduced. To evaluate the MEMS navigation performance, we augment the EKF with an equivalent coloured noise model, and the basic observability analysis is given. A realistic simulated navigation test is carried out with single and multiple MEMS sensors, and a sensor array of 5-10 sensors are recommended according to the test results and analysis. Finally some suggestions for future research are proposed. / <p>QC 20131016</p> GPS INS MEMS error modelling integration structure lane detection camera motion balance adjustment sensor array
6	A rhetorical analysis of Elizabeth Barret's Stranger with a camera McCann, Elisabeth L. S. January 2002 (has links) This study explores how the context of an event can be reconstructed in order to change an event's meaning and how the recontextualization can influence perceptions of a community. The artifact examined is a documentary film produced by Appalshop, Stranger with a Camera directed by Elizabeth Barret.Chapter One includes an introduction to Stranger with a Camera, and work by scholars related to the study of documentary film. The research focus guiding the analysis is an examination of how Barret reconstructs the context of a murder in Jeremiah, Kentucky in order to alter the event's significance and meaning, and how her reconstruction may influence dominant social perceptions of a community.Chapter Two describes the method to be used in the analysis, cluster analysis developed by Kenneth Burke. The process of cluster analysis entails: 1) identifying the key terms in the rhetoric, 2) charting the terms that cluster around the key terms, 3) discovering emergent patterns in the clusters, and 4) naming the motive, or situation, based on the meanings of the key terms.Chapter Three is a cluster analysis of Stranger with a Camera. Key terms found in this analysis are "picture," "camera," "shooting," "media," "poverty," and "social action."Chapter Four contains conclusions pertaining to the analysis of the rhetorical artifact, conclusions for cluster analysis as a rhetorical methodology, and future considerations for academic scholarship. / Department of Communication Studies O'Connor, Hugh. Appalachian Region -- Social conditions. Stranger with a camera (Motion picture)
7	Line Based Estimation of Object Space Geometry and Camera Motion Srestasathiern, Panu 31 August 2012 (has links) No description available. Computer Science Geographic Information Science Robotics photogrammetry projective geometry 3D structure and camera motion recovery line feature
8	Určení parametrů pohybu ze snímků kamery / Determination of Motion Parameters in Machine Vision Dušek, Stanislav January 2009 (has links) This thesis describe about determination of camera motion parameters in plane. At first there are introduce the basics of motion tracking, is focused to find out displacement between two input images. Below is describe the algorithm GoodFeatruresToTrack, which find out the most significant point in a first image. The point is search out the good point, which will be easy to track in next image, reduce the data volume and prepare the input information (array of significant point) for the algorithm Lucas-Kanade optical flow. In second part is deal with processing and utilization estimations optical flow. There is median filtration, below is describe computation of homogenous transformation, which describe all affine transformation in affine space. As the result are coordinates, which describe the shift between the two input images as X-axis and Y-axis value. The project used the library Open Computer Vision.
9	Video anatomy : spatial-temporal video profile Cai, Hongyuan 31 July 2014 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / A massive amount of videos are uploaded on video websites, smooth video browsing, editing, retrieval, and summarization are demanded. Most of the videos employ several types of camera operations for expanding field of view, emphasizing events, and expressing cinematic effect. To digest heterogeneous videos in video websites and databases, video clips are profiled to 2D image scroll containing both spatial and temporal information for video preview. The video profile is visually continuous, compact, scalable, and indexing to each frame. This work analyzes the camera kinematics including zoom, translation, and rotation, and categorize camera actions as their combinations. An automatic video summarization framework is proposed and developed. After conventional video clip segmentation and video segmentation for smooth camera operations, the global flow field under all camera actions has been investigated for profiling various types of video. A new algorithm has been designed to extract the major flow direction and convergence factor using condensed images. Then this work proposes a uniform scheme to segment video clips and sections, sample video volume across the major flow, compute flow convergence factor, in order to obtain an intrinsic scene space less influenced by the camera ego-motion. The motion blur technique has also been used to render dynamic targets in the profile. The resulting profile of video can be displayed in a video track to guide the access to video frames, help video editing, and facilitate the applications such as surveillance, visual archiving of environment, video retrieval, and online video preview. camera motion understanding major flow mosaicing profile of video spatial-temporal synthesis video indexing Information display systems Digital video -- Research Digital cameras Video compression Multimedia systems Image analysis Video surveillance Computer algorithms Human-computer interaction Image transmission Visual perception Automatic abstracting Mechatronics Pattern recognition systems Content-based image retrieval
10	Robust Subspace Estimation Using Low-rank Optimization. Theory And Applications In Scene Reconstruction, Video Denoising, And Activity Recognition. Oreifej, Omar 01 January 2013 (has links) In this dissertation, we discuss the problem of robust linear subspace estimation using low-rank optimization and propose three formulations of it. We demonstrate how these formulations can be used to solve fundamental computer vision problems, and provide superior performance in terms of accuracy and running time. Consider a set of observations extracted from images (such as pixel gray values, local features, trajectories . . . etc). If the assumption that these observations are drawn from a liner subspace (or can be linearly approximated) is valid, then the goal is to represent each observation as a linear combination of a compact basis, while maintaining a minimal reconstruction error. One of the earliest, yet most popular, approaches to achieve that is Principal Component Analysis (PCA). However, PCA can only handle Gaussian noise, and thus suffers when the observations are contaminated with gross and sparse outliers. To this end, in this dissertation, we focus on estimating the subspace robustly using low-rank optimization, where the sparse outliers are detected and separated through the `1 norm. The robust estimation has a two-fold advantage: First, the obtained basis better represents the actual subspace because it does not include contributions from the outliers. Second, the detected outliers are often of a specific interest in many applications, as we will show throughout this thesis. We demonstrate four different formulations and applications for low-rank optimization. First, we consider the problem of reconstructing an underwater sequence by removing the iii turbulence caused by the water waves. The main drawback of most previous attempts to tackle this problem is that they heavily depend on modelling the waves, which in fact is ill-posed since the actual behavior of the waves along with the imaging process are complicated and include several noise components; therefore, their results are not satisfactory. In contrast, we propose a novel approach which outperforms the state-of-the-art. The intuition behind our method is that in a sequence where the water is static, the frames would be linearly correlated. Therefore, in the presence of water waves, we may consider the frames as noisy observations drawn from a the subspace of linearly correlated frames. However, the noise introduced by the water waves is not sparse, and thus cannot directly be detected using low-rank optimization. Therefore, we propose a data-driven two-stage approach, where the first stage “sparsifies” the noise, and the second stage detects it. The first stage leverages the temporal mean of the sequence to overcome the structured turbulence of the waves through an iterative registration algorithm. The result of the first stage is a high quality mean and a better structured sequence; however, the sequence still contains unstructured sparse noise. Thus, we employ a second stage at which we extract the sparse errors from the sequence through rank minimization. Our method converges faster, and drastically outperforms state of the art on all testing sequences. Secondly, we consider a closely related situation where an independently moving object is also present in the turbulent video. More precisely, we consider video sequences acquired in a desert battlefields, where atmospheric turbulence is typically present, in addition to independently moving targets. Typical approaches for turbulence mitigation follow averaging or de-warping techniques. Although these methods can reduce the turbulence, they distort the independently moving objects which can often be of great interest. Therefore, we address the iv problem of simultaneous turbulence mitigation and moving object detection. We propose a novel three-term low-rank matrix decomposition approach in which we decompose the turbulence sequence into three components: the background, the turbulence, and the object. We simplify this extremely difficult problem into a minimization of nuclear norm, Frobenius norm, and `1 norm. Our method is based on two observations: First, the turbulence causes dense and Gaussian noise, and therefore can be captured by Frobenius norm, while the moving objects are sparse and thus can be captured by `1 norm. Second, since the object’s motion is linear and intrinsically different than the Gaussian-like turbulence, a Gaussian-based turbulence model can be employed to enforce an additional constraint on the search space of the minimization. We demonstrate the robustness of our approach on challenging sequences which are significantly distorted with atmospheric turbulence and include extremely tiny moving objects. In addition to robustly detecting the subspace of the frames of a sequence, we consider using trajectories as observations in the low-rank optimization framework. In particular, in videos acquired by moving cameras, we track all the pixels in the video and use that to estimate the camera motion subspace. This is particularly useful in activity recognition, which typically requires standard preprocessing steps such as motion compensation, moving object detection, and object tracking. The errors from the motion compensation step propagate to the object detection stage, resulting in miss-detections, which further complicates the tracking stage, resulting in cluttered and incorrect tracks. In contrast, we propose a novel approach which does not follow the standard steps, and accordingly avoids the aforementioned diffi- culties. Our approach is based on Lagrangian particle trajectories which are a set of dense trajectories obtained by advecting optical flow over time, thus capturing the ensemble motions v of a scene. This is done in frames of unaligned video, and no object detection is required. In order to handle the moving camera, we decompose the trajectories into their camera-induced and object-induced components. Having obtained the relevant object motion trajectories, we compute a compact set of chaotic invariant features, which captures the characteristics of the trajectories. Consequently, a SVM is employed to learn and recognize the human actions using the computed motion features. We performed intensive experiments on multiple benchmark datasets, and obtained promising results. Finally, we consider a more challenging problem referred to as complex event recognition, where the activities of interest are complex and unconstrained. This problem typically pose significant challenges because it involves videos of highly variable content, noise, length, frame size . . . etc. In this extremely challenging task, high-level features have recently shown a promising direction as in [53, 129], where core low-level events referred to as concepts are annotated and modelled using a portion of the training data, then each event is described using its content of these concepts. However, because of the complex nature of the videos, both the concept models and the corresponding high-level features are significantly noisy. In order to address this problem, we propose a novel low-rank formulation, which combines the precisely annotated videos used to train the concepts, with the rich high-level features. Our approach finds a new representation for each event, which is not only low-rank, but also constrained to adhere to the concept annotation, thus suppressing the noise, and maintaining a consistent occurrence of the concepts in each event. Extensive experiments on large scale real world dataset TRECVID Multimedia Event Detection 2011 and 2012 demonstrate that our approach consistently improves the discriminativity of the high-level features by a significant margin. low rank representation low rank sparse representation sparse activity recognition turbulence mitigation video denoising complex event recognition nuclear norm augmented lagrange multiplier camera motion estimation trecvid hoha water waves rank trajectories particle advection registration decomposition moving object detection background subtraction atmospheric turbulence Computer Engineering Engineering

Search results