Global ETD Search

Return to search

Light Field Video Processing and Streaming Using Applied AI

As a new form of volumetric media, a Light Field (LF) can provide users with a true 6 Degrees-Of-Freedom (DOF) immersive experience, because LF captures the scene with photo-realism, including aperture-limited changes in viewpoint. Nevertheless, the larger size and higher dimension of LF data bring greater challenges to processing and transmission. The main focus of this study is the application of the applied Artificial Intelligence (AI) method to the transmission and processing of LF data, thereby alleviating the performance bottleneck in existing methods.
Uncompressed LF data are too large for network transmission, which is why LF compression has become an important research topic. A new LF compression algorithm based on Graph Neural Network (GNN) is proposed in this work. It can use the graph network model to fit the similarity between the LF viewpoints, so that only the data of a few essential anchor viewpoints need to be transmitted after compression, and a complete LF matrix can be reconstructed according to the graph model at the decoding end. This method also solves the problem of weak generalization of the LF reconstruction algorithm when dealing with high-frequency components through the design of two-layer compression structure. Compared with existing compression methods, a higher compression ratio and better quality can be achieved using this algorithm. Furthermore, to improve the adaptability of the real-time requirements of different LF applications and robustness requirements in unreliable network environments, an adaptive LF video transmission scheme based on Multiple Description Coding (MDC) is proposed. It can divide the LF matrix into LF descriptions at different levels of downsampling ratios, and optimize the scheduling of the descriptions transmission queue, which can ensure that it can adaptively adjust the design of basic GNN unit so that the proposed method can adapt more flexibly to the real-time changes of user viewpoint requests, so as to save unnecessary viewpoint transmission overhead to the greatest extent, and minimize the adverse impact of network packet loss and network status fluctuations on LF transmission services.
For LF processing, depth estimation has been a very hot topic in recent years. To achieve a good balance between the performance of both narrow- or wide-baseline LF data, a novel optical-flow-based LF depth estimation scheme, which uses a convolutional neural network (CNN) to predict the patch matrix after optical flow offset, is proposed. After the optical-flow-assisted offset, the disparity between patches is processed to a unified numerical range, which can effectively solve the overfitting problem of the LF depth estimation network caused by the uneven distribution of the baseline range of LF samples. Experimental results show that the proposed uniform-patch-based estimation mechanism has good generalization on LF data of different baselines and is compatible with various existing narrow-baseline LF depth estimation algorithms. Finally, since LF processing places high requirements on both the computing and caching capabilities of the infrastructure, a framework that combines Multi-access Edge Computing (MEC) technology with LF applications is proposed in this thesis. In this study, the problem is transformed by the Lyapunov optimization, and an optimized search algorithm based on the Markov approximation method is designed, which can adaptively schedule and adjust the task offloading strategy and resource allocation scheme, so as to provide users with the best service experience in the LF viewpoint interpolation task. Numerical results demonstrate that this edge-based framework can achieve a dynamic balance between energy and caching consumption while meeting the low latency requirements of LF applications.

Light Field

Light Field Compression

Light Field Streaming

Depth Estimation

Edge Computing

Identifer	oai:union.ndltd.org:uottawa.ca/oai:ruor.uottawa.ca:10393/44268
Date	16 November 2022
Creators	Hu, Xinjue
Contributors	Shirmohammadi, Shervin
Publisher	Université d'Ottawa / University of Ottawa
Source Sets	Université d’Ottawa
Language	English
Detected Language	English
Type	Thesis
Format	application/pdf

Page generated in 0.0024 seconds

Light Field Video Processing and Streaming Using Applied AI

Description

Links & Downloads

Tags

Additional Fields