1 |
On Causal Video Coding with Possible Loss of the First Encoded FrameEslamifar, Mahshad January 2013 (has links)
Multiple Description Coding (MDC) was fi rst formulated by A. Gersho and H. Witsenhausen as a way to improve the robustness of telephony links to outages. Lots of studies have been done in this area up to now. Another application of MDC is the transmission of an image in diff erent descriptions. If because of the link outage during transmission, any one of the descriptions fails, the image could still be reconstructed with some quality at the decoder side. In video coding, inter prediction is a way to reduce temporal redundancy.
From an information theoretical point of view, one can model inter prediction with Causal
Video Coding (CVC). If because of link outage, we lose any I-frame, how can we reconstruct the corresponding P- or B-frames at the decoder? In this thesis, we are interested in answering this question and we call this scenario as causal video coding with possible loss of the fi rst encoded frame and we denote it by CVC-PL as PL stands for possible loss.
In this thesis for the fi rst time, CVC-PL is investigated. Although, due to lack of time,
we mostly study two-frame CVC-PL, we extend the problem to M-frame CVC-PL as well.
To provide more insight into two-frame CVC-PL, we derive an outer-bound to the achievable rate-distortion sets to show that CVC-PL is a subset of the region combining CVC and peer-to-peer coding. In addition, we propose and prove a new achievable region to highlight the fact that two-frame CVC-PL could be viewed as MDC followed by CVC. Afterwards, we present the main theorem of this thesis, which is the minimum total rate of CVC-PL with two jointly Gaussian distributed sources, i.e. X1 and X2 with normalized correlation
coeffi cient r, for di fferent distortion pro files (D1,D2,D3). Defi ning Dr = r^2(D1 -1) + 1,
we show that for small D3, i.e. D3 < Dr +D2 -1, CVC-PL could be treated as CVC with
two jointly Gaussian distributed sources; for large D3, i.e. D3 > DrD2/(Dr+D2-DrD2), CVC-PL could be treated as two parallel peer-to-peer networks with distortion constraints D1 and D2; and for the other cases of D3, the minimum total rate is 0.5 log (1+ ??)(D3+??)/
(Dr+?? )(D2+?? ) + 0.5 log Dr/(D1D3)
where ??=D3-DrD2+r[(1-D1)(1-D2)(D3-Dr)(D3-D2)]^0.5/[Dr+D2-(D3+1) ]
We also determine the optimal coding scheme which achieves the minimum total rate.
We conclude the thesis by comparing the scenario of CVC-PL with two frames with a
coding scheme, in which both of the sources are available at the encoders, i.e. distributed source coding versus centralized source coding. We show that for small D2 or large D3, the distributed source coding can perform as good as the centralized source coding. Finally, we talk about future work and extend and formulate the problem for M sources.
|
2 |
Rate Distortion Theory for Causal Video Coding: Characterization, Computation Algorithm, Comparison, and Code DesignZheng, Lin January 2012 (has links)
Due to the sheer volume of data involved, video coding is an important application of lossy source coding, and has received wide industrial interest and support as evidenced by the development and success of a series of video coding standards. All MPEG-series and H-series video coding standards proposed so far are based upon a video coding paradigm called predictive video coding, where video source frames Xᵢ,i=1,2,...,N, are encoded in a frame by frame manner, the encoder and decoder for each frame Xᵢ, i =1, 2, ..., N, enlist help only from all previous encoded frames Sj, j=1, 2, ..., i-1.
In this thesis, we will look further beyond all existing and proposed video coding standards,
and introduce a new coding paradigm called causal video coding, in which the encoder for each frame Xᵢ
can use all previous original frames Xj, j=1, 2, ..., i-1, and all previous
encoded frames Sj, while the corresponding decoder can use only all
previous encoded frames. We consider all studies, comparisons, and designs on causal video coding
from an information theoretic
point of view.
Let R*c(D₁,...,D_N) (R*p(D₁,...,D_N), respectively)
denote the minimum total rate required to achieve a given distortion
level D₁,...,D_N > 0 in causal video coding (predictive video coding, respectively).
A novel computation
approach is proposed to analytically characterize, numerically
compute, and compare the
minimum total rate of causal video coding R*c(D₁,...,D_N)
required to achieve a given distortion (quality) level D₁,...,D_N > 0.
Specifically, we first show that for jointly stationary and ergodic
sources X₁, ..., X_N, R*c(D₁,...,D_N) is equal
to the infimum of the n-th order total rate distortion function
R_{c,n}(D₁,...,D_N) over all n, where
R_{c,n}(D₁,...,D_N) itself is given by the minimum of an
information quantity over a set of auxiliary random variables. We
then present an iterative algorithm for computing
R_{c,n}(D₁,...,D_N) and demonstrate the convergence of the
algorithm to the global minimum. The global convergence of the
algorithm further enables us to not only establish a single-letter
characterization of R*c(D₁,...,D_N) in a novel way when the
N sources are an independent and identically distributed (IID)
vector source, but also demonstrate
a somewhat surprising result (dubbed the more and less coding
theorem)---under some conditions on source frames and distortion,
the more frames need to be encoded and transmitted, the less amount
of data after encoding has to be actually sent.
With the help of the algorithm, it is also shown by example that
R*c(D₁,...,D_N) is in general much smaller than the total rate
offered by the traditional greedy coding method by which each frame
is encoded in a local optimum manner based on all information
available to the encoder of the frame.
As a by-product, an extended Markov lemma is
established for correlated ergodic sources.
From an information theoretic point of view,
it is interesting to compare causal
video coding and predictive video coding,
which all existing video
coding standards proposed so far are based upon.
In this thesis, by fixing N=3,
we first derive a single-letter characterization
of R*p(D₁,D₂,D₃) for an IID
vector source (X₁,X₂,X₃) where X₁ and X₂ are independent, and then demonstrate the existence of such X₁,X₂,X₃ for which R*p(D₁,D₂,D₃)>R*c(D₁,D₂,D₃) under some conditions on source frames and distortion. This result makes causal video coding an attractive framework for future video coding systems and standards.
The design of causal video coding is also considered in the thesis from an information
theoretic perspective by modeling each frame as a stationary information source.
We first put forth a concept called causal scalar quantization, and then
propose an algorithm for designing optimum fixed-rate causal scalar quantizers
for causal video coding to minimize the total distortion among all sources.
Simulation results show that in comparison with fixed-rate predictive scalar quantization,
fixed-rate causal scalar quantization offers as large as 16% quality improvement (distortion reduction).
|
3 |
Rate Distortion Theory for Causal Video Coding: Characterization, Computation Algorithm, Comparison, and Code DesignZheng, Lin January 2012 (has links)
Due to the sheer volume of data involved, video coding is an important application of lossy source coding, and has received wide industrial interest and support as evidenced by the development and success of a series of video coding standards. All MPEG-series and H-series video coding standards proposed so far are based upon a video coding paradigm called predictive video coding, where video source frames Xᵢ,i=1,2,...,N, are encoded in a frame by frame manner, the encoder and decoder for each frame Xᵢ, i =1, 2, ..., N, enlist help only from all previous encoded frames Sj, j=1, 2, ..., i-1.
In this thesis, we will look further beyond all existing and proposed video coding standards,
and introduce a new coding paradigm called causal video coding, in which the encoder for each frame Xᵢ
can use all previous original frames Xj, j=1, 2, ..., i-1, and all previous
encoded frames Sj, while the corresponding decoder can use only all
previous encoded frames. We consider all studies, comparisons, and designs on causal video coding
from an information theoretic
point of view.
Let R*c(D₁,...,D_N) (R*p(D₁,...,D_N), respectively)
denote the minimum total rate required to achieve a given distortion
level D₁,...,D_N > 0 in causal video coding (predictive video coding, respectively).
A novel computation
approach is proposed to analytically characterize, numerically
compute, and compare the
minimum total rate of causal video coding R*c(D₁,...,D_N)
required to achieve a given distortion (quality) level D₁,...,D_N > 0.
Specifically, we first show that for jointly stationary and ergodic
sources X₁, ..., X_N, R*c(D₁,...,D_N) is equal
to the infimum of the n-th order total rate distortion function
R_{c,n}(D₁,...,D_N) over all n, where
R_{c,n}(D₁,...,D_N) itself is given by the minimum of an
information quantity over a set of auxiliary random variables. We
then present an iterative algorithm for computing
R_{c,n}(D₁,...,D_N) and demonstrate the convergence of the
algorithm to the global minimum. The global convergence of the
algorithm further enables us to not only establish a single-letter
characterization of R*c(D₁,...,D_N) in a novel way when the
N sources are an independent and identically distributed (IID)
vector source, but also demonstrate
a somewhat surprising result (dubbed the more and less coding
theorem)---under some conditions on source frames and distortion,
the more frames need to be encoded and transmitted, the less amount
of data after encoding has to be actually sent.
With the help of the algorithm, it is also shown by example that
R*c(D₁,...,D_N) is in general much smaller than the total rate
offered by the traditional greedy coding method by which each frame
is encoded in a local optimum manner based on all information
available to the encoder of the frame.
As a by-product, an extended Markov lemma is
established for correlated ergodic sources.
From an information theoretic point of view,
it is interesting to compare causal
video coding and predictive video coding,
which all existing video
coding standards proposed so far are based upon.
In this thesis, by fixing N=3,
we first derive a single-letter characterization
of R*p(D₁,D₂,D₃) for an IID
vector source (X₁,X₂,X₃) where X₁ and X₂ are independent, and then demonstrate the existence of such X₁,X₂,X₃ for which R*p(D₁,D₂,D₃)>R*c(D₁,D₂,D₃) under some conditions on source frames and distortion. This result makes causal video coding an attractive framework for future video coding systems and standards.
The design of causal video coding is also considered in the thesis from an information
theoretic perspective by modeling each frame as a stationary information source.
We first put forth a concept called causal scalar quantization, and then
propose an algorithm for designing optimum fixed-rate causal scalar quantizers
for causal video coding to minimize the total distortion among all sources.
Simulation results show that in comparison with fixed-rate predictive scalar quantization,
fixed-rate causal scalar quantization offers as large as 16% quality improvement (distortion reduction).
|
Page generated in 0.0726 seconds