Global ETD Search

1	Coding Performance Enhancement: Motion Estimation and Video Transcoding Wu, Ming-te 05 June 2009 (has links) With the rapid growth of multimedia information, video coding standards have become crucial when transmitting large amount of video data. Motion estimation promises to be the key to high performance in video coding by removing the temporal redundancy of video data for storage and transmission. Video transcoding also becomes a significant scheme applied in different bandwidth transform. Due to their fundamentality, research works on motion estimation and video transcoding have been conducted extensively. In this thesis, an overview of video compression technique is presented with emphasis on motion estimation. Then, a survey of most representative motion estimation search algorithms and the proposed motion estimation algorithms are introduced. The evaluation and analysis of these algorithms based on a number of experiments on several famous test video sequences is presented. In addition, an efficient video transcoding via visual attention model with Lagrange optimization to minimum rate-distortion cost is proposed. Finally, an investigation of the future trend of video coding is discussed. Through the proposed algorithms of motion estimation, the computational complexity can be significantly reduced despite the fact that the objective quality of motion compensated images is slightly degraded. Moreover, through the proposed video transcoding method, the bit rate can be reduced to fit the requirement of bandwidth. video transcoding motion estimation
2	An Efficient Motion Estimation Method for H.264-Based Video Transcoding with Arbitrary Spatial Resolution Conversion Wang, Jiao January 2007 (has links) As wireless and wired network connectivity is rapidly expanding and the number of network users is steadily increasing, it has become more and more important to support universal access of multimedia content over the whole network. A big challenge, however, is the great diversity of network devices from full screen computers to small smart phones. This leads to research on transcoding, which involves in efficiently reformatting compressed data from its original high resolution to a desired spatial resolution supported by the displaying device. Particularly, there is a great momentum in the multimedia industry for H.264-based transcoding as H.264 has been widely employed as a mandatory player feature in applications ranging from television broadcast to video for mobile devices. While H.264 contains many new features for effective video coding with excellent rate distortion (RD) performance, a major issue for transcoding H.264 compressed video from one spatial resolution to another is the computational complexity. Specifically, it is the motion compensated prediction (MCP) part. MCP is the main contributor to the excellent RD performance of H.264 video compression, yet it is very time consuming. In general, a brute-force search is used to find the best motion vectors for MCP. In the scenario of transcoding, however, an immediate idea for improving the MCP efficiency for the re-encoding procedure is to utilize the motion vectors in the original compressed stream. Intuitively, motion in the high resolution scene is highly related to that in the down-scaled scene. In this thesis, we study homogeneous video transcoding from H.264 to H.264. Specifically, for the video transcoding with arbitrary spatial resolution conversion, we propose a motion vector estimation algorithm based on a multiple linear regression model, which systematically utilizes the motion information in the original scenes. We also propose a practical solution for efficiently determining a reference frame to take the advantage of the new feature of multiple references in H.264. The performance of the algorithm was assessed in an H.264 transcoder. Experimental results show that, as compared with a benchmark solution, the proposed method significantly reduces the transcoding complexity without degrading much the video quality. H.264 video transcoding Electrical and Computer Engineering
3	An Efficient Motion Estimation Method for H.264-Based Video Transcoding with Arbitrary Spatial Resolution Conversion Wang, Jiao January 2007 (has links) As wireless and wired network connectivity is rapidly expanding and the number of network users is steadily increasing, it has become more and more important to support universal access of multimedia content over the whole network. A big challenge, however, is the great diversity of network devices from full screen computers to small smart phones. This leads to research on transcoding, which involves in efficiently reformatting compressed data from its original high resolution to a desired spatial resolution supported by the displaying device. Particularly, there is a great momentum in the multimedia industry for H.264-based transcoding as H.264 has been widely employed as a mandatory player feature in applications ranging from television broadcast to video for mobile devices. While H.264 contains many new features for effective video coding with excellent rate distortion (RD) performance, a major issue for transcoding H.264 compressed video from one spatial resolution to another is the computational complexity. Specifically, it is the motion compensated prediction (MCP) part. MCP is the main contributor to the excellent RD performance of H.264 video compression, yet it is very time consuming. In general, a brute-force search is used to find the best motion vectors for MCP. In the scenario of transcoding, however, an immediate idea for improving the MCP efficiency for the re-encoding procedure is to utilize the motion vectors in the original compressed stream. Intuitively, motion in the high resolution scene is highly related to that in the down-scaled scene. In this thesis, we study homogeneous video transcoding from H.264 to H.264. Specifically, for the video transcoding with arbitrary spatial resolution conversion, we propose a motion vector estimation algorithm based on a multiple linear regression model, which systematically utilizes the motion information in the original scenes. We also propose a practical solution for efficiently determining a reference frame to take the advantage of the new feature of multiple references in H.264. The performance of the algorithm was assessed in an H.264 transcoder. Experimental results show that, as compared with a benchmark solution, the proposed method significantly reduces the transcoding complexity without degrading much the video quality. H.264 video transcoding Electrical and Computer Engineering
4	Coding Modes Probability Modeling for H.264/AVC to SVC Video Transcoding Wu, Shih-Tse 06 September 2011 (has links) Scalable video coding (SVC) supports full scalability by extracting a partial bitstream to adapt to transmission and display requirements in multimedia applications. Most conventional video content is stored in non-scalable format, e.g., H.264/AVC, necessitating the development of an efficient video transcoding from a conventional format to a scalable one. This work describes a fast video transcoding architecture that overcomes the complexity of different coding structures between H.264/AVC and SVC. The proposed algorithm simplifies the mode decision process in SVC owing to its heavy computations. The current mode in SVC is selected by the highest conditional probability of SVC¡¦s mode given the H.264/AVC¡¦s mode. Exactly when an error prediction occurs is then detected using Bayesian theorem, followed by its refinement using the Markov model. Experimental results indicate that the proposed algorithm saves on average 75.28% of coding time with 0.13 dB PSNR loss over that when using a cascaded pixel domain transcoder. Video transcoding Bayesian theorem SVC Markov model H.264/AVC
5	Machine learning mode decision for complexity reduction and scaling in video applications Grellert, Mateus January 2018 (has links) As recentes inovações em técnicas de Aprendizado de Máquina levaram a uma ampla utilização de modelos inteligentes para resolver problemas complexos que são especialmente difíceis de computar com algoritmos e estruturas de dados convencionais. Em particular, pesquisas recentes em Processamento de Imagens e Vídeo mostram que é possível desenvolver modelos de Aprendizado de Máquina que realizam reconhecimento de objetos e até mesmo de ações com altos graus de confiança. Além disso, os últimos avanços em algoritmos de treinamento para Redes Neurais Profundas (Deep Learning Neural Networks) estabeleceram um importante marco no estudo de Aprendizado de Máquina, levando a descobertas promissoras em Visão Computacional e outras aplicações. Estudos recentes apontam que também é possível desenvolver modelos inteligentes capazes de reduzir drasticamente o espaço de otimização do modo de decisão em codificadores de vídeo com perdas irrelevantes em eficiência de compressão. Todos esses fatos indicam que Aprendizado de Máquina para redução de complexidade em aplicações de vídeo é uma área promissora para pesquisa. O objetivo desta tese é investigar técnicas baseadas em aprendizado para reduzir a complexidade das decisões da codificação HEVC, com foco em aplicações de codificação e transcodificação rápidas. Um perfilamento da complexidade em codificadores é inicialmente apresentado, a fim de identificar as tarefas que requerem prioridade para atingir o objetivo dessa tese. A partir disso, diversas variáveis e métricas são extraídas durante os processos de codificação e decodificação para avaliar a correlação entre essas variáveis e as decisões de codificação associadas a essas tarefas. Em seguida, técnicas de Aprendizado de Máquina são empregadas para construir classificadores que utilizam a informação coletada para prever o resultado dessas decisões, eliminando o custo computacional necessário para computá-las. As soluções de codificação e transcodificação foram desenvolvidas separadamente, pois o tipo de informação é diferente em cada caso, mas a mesma metologia foi aplicada em ambos os casos. Além disso, mecanismos de complexidade escalável foram desenvolvidos para permitir o melhor desempenho taxa-compressão para um dado valor de redução de complexidade. Resultados experimentais apontam que as soluções desenvolvidas para codificação rápida atingiram reduções de complexidade entre 37% e 78% na média, com perdas de qualidade entre 0.04% e 4.8% (medidos em Bjontegaard Delta Bitrate – BD-BR). Já as soluções para trancodificação rápida apresentaram uma redução de 43% até 67% na complexidade, com BD-BR entre 0.34% e 1.7% na média. Comparações com o estado da arte confirmam a eficácia dos métodos desenvolvidos, visto que são capazes de superar os resultados atingidos por soluções similares. / The recent innovations in Machine Learning techniques have led to a large utilization of intelligent models to solve complex problems that are especially hard to compute with traditional data structures and algorithms. In particular, the current research on Image and Video Processing shows that it is possible to design Machine Learning models that perform object recognition and even action recognition with high confidence levels. In addition, the latest progress on training algorithms for Deep Learning Neural Networks was also an important milestone in Machine Learning, leading to prominent discoveries in Computer Vision and other applications. Recent studies have also shown that it is possible to design intelligent models capable of drastically reducing the optimization space of mode decision in video encoders with minor losses in coding efficiency. All these facts indicate that Machine Learning for complexity reduction in visual applications is a very promising field of study. The goal of this thesis is to investigate learning-based techniques to reduce the complexity of the HEVC encoding decisions, focusing on fast video encoding and transcoding applications. A complexity profiling of HEVC is first presented to identify the tasks that must be prioritized to accomplish our objective. Several variables and metrics are then extracted during the encoding and decoding processes to assess their correlation with the encoding decisions associated with these tasks. Next, Machine Learning techniques are employed to construct classifiers that make use of this information to accurately predict the outcome of these decisions, eliminating the timeconsuming operations required to compute them. The fast encoding and transcoding solutions were developed separately, as the source of information is different on each case, but the same methodology was followed in both cases. In addition, mechanisms for complexity scalability were developed to provide the best rate-distortion performance given a target complexity reduction. Experimental results demonstrated that the designed fast encoding solutions achieve time savings of 37% up to 78% on average, with Bjontegaard Delta Bitrate (BD-BR) increments between 0.04% and 4.8%. In the transcoding results, a complexity reduction ranging from 43% to 67% was observed, with average BD-BR increments from 0.34% up to 1.7%. Comparisons with state of the art confirm the efficacy of the designed methods, as they outperform the results achieved by related solutions. Vídeo digital Video coding Video transcoding Complexity reduction Complexity scaling Machine Learning HEVC
6	Live Video Streaming from Android-Enabled Devices to Web Browsers Bailey, Justin M. 01 January 2011 (has links) The wide-spread adoption of camera-embedded mobile devices along with the ubiquitous connection via WiFi or cellular networks enables people to visually report live events. Current solutions limit the configurability of such services by allowing video streaming only to fixed servers. In addition, the business models of the companies that provide such (free) services insert visual ads in the streamed videos, leading to unnecessary resource consumption. This thesis proposes an architecture of a real-time video streaming service from an Android mobile device to a server of the user's choice. The real-time video can then be viewed from a web browser. The project builds on open-source code and open protocols to implement a set of software components that successfully stream live video. Experimental evaluations show practical resource consumption and a good quality of the streamed video. Furthermore, the architecture is scalable and can support large number of simultaneous streams with additional increase in hardware resources. Android application h263 rtp rtsp video transcoding American Studies Arts and Humanities Computer Sciences
7	Cubic-Panorama Image Dataset Analysis for Storage and Transmission Salehi Doolabi, Saeed 23 April 2013 (has links) This thesis involves systems for virtual presence in remote locations, a field referred to as telepresence. Recent image-based representations such as Google map's street view provide a familiar example. Several areas of research are open; such image-based representations are huge in size and the necessity to compress data efficiently for storage is inevitable. On the other hand, users are usually located in remote areas, and thus efficient transmission of the visual information is another issue of great importance. In this work, real-world images are used in preference to computer graphics representations, mainly due to the photorealism that they provide as well as to avoid the high computational cost required for simulating large-scale environments. The cubic format is selected for panoramas in this thesis. A major feature of the captured cubic-panoramic image datasets in this work is the assumption of static scenes, and major issues of the system are compression efficiency and random access for storage, as well as computational complexity for transmission upon remote users' requests. First, in order to enable smooth navigation across different view-points, a method for aligning cubic-panorama image datasets by using the geometry of the scene is proposed and tested. Feature detection and camera calibration are incorporated and unlike the existing method, which is limited to a pair of panoramas, our approach is applicable to datasets with a large number of panoramic images, with no need for extra numerical estimation. Second, the problem of cubic-panorama image dataset compression is addressed in a number of ways. Two state-of-the-art approaches, namely the standardized scheme of H.264 and a wavelet-based codec named Dirac, are used and compared for the application of virtual navigation in image based representations of real world environments. Different frame prediction structures and group of pictures lengths are investigated and compared for this new type of visual data. At this stage, based on the obtained results, an efficient prediction structure and bitstream syntax using features of the data as well as satisfying major requirements of the system are proposed. Third, we have proposed novel methods to address the important issue of disparity estimation. A client-server based scheme is assumed and a remote user is assumed to seek information at each navigation step. Considering the compression stage, a fast method that uses our previous work on the geometry of the scene as well as the proposed prediction structure together with the cubic format of panoramas is used to estimate disparity vectors efficiently. Considering the transmission stage, a new transcoding scheme is introduced and a number of different frame-format conversion scenarios are addressed towards the goal of free navigation. Different types of navigation scenarios including forward or backward navigation, as well as user pan, tilt, and zoom are addressed. In all the aforementioned cases, results are compared both visually through error images and videos as well as using the objective measures. Altogether free navigation within the captured panoramic image datasets will be facilitated using our work and it can be incorporated in state-of-the-art of emerging cubic-panorama image dataset compression/transmission schemes. Telepresence Virtual Navigation Imaged Based Rendering Cubic Panorama Multiview Video Video Transcoding Disparity Estimation
8	Machine learning mode decision for complexity reduction and scaling in video applications Grellert, Mateus January 2018 (has links) As recentes inovações em técnicas de Aprendizado de Máquina levaram a uma ampla utilização de modelos inteligentes para resolver problemas complexos que são especialmente difíceis de computar com algoritmos e estruturas de dados convencionais. Em particular, pesquisas recentes em Processamento de Imagens e Vídeo mostram que é possível desenvolver modelos de Aprendizado de Máquina que realizam reconhecimento de objetos e até mesmo de ações com altos graus de confiança. Além disso, os últimos avanços em algoritmos de treinamento para Redes Neurais Profundas (Deep Learning Neural Networks) estabeleceram um importante marco no estudo de Aprendizado de Máquina, levando a descobertas promissoras em Visão Computacional e outras aplicações. Estudos recentes apontam que também é possível desenvolver modelos inteligentes capazes de reduzir drasticamente o espaço de otimização do modo de decisão em codificadores de vídeo com perdas irrelevantes em eficiência de compressão. Todos esses fatos indicam que Aprendizado de Máquina para redução de complexidade em aplicações de vídeo é uma área promissora para pesquisa. O objetivo desta tese é investigar técnicas baseadas em aprendizado para reduzir a complexidade das decisões da codificação HEVC, com foco em aplicações de codificação e transcodificação rápidas. Um perfilamento da complexidade em codificadores é inicialmente apresentado, a fim de identificar as tarefas que requerem prioridade para atingir o objetivo dessa tese. A partir disso, diversas variáveis e métricas são extraídas durante os processos de codificação e decodificação para avaliar a correlação entre essas variáveis e as decisões de codificação associadas a essas tarefas. Em seguida, técnicas de Aprendizado de Máquina são empregadas para construir classificadores que utilizam a informação coletada para prever o resultado dessas decisões, eliminando o custo computacional necessário para computá-las. As soluções de codificação e transcodificação foram desenvolvidas separadamente, pois o tipo de informação é diferente em cada caso, mas a mesma metologia foi aplicada em ambos os casos. Além disso, mecanismos de complexidade escalável foram desenvolvidos para permitir o melhor desempenho taxa-compressão para um dado valor de redução de complexidade. Resultados experimentais apontam que as soluções desenvolvidas para codificação rápida atingiram reduções de complexidade entre 37% e 78% na média, com perdas de qualidade entre 0.04% e 4.8% (medidos em Bjontegaard Delta Bitrate – BD-BR). Já as soluções para trancodificação rápida apresentaram uma redução de 43% até 67% na complexidade, com BD-BR entre 0.34% e 1.7% na média. Comparações com o estado da arte confirmam a eficácia dos métodos desenvolvidos, visto que são capazes de superar os resultados atingidos por soluções similares. / The recent innovations in Machine Learning techniques have led to a large utilization of intelligent models to solve complex problems that are especially hard to compute with traditional data structures and algorithms. In particular, the current research on Image and Video Processing shows that it is possible to design Machine Learning models that perform object recognition and even action recognition with high confidence levels. In addition, the latest progress on training algorithms for Deep Learning Neural Networks was also an important milestone in Machine Learning, leading to prominent discoveries in Computer Vision and other applications. Recent studies have also shown that it is possible to design intelligent models capable of drastically reducing the optimization space of mode decision in video encoders with minor losses in coding efficiency. All these facts indicate that Machine Learning for complexity reduction in visual applications is a very promising field of study. The goal of this thesis is to investigate learning-based techniques to reduce the complexity of the HEVC encoding decisions, focusing on fast video encoding and transcoding applications. A complexity profiling of HEVC is first presented to identify the tasks that must be prioritized to accomplish our objective. Several variables and metrics are then extracted during the encoding and decoding processes to assess their correlation with the encoding decisions associated with these tasks. Next, Machine Learning techniques are employed to construct classifiers that make use of this information to accurately predict the outcome of these decisions, eliminating the timeconsuming operations required to compute them. The fast encoding and transcoding solutions were developed separately, as the source of information is different on each case, but the same methodology was followed in both cases. In addition, mechanisms for complexity scalability were developed to provide the best rate-distortion performance given a target complexity reduction. Experimental results demonstrated that the designed fast encoding solutions achieve time savings of 37% up to 78% on average, with Bjontegaard Delta Bitrate (BD-BR) increments between 0.04% and 4.8%. In the transcoding results, a complexity reduction ranging from 43% to 67% was observed, with average BD-BR increments from 0.34% up to 1.7%. Comparisons with state of the art confirm the efficacy of the designed methods, as they outperform the results achieved by related solutions. Vídeo digital Video coding Video transcoding Complexity reduction Complexity scaling Machine Learning HEVC
9	Machine learning mode decision for complexity reduction and scaling in video applications Grellert, Mateus January 2018 (has links) As recentes inovações em técnicas de Aprendizado de Máquina levaram a uma ampla utilização de modelos inteligentes para resolver problemas complexos que são especialmente difíceis de computar com algoritmos e estruturas de dados convencionais. Em particular, pesquisas recentes em Processamento de Imagens e Vídeo mostram que é possível desenvolver modelos de Aprendizado de Máquina que realizam reconhecimento de objetos e até mesmo de ações com altos graus de confiança. Além disso, os últimos avanços em algoritmos de treinamento para Redes Neurais Profundas (Deep Learning Neural Networks) estabeleceram um importante marco no estudo de Aprendizado de Máquina, levando a descobertas promissoras em Visão Computacional e outras aplicações. Estudos recentes apontam que também é possível desenvolver modelos inteligentes capazes de reduzir drasticamente o espaço de otimização do modo de decisão em codificadores de vídeo com perdas irrelevantes em eficiência de compressão. Todos esses fatos indicam que Aprendizado de Máquina para redução de complexidade em aplicações de vídeo é uma área promissora para pesquisa. O objetivo desta tese é investigar técnicas baseadas em aprendizado para reduzir a complexidade das decisões da codificação HEVC, com foco em aplicações de codificação e transcodificação rápidas. Um perfilamento da complexidade em codificadores é inicialmente apresentado, a fim de identificar as tarefas que requerem prioridade para atingir o objetivo dessa tese. A partir disso, diversas variáveis e métricas são extraídas durante os processos de codificação e decodificação para avaliar a correlação entre essas variáveis e as decisões de codificação associadas a essas tarefas. Em seguida, técnicas de Aprendizado de Máquina são empregadas para construir classificadores que utilizam a informação coletada para prever o resultado dessas decisões, eliminando o custo computacional necessário para computá-las. As soluções de codificação e transcodificação foram desenvolvidas separadamente, pois o tipo de informação é diferente em cada caso, mas a mesma metologia foi aplicada em ambos os casos. Além disso, mecanismos de complexidade escalável foram desenvolvidos para permitir o melhor desempenho taxa-compressão para um dado valor de redução de complexidade. Resultados experimentais apontam que as soluções desenvolvidas para codificação rápida atingiram reduções de complexidade entre 37% e 78% na média, com perdas de qualidade entre 0.04% e 4.8% (medidos em Bjontegaard Delta Bitrate – BD-BR). Já as soluções para trancodificação rápida apresentaram uma redução de 43% até 67% na complexidade, com BD-BR entre 0.34% e 1.7% na média. Comparações com o estado da arte confirmam a eficácia dos métodos desenvolvidos, visto que são capazes de superar os resultados atingidos por soluções similares. / The recent innovations in Machine Learning techniques have led to a large utilization of intelligent models to solve complex problems that are especially hard to compute with traditional data structures and algorithms. In particular, the current research on Image and Video Processing shows that it is possible to design Machine Learning models that perform object recognition and even action recognition with high confidence levels. In addition, the latest progress on training algorithms for Deep Learning Neural Networks was also an important milestone in Machine Learning, leading to prominent discoveries in Computer Vision and other applications. Recent studies have also shown that it is possible to design intelligent models capable of drastically reducing the optimization space of mode decision in video encoders with minor losses in coding efficiency. All these facts indicate that Machine Learning for complexity reduction in visual applications is a very promising field of study. The goal of this thesis is to investigate learning-based techniques to reduce the complexity of the HEVC encoding decisions, focusing on fast video encoding and transcoding applications. A complexity profiling of HEVC is first presented to identify the tasks that must be prioritized to accomplish our objective. Several variables and metrics are then extracted during the encoding and decoding processes to assess their correlation with the encoding decisions associated with these tasks. Next, Machine Learning techniques are employed to construct classifiers that make use of this information to accurately predict the outcome of these decisions, eliminating the timeconsuming operations required to compute them. The fast encoding and transcoding solutions were developed separately, as the source of information is different on each case, but the same methodology was followed in both cases. In addition, mechanisms for complexity scalability were developed to provide the best rate-distortion performance given a target complexity reduction. Experimental results demonstrated that the designed fast encoding solutions achieve time savings of 37% up to 78% on average, with Bjontegaard Delta Bitrate (BD-BR) increments between 0.04% and 4.8%. In the transcoding results, a complexity reduction ranging from 43% to 67% was observed, with average BD-BR increments from 0.34% up to 1.7%. Comparisons with state of the art confirm the efficacy of the designed methods, as they outperform the results achieved by related solutions. Vídeo digital Video coding Video transcoding Complexity reduction Complexity scaling Machine Learning HEVC
10	Cubic-Panorama Image Dataset Analysis for Storage and Transmission Salehi Doolabi, Saeed January 2013 (has links) This thesis involves systems for virtual presence in remote locations, a field referred to as telepresence. Recent image-based representations such as Google map's street view provide a familiar example. Several areas of research are open; such image-based representations are huge in size and the necessity to compress data efficiently for storage is inevitable. On the other hand, users are usually located in remote areas, and thus efficient transmission of the visual information is another issue of great importance. In this work, real-world images are used in preference to computer graphics representations, mainly due to the photorealism that they provide as well as to avoid the high computational cost required for simulating large-scale environments. The cubic format is selected for panoramas in this thesis. A major feature of the captured cubic-panoramic image datasets in this work is the assumption of static scenes, and major issues of the system are compression efficiency and random access for storage, as well as computational complexity for transmission upon remote users' requests. First, in order to enable smooth navigation across different view-points, a method for aligning cubic-panorama image datasets by using the geometry of the scene is proposed and tested. Feature detection and camera calibration are incorporated and unlike the existing method, which is limited to a pair of panoramas, our approach is applicable to datasets with a large number of panoramic images, with no need for extra numerical estimation. Second, the problem of cubic-panorama image dataset compression is addressed in a number of ways. Two state-of-the-art approaches, namely the standardized scheme of H.264 and a wavelet-based codec named Dirac, are used and compared for the application of virtual navigation in image based representations of real world environments. Different frame prediction structures and group of pictures lengths are investigated and compared for this new type of visual data. At this stage, based on the obtained results, an efficient prediction structure and bitstream syntax using features of the data as well as satisfying major requirements of the system are proposed. Third, we have proposed novel methods to address the important issue of disparity estimation. A client-server based scheme is assumed and a remote user is assumed to seek information at each navigation step. Considering the compression stage, a fast method that uses our previous work on the geometry of the scene as well as the proposed prediction structure together with the cubic format of panoramas is used to estimate disparity vectors efficiently. Considering the transmission stage, a new transcoding scheme is introduced and a number of different frame-format conversion scenarios are addressed towards the goal of free navigation. Different types of navigation scenarios including forward or backward navigation, as well as user pan, tilt, and zoom are addressed. In all the aforementioned cases, results are compared both visually through error images and videos as well as using the objective measures. Altogether free navigation within the captured panoramic image datasets will be facilitated using our work and it can be incorporated in state-of-the-art of emerging cubic-panorama image dataset compression/transmission schemes. Telepresence Virtual Navigation Imaged Based Rendering Cubic Panorama Multiview Video Video Transcoding Disparity Estimation

Search results