Global ETD Search

1	An Architecture for 3D Multi-view video Transmission based on Dynamic Adaptive Streaming over HTTP (DASH) Su, Tianyu January 2015 (has links) Recent advancement in cameras and image processing technology has generated a paradigm shift from traditional 2D and 3D video to Multi-view Video (MVV) technology, while at the same time improving video quality and compression through standards such as High Efficiency video Coding (HEVC). In multi-view, cameras are placed in predetermined positions to capture the video from various views. Delivering such views with high quality over the Internet is a challenging prospect, as MVV traffic is several times larger than traditional video since it consists of multiple video sequences each captured from a different angle, requiring more bandwidth than single view video to transmit MVV. Also, the Internet is known to be prone to packet loss, delay, and bandwidth variation, which adversely affects MVV transmission. Another challenge is that end users’ devices have different capabilities in terms of computing power, display, and access link capacity, requiring MVV to be adapted to each user’s context. In this paper, we propose an HEVC Multi-View system using Dynamic Adaptive Streaming over HTTP (DASH) to overcome the above mentioned challenges. Our system uses an adaptive mechanism to adjust the video bitrate to the variations of bandwidth in best effort networks. We also propose a novel scalable way for the Multi-view video and Depth (MVD) content for 3D video in terms of the number of transmitted views. Our objective measurements show that our method of transmitting MVV content can maximize the perceptual quality of virtual views after the rendering and hence increase the user’s quality of experience. 3D video QoE DASH
2	End-to-end 3D video communication over heterogeneous networks Mohib, Hamdullah January 2014 (has links) Three-dimensional technology, more commonly referred to as 3D technology, has revolutionised many fields including entertainment, medicine, and communications to name a few. In addition to 3D films, games, and sports channels, 3D perception has made tele-medicine a reality. By the year 2015, 30% of the all HD panels at home will be 3D enabled, predicted by consumer electronics manufacturers. Stereoscopic cameras, a comparatively mature technology compared to other 3D systems, are now being used by ordinary citizens to produce 3D content and share at a click of a button just like they do with the 2D counterparts via sites like YouTube. But technical challenges still exist, including with autostereoscopic multiview displays. 3D content requires many complex considerations--including how to represent it, and deciphering what is the best compression format--when considering transmission or storage, because of its increased amount of data. Any decision must be taken in the light of the available bandwidth or storage capacity, quality and user expectations. Free viewpoint navigation also remains partly unsolved. The most pressing issue getting in the way of widespread uptake of consumer 3D systems is the ability to deliver 3D content to heterogeneous consumer displays over the heterogeneous networks. Optimising 3D video communication solutions must consider the entire pipeline, starting with optimisation at the video source to the end display and transmission optimisation. Multi-view offers the most compelling solution for 3D videos with motion parallax and freedom from wearing headgear for 3D video perception. Optimising multi-view video for delivery and display could increase the demand for true 3D in the consumer market. This thesis focuses on an end-to-end quality optimisation in 3D video communication/transmission, offering solutions for optimisation at the compression, transmission, and decoder levels. 006.6
3	Low complexity multiview video coding Khattak, Shadan January 2014 (has links) 3D video is a technology that has seen a tremendous attention in the recent years. Multiview Video Coding (MVC) is an extension of the popular H.264 video coding standard and is commonly used to compress 3D videos. It offers an improvement of 20% to 50% in compression efficiency over simulcast encoding of multiview videos using the conventional H.264 video coding standard. However, there are two important problems associated with it: (i) its superior compression performance comes at the cost of significantly higher computational complexity which hampers the real-world realization of MVC encoder in applications such as 3D live broadcasting and interactive Free Viewpoint Television (FTV), and (ii) compressed 3D videos can suffer from packet loss during transmission, which can degrade the viewing quality of the 3D video at the decoder. This thesis aims to solve these problems by presenting techniques to reduce the computational complexity of the MVC encoder and by proposing a consistent error concealment technique for frame losses in 3D video transmission. The thesis first analyses the complexity of the MVC encoder. It then proposes two novel techniques to reduce the complexity of motion and disparity estimation. The first method achieves complexity reduction in the disparity estimation process by exploiting the relationship between temporal levels, type of macroblocks and search ranges while the second method achieves it by exploiting the geometrical relation- ship between motion and disparity vectors in stereo frames. These two methods are then combined with other state-of-the-art methods in a unique framework where gains add up. Experimental results show that the proposed low-complexity framework can reduce the encoding time of the standard MVC encoder by over 93% while maintaining similar compression efficiency performance. The addition of new View Synthesis Prediction (VSP) modes to the MVC encoding framework improves the compression efficiency of MVC. However, testing additional modes comes at the cost of increased encoding complexity. In order to reduce the encoding complexity, the thesis, next, proposes a bayesian early mode decision technique for a VSP enhanced MVC coder. It exploits the statistical similarities between the RD costs of the VSP SKIP mode in neighbouring views to terminate the mode decision process early. Results indicate that the proposed technique can reduce the encoding time of the enhanced MVC coder by over 33% at similar compression efficiency levels. Finally, compressed 3D videos are usually required to be broadcast to a large number of users where transmission errors can lead to frame losses which can degrade the video quality at the decoder. A simple reconstruction of the lost frames can lead to inconsistent reconstruction of the 3D scene which may negatively affect the viewing experience of a user. In order to solve this problem, the thesis proposes, at the end, a consistency model for recovering frames lost during transmission. The proposed consistency model is used to evaluate inter-view and temporal consistencies while selecting candidate blocks for concealment. Experimental results show that the proposed technique is able to recover the lost frames with high consistency and better quality than two standard error concealment methods and a baseline technique based on the boundary matching algorithm. 600
4	Pokročilé metody postprodukce a distribuce videa s využitím IT / Advanced video post-production and distribution methods using IT Krist, Antonín January 2010 (has links) This thesis deals with advanced methods of digital video postproduction and distribution using broadcasting technologies and internet protocol. Describes and compares the distribution methods, using information technology and discusses the current problems. Describes digitization methods and methods that can save bandwidth for distribution. Deals with the possible practical implementation of distribution od three dimensional video to upcoming standards and analyzes the possibilities of their future development. Discusses the overall problems of transmission standardization and advanced video coding. In a conclusion, based on a comparison of methods and practical experience of the author, thesis recommends certain procedures to implement to the standard and specifies the direction of the technological solutions.
5	3D Video Capture of a Moving Object in a Wide Area Using Active Cameras / 能動カメラ群を用いた広域移動対象の3次元ビデオ撮影 Yamaguchi, Tatsuhisa 24 September 2013 (has links) 京都大学 / 0048 / 新制・課程博士 / 博士(情報学) / 甲第17919号 / 情博第501号 / 新制\|\|情\|\|89(附属図書館) / 30739 / 京都大学大学院情報学研究科知能情報学専攻 / (主査)教授松山隆司, 教授美濃導彦, 教授中村裕一 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM 3D Video active camera multi-view video object tracking 007
6	Automated Identification and Tracking of Motile Oligodendrocyte Precursor Cells (OPCs) from Time-lapse 3D Microscopic Imaging Data of Cell Clusters in vivo Wang, Yinxue 02 June 2021 (has links) Advances in time-lapse 3D in vivo fluorescence microscopic imaging techniques enables the observation and investigation into the migration of Oligodendrocyte precursor cells (OPCs) and its role in the central nervous system. However, current practice of image-based OPC motility analysis heavily relies on manual labeling and tracking on 2D max projection of the 3D data, which suffers from massive human labor, subjective biases, weak reproducibility and especially information loss and distortion. Besides, due to the lack of OPC specific genetically encoded indicator, OPCs can only be identified from other oligodendrocyte lineage cells by their observed motion patterns. Automated analytical tools are needed for the identification and tracking of OPCs. In this dissertation work, we proposed an analytical framework, MicTracker (Migrating Cell Tracker), for the integrated task of identifying, segmenting and tracking migrating cells (OPCs) from in vivo time-lapse fluorescence imaging data of high-density cell clusters composed of cells with different modes of motions. As a component of the framework, we presented a novel strategy for cell segmentation with global temporal consistency enforced, tackling the challenges caused by highly clustered cell population and temporally inconsistently blurred boundaries between touching cells. We also designed a data association algorithm to address the violation of usual assumption of small displacements. Recognizing that the violation was in the mixed cell population composed of two cell groups while the assumption held within each group, we proposed to solve the seemingly impossible mission by de-mixing the two groups of cell motion modes without known labels. We demonstrated the effectiveness of MicTracker in solving our problem on in vivo real data. / Doctor of Philosophy / Oligodendrocyte precursor cells (OPCs) are a type of motile cells in the central nervous system (CNS). OPCs' migration plays a critical role in the repair and re-distribution of myelin sheaths, a structures that helps to accelerate the transmission of electrical signals from neuron to neuron. But the mechanism behind the motility of OPCs is largely unclear. In recent years, advances in genetic fluorescence indicators and time-lapse optical microscopic imaging techniques, especially 3D in vivo imaging, enables neuroscientists to investigate into the puzzle. However, current practice of OPC motility analysis heavily relies on compressing the 3D data into 2D then manually tracking the OPCs, which suffers from not only massive human labor, subjective biases, weak reproducibility and especially information loss and distortion. Automated analytical tools are needed. Due to the limitation of current techniques in fluorescent labeling of cells in live animals, OPCs cannot be distinctively labeled. Instead, in the field of view there are also other irrelevant cells that cannot migrate but locally vibrate. Therefore, the human analyzer or the analytical software is supposed to detect OPCs from a cluster of touching cells containing multiple types of cells by their motion patterns only. In this dissertation, we presented a fully automatic machine learning based algorithm, MicTracker (Migrating Cell Tracker), to identify and track migrating OPCs. The task cannot be straightforwardly solved by existing generic-purpose cell tracking tools due to quite a few special challenges. To tackle the challenges, we also proposed novel methods for two major modules of MicTracker, segmentation and linking, respectively. We demonstrated the effectiveness of MicTracker and its components on real data and compared it with related existing works. The results of experiments showed notable superiority of MicTracker in solving our problem, compared with existing methods. Motile cells 3D video analysis Cell tracking in vivo
7	Segmentation Based Depth Extraction for Stereo Image and Video Sequence Zhang, Yu 24 August 2012 (has links) 3D representation nowadays has attracted much more public attention than ever before. One of the most important techniques in this field is depth extraction. In this thesis, we first introduce a well-known stereo matching method using color segmentation and belief propagation, and make an implementation of this framework. The color-segmentation based stereo matching method performs well recently, since this method can keep the object boundaries accurate, which is very important to depth map. Based on the implemented framework of segmentation based stereo matching, we proposed a color segmentation based 2D-to-3D video conversion method using high quality motion information. In our proposed scheme, the original depth map is generated from motion parallax by optical flow calculation. After that we employ color segmentation and plane estimation to optimize the original depth map to get an improved depth map with sharp object boundaries. We also make some adjustments for optical flow calculation to improve its efficiency and accuracy. By using the motion vectors extracted from compressed video as initial values for optical flow calculation, the calculated motion vectors are more accurate within a shorter time compared with the same process without initial values. The experimental results shows that our proposed method indeed gives much more accurate depth maps with high quality edge information. Optical flow with initial values provides good original depth map, and color segmentation with plane estimation further improves the depth map by sharpening its boundaries. Depth extraction Stereo matching Color segmentation 2D-to-3D video conversion
8	RevGlyph - codificação e reversão esteroscópica anaglífica / RevGlyph - stereoscopic coding and reversing of anaglyphs Zingarelli, Matheus Ricardo Uihara 27 September 2013 (has links) A atenção voltada à produção de conteúdos 3D atualmente tem sido alta, em grande parte devido à aceitação e à manifestação de interesse do público para esta tecnologia. Isso reflete num maior investimento das indústrias cinematográfica, de televisores e de jogos visando trazer o 3D para suas produções e aparelhos, oferecendo modos diferentes de interação ao usuário. Com isso, novas técnicas de captura, codificação e modos de reprodução de vídeos 3D, em especial, os vídeos estereoscópicos, vêm surgindo ou sendo melhorados, visando aperfeiçoar e integrar esta nova tecnologia com a infraestrutura disponível. Entretanto, notam-se divergências nos avanços feitos no campo da codificação, com cada método de visualização estereoscópica utilizando uma técnica de codificação diferente. Isso leva ao problema da incompatibilidade entre métodos de visualização. Uma proposta é criar uma técnica que seja genérica, isto é, independente do método de visualização. Tal técnica, por meio de parâmetros adequados, codifica o vídeo estéreo sem nenhuma perda significativa tanto na qualidade quanto na percepção de profundidade, característica marcante nesse tipo de conteúdo. A técnica proposta, denominada RevGlyph, transforma um par estéreo de vídeos em um único fluxo anaglífico, especialmente codificado. Tal fluxo, além de ser compatível com o método anaglífico de visualização, é também reversível a uma aproximação do par estéreo original, garantindo a independência do método de visualização / Attention towards 3D content production has been currently high, mostly because of public acceptance and interest in this kind of technology. That reflects in more investment from film, television and gaming industries, aiming at bringing 3D to their content and devices, as well as offering different ways of user interaction. Therefore, new capturing techniques, coding and playback modes for 3D video, particularly stereoscopic video, have been emerging or being enhanced, focusing on improving and integrating this new kind of technology with the available infrastructure. However, regarding advances in the coding area, there are conflicts because each stereoscopic visualization method uses a different coding technique. That leads to incompatibility between those methods. One proposal is to develop a generic technique, that is, a technique that is appropriate regardless the visualization method. Such technique, with suitable parameters, outputs a stereoscopic video with no significant loss of quality or depth perception, which is the remarkable feature of this kind of content. The proposed technique, named RevGlyph, transforms a stereo pair of videos into a single anaglyph stream, coded in a special manner. Such stream is not only compliant with the anaglyph visualization method but also reversible to something close to the original stereo pair, allowing visualization independence 3D video coding Anaglyph coding Codificação anaglífica Codificação da informação Compressão Compression Estereoscopia Stereoscopy
9	A floating polygon soup representation for 3D video Colleu, Thomas 06 December 2010 (has links) (PDF) Cette thèse présente une nouvelle représentation appeléesoupe de polygones déformables pour les applications telles que 3DTV et FTV (Free Viewpoint TV). La soupe de polygones prend en compte les problèmes de compacité, efficacité de compression, et synthèse de vue. Les polygones sont définis en 2D avec des valeurs de profondeurs à chaque coin. Ils ne sont pas nécessairement connectés entre eux et peuvent se déformer en fonction du point de vue et de l'instant dans la séquence vidéo. A partir de données multi-vues plus profondeur (MVD), la construction tient en deux étapes: la décomposition en quadtree et la réduction des redondances inter-vues. Un ensemble compact de polygones est obtenu à la place des cartes de profondeur, tout en préservant les discontinuités de profondeurs et les détails géométriques. Ensuite, l'efficacité de compression et la qualité de synthèse de vue sont évaluées. Des méthodes classiques comme l'\emph{inpainting} et des post-traitements sont implémentées et adaptées à la soupe de polygones. Une nouvelle méthode de compression est proposée. Elle exploite la structure en quadtree et la prédiction spatiale. Les résultats sont comparés à un schéma de compression MVD utilisant le standard MPEG H.264/MVC. Des valeurs de PSNR légèrement supérieures sont obtenues à moyens et hauts débits, et les effets fantômes sont largement réduits. Enfin, la soupe de polygone est déformée en fonction du point de vue désiré. Cette géométrie dépendante du point de vue est guidée par l'estimation du mouvement entre les vues synthétisées et originales. Cela réduit les artefacts restants et améliore la qualité d'image. 3D video data representation compression rendering
10	Segmentation Based Depth Extraction for Stereo Image and Video Sequence Zhang, Yu 24 August 2012 (has links) 3D representation nowadays has attracted much more public attention than ever before. One of the most important techniques in this field is depth extraction. In this thesis, we first introduce a well-known stereo matching method using color segmentation and belief propagation, and make an implementation of this framework. The color-segmentation based stereo matching method performs well recently, since this method can keep the object boundaries accurate, which is very important to depth map. Based on the implemented framework of segmentation based stereo matching, we proposed a color segmentation based 2D-to-3D video conversion method using high quality motion information. In our proposed scheme, the original depth map is generated from motion parallax by optical flow calculation. After that we employ color segmentation and plane estimation to optimize the original depth map to get an improved depth map with sharp object boundaries. We also make some adjustments for optical flow calculation to improve its efficiency and accuracy. By using the motion vectors extracted from compressed video as initial values for optical flow calculation, the calculated motion vectors are more accurate within a shorter time compared with the same process without initial values. The experimental results shows that our proposed method indeed gives much more accurate depth maps with high quality edge information. Optical flow with initial values provides good original depth map, and color segmentation with plane estimation further improves the depth map by sharpening its boundaries. Depth extraction Stereo matching Color segmentation 2D-to-3D video conversion

Search results