Spelling suggestions: "subject:"3D video"" "subject:"3D ideo""
21 |
Gaining Depth : Time-of-Flight Sensor Fusion for Three-Dimensional Video Content CreationSchwarz, Sebastian January 2014 (has links)
The successful revival of three-dimensional (3D) cinema has generated a great deal of interest in 3D video. However, contemporary eyewear-assisted displaying technologies are not well suited for the less restricted scenarios outside movie theaters. The next generation of 3D displays, autostereoscopic multiview displays, overcome the restrictions of traditional stereoscopic 3D and can provide an important boost for 3D television (3DTV). Then again, such displays require scene depth information in order to reduce the amount of necessary input data. Acquiring this information is quite complex and challenging, thus restricting content creators and limiting the amount of available 3D video content. Nonetheless, without broad and innovative 3D television programs, even next-generation 3DTV will lack customer appeal. Therefore simplified 3D video content generation is essential for the medium's success. This dissertation surveys the advantages and limitations of contemporary 3D video acquisition. Based on these findings, a combination of dedicated depth sensors, so-called Time-of-Flight (ToF) cameras, and video cameras, is investigated with the aim of simplifying 3D video content generation. The concept of Time-of-Flight sensor fusion is analyzed in order to identify suitable courses of action for high quality 3D video acquisition. In order to overcome the main drawback of current Time-of-Flight technology, namely the high sensor noise and low spatial resolution, a weighted optimization approach for Time-of-Flight super-resolution is proposed. This approach incorporates video texture, measurement noise and temporal information for high quality 3D video acquisition from a single video plus Time-of-Flight camera combination. Objective evaluations show benefits with respect to state-of-the-art depth upsampling solutions. Subjective visual quality assessment confirms the objective results, with a significant increase in viewer preference by a factor of four. Furthermore, the presented super-resolution approach can be applied to other applications, such as depth video compression, providing bit rate savings of approximately 10 percent compared to competing depth upsampling solutions. The work presented in this dissertation has been published in two scientific journals and five peer-reviewed conference proceedings. In conclusion, Time-of-Flight sensor fusion can help to simplify 3D video content generation, consequently supporting a larger variety of available content. Thus, this dissertation provides important inputs towards broad and innovative 3D video content, hopefully contributing to the future success of next-generation 3DTV.
|
22 |
Indoor Navigation For The Blind And Visually Impaired: Validation And Training Methodology Using Virtual RealityWang, Sili 24 March 2017 (has links)
In this thesis we propose a navigation instruction validation tool and an user training tool for PERCEPT system.
The validation tool evaluates the navigation instructions using a virtual reality environment by ensuring that each path in the virtual environment can be traversed by following the navigation instructions. This validation tool will serve as a first automatic validation of navigation instructions prior to testing them with blind and visually impaired users.
The user-training tool enables the blind user to explore and get familiar with the real environment by using the virtual environment generated in the Unity3d based game. The user interacts with the game using PERCEPT Smartphone client just like the user would interact in the real environment. Motion in the game is emulated using the keyboard. Motion directions follow the navigation instructions obtained through the Smartphone. This user-training tool will improve the users experience in the real environment by enabling them to explore and learn the environment a-priori to their arrival in the physical space.
|
23 |
Adapting Single-View View Synthesis with Multiplane Images for 3D Video ChatUppuluri, Anurag Venkata 01 December 2021 (has links) (PDF)
Activities like one-on-one video chatting and video conferencing with multiple participants are more prevalent than ever today as we continue to tackle the pandemic. Bringing a 3D feel to video chat has always been a hot topic in Vision and Graphics communities. In this thesis, we have employed novel view synthesis in attempting to turn one-on-one video chatting into 3D. We have tuned the learning pipeline of Tucker and Snavely's single-view view synthesis paper — by retraining it on MannequinChallenge dataset — to better predict a layered representation of the scene viewed by either video chat participant at any given time. This intermediate representation of the local light field — called a Multiplane Image (MPI) — may then be used to rerender the scene at an arbitrary viewpoint which, in our case, would match with the head pose of the watcher in the opposite, concurrent video frame. We discuss that our pipeline, when implemented in real-time, would allow both video chat participants to unravel occluded scene content and "peer into" each other's dynamic video scenes to a certain extent. It would enable full parallax up to the baselines of small head rotations and/or translations. It would be similar to a VR headset's ability to determine the position and orientation of the wearer's head in 3D space and render any scene in alignment with this estimated head pose. We have attempted to improve the performance of the retrained model by extending MannequinChallenge with the much larger RealEstate10K dataset. We present a quantitative and qualitative comparison of the model variants and describe our impactful dataset curation process, among other aspects.
|
24 |
Facial-based Analysis Tools: Engagement Measurements and Forensics ApplicationsBonomi, Mattia 27 July 2020 (has links)
The last advancements in technology leads to an easy acquisition and spreading of multi-dimensional multimedia content, e.g. videos, which in many cases depict human faces. From such videos, valuable information describing the intrinsic characteristic of the recorded user can be retrieved: the features extracted from the facial patch are relevant descriptors that allow for the measurement of subject's emotional status or the identification of synthetic characters.
One of the emerging challenges is the development of contactless approaches based on face analysis aiming at measuring the emotional status of the subject without placing sensors that limit or bias his experience. This raises even more interest in the context of Quality of Experience (QoE) measurement, or the measurement of user emotional status when subjected to a multimedia content, since it allows for retrieving the overall acceptability of the content as perceived by the end user. Measuring the impact of a given content to the user can have many implications from both the content producer and the end-user perspectives.
For this reason, we pursue the QoE assessment of a user watching multimedia stimuli, i.e. 3D-movies, through the analysis of his facial features acquired by means of contactless approaches. More specifically, the user's Heart Rate (HR) was retrieved by using computer vision techniques applied to the facial recording of the subject and then analysed in order to compute the level of engagement. We show that the proposed framework is effective for long video sequences, being robust to facial movements and illumination changes. We validate it on a dataset of 64 sequences where users observe 3D movies selected to induce variations in users' emotional status.
From one hand understanding the interaction between the user's perception of the content and his cognitive-emotional aspects leads to many opportunities to content producers, which may influence people's emotional statuses according to needs that can be driven by political, social, or business interests. On the other hand, the end-user must be aware of the authenticity of the content being watched: advancements in computer renderings allowed for the spreading of fake subjects in videos.
Because of this, as a second challenge we target the identification of CG characters in videos by applying two different approaches. We firstly exploit the idea that fake characters do not present any pulse rate signal, while humans' pulse rate is expressed by a sinusoidal signal. The application of computer vision techniques on a facial video allows for the contactless estimation of the subject's HR, thus leading to the identification of signals that lack of a strong sinusoidality, which represent virtual humans. The proposed pipeline allows for a fully automated discrimination, validated on a dataset consisting of 104 videos. Secondly, we make use of facial spatio-temporal texture dynamics that reveal the artefacts introduced by computer renderings techniques when creating a manipulation, e.g. face swapping, on videos depicting human faces. To do so, we consider multiple temporal video segments on which we estimated multi-dimensional (spatial and temporal) texture features. A binary decision of the joint analysis of such features is applied to strengthen the classification accuracy. This is achieved through the use of Local Derivative Patterns on Three Orthogonal Planes (LDP-TOP). Experimental analyses on state-of-the-art datasets of manipulated videos show the discriminative power of such descriptors in separating real and manipulated sequences and identifying the creation method used.
The main finding of this thesis is the relevance of facial features in describing intrinsic characteristics of humans. These can be used to retrieve significant information like the physiological response to multimedia stimuli or the authenticity of the human being itself. The application of the proposed approaches also on benchmark dataset returned good results, thus demonstrating real advancements in this research field. In addition to that, these methods can be extended to different practical application, from the autonomous driving safety checks to the identification of spoofing attacks, from the medical check-ups when doing sports to the users' engagement measurement when watching advertising. Because of this, we encourage further investigations in such direction, in order to improve the robustness of the methods, thus allowing for the application to increasingly challenging scenarios.
|
25 |
Emergency visualized : exploring visual technology for paramedic-physician collaboration in emergency careMaurin Söderholm, Hanna January 2013 (has links)
This thesis explores the potential of visual information and communication technologies (ICTs) for collaboration in emergency care. The thesis consists of four studies exploring future technology, 3D telepresence technology for medical consultation (3DMC), from several different methodological and analytical perspectives. Together the studies provide a broad view of the potential benefits, risks and implications of using visual technologies for collaboration in emergency care. The results show that paramedic-physician collaboration via 3DMC might have some benefits for patient care, both in the immediate patient care situation and beyond, for example, when coordinating transport and resources; improving understanding between different actors; and in developing paramedic competence and confidence in their skills. However, collaboration is heavily impacted by physicians’ and paramedics’ respective work practices which are situated in very different physical, professional and organizational contexts. Adding a visual dimension to this collaboration presents unique challenges for the overall design, development, implementation, and appropriation process. Thus, the thesis emphasizes the importance of understanding both the individual users as well as the complex overall image which, although often neglected or ignored, is crucial to understand when developing and introducing new technology that is successful and justified in the overall context while also being useful and meaningful for the individual users. / <p>Academic dissertation for the Degree of Doctor of Philosophy in Library and Information Science at the University of Gothenburg and the University of Borås to be publicly defended on Thursday 19 September 2013 at 13:15 in the auditorium at Simonsland, University of Borås, Skaraborgsvägen 3, Borås.</p>
|
26 |
Energy-efficient memory hierarchy for motion and disparity estimation in multiview video codingSampaio, Felipe Martin January 2013 (has links)
Esta dissertação de mestrado propõe uma hierarquia de memória para a Estimação de Movimento e de Disparidade (ME/DE) centrada nas referências da codificação, estratégia chamada de Reference-Centered Data Reuse (RCDR), com foco em redução de energia em codificadores de vídeo multivistas (MVC - Multiview Video Coding). Nos codificadores MVC, a ME/DE é responsável por praticamente 98% do consumo total de energia. Além disso, até 90% desta energia está relacionada com a memória do codificador: (a) acessos à memória externa para a busca das referências da ME/DE (45%) e (b) memória interna (cache) para manter armazenadas as amostras da área de busca e enviá-las para serem processadas pela ME/DE (45%). O principal objetivo deste trabalho é minimizar de maneira conjunta a energia consumida pelo módulo de ME/DE com relação às memórias externa e interna necessárias para a codificação MVC. A hierarquia de memória é composta por uma memória interna (a qual armazena a área de busca inteira), um controle dinâmico para a estratégia de power-gating da memória interna e um compressor de resultados parciais. Um controle de buscas foi proposto para explorar o comportamento da busca com o objetivo de atingir ainda mais reduções de energia. Além disso, este trabalho também agrega à hierarquia de memória um compressor de quadros de referência de baixa complexidade. A estratégia RCDR provê reduções de até 68% no consumo de energia quando comparada com estratégias estadoda- arte que são centradas no bloco atual da codificação. O compressor de resultados parciais é capaz de reduzir em 52% a comunicação com memória externa necessária para o armazenamento desses elementos. Quando comparada a técnicas de reuso de dados que não acessam toda área de busca, a estratégia RCDR também atinge os melhores resultados em consumo de energia, visto que acessos regulares a memórias externas DDR são energeticamente mais eficientes. O compressor de quadros de referência reduz ainda mais o número de acessos a memória externa (2,6 vezes menos acessos), aliando isso a perdas insignificantes na eficiência da codificação MVC. A memória interna requerida pela estratégia RCDR é até 74% menor do que estratégias centradas no bloco atual, como Level C. Além disso, o controle dinâmico para a técnica de power-gating provê reduções de até 82% na energia estática, o que é o melhor resultado entre os trabalho relacionados. A energia dinâmica é tratada pela técnica de união dos blocos candidatos, atingindo ganhos de mais de 65%. Considerando as reduções de consumo de energia atingidas pelas técnicas propostas neste trabalho, conclui-se que o sistema de hierarquia de memória proposto nesta dissertação atinge seu objetivo de atender às restrições impostas pela codificação MVC, no que se refere ao processamento do módulo de ME/DE. / This Master Thesis proposes a memory hierarchy for the Motion and Disparity Estimation (ME/DE) centered on the encoding references, called Reference-Centered Data Reuse (RCDR), focusing on energy reduction in the Multiview Video Coding (MVC). In the MVC encoders the ME/DE represents more than 98% of the overall energy consumption. Moreover, in the overall ME/DE energy, up to 90% is related to the memory issues, and only 10% is related to effective computation. The two items to be concerned with: (1) off-chip memory communication to fetch the reference samples (45%) and (2) on-chip memory to keep stored the search window samples and to send them to the ME/DE processing core (45%). The main goal of this work is to jointly minimize the on-chip and off-chip energy consumption in order to reduce the overall energy related to the ME/DE on MVC. The memory hierarchy is composed of an onchip video memory (which stores the entire search window), an on-chip memory gating control, and a partial results compressor. A search control unit is also proposed to exploit the search behavior to achieve further energy reduction. This work also aggregates to the memory hierarchy a low-complexity reference frame compressor. The experimental results proved that the proposed system accomplished the goal of the work of jointly minimizing the on-chip and off-chip energies. The RCDR provides off-chip energy savings of up to 68% when compared to state-of-the-art. the traditional MBcentered approach. The partial results compressor is able to reduce by 52% the off-chip memory communication to handle this RCDR penalty. When compared to techniques that do not access the entire search window, the proposed RCDR also achieve the best results in off-chip energy consumption due to the regular access pattern that allows lots of DDR burst reads (30% less off-chip energy consumption). Besides, the reference frame compressor is capable to improve by 2.6x the off-chip memory communication savings, along with negligible losses on MVC encoding performance. The on-chip video memory size required for the RCDR is up to 74% smaller than the MB-centered Level C approaches. On top of that, the power-gating control is capable to save 82% of leakage energy. The dynamic energy is treated due to the candidate merging technique, with savings of more than 65%. Due to the jointly off-chip communication and on-chip storage energy savings, the proposed memory hierarchy system is able to meet the MVC constraints for the ME/DE processing.
|
27 |
Depth-based 3D videos: quality measurement and synthesized view enhancementSolh, Mashhour M. 13 December 2011 (has links)
Three dimensional television (3DTV) is believed to be the future of television broadcasting that will replace current 2D HDTV technology. In the future, 3DTV will bring a more life-like and visually immersive home entertainment experience, in which users will have the freedom to navigate through the scene to choose a different viewpoint. A desired view can be synthesized at the receiver side using depth image-based rendering (DIBR). While this approach has many advantages, one of the key
challenges in DIBR is generating high quality synthesized views. This work presents novel methods to measure and enhance the quality of 3D videos generated through
DIBR. For quality measurements we describe a novel method to characterize and measure distortions by multiple cameras used to capture stereoscopic images. In addition, we present an objective quality measure for DIBR-based 3D videos by evaluating the elements of visual discomfort in stereoscopic 3D videos. We also introduce a new concept called the ideal depth estimate, and define the tools to estimate that depth. Full-reference and no-reference profiles for calculating the proposed measures are also presented. Moreover, we introduce two innovative approaches to improve the quality of the synthesized views generated by DIBR. The first approach is based on hierarchical blending of the background and foreground information around the disocclusion areas which produces a natural looking, synthesized view with seamless hole-filling. This approach yields virtual images that are free of any geometric distortions, unlike other algorithms that preprocess the depth map. In contrast to the other hole-filling approaches, our approach is not sensitive to depth maps with high percentage of bad pixels from stereo matching.
The second approach further enhances the results through a depth-adaptive preprocessing of the colored images. Finally, we propose an enhancement over depth estimation algorithm using the depth monocular cues from luminance and chrominance. The estimated depth will be evaluated using our quality measure, and the hole-filling algorithm will be used to generate synthesized views. This application will demonstrate how our quality measures and enhancement algorithms could help in the development of high quality stereoscopic depth-based synthesized videos.
|
28 |
Energy-efficient memory hierarchy for motion and disparity estimation in multiview video codingSampaio, Felipe Martin January 2013 (has links)
Esta dissertação de mestrado propõe uma hierarquia de memória para a Estimação de Movimento e de Disparidade (ME/DE) centrada nas referências da codificação, estratégia chamada de Reference-Centered Data Reuse (RCDR), com foco em redução de energia em codificadores de vídeo multivistas (MVC - Multiview Video Coding). Nos codificadores MVC, a ME/DE é responsável por praticamente 98% do consumo total de energia. Além disso, até 90% desta energia está relacionada com a memória do codificador: (a) acessos à memória externa para a busca das referências da ME/DE (45%) e (b) memória interna (cache) para manter armazenadas as amostras da área de busca e enviá-las para serem processadas pela ME/DE (45%). O principal objetivo deste trabalho é minimizar de maneira conjunta a energia consumida pelo módulo de ME/DE com relação às memórias externa e interna necessárias para a codificação MVC. A hierarquia de memória é composta por uma memória interna (a qual armazena a área de busca inteira), um controle dinâmico para a estratégia de power-gating da memória interna e um compressor de resultados parciais. Um controle de buscas foi proposto para explorar o comportamento da busca com o objetivo de atingir ainda mais reduções de energia. Além disso, este trabalho também agrega à hierarquia de memória um compressor de quadros de referência de baixa complexidade. A estratégia RCDR provê reduções de até 68% no consumo de energia quando comparada com estratégias estadoda- arte que são centradas no bloco atual da codificação. O compressor de resultados parciais é capaz de reduzir em 52% a comunicação com memória externa necessária para o armazenamento desses elementos. Quando comparada a técnicas de reuso de dados que não acessam toda área de busca, a estratégia RCDR também atinge os melhores resultados em consumo de energia, visto que acessos regulares a memórias externas DDR são energeticamente mais eficientes. O compressor de quadros de referência reduz ainda mais o número de acessos a memória externa (2,6 vezes menos acessos), aliando isso a perdas insignificantes na eficiência da codificação MVC. A memória interna requerida pela estratégia RCDR é até 74% menor do que estratégias centradas no bloco atual, como Level C. Além disso, o controle dinâmico para a técnica de power-gating provê reduções de até 82% na energia estática, o que é o melhor resultado entre os trabalho relacionados. A energia dinâmica é tratada pela técnica de união dos blocos candidatos, atingindo ganhos de mais de 65%. Considerando as reduções de consumo de energia atingidas pelas técnicas propostas neste trabalho, conclui-se que o sistema de hierarquia de memória proposto nesta dissertação atinge seu objetivo de atender às restrições impostas pela codificação MVC, no que se refere ao processamento do módulo de ME/DE. / This Master Thesis proposes a memory hierarchy for the Motion and Disparity Estimation (ME/DE) centered on the encoding references, called Reference-Centered Data Reuse (RCDR), focusing on energy reduction in the Multiview Video Coding (MVC). In the MVC encoders the ME/DE represents more than 98% of the overall energy consumption. Moreover, in the overall ME/DE energy, up to 90% is related to the memory issues, and only 10% is related to effective computation. The two items to be concerned with: (1) off-chip memory communication to fetch the reference samples (45%) and (2) on-chip memory to keep stored the search window samples and to send them to the ME/DE processing core (45%). The main goal of this work is to jointly minimize the on-chip and off-chip energy consumption in order to reduce the overall energy related to the ME/DE on MVC. The memory hierarchy is composed of an onchip video memory (which stores the entire search window), an on-chip memory gating control, and a partial results compressor. A search control unit is also proposed to exploit the search behavior to achieve further energy reduction. This work also aggregates to the memory hierarchy a low-complexity reference frame compressor. The experimental results proved that the proposed system accomplished the goal of the work of jointly minimizing the on-chip and off-chip energies. The RCDR provides off-chip energy savings of up to 68% when compared to state-of-the-art. the traditional MBcentered approach. The partial results compressor is able to reduce by 52% the off-chip memory communication to handle this RCDR penalty. When compared to techniques that do not access the entire search window, the proposed RCDR also achieve the best results in off-chip energy consumption due to the regular access pattern that allows lots of DDR burst reads (30% less off-chip energy consumption). Besides, the reference frame compressor is capable to improve by 2.6x the off-chip memory communication savings, along with negligible losses on MVC encoding performance. The on-chip video memory size required for the RCDR is up to 74% smaller than the MB-centered Level C approaches. On top of that, the power-gating control is capable to save 82% of leakage energy. The dynamic energy is treated due to the candidate merging technique, with savings of more than 65%. Due to the jointly off-chip communication and on-chip storage energy savings, the proposed memory hierarchy system is able to meet the MVC constraints for the ME/DE processing.
|
29 |
Energy-efficient memory hierarchy for motion and disparity estimation in multiview video codingSampaio, Felipe Martin January 2013 (has links)
Esta dissertação de mestrado propõe uma hierarquia de memória para a Estimação de Movimento e de Disparidade (ME/DE) centrada nas referências da codificação, estratégia chamada de Reference-Centered Data Reuse (RCDR), com foco em redução de energia em codificadores de vídeo multivistas (MVC - Multiview Video Coding). Nos codificadores MVC, a ME/DE é responsável por praticamente 98% do consumo total de energia. Além disso, até 90% desta energia está relacionada com a memória do codificador: (a) acessos à memória externa para a busca das referências da ME/DE (45%) e (b) memória interna (cache) para manter armazenadas as amostras da área de busca e enviá-las para serem processadas pela ME/DE (45%). O principal objetivo deste trabalho é minimizar de maneira conjunta a energia consumida pelo módulo de ME/DE com relação às memórias externa e interna necessárias para a codificação MVC. A hierarquia de memória é composta por uma memória interna (a qual armazena a área de busca inteira), um controle dinâmico para a estratégia de power-gating da memória interna e um compressor de resultados parciais. Um controle de buscas foi proposto para explorar o comportamento da busca com o objetivo de atingir ainda mais reduções de energia. Além disso, este trabalho também agrega à hierarquia de memória um compressor de quadros de referência de baixa complexidade. A estratégia RCDR provê reduções de até 68% no consumo de energia quando comparada com estratégias estadoda- arte que são centradas no bloco atual da codificação. O compressor de resultados parciais é capaz de reduzir em 52% a comunicação com memória externa necessária para o armazenamento desses elementos. Quando comparada a técnicas de reuso de dados que não acessam toda área de busca, a estratégia RCDR também atinge os melhores resultados em consumo de energia, visto que acessos regulares a memórias externas DDR são energeticamente mais eficientes. O compressor de quadros de referência reduz ainda mais o número de acessos a memória externa (2,6 vezes menos acessos), aliando isso a perdas insignificantes na eficiência da codificação MVC. A memória interna requerida pela estratégia RCDR é até 74% menor do que estratégias centradas no bloco atual, como Level C. Além disso, o controle dinâmico para a técnica de power-gating provê reduções de até 82% na energia estática, o que é o melhor resultado entre os trabalho relacionados. A energia dinâmica é tratada pela técnica de união dos blocos candidatos, atingindo ganhos de mais de 65%. Considerando as reduções de consumo de energia atingidas pelas técnicas propostas neste trabalho, conclui-se que o sistema de hierarquia de memória proposto nesta dissertação atinge seu objetivo de atender às restrições impostas pela codificação MVC, no que se refere ao processamento do módulo de ME/DE. / This Master Thesis proposes a memory hierarchy for the Motion and Disparity Estimation (ME/DE) centered on the encoding references, called Reference-Centered Data Reuse (RCDR), focusing on energy reduction in the Multiview Video Coding (MVC). In the MVC encoders the ME/DE represents more than 98% of the overall energy consumption. Moreover, in the overall ME/DE energy, up to 90% is related to the memory issues, and only 10% is related to effective computation. The two items to be concerned with: (1) off-chip memory communication to fetch the reference samples (45%) and (2) on-chip memory to keep stored the search window samples and to send them to the ME/DE processing core (45%). The main goal of this work is to jointly minimize the on-chip and off-chip energy consumption in order to reduce the overall energy related to the ME/DE on MVC. The memory hierarchy is composed of an onchip video memory (which stores the entire search window), an on-chip memory gating control, and a partial results compressor. A search control unit is also proposed to exploit the search behavior to achieve further energy reduction. This work also aggregates to the memory hierarchy a low-complexity reference frame compressor. The experimental results proved that the proposed system accomplished the goal of the work of jointly minimizing the on-chip and off-chip energies. The RCDR provides off-chip energy savings of up to 68% when compared to state-of-the-art. the traditional MBcentered approach. The partial results compressor is able to reduce by 52% the off-chip memory communication to handle this RCDR penalty. When compared to techniques that do not access the entire search window, the proposed RCDR also achieve the best results in off-chip energy consumption due to the regular access pattern that allows lots of DDR burst reads (30% less off-chip energy consumption). Besides, the reference frame compressor is capable to improve by 2.6x the off-chip memory communication savings, along with negligible losses on MVC encoding performance. The on-chip video memory size required for the RCDR is up to 74% smaller than the MB-centered Level C approaches. On top of that, the power-gating control is capable to save 82% of leakage energy. The dynamic energy is treated due to the candidate merging technique, with savings of more than 65%. Due to the jointly off-chip communication and on-chip storage energy savings, the proposed memory hierarchy system is able to meet the MVC constraints for the ME/DE processing.
|
30 |
Evaluación de la QoE en un sistema de streaming adaptativo de vídeo 3D basado en DASHGuzmán Castillo, Paola Fernanda 06 September 2022 (has links)
[ES] La distribución de contenidos multimedia, y en particular el streaming de vídeo, domina actualmente el tráfico global de Internet y su importancia será incluso mayor en el futuro. Miles de títulos se agregan mensualmente a los principales proveedores de servicios, como Netflix, YouTube y Amazon. Y de la mano del consumo de contenidos de alta definición que se convierte en la principal tendencia, se puede observar nuevamente un incremento en el consumo de contenidos 3D. Esto ha hecho que las temáticas relacionadas con la producción de contenidos, codificación, transmisión, Calidad de Servicio (QoS) y Calidad de Experiencia (QoE) percibidas por los usuarios de los sistemas de distribución de vídeo 3D sean un tema de investigación con numerosas contribuciones en los últimos años.
Esta tesis aborda el problema de la prestación de servicios de transmisión de vídeo 3D bajo condiciones de red de ancho de banda variable. En este sentido, presenta los resultados de la evaluación de la QoE percibida por los usuarios de los sistemas de vídeo 3D, analizando principalmente el impacto de los efectos introducidos en dos de los elementos de la cadena de procesamiento de vídeo 3D: la etapa de codificación y el proceso de transmisión.
Para analizar los efectos de la codificación en la calidad del vídeo 3D, en la primera etapa se aborda la evaluación objetiva y subjetiva de la calidad del vídeo, comparando el rendimiento de diferentes estándares y métodos de codificación, con el fin de identificar aquellos que logran la mejor relación entre calidad, tasa de bits y tiempo de codificación. Así mismo, en el contexto de la transmisión en un entorno simulcast, se evalúa la eficacia de la utilización de las codificaciones asimétricas para la transmisión de vídeo 3D, como una alternativa para la reducción del ancho de banda manteniendo la calidad global.
En segundo lugar, para el estudio del impacto y el rendimiento del proceso de transmisión, se ha trabajado sobre la base de un sistema de transmisión dinámica adaptativa sobre HTTP (DASH) en el contexto de la transmisión de vídeo tanto 2D como 3D, utilizando diferentes escenarios de variación de ancho de banda. El objetivo ha sido el desarrollo de un marco de referencia para la evaluación de la QoE en escenarios de transmisión adaptativa de vídeo 3D, que permite analizar el impacto en la QoE del usuario frente a diferentes patrones de variación del ancho de banda, así como el rendimiento del algoritmo de adaptación frente a estos escenarios. El trabajo se enfoca en identificar el impacto en la Calidad de Experiencia del usuario que tienen aspectos como: la frecuencia, el tipo, el alcance y la ubicación temporal de los eventos de variación del ancho de banda.
El sistema propuesto permite realizar mediciones de rendimiento de forma automatizada y sistemática para la evaluación de los sistemas DASH en el servicio de distribución de vídeo 2D y 3D. Se ha utilizado Puppeteer, la librería Node.js desarrollada por Google, que proporciona una API de alto nivel, para automatizar acciones en el protocolo Chrome Devtools, como iniciar la reproducción, provocar cambios de ancho de banda y guardar los resultados de los procesos de cambio de calidad, marcas de tiempo, paradas, etc. A partir de estos datos, se realiza un procesamiento que permite la reconstrucción del vídeo visualizado, así como la extracción de métricas de calidad y la evaluación de la QoE de los usuarios utilizando la recomendación ITU-T P.1203. / [CA] La distribució de continguts multimèdia, i en particular el streaming de vídeo, domina actualment el trànsit global d'Internet i la seua importància serà fins i tot mes gran en el futur. Milers de títols s'afegeixen mensualment als principals proveïdors de serveis, com ara Netflix, YouTube i Amazon. I de la mà del consum de continguts d'alta definició que es converteix en la tendència principal, es pot observar novament un increment en el consum de continguts 3D. Això ha fet que les temàtiques relacionades amb la producció de continguts, codificació, transmissió, Qualitat de Servei (QoS) i Qualitat d'Experiència (QoE) percebudes pels usuaris dels sistemes de distribució de vídeo 3D siguen un tema de recerca amb nombroses contribucions en els últims anys.
Aquesta tesi aborda el problema de la prestació de serveis de transmissió de vídeo 3D sota condicions de xarxa d'ample de banda variable. En aquest sentit, presenta els resultats de l'avaluació de la QoE percebuda pels usuaris dels sistemes de vídeo 3D, analitzant principalment l'impacte dels efectes introduïts en dos dels elements de la cadena de processament de vídeo 3D: l'etapa de codificació i el procés de transmissió.
Per analitzar els efectes de la codificació en la qualitat del vídeo 3D, a la primera etapa s'aborda l'avaluació objectiva i subjectiva de la qualitat del vídeo, comparant el rendiment de diferents estàndards i mètodes de codificació, per tal d'identificar aquells que aconsegueixen la millor relació entre qualitat, taxa de bits i temps de codificació. Així mateix, en el context de la transmissió en un entorn simulcast, s'avalua l'eficàcia de la utilització de les codificacions asimètriques per la transmissió de vídeo 3D, com una alternativa per la reducció de l'ampleada de banda mantenint la qualitat global.
En segon lloc, per a l'estudi de l'impacte i el rendiment del procés de transmissió, s'ha treballat sobre la base d'un sistema de transmissió dinàmica adaptativa sobre HTTP (DASH) en el context de la transmissió de vídeo tant 2D com 3D, utilitzant diferents escenaris de variació d'ample de banda. L'objectiu ha estat el desenvolupament d'un marc de referència per a l'avaluació de la QoE en escenaris de transmissió adaptativa de vídeo 3D, que permet analitzar l'impacte en la QoE de l'usuari davant de diferents patrons de variació de l'ample de banda; així com el rendiment de l'algorisme d'adaptació davant d'aquests escenaris. El treball s'enfoca a identificar l'impacte a la Qualitat d'Experiència de l'usuari que tenen aspectes com ara: la freqüència, el tipus, l'abast i la ubicació temporal dels esdeveniments de variació de l'ample de banda.
El sistema proposat permet realitzar mesuraments de rendiment de manera automatitzada i sistemàtica per a l'avaluació dels sistemes DASH en el servei de distribució de vídeo 2D i 3D. S'ha utilitzat Puppeteer, la llibreria Node.js desenvolupada per Google, que proporciona una API d'alt nivell, per automatitzar accions al protocol Chrome Devtools, com iniciar la reproducció, provocar canvis d'ample de banda i desar els resultats dels processos de canvi de qualitat, marques de temps, parades, etc. A partir d'aquestes dades, es fa un processament que permet la reconstrucció del vídeo visualitzat, així com l'extracció de mètriques de qualitat i l'avaluació de la QoE dels usuaris fent servir la recomanació ITU-T P.1203. / [EN] The distribution of multimedia content, and in particular video streaming, currently dominates global Internet traffic and will become even more important in the future. Thousands of titles are added monthly to major service providers such as Netflix, YouTube and Amazon. In addition to the consumption of high-definition content becoming the main trend, an increase in the consumption of 3D content can be observed again. This fact has caused that issues related to content production, encoding, transmission, Quality of Service (QoS) and Quality of Experience (QoE) perceived by users of 3D video distribution systems became a research topic with numerous contributions in recent years.
This thesis addresses the problem of providing 3D video streaming services under variable bandwidth network conditions. In this sense, it presents the results of the evaluation of the QoE perceived by the users of 3D video systems, analyzing mainly the impact of the effects introduced in two of the elements of the 3D video processing chain: the encoding stage and the transmission process.
To analyze the effects of the encoding process on the quality of 3D video, the first stage deals with the objective and subjective evaluation of video quality, comparing the performance of different encoding standards and methods, in order to identify those that achieve the best ratio between quality, bit rate and encoding time. Also, in the context of transmission in a simulcast environment, the advantages of using asymmetric coding for 3D video transmission is evaluated as an alternative for bandwidth reduction while maintaining overall quality.
Secondly, for the study of the impact and performance of the transmission process, the work has been carried out on the basis of an adaptive dynamic over HTTP (DASH) transmission system in the context of both 2D and 3D video transmission, using different bandwidth variation scenarios. The aim has been to develop a framework for the evaluation of QoE in 3D adaptive video streaming scenarios, which allows analyzing the impact on the user's QoE against different bandwidth variation patterns, as well as the performance of the adaptation algorithm under these scenarios. The work focuses on identifying the impact on the user's Quality of Experience in aspects such as: frequency, type, range and temporal location of bandwidth variation events.
The proposed system allows to perform performance measurements in an automated and systematic way for the evaluation of DASH systems in the 2D and 3D video distribution service. The tool Puppeteer, the Node.js library developed by Google, has been used, which provides a high-level API to automate actions in the Chrome Devtools protocol, such as starting playback, causing bandwidth changes and saving the results of the quality change processes, timestamps, stops, etc. From this data, a further processing is performed that allows the reconstruction of the displayed video, as well as the extraction of quality metrics and the evaluation of the QoE of the users using the ITU-T P.1203 recommendation. / Guzmán Castillo, PF. (2022). Evaluación de la QoE en un sistema de streaming adaptativo de vídeo 3D basado en DASH [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/186354
|
Page generated in 0.0671 seconds