• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 177
  • 56
  • 9
  • 9
  • 6
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 284
  • 284
  • 88
  • 78
  • 76
  • 72
  • 47
  • 43
  • 42
  • 41
  • 37
  • 36
  • 36
  • 33
  • 33
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
91

The effects of evaluation and rotation on descriptors and similarity measures for a single class of image objects

06 June 2008 (has links)
“A picture is worth a thousand words”. If this proverb were taken literally we all know that every person interprets images or photos differently in terms of its content. This is due to the semantics contained in these images. Content-based image retrieval has become a vast area of research in order to successfully describe and retrieve images according to the content. In military applications, intelligence images such as those obtained by the defence intelligence group are taken (mostly on film), developed and then manually annotated thereafter. These photos are then stored in a filing system according to certain attributes such as the location, content etc. To retrieve these images at a later stage might take days or even weeks to locate. Thus, the need for a digital annotation system has arisen. The images of the military contain various military vehicles and buildings that need to be detected, described and stored in a database. For our research we want to look at the effects that the rotation and elevation angle of an object in an image has on the retrieval performance. We chose model cars in order to be able to control the environment the photos were taken in such as the background, lighting, distance between the objects, and the camera etc. There are also a wide variety of shapes and colours of these models to obtain and work with. We look at the MPEG-7 descriptor schemes that are recommended by the MPEG group for video and image retrieval as well as implement three of them. For the military it could be required that when the defence intelligence group is in the field, that the images be directly transmitted via satellite to the headquarters. We have therefore included the JPEG2000 standard which gives a compression performance increase of 20% over the original JPEG standard. It is also capable to transmit images wirelessly as well as securely. Including the MPEG-7 descriptors that we have implemented, we have also implemented the fuzzy histogram and colour correlogram descriptors. For our experimentation we implemented a series of experiments in order to determine the effects that rotation and elevation has on our model vehicle images. Observations are made when each vehicle is considered separately and when the vehicles are described and combined into a single database. After the experiments are done we look at the descriptors and determine which adjustments could be made in order to improve the retrieval performance thereof. / Dr. W.A. Clarke
92

Energy Efficient and Programmable Architecture for Wireless Vision Sensor Node

Imran, Muhammad January 2013 (has links)
Wireless Vision Sensor Networks (WVSNs) is an emerging field which has attracted a number of potential applications because of smaller per node cost, ease of deployment, scalability and low power stand alone solutions. WVSNs consist of a number of wireless Vision Sensor Nodes (VSNs). VSN has limited resources such as embedded processing platform, power supply, wireless radio and memory.  In the presence of these limited resources, a VSN is expected to perform complex vision tasks for a long duration of time without battery replacement/recharging. Currently, reduction of processing and communication energy consumptions have been major challenges for battery operated VSNs. Another challenge is to propose generic solutions for a VSN so as to make these solutions suitable for a number of applications. To meet these challenges, this thesis focuses on energy efficient and programmable VSN architecture for machine vision systems which can classify objects based on binary data. In order to facilitate generic solutions, a taxonomy has been developed together with a complexity model which can be used for systems’ classification and comparison without the need for actual implementation. The proposed VSN architecture is based on tasks partitioning between a VSN and a server as well as tasks partitioning locally on the node between software and hardware platforms. In relation to tasks partitioning, the effect on processing, communication energy consumptions, design complexity and lifetime has been investigated. The investigation shows that the strategy, in which front end tasks up to segmentation, accompanied by a bi-level coding, are implemented on Field Programmable Platform (FPGA) with small sleep power, offers a generalized low complexity and energy efficient VSN architecture. The implementation of data intensive front end tasks on hardware reconfigurable platform reduces processing energy. However, there is a scope for reducing communication energy, related to output data. This thesis also explores data reduction techniques including image coding, region of interest coding and change coding which reduces output data significantly. For proof of concept, VSN architecture together with tasks partitioning, bi-level video coding, duty cycling and low complexity background subtraction technique has been implemented on real hardware and functionality has been verified for four applications including particle detection system, remote meter reading, bird detection and people counting. The results based on measured energy values shows that, depending on the application, the energy consumption can be reduced by a factor of approximately 1.5 up to 376 as compared to currently published VSNs. The lifetime based on measured energy values showed that for a sample period of 5 minutes, VSN can achieve 3.2 years lifetime with a battery of 37.44 kJ energy. In addition to this, proposed VSN offers generic architecture with smaller design complexity on hardware reconfigurable platform and offers easy adaptation for a number of applications as compared to published systems.
93

RevGlyph - codificação e reversão esteroscópica anaglífica / RevGlyph - stereoscopic coding and reversing of anaglyphs

Zingarelli, Matheus Ricardo Uihara 27 September 2013 (has links)
A atenção voltada à produção de conteúdos 3D atualmente tem sido alta, em grande parte devido à aceitação e à manifestação de interesse do público para esta tecnologia. Isso reflete num maior investimento das indústrias cinematográfica, de televisores e de jogos visando trazer o 3D para suas produções e aparelhos, oferecendo modos diferentes de interação ao usuário. Com isso, novas técnicas de captura, codificação e modos de reprodução de vídeos 3D, em especial, os vídeos estereoscópicos, vêm surgindo ou sendo melhorados, visando aperfeiçoar e integrar esta nova tecnologia com a infraestrutura disponível. Entretanto, notam-se divergências nos avanços feitos no campo da codificação, com cada método de visualização estereoscópica utilizando uma técnica de codificação diferente. Isso leva ao problema da incompatibilidade entre métodos de visualização. Uma proposta é criar uma técnica que seja genérica, isto é, independente do método de visualização. Tal técnica, por meio de parâmetros adequados, codifica o vídeo estéreo sem nenhuma perda significativa tanto na qualidade quanto na percepção de profundidade, característica marcante nesse tipo de conteúdo. A técnica proposta, denominada RevGlyph, transforma um par estéreo de vídeos em um único fluxo anaglífico, especialmente codificado. Tal fluxo, além de ser compatível com o método anaglífico de visualização, é também reversível a uma aproximação do par estéreo original, garantindo a independência do método de visualização / Attention towards 3D content production has been currently high, mostly because of public acceptance and interest in this kind of technology. That reflects in more investment from film, television and gaming industries, aiming at bringing 3D to their content and devices, as well as offering different ways of user interaction. Therefore, new capturing techniques, coding and playback modes for 3D video, particularly stereoscopic video, have been emerging or being enhanced, focusing on improving and integrating this new kind of technology with the available infrastructure. However, regarding advances in the coding area, there are conflicts because each stereoscopic visualization method uses a different coding technique. That leads to incompatibility between those methods. One proposal is to develop a generic technique, that is, a technique that is appropriate regardless the visualization method. Such technique, with suitable parameters, outputs a stereoscopic video with no significant loss of quality or depth perception, which is the remarkable feature of this kind of content. The proposed technique, named RevGlyph, transforms a stereo pair of videos into a single anaglyph stream, coded in a special manner. Such stream is not only compliant with the anaglyph visualization method but also reversible to something close to the original stereo pair, allowing visualization independence
94

Reversão anaglífica em vídeos estereoscópicos / Anaglyphic reversion in stereoscopic videos

Rodrigues, Felipe Maciel 24 May 2016 (has links)
A atenção voltada à produção de conteúdos 3D atualmente tem sido alta, grande parte devido à aceitação e à manifestação de interesse do público para esta tecnologia. Novas técnicas de captação e codificação e modos de reprodução de vídeos 3D, particularmente vídeos estereoscópicos, vêm surgindo ou sendo melhorados, visando aperfeiçoar e integrar esta nova tecnologia com a infraestrutura disponível. No entanto, em relação a avanços na área de codificação, nota-se a ausência de uma técnica compatível com mais de um método de visualização de vídeos estereoscópicos - para cada método de visualização há uma técnica de codificação diferente, o que inviabiliza ao usuário escolher o método que deseja visualizar o conteúdo. Uma abordagem para resolver este problema é desenvolver uma técnica genérica, isto é, uma técnica que seja independentemente do método de visualização, que através de parâmetros adequados, produza um vídeo estereoscópico sem perda significativa de qualidade ou a percepção de profundidade, que é a característica marcante desse tipo de conteúdo. O método proposto neste trabalho, chamado HaaRGlyph, transforma um vídeo esterescópico em um único fluxo contendo um anáglifo, codificado de modo especial. Esse fluxo além de ser compatível com o método de visualização anaglífica é também reversível à uma aproximação do par estéreo original, possibilitando a independência de visualização. Além disso, a HaaRGlyph atinge maiores taxas de compressão do que o trabalho relacionado. / Attention towards 3D content production has been currently high, mostly because of public acceptance and interest in this kind of technology. Therefore, new capturing techniques, coding and playback modes for 3D video, particularly stereoscopic video, have been emerging or being enhanced, focusing on improving and integrating this new kind of technology with the available infrastructure. However, regarding advances in the coding area, there are conflicts because each stereoscopic visualization method uses a different coding technique. That leads to incompatibility between those methods. An approach to tackle this problem is to develop a generic technique, that is, a technique that is appropriate regardless the visualization method. Such technique, with suitable parameters, outputs a stereoscopic video with no significant loss of quality or depth perception, which is the remarkable feature of this kind of content. The method proposed in this work, named HaaRGlyph, transforms a stereo pair of videos into a single anaglyph stream, coded in a special manner. Such stream is not only compliant with the anaglyph visualization method but also reversible to something close to the original stereo pair, allowing visualization independence. Moreover, HaarGlyph achieves higher compression rates than related work.
95

Codificação escalável de vídeo para recepção fixa no sistema brasileiro de televisão digital. / Scalable video coding for fixed reception in the Brazilian digital TV system.

Nunes, Rogério Pernas 29 May 2009 (has links)
Em dezembro de 2007, a partir da cidade de São Paulo, as transmissões de televisão digital terrestre e aberta tiveram início no Brasil. Um avanço significativo do Sistema Brasileiro de TV Digital (SBTVD) foi a adoção do padrão H.264/AVC e o formato de vídeo 1080i para a codificação de vídeo em alta definição. A adoção em larga escala de tecnologia de alta definição tem sido um processo observado em vários mercados do mundo, e novos formatos superiores ao 1080i já estão sendo discutidos e propostos. Tendo em vista o que será a próxima geração da televisão, centros de pesquisa, como o centro japonês da emissora NHK, investigam os fatores humanos determinantes para caracterizar o sistema que deverá ser o último passo em tecnologia de televisão 2D. Já nomeado de UHDTV, este sistema deve contemplar resolução de 7680 pontos horizontais por 4320 pontos verticais, além de outras características ainda em estudo. Ao mesmo tempo, o trabalho aqui apresentado discute as ferramentas de suporte da escalabilidade na codificação multimídia como forma de evolução gradual dos formatos de vídeo na radiodifusão. Especificamente este trabalho sistematiza as ferramentas de escalabilidade do padrão H.264/AVC tendo em vista a sua aplicação ao SBTVD. Neste sentido, são discutidas as possibilidades de evolução do sistema frente à escalabilidade e são apresentados levantamentos experimentais da atual ocupação do espectro na cidade de São Paulo, evidenciando a disponibilidade de taxa para expansões. São apresentados também resultados iniciais relativos à codificação SVC, que apontam objetivamente as vantagens da escalabilidade sobre o simulcast, evidenciando que esta técnica pode ser utilizada no SBTVD para prover novos formatos de vídeo, tendo como premissa a compatibilidade com os atuais receptores que suportam o formato 1080i. O trabalho apresenta contribuições teóricas e experimentais na direção de adoção da escalabilidade no SBTVD, apontando também possíveis trabalhos futuros que, se realizados, poderão confirmar a transmissão de formatos superiores de vídeo nos próximos anos no SBTVD. / On December 2007, starting from São Paulo city, the open digital terrestrial transmissions were launched in Brazil. A significant improvement of the Brazilian Digital TV System (SBTVD) was the adoption of the H.264/AVC standard supporting the 1080i video format for the high definition video coding. Wide adoption of high definition technology has been a process that can be observed in lots of countries, and new video formats, beyond 1080i have already been discussed and proposed. With both eyes in the next generation of TV, research centers like Japanese broadcaster NHK investigate human factors that should drive the system specifications of this one that may be the last step in terms of 2D television technology. Named UHDTV, the system may support 7680 horizontal dots per 4320 vertical dots in terms of resolution among other features. At the same time, the work exposed here discusses tools that support multimedia coding scalability as a way of gradually improving video formats for broadcast. This work specially deals with the H.264/AVC standard scalability tools, aiming their use within SBTVD. Therefore, the evolution of the system is discussed based on scalability and experimental results related to the digital TV spectral occupation in São Paulo city are analyzed, showing that there is enough exceeding bit rate available for future expansion. Initial results related to SVC coding are also shown, objectively indicating that video scalability is more advantageous than simulcast and that this technique can be used in SBTVD to provide new video formats, keeping compatibility with current receivers that only support 1080i format. This work presents theoretical and experimental contributions towards the adoption of SVC in the SBTVD, pointing out some future works that, if executed, could confirm the transmission of new video format in SBTVD in the next years.
96

Non-expansive symmetrically extended wavelet transform for arbitrarily shaped video object plane.

January 1998 (has links)
by Lai Chun Kit. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1998. / Includes bibliographical references (leaves 68-70). / Abstract also in Chinese. / ACKNOWLEDGMENTS --- p.IV / ABSTRACT --- p.v / Chapter Chapter 1 --- Traditional Image and Video Coding --- p.1 / Chapter 1.1 --- Introduction --- p.1 / Chapter 1.2 --- Fundamental Principle of Compression --- p.1 / Chapter 1.3 --- Entropy - Value of Information --- p.2 / Chapter 1.4 --- Performance Measure --- p.3 / Chapter 1.5 --- Image Coding Overview --- p.4 / Chapter 1.5.1 --- Digital Image Formation --- p.4 / Chapter 1.5.2 --- Needs of Image Compression --- p.4 / Chapter 1.5.3 --- Classification of Image Compression --- p.5 / Chapter 1.5.4 --- Transform Coding --- p.6 / Chapter 1.6 --- Video Coding Overview --- p.8 / Chapter Chapter 2 --- Discrete Wavelets Transform (DWT) and Subband Coding --- p.11 / Chapter 2.1 --- Subband Coding --- p.11 / Chapter 2.1.1 --- Introduction --- p.11 / Chapter 2.1.2 --- Quadrature Mirror Filters (QMFs) --- p.12 / Chapter 2.1.3 --- Subband Coding for Image --- p.13 / Chapter 2.2 --- Discrete Wavelets Transformation (DWT) --- p.15 / Chapter 2.2.1 --- Introduction --- p.15 / Chapter 2.2.2 --- Wavelet Theory --- p.15 / Chapter 2.2.3 --- Comparison Between Fourier Transform and Wavelet Transform --- p.16 / Chapter Chapter 3 --- Non-expansive Symmetric Extension --- p.19 / Chapter 3.1 --- Introduction --- p.19 / Chapter 3.2 --- Types of extension scheme --- p.19 / Chapter 3.3 --- Non-expansive Symmetric Extension and Symmetric Sub-sampling --- p.21 / Chapter Chapter 4 --- Content-based Video Coding in MPEG-4 Purposed Standard --- p.24 / Chapter 4.1 --- Introduction --- p.24 / Chapter 4.2 --- Motivation of the new MPEG-4 standard --- p.25 / Chapter 4.2.1 --- Changes in the production of audio-visual material --- p.25 / Chapter 4.2.2 --- Changes in the consumption of multimedia information --- p.25 / Chapter 4.2.3 --- Reuse of audio-visual material --- p.26 / Chapter 4.2.4 --- Changes in mode of implementation --- p.26 / Chapter 4.3 --- Objective of MPEG-4 standard --- p.27 / Chapter 4.4 --- Technical Description of MPEG-4 --- p.28 / Chapter 4.4.1 --- Overview of MPEG-4 coding system --- p.28 / Chapter 4.4.2 --- Shape Coding --- p.29 / Chapter 4.4.3 --- Shape Adaptive Texture Coding --- p.33 / Chapter 4.4.4 --- Motion Estimation and Compensation (ME/MC) --- p.35 / Chapter Chapter 5 --- Shape Adaptive Wavelet Transformation Coding Scheme (SA WT) --- p.36 / Chapter 5.1 --- Shape Adaptive Wavelet Transformation --- p.36 / Chapter 5.1.1 --- Introduction --- p.36 / Chapter 5.1.2 --- Description of Transformation Scheme --- p.37 / Chapter 5.2 --- Quantization --- p.40 / Chapter 5.3 --- Entropy Coding --- p.42 / Chapter 5.3.1 --- Introduction --- p.42 / Chapter 5.3.2 --- Stack Run Algorithm --- p.42 / Chapter 5.3.3 --- ZeroTree Entropy (ZTE) Coding Algorithm --- p.45 / Chapter 5.4 --- Binary Shape Coding --- p.49 / Chapter Chapter 6 --- Simulation --- p.51 / Chapter 6.1 --- Introduction --- p.51 / Chapter 6.2 --- SSAWT-Stack Run --- p.52 / Chapter 6.3 --- SSAWT-ZTR --- p.53 / Chapter 6.4 --- Simulation Results --- p.55 / Chapter 6.4.1 --- SSAWT - STACK --- p.55 / Chapter 6.4.2 --- SSAWT ´ؤ ZTE --- p.56 / Chapter 6.4.3 --- Comparison Result - Cjpeg and Wave03. --- p.57 / Chapter 6.5 --- Shape Coding Result --- p.61 / Chapter 6.6 --- Analysis --- p.63 / Chapter Chapter 7 --- Conclusion --- p.64 / Appendix A: Image Segmentation --- p.65 / Reference --- p.68
97

Media Scaling for Power Optimization on Wireless Video Sensors

Lu, Rui 23 August 2007 (has links)
"Video-based sensor networks can be used to improve environment surveillance, health care and emergency response. Many sensor network scenarios require multiple high quality video streams that share limited wireless bandwidth. At the same time, the lifetime of wireless video sensors are constrained by the capacity of their batteries. Media scaling may extend battery life by reducing the video data rate while still maintaining visual quality, but comes at the expense of additional compression time. This thesis studies the effects of media scaling on video sensor energy consumption by: measuring the energy consumption on the different components of the video sensor; building a energy consumption model with several adjustable parameters to analyze the performance of a video sensor; exploring the trade-offs between the video quality and the energy consumption for a video sensor; and, finally, building a working video sensor to validate the accuracy of the model. The results show that the model is an accurate representation of the power usage of an actual video sensor. In addition, media scaling is often an effective way to reduce energy consumption in a video sensor."
98

Machine learning mode decision for complexity reduction and scaling in video applications

Grellert, Mateus January 2018 (has links)
As recentes inovações em técnicas de Aprendizado de Máquina levaram a uma ampla utilização de modelos inteligentes para resolver problemas complexos que são especialmente difíceis de computar com algoritmos e estruturas de dados convencionais. Em particular, pesquisas recentes em Processamento de Imagens e Vídeo mostram que é possível desenvolver modelos de Aprendizado de Máquina que realizam reconhecimento de objetos e até mesmo de ações com altos graus de confiança. Além disso, os últimos avanços em algoritmos de treinamento para Redes Neurais Profundas (Deep Learning Neural Networks) estabeleceram um importante marco no estudo de Aprendizado de Máquina, levando a descobertas promissoras em Visão Computacional e outras aplicações. Estudos recentes apontam que também é possível desenvolver modelos inteligentes capazes de reduzir drasticamente o espaço de otimização do modo de decisão em codificadores de vídeo com perdas irrelevantes em eficiência de compressão. Todos esses fatos indicam que Aprendizado de Máquina para redução de complexidade em aplicações de vídeo é uma área promissora para pesquisa. O objetivo desta tese é investigar técnicas baseadas em aprendizado para reduzir a complexidade das decisões da codificação HEVC, com foco em aplicações de codificação e transcodificação rápidas. Um perfilamento da complexidade em codificadores é inicialmente apresentado, a fim de identificar as tarefas que requerem prioridade para atingir o objetivo dessa tese. A partir disso, diversas variáveis e métricas são extraídas durante os processos de codificação e decodificação para avaliar a correlação entre essas variáveis e as decisões de codificação associadas a essas tarefas. Em seguida, técnicas de Aprendizado de Máquina são empregadas para construir classificadores que utilizam a informação coletada para prever o resultado dessas decisões, eliminando o custo computacional necessário para computá-las. As soluções de codificação e transcodificação foram desenvolvidas separadamente, pois o tipo de informação é diferente em cada caso, mas a mesma metologia foi aplicada em ambos os casos. Além disso, mecanismos de complexidade escalável foram desenvolvidos para permitir o melhor desempenho taxa-compressão para um dado valor de redução de complexidade. Resultados experimentais apontam que as soluções desenvolvidas para codificação rápida atingiram reduções de complexidade entre 37% e 78% na média, com perdas de qualidade entre 0.04% e 4.8% (medidos em Bjontegaard Delta Bitrate – BD-BR). Já as soluções para trancodificação rápida apresentaram uma redução de 43% até 67% na complexidade, com BD-BR entre 0.34% e 1.7% na média. Comparações com o estado da arte confirmam a eficácia dos métodos desenvolvidos, visto que são capazes de superar os resultados atingidos por soluções similares. / The recent innovations in Machine Learning techniques have led to a large utilization of intelligent models to solve complex problems that are especially hard to compute with traditional data structures and algorithms. In particular, the current research on Image and Video Processing shows that it is possible to design Machine Learning models that perform object recognition and even action recognition with high confidence levels. In addition, the latest progress on training algorithms for Deep Learning Neural Networks was also an important milestone in Machine Learning, leading to prominent discoveries in Computer Vision and other applications. Recent studies have also shown that it is possible to design intelligent models capable of drastically reducing the optimization space of mode decision in video encoders with minor losses in coding efficiency. All these facts indicate that Machine Learning for complexity reduction in visual applications is a very promising field of study. The goal of this thesis is to investigate learning-based techniques to reduce the complexity of the HEVC encoding decisions, focusing on fast video encoding and transcoding applications. A complexity profiling of HEVC is first presented to identify the tasks that must be prioritized to accomplish our objective. Several variables and metrics are then extracted during the encoding and decoding processes to assess their correlation with the encoding decisions associated with these tasks. Next, Machine Learning techniques are employed to construct classifiers that make use of this information to accurately predict the outcome of these decisions, eliminating the timeconsuming operations required to compute them. The fast encoding and transcoding solutions were developed separately, as the source of information is different on each case, but the same methodology was followed in both cases. In addition, mechanisms for complexity scalability were developed to provide the best rate-distortion performance given a target complexity reduction. Experimental results demonstrated that the designed fast encoding solutions achieve time savings of 37% up to 78% on average, with Bjontegaard Delta Bitrate (BD-BR) increments between 0.04% and 4.8%. In the transcoding results, a complexity reduction ranging from 43% to 67% was observed, with average BD-BR increments from 0.34% up to 1.7%. Comparisons with state of the art confirm the efficacy of the designed methods, as they outperform the results achieved by related solutions.
99

Object-based scalable wavelet image and video coding. / CUHK electronic theses & dissertations collection

January 2008 (has links)
The first part of this thesis studies advanced wavelet transform techniques for scalable still image object coding. In order to adapt to the content of a given signal and obtain more flexible adaptive representation, two advanced wavelet transform techniques, wavelet packet transform and directional wavelet transform, are developed for object-based image coding. Extensive experiments demonstrate that the new wavelet image coding systems perform comparable to or better than state-of-the-art in image compression while possessing some attractive features such as object-based coding functionality and high coding scalability. / The objective of this thesis is to develop an object-based coding framework built upon a family of wavelet coding techniques for a variety of arbitrarily shaped visual object scalable coding applications. Two kinds of arbitrarily shaped visual object scalable coding techniques are investigated in this thesis. One is object-based scalable wavelet still image coding; another is object-based scalable wavelet video coding. / The second part of this thesis investigates various components of object-based scalable wavelet video coding. A generalized 3-D object-based directional threading, which unifies the concepts of temporal motion threading and spatial directional threading, is seamlessly incorporated into 3-D shape-adaptive directional wavelet transform to exploit the spatio-temporal correlation inside the 3-D video object. To improve the computational efficiency of multi-resolution motion estimation (MRME) in shift-invariant wavelet domain, two fast MRME algorithms are proposed for wavelet-based scalable video coding. As demonstrated in the experiments, the proposed 3-D object-based wavelet video coding techniques consistently outperform MPEG-4 and other wavelet-based schemes for coding arbitrarily shaped video object, while providing full spatio-temporal-quality scalability with non-redundant 3-D subband decomposition. / Liu, Yu. / Adviser: King Ngi Ngan. / Source: Dissertation Abstracts International, Volume: 70-06, Section: B, page: 3693. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2008. / Includes bibliographical references (leaves 166-173). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts in English and Chinese. / School code: 1307.
100

End to end Multi-Objective Optimisation of H.264 and HEVC CODECs

Al Barwani, Maryam Mohsin Salim January 2018 (has links)
All multimedia devices now incorporate video CODECs that comply with international video coding standards such as H.264 / MPEG4-AVC and the new High Efficiency Video Coding Standard (HEVC) otherwise known as H.265. Although the standard CODECs have been designed to include algorithms with optimal efficiency, large number of coding parameters can be used to fine tune their operation, within known constraints of for e.g., available computational power, bandwidth, consumer QoS requirements, etc. With large number of such parameters involved, determining which parameters will play a significant role in providing optimal quality of service within given constraints is a further challenge that needs to be met. Further how to select the values of the significant parameters so that the CODEC performs optimally under the given constraints is a further important question to be answered. This thesis proposes a framework that uses machine learning algorithms to model the performance of a video CODEC based on the significant coding parameters. Means of modelling both the Encoder and Decoder performance is proposed. We define objective functions that can be used to model the performance related properties of a CODEC, i.e., video quality, bit-rate and CPU time. We show that these objective functions can be practically utilised in video Encoder/Decoder designs, in particular in their performance optimisation within given operational and practical constraints. A Multi-objective Optimisation framework based on Genetic Algorithms is thus proposed to optimise the performance of a video codec. The framework is designed to jointly minimize the CPU Time, Bit-rate and to maximize the quality of the compressed video stream. The thesis presents the use of this framework in the performance modelling and multi-objective optimisation of the most widely used video coding standard in practice at present, H.264 and the latest video coding standard, H.265/HEVC. When a communication network is used to transmit video, performance related parameters of the communication channel will impact the end-to-end performance of the video CODEC. Network delays and packet loss will impact the quality of the video that is received at the decoder via the communication channel, i.e., even if a video CODEC is optimally configured network conditions will make the experience sub-optimal. Given the above the thesis proposes a design, integration and testing of a novel approach to simulating a wired network and the use of UDP protocol for the transmission of video data. This network is subsequently used to simulate the impact of packet loss and network delays on optimally coded video based on the framework previously proposed for the modelling and optimisation of video CODECs. The quality of received video under different levels of packet loss and network delay is simulated, concluding the impact on transmitted video based on their content and features.

Page generated in 0.055 seconds