Spelling suggestions: "subject:"video compression"" "subject:"ideo compression""
251 |
Estimation of LRD present in H.264 video traces using wavelet analysis and proving the paramount of H.264 using OPF technique in wi-fi environmentJayaseelan, John January 2012 (has links)
While there has always been a tremendous demand for streaming video over Wireless networks, the nature of the application still presents some challenging issues. These applications that transmit coded video sequence data over best-effort networks like the Internet, the application must cope with the changing network behaviour; especially, the source encoder rate should be controlled based on feedback from a channel estimator that explores the network intermittently. The arrival of powerful video compression techniques such as H.264, which advance in networking and telecommunications, opened up a whole new frontier for multimedia communications. The aim of this research is to transmit the H.264 coded video frames in the wireless network with maximum reliability and in a very efficient manner. When the H.264 encoded video sequences are to be transmitted through wireless network, it faces major difficulties in reaching the destination. The characteristics of H.264 video coded sequences are studied fully and their capability of transmitting in wireless networks are examined and a new approach called Optimal Packet Fragmentation (OPF) is framed and the H.264 coded sequences are tested in the wireless simulated environment. This research has three major studies involved in it. First part of the research has the study about Long Range Dependence (LRD) and the ways by which the self-similarity can be estimated. For estimating the LRD a few studies are carried out and Wavelet-based estimator is selected for the research because Wavelets incarcerate both time and frequency features in the data and regularly provides a more affluent picture than the classical Fourier analysis. The Wavelet used to estimate the self-similarity by using the variable called Hurst Parameter. Hurst Parameter tells the researcher about how a data can behave inside the transmitted network. This Hurst Parameter should be calculated for a more reliable transmission in the wireless network. The second part of the research deals with MPEG-4 and H.264 encoder. The study is carried out to prove which encoder is superior to the other. We need to know which encoder can provide excellent Quality of Service (QoS) and reliability. This study proves with the help of Hurst parameter that H.264 is superior to MPEG-4. The third part of the study is the vital part in this research; it deals with the H.264 video coded frames that are segmented into optimal packet size in the MAC Layer for an efficient and more reliable transfer in the wireless network. Finally the H.264 encoded video frames incorporated with the Optimal Packet Fragmentation are tested in the NS-2 wireless simulated network. The research proves the superiority of H.264 video encoder and OPF's master class.
|
252 |
Chirurgie robotique : de l'apprentissage à l'application / Telesurgery : from training to implementationPerez, Manuela 14 September 2012 (has links)
Le développement croissant de la chirurgie robotique pose le problème de la formation. Cette nouvelle technologie tend à suppléer dans les procédures complexes la coelioscopie. Elle nécessite une adaptation du chirurgien. Il est, en effet, nécessaire de maîtriser à la fois le télémanipulateur et les procédures chirurgicales, qui ne sont pas de simples transpositions des gestes coelioscopiques. Initialement, nous avons réalisé un historique du développement de la chirurgie mini-invasive coelioscopique et robotique, ainsi qu'un historique de l'apprentissage de la chirurgie. Puis, nous nous sommes intéressés à l'apprentissage de la robotique. Les simulateurs de chirurgie coelioscopique sont très couramment employés dans l'apprentissage. En robotiques, ils ont fait récemment leur apparition sur le marché. Nous avons étudié la validité du simulateur dV-Trainer dans l'apprentissage de la chirurgie robotique. Nous avons démontré l'intérêt de cet outil dans l'acquisition de la gestuelle et des automatismes propres au robot. Nous avons par ailleurs étudié l'impact d'une formation en micro-chirurgie sur les performances développées en chirurgie robotique car, au cours d'une étude préliminaire nous avions constaté que les micro-chirurgiens présentaient de meilleures aptitudes sur le simulateur de chirurgie robotique que ceux sans expérience en micro-chirurgie. Dans un troisième temps, nous nous sommes intéressés à la téléchirurgie à longue distance qui est impactée par deux contraintes que sont la latence de transmission et le volume des informations à transmettre. Une première étude a étudié l'impact du délai de transmission sur les performances des chirurgiens. Une deuxième étude a consisté à réaliser une évaluation subjective par des chirurgiens de la qualité de vidéos de chirurgie robotique compressées afin de déterminer un seuil de compression maximal acceptable / The huge expansion of minimally invasive robotic devices for surgery ask the question of the training of this new technology. Progress of robotic-assisted surgical techniques allows today mini- invasive surgery to be more accurate, providing benefits to surgeons and patients for complex surgical procedures. But, it resulted from an increasing need for training and development of new pedagogical strategies. Indeed, the surgeon has to master the telemanipulator and the procedure, which is different from a simple transposition of a laparoscopic skill. The first part of this work treats about historical development of minimal invasive surgery from laparoscopy to robotic surgery. We also develop the evolution of training program in surgery. Virtual simulators provide efficient tools for laparoscopy training. The second part of this work, study some possible solutions for robotic training. We assess the validity of a new robotic virtual simulator (dV-Trainer). We demonstrate the usefulness of this tool for the acquisition of the basic gesture for robotic surgery. Then, we evaluate the impact of a previous experience in micro-surgery on robotic training. We propose a prospective study comparing the surgical performance of micro-surgeons to that of general surgeons on a robotic simulator. We want to determine if this experience in micro-surgery could significantly improve the abilities and surgeons performance in the field of basic gesture in robotic surgery. The last part of the study also looks to the future. Currently, telesurgery need sophisticated dedicated technical resources. We want to develop procedures for clinical routine used. Therefore, we evaluate the impact of the delay on the surgical procedure. Also, reducing data volume allow decreasing latency. An appropriate solution to reduce the amount of data could be found by introducing lossy compression for the transmission using the well-known MPEG-2 and H-264 standards
|
253 |
Video transcoding using machine learningUnknown Date (has links)
The field of Video Transcoding has been evolving throughout the past ten years. The need for transcoding of video files has greatly increased because of the new upcoming standards which are incompatible with old ones. This thesis takes the method of using machine learning for video transcoding mode decisions and discusses ways to improve the process of generating the algorithm for implementation in different video transcoders. The transcoding methods used decrease the complexity in the mode decision inside the video encoder. Also methods which automate and improve results are discussed and implemented in two different sets of transcoders: H.263 to VP6 , and MPEG-2 to H.264. Both of these transcoders have shown a complexity loss of almost 50%. Video transcoding is important because the quantity of video standards have been increasing while devices usually can only decode one specific codec. / by Christopher Holder. / Thesis (M.S.C.S.)--Florida Atlantic University, 2008. / Includes bibliography. / Electronic reproduction. Boca Raton, Fla., 2008. Mode of access: World Wide Web.
|
254 |
Codeur vidéo scalable haute-fidélité SHVC modulable et parallèle / Modulr and parallel scalable high efficiency SHVC video encoderParois, Ronan 27 February 2018 (has links)
Après l'entrée dans l'ère du numérique, la consommation vidéo a évolué définissant de nouvelles tendances. Les contenus vidéo sont désormais accessibles sur de nombreuses plateformes (télévision, ordinateur, tablette, smartphone ... ) et par de nombreux moyens, comme les réseaux mobiles, les réseaux satellites, les réseaux terrestres, Internet ou le stockage Blu-ray par exemple. Parallèlement, l'expérience utilisateur s'améliore grâce à la définition de nouveaux formats comme l'Ultra Haute Définition (UHD), le « High Dynamic Range » (HDR) ou le « High Frame Rate » (HFR). Ces formats considèrent une augmentation respectivement de la résolution, de la dynamique des couleurs et de la fréquence d'image. Les nouvelles tendances de consommation et les améliorations des formats imposent de nouvelles contraintes auxquelles doivent répondre les codeurs vidéo actuels et futurs. Dans ce contexte, nous proposons une solution de codage vidéo permettant de répondre à des contraintes de codage multi-formats, multi-destinations, rapide et efficace en termes de compression. Cette solution s'appuie sur l'extension Scalable du standard de compression vidéo « High Efficiency Video Coding » (HEVC) définie en fin d'année 2014, aussi appelée SHVC. Elle permet de réaliser des codages scalables en produisant un unique bitstream à partir d'un codage sur plusieurs couches construites à partir d'une même vidéo à différentes échelles de résolutions, fréquences, niveaux de qualité, profondeurs des pixels ou espaces de couleur. Le codage SHVC améliore l'efficacité du codage HEVC grâce à une prédiction inter-couches qui consistent à employer les informations de codage issues des couches les plus basses. La solution proposée dans cette thèse s'appuie sur un codeur HEVC professionnel développé par la société Ateme qui intègre plusieurs niveaux de parallélisme (inter-images, intra-images, inter-blocs et inter-opérations) grâce à une architecture en pipeline. Deux instances parallèles de ce codeur sont synchronisées via un décalage inter-pipelines afin de réaliser une prédiction inter-couches. Des compromis entre complexité et efficacité de codage sont effectués au sein de cette prédiction au niveau des types d'image et des outils de prédiction. Dans un cadre de diffusion, par exemple, la prédiction inter-couches est effectuée sur les textures pour une image sur deux. A qualité constante, ceci permet d'économiser 18.5% du débit pour une perte de seulement 2% de la vitesse par rapport à un codage HEVC. L'architecture employée permet alors de réaliser tous les types de scalabilité supportés par l'extension SHVC. De plus, pour une scalabilité en résolution, nous proposons un filtre de sous-échantillonnage, effectué sur la couche de base, qui optimise le coût en débit global. Nous proposons des modes de qualité intégrant plusieurs niveaux de parallélisme et optimisations à bas niveau qui permettent de réaliser des codages en temps-réel sur des formats UHD. La solution proposée a été intégrée dans une chaîne de diffusion vidéo temps-réel et montrée lors de plusieurs salons, conférences et meetinqs ATSC 3.0. / After entering the digital era, video consumption evolved and defined new trends. Video contents can now be accessed with many platforms (television, computer, tablet, smartphones ... ) and from many medias such as mobile network or satellite network or terrestrial network or Internet or local storage on Blu-ray disc for instance. In the meantime, users experience improves thanks to new video format such as Ultra High Definition (UHD) or High Dynamic Range (HOR) or High Frame Rate (HFR). These formats respectively enhance quality through resolution, dynamic range and frequency. New consumption trends and new video formats define new constraints that have to be resolved by currents and futures video encoders. In this context, we propose a video coding solution able to answer constraints such as multi-formats coding, multidestinations coding, coding speed and coding efficiency in terms of video compression. This solution relies on the scalable extension of the standard « High Efficiency Video Coding » (HEVC) defined in 2014 also called SHVC. This extension enables scalable video coding by producing a single bitstream on several layers built from a common video at different scales of resolution, frequency, quality, bit depth per pixel or even colour gamut. SHVC coding enhance HEVC coding thanks to an inter-layer prediction that use coding information from lower layers. In this PhD thesis, the proposed solution is based on a professional video encoder, developed by Ateme company, able to perform parallelism on several levels (inter-frames, intra-frames, inter-blocks, inter-operations) thanks to a pipelined architecture. Two instances of this encoder run in parallel and are synchronised at pipeline level to enable inter-layer predictions. Some trade-off between complexity and coding efficiency are proposed on inter-layer prediction at slice and prediction tools levels. For instance, in a broadcast configuration, inter-layer prediction is processed on reconstructed pictures only for half the frames of the bitstream. In a constant quality configuration, it enables to save 18.5% of the coding bitrate for only 2% loss in terms of coding speed compared to equivalent HEVC coding. The proposed architecture is also able to perform all kinds of scalability supported in the SHVC extension. Moreover, in spatial scalability, we propose a down-sampling filter processed on the base layer that optimized global coding bitrate. We propose several quality modes with parallelism on several levels and low-level optimization that enable real-time video coding on UHD sequences. The proposed solution was integrated in a video broadcast chain and showed in several professional shows, conferences and at ATSC 3.0 meetings.
|
255 |
Multi-View Video Transmission over the InternetAbdullah Jan, Mirza, Ahsan, Mahmododfateh January 2010 (has links)
<p>3D television using multiple views rendering is receiving increasing interest. In this technology a number of video sequences are transmitted simultaneously and provides a larger view of the scene or stereoscopic viewing experience. With two views stereoscopic rendition is possible. Nowadays 3D displays are available that are capable of displaying several views simultaneously and the user is able to see different views by moving his head.</p><p>The thesis work aims at implementing a demonstration system with a number of simultaneous views. The system will include two cameras, computers at both the transmitting and receiving end and a multi-view display. Besides setting up the hardware, the main task is to implement software so that the transmission can be done over an IP-network.</p><p>This thesis report includes an overview and experiences of similar published systems, the implementation of real time video, its compression, encoding, and transmission over the internet with the help of socket programming and finally the multi-view display in 3D format. This report also describes the design considerations more precisely regarding the video coding and network protocols.</p>
|
256 |
Very Low Bitrate Video Communication : A Principal Component Analysis ApproachSöderström, Ulrik January 2008 (has links)
A large amount of the information in conversations come from non-verbal cues such as facial expressions and body gesture. These cues are lost when we don't communicate face-to-face. But face-to-face communication doesn't have to happen in person. With video communication we can at least deliver information about the facial mimic and some gestures. This thesis is about video communication over distances; communication that can be available over networks with low capacity since the bitrate needed for video communication is low. A visual image needs to have high quality and resolution to be semantically meaningful for communication. To deliver such video over networks require that the video is compressed. The standard way to compress video images, used by H.264 and MPEG-4, is to divide the image into blocks and represent each block with mathematical waveforms; usually frequency features. These mathematical waveforms are quite good at representing any kind of video since they do not resemble anything; they are just frequency features. But since they are completely arbitrary they cannot compress video enough to enable use over networks with limited capacity, such as GSM and GPRS. Another issue is that such codecs have a high complexity because of the redundancy removal with positional shift of the blocks. High complexity and bitrate means that a device has to consume a large amount of energy for encoding, decoding and transmission of such video; with energy being a very important factor for battery-driven devices. Drawbacks of standard video coding mean that it isn't possible to deliver video anywhere and anytime when it is compressed with such codecs. To resolve these issues we have developed a totally new type of video coding. Instead of using mathematical waveforms for representation we use faces to represent faces. This makes the compression much more efficient than if waveforms are used even though the faces are person-dependent. By building a model of the changes in the face, the facial mimic, this model can be used to encode the images. The model consists of representative facial images and we use a powerful mathematical tool to extract this model; namely principal component analysis (PCA). This coding has very low complexity since encoding and decoding only consist of multiplication operations. The faces are treated as single encoding entities and all operations are performed on full images; no block processing is needed. These features mean that PCA coding can deliver high quality video at very low bitrates with low complexity for encoding and decoding. With the use of asymmetrical PCA (aPCA) it is possible to use only semantically important areas for encoding while decoding full frames or a different part of the frames. We show that a codec based on PCA can compress facial video to a bitrate below 5 kbps and still provide high quality. This bitrate can be delivered on a GSM network. We also show the possibility of extending PCA coding to encoding of high definition video.
|
257 |
Wireless video sensor network and its applications in digital zooKarlsson, Johannes January 2010 (has links)
Most computing and communicating devices have been personal computers that were connected to Internet through a fixed network connection. It is believed that future communication devices will not be of this type. Instead the intelligence and communication capability will move into various objects that surround us. This is often referred to as the "Internet of Things" or "Wireless Embedded Internet". This thesis deals with video processing and communication in these types of systems. One application scenario that is dealt with in this thesis is real-time video transmission over wireless ad-hoc networks. Here a set of devices automatically form a network and start to communicate without the need for any previous infrastructure. These devices act as both hosts and routers and can build up large networks where they forward information for each other. We have identified two major problems when sending real-time video over wireless ad-hoc networks. One is the reactive design used by most ad-hoc routing protocols. When nodes move some links that are used in the communication path between the sender and the receiver may disappear. The reactive routing protocols wait until some links on the path breaks and then start to search for a new path. This will lead to long interruptions in packet delivery and does not work well for real-time video transmission. Instead we propose an approach where we identify when a route is about to break and start to search for new routes before this happen. This is called a proactive approach. Another problem is that video codecs are very sensitive for packet losses and at the same time the wireless ad-hoc network is very error prone. The most common way to handle lost packets in video codecs is to periodically insert frames that are not predictively coded. This method periodically corrects errors regardless there has been an error or not. The method we propose is to insert frames that are not predictively coded directly after a packet has been lost, and only if a packet has been lost. Another area that is dealt with in this thesis is video sensor networks. These are small devices that have communication and computational capacity, they are equipped with an image sensor so that they can capture video. Since these devices in general have very limited resources in terms of energy, computation, communication and memory they demand a lot of the video compression algorithms used. In standard video compression algorithms the complexity is high for the encoder while the decoder has low complexity and is just passively controlled by the encoder. We propose video compression algorithms for wireless video sensor networks where complexity is reduced in the encoder by moving some of the image analysis to the decoder side. We have implemented our approach on actual low-power sensor nodes to test our developed algorithms. Finally we have built a "Digital Zoo" that is a complete system including a large scale outdoor video sensor network. The goal is to use the collected data from the video sensor network to create new experiences for physical visitors in the zoo, or "cyber" visitors from home. Here several topics that relate to practical deployments of sensor networks are addressed.
|
258 |
SSIM-Inspired Quality Assessment, Compression, and Processing for Visual CommunicationsRehman, Abdul January 2013 (has links)
Objective Image and Video Quality Assessment (I/VQA) measures predict image/video quality as perceived by human beings - the ultimate consumers of visual data. Existing research in the area is mainly limited to benchmarking and monitoring of visual data. The use of I/VQA measures in the design and optimization of image/video processing algorithms and systems is more desirable, challenging and fruitful but has not been well explored. Among the recently proposed objective I/VQA approaches, the structural similarity (SSIM) index and its variants have emerged as promising measures that show superior performance as compared to the widely used mean squared error (MSE) and are computationally simple compared with other state-of-the-art perceptual quality measures. In addition, SSIM has a number of desirable mathematical properties for optimization tasks. The goal of this research is to break the tradition of using MSE as the optimization criterion for image and video processing algorithms. We tackle several important problems in visual communication applications by exploiting SSIM-inspired design and optimization to achieve significantly better performance.
Firstly, the original SSIM is a Full-Reference IQA (FR-IQA) measure that requires access to the original reference image, making it impractical in many visual communication applications. We propose a general purpose Reduced-Reference IQA (RR-IQA) method that can estimate SSIM with high accuracy with the help of a small number of RR features extracted from the original image. Furthermore, we introduce and demonstrate the novel idea of partially repairing an image using RR features. Secondly, image processing algorithms such as image de-noising and image super-resolution are required at various stages of visual communication systems, starting from image acquisition to image display at the receiver. We incorporate SSIM into the framework of sparse signal representation and non-local means methods and demonstrate improved performance in image de-noising and super-resolution. Thirdly, we incorporate SSIM into the framework of perceptual video compression. We propose an SSIM-based rate-distortion optimization scheme and an SSIM-inspired divisive optimization method that transforms the DCT domain frame residuals to a perceptually uniform space. Both approaches demonstrate the potential to largely improve the rate-distortion performance of state-of-the-art video codecs. Finally, in real-world visual communications, it is a common experience that end-users receive video with significantly time-varying quality due to the variations in video content/complexity, codec configuration, and network conditions. How human visual quality of experience (QoE) changes with such time-varying video quality is not yet well-understood. We propose a quality adaptation model that is asymmetrically tuned to increasing and decreasing quality. The model improves upon the direct SSIM approach in predicting subjective perceptual experience of time-varying video quality.
|
259 |
Scalable video communications: bitstream extraction algorithms for streaming, conferencing and 3DTVPalaniappan, Ramanathan 19 August 2011 (has links)
This research investigates scalable video communications and its applications to video streaming, conferencing and 3DTV. Scalable video coding (SVC) is a layer-based encoding scheme that provides spatial, temporal and quality scalability. Heterogeneity of the Internet and clients' operating environment necessitate the adaptation of media content to ensure a satisfactory multimedia experience. SVC's layer structure allows the extraction of partial bitstreams at reduced spatial, quality and temporal resolutions that adjust the media bitrate at a fine granularity to changes in network state. The main focus of this research work is in developing such extraction algorithms in the context of SVC. Based on a combination of metadata computations and prediction mechanisms, these algorithms evaluate the quality contribution of each layer in the SVC bitstream and make extraction decisions that are aimed at maximizing video quality while operating within the available bandwidth resources. These techniques are applied in two-way interaction and one-way streaming of 2D and 3D content. Depending on the delay tolerance of these applications, rate-distortion optimized extraction algorithms are proposed. For conferencing applications, the extraction decisions are made over single frames and frame pairs due to tight end-to-end delay constraints. The proposed extraction algorithms for 3D content streaming maximize the overall perceived 3D quality based on human stereoscopic perception. When compared to current extraction methods, the new algorithms offer better video quality at a given bitrate while performing lesser number of metadata computations in the post-encoding phase. The solutions proposed for each application achieve the recurring goal of maintaining the best possible level of end-user quality of multimedia experience in spite of network impairments.
|
260 |
Vector flow model in video estimation and effects of network congestion in low bit-rate compression standards [electronic resource] / by Balaji Ramadoss.Ramadoss, Balaji. January 2003 (has links)
Title from PDF of title page. / Document formatted into pages; contains 76 pages. / Thesis (M.S.E.E.)--University of South Florida, 2003. / Includes bibliographical references. / Text (Electronic thesis) in PDF format. / ABSTRACT: The use of digitized information is rapidly gaining acceptance in bio-medical applications. Video compression plays an important role in the archiving and transmission of different digital diagnostic modalities. The present scheme of video compression for low bit-rate networks is not suitable for medical video sequences. The instability is the result of block artifacts resulting from the block based DCT coefficient quantization. The possibility of applying deformable motion estimation techniques to make the video compression standard (H.263) more adaptable for bio-medial applications was studied in detail. The study on the network characteristics and the behavior of various congestion control mechanisms was used to analyze the complete characteristics of existing low bit rate video compression algorithms. The study was conducted in three phases. The first phase involved the implementation and study of the present H.263 compression standard and its limitations. / ABSTRACT: The second phase dealt with the analysis of an external force for active contours which was used to obtain estimates for deformable objects. The external force, which is termed Gradient Vector Flow (GVF), was computed as a diffusion of the gradient vectors associated with a gray-level or binary edge map derived from the image. The mathematical aspect of a multi-scale framework based on a medial representation for the segmentation and shape characterization of anatomical objects in medical imagery was derived in detail. The medial representations were based on a hierarchical representation of linked figural models such as protrusions, indentations, neighboring figures and included figures--which represented solid regions and their boundaries. The third phase dealt with the vital parameters for effective video streaming over the internet in the bottleneck bandwidth, which gives the upper limit for the speed of data delivery from one end point to the other in a network. / ABSTRACT: If a codec attempts to send data beyond this limit, all packets above the limit will be lost. On the other hand, sending under this limit will clearly result in suboptimal video quality. During this phase the packet-drop-rate (PDR) performance of TCP(1/2) was investigated in conjunction with a few representative TCP-friendly congestion control protocols (CCP). The CCPs were TCP(1/256), SQRT(1/256) and TFRC (256), with and without self clocking. The CCPs were studied when subjected to an abrupt reduction in the available bandwidth. Additionally, the investigation studied the effect on the drop rates of TCP-Compatible algorithms by changing the queuing scheme from Random Early Detection (RED) to DropTail. / System requirements: World Wide Web browser and PDF reader. / Mode of access: World Wide Web.
|
Page generated in 0.096 seconds