Global ETD Search

81	Transform Coefficient Thresholding and Lagrangian Optimization for H.264 Video Coding / Transformkoefficient-tröskling och Lagrangeoptimering för H.264 Videokodning Carlsson, Pontus January 2004 (has links) H.264, also known as MPEG-4 Part 10: Advanced Video Coding, is the latest MPEG standard for video coding. It provides approximately 50% bit rate savings for equivalent perceptual quality compared to any previous standard. In the same fashion as previous MPEG standards, only the bitstream syntax and the decoder are specified. Hence, coding performance is not only determined by the standard itself but also by the implementation of the encoder. In this report we propose two methods for improving the coding performance while remaining fully compliant to the standard. After transformation and quantization, the transform coefficients are usually entropy coded and embedded in the bitstream. However, some of them might be beneficial to discard if the number of saved bits are sufficiently large. This is usually referred to as coefficient thresholding and is investigated in the scope of H.264 in this report. Lagrangian optimization for video compression has proven to yield substantial improvements in perceived quality and the H.264 Reference Software has been designed around this concept. When performing Lagrangian optimization, lambda is a crucial parameter that determines the tradeoff between rate and distortion. We propose a new method to select lambda and the quantization parameter for non-reference frames in H.264. The two methods are shown to achieve significant improvements. When combined, they reduce the bitrate around 12%, while preserving the video quality in terms of average PSNR. To aid development of H.264, a software tool has been created to visualize the coding process and present statistics. This tool is capable of displaying information such as bit distribution, motion vectors, predicted pictures and motion compensated block sizes. Technology Lagrangian Optimization Rate-Distortion Optimization Transform Coefficient Thresholding Preprocessing MATLAB Developer Tool MPEG-4 Part 10 Advanced Video Coding AVC H.26L H.264 TEKNIKVETENSKAP TECHNOLOGY TEKNIKVETENSKAP
82	Fast Mode Selection Algoritm for H.264 Video Coding Hållmarker, Ola, Linderoth, Martin January 2005 (has links) ITU - T and the Moving Picture Expert Group (MPEG) have jointly, under the name of Joint Video Team (JVT), developed a new video coding standard. The standard is called H.264 and is also known as Advanced Video Coding (AVC) or MPEG-4 part 10. Comparisons shows that H.264 greatly outperforms MPEG-2, currently used in DVD and digital TV. H.264 halves the bit rate with equal image quality. The great rate - distortion performance means nevertheless a high computational complexity. Especially on the encoder side. Handling of audio and video, e.g. compressing and filtering, is quite complex and requires high performance hardware and software. A video encoder consists of a number of modules that find the best coding parameters. For each macroblock several $modes$ are evaluated in order to achieve optimal coding. The reference implementation of H.264 uses a brute force search for this mode selection which is extremely computational constraining. In order to perform video encoding with satisfactory speed there is an obvious need for reducing the amount of modes that are evaluated. This thesis proposes an algorithm which reduces the number of modes and reference frames that are evaluated. The algorithm can be regulated in order to fulfill the demand on quality versus speed. Six times faster encoding can be obtained without loosing perceptual image quality. By allowing some quality degradation the encoding becomes up to 20 times faster. Advanced Video Coding AVC H.264 Mode Selection MPEG-4 Part 10 Multiple Reference Frames Real Time Coding Engineering physics Teknisk fysik
83	2-d Mesh-based Motion Estimation And Video Object Manipulation Kaval, Huseyin 01 September 2007 (has links) (PDF) Motion estimation and compensation plays an important role in video processing applications. Two-dimensional block-based and mesh-based models are widely used in this area. A 2-D mesh-based model provides a better representation of complex real world motion than a block-based model. Mesh-based motion estimation algorithms are employed in both frame-based and object-based video compression and coding. A hierarchical mesh-based algorithm is applied to improve the motion field generated by a single-layer algorithm. 2-D mesh-based models also enable the manipulation of video objects which is included in the MPEG-4 standard. A video object in a video clip can be replaced by another object by the use of a dynamic mesh structure. In this thesis, a comparative analysis of 2-D block-based and mesh-based motion estimation algorithms in both frame-based and object-based video representations is performed. The experimental results indicate that a mesh-based algorithm produces better motion compensation results than a block-based algorithm. Moreover, a two-layer mesh-based algorithm shows improvement over a one-layer mesh-based algorithm. The application of mesh-based motion estimation and compensation to video object replacement and animation is also performed.
84	MPEG-4 Facial Feature Point Editor / Editor för MPEG-4 "feature points" Lundberg, Jonas January 2002 (has links) <p>The use of computer animated interactive faces in film, TV, games is ever growing, with new application areas emerging also on the Internet and mobile environments. Morph targets are one of the most popular methods to animate the face. Up until now 3D artists had to design each morph target defined by the MPEG-4 standard by hand. This is a very monotonous and tedious task. With the newly developed method of Facial Motion Cloning [11]the heavy work is relieved from the artists. From an already animated face model the morph targets can now be copied onto a new static face model. </p><p>For the Facial Motion Cloning process there must be a subset of the feature points specified by the MPEG-4 standard defined. The purpose of this is to correlate the facial features of the two faces. The goal of this project is to develop a graphical editor in which the artists can define the feature points for a face model. The feature points will be saved in a file format that can be used in a Facial Motion Cloning software.</p> Technology MPEG-4 FMC Facial Motion Cloning Feature Point Facial Animation Feature Point Editor VRML OpenGL 3D Graphics TEKNIKVETENSKAP TECHNOLOGY TEKNIKVETENSKAP
85	Virtual human modelling and animation for real-time sign language visualisation van Wyk, Desmond Eustin January 2008 (has links) No description available. 3D Computer Graphics Open Modelling Animation Framework Virtual Human Modelling Animation Sign Language Visualisation Sign Writing Make Human Blender Python H-Anim MPEG-4
86	MPEG-4 AVC stream watermarking Hasnaoui, Marwen 28 March 2014 (has links) (PDF) The present thesis addresses the MPEG-4 AVC stream watermarking and considers two theoretical and applicative challenges, namely ownership protection and content integrity verification.From the theoretical point of view, the thesis main challenge is to develop a unitary watermarking framework (insertion/detection) able to serve the two above mentioned applications in the compressed domain. From the methodological point of view, the challenge is to instantiate this theoretical framework for serving the targeted applications. The thesis first main contribution consists in building the theoretical framework for the multi symbol watermarking based on quantization index modulation (m-QIM). The insertion rule is analytically designed by extending the binary QIM rule. The detection rule is optimized so as to ensure minimal probability of error under additive white Gaussian noise distributed attacks. It is thus demonstrated that the data payload can be increased by a factor of log2m, for prescribed transparency and additive Gaussian noise power. A data payload of 150 bits per minute, i.e. about 20 times larger than the limit imposed by the DCI standard, is obtained. The thesis second main theoretical contribution consists in specifying a preprocessing MPEG-4 AVC shaping operation which can eliminate the intra-frame drift effect. The drift represents the distortion spread in the compressed stream related to the MPEG encoding paradigm. In this respect, the drift distortion propagation problem in MPEG-4 AVC is algebraically expressed and the corresponding equations system is solved under drift-free constraints. The drift-free shaping results in gain in transparency of 2 dB in PSNR Stream watermarking MPEG-4 AVC M-QIM Ownership protection Integrity verification Drift-free
87	Error Detection for DMB Video Streams Irani, Ramin January 2013 (has links) The purpose of this thesis is to detect errors in Digital Multimedia Broadcasting (DMB) transport stream. DMB uses the MPEG-4 standard for encapsulating Packetized Elementary Stream (PES), and uses the MPEG-2 standard for assembling them in the form of transport stream packets. Recently many research works have been carried out about video stream error detection. They mostly do this by focusing on some decoding parameters related to frame. Processing complexity can be a disadvantage for the proposed methods. In this thesis, we investigated syntax error occurrences due to corruption in the header of the video transport stream. The main focus of the study is the video streams that cannot be decoded. The proposed model is implemented by filtering video and audio packets in order to find the errors. The filters investigate some sources that can affect the video stream playback. The output from this method determines the type, location and duration of the errors. The simplicity of the structure is one of advantages of this model. It can be implemented by three simple filters for detecting errors and a “calculation unit” for calculating the duration of an error. Fast processing is another benefit of the proposed model. Digital Multimedia Broadcasting MPEG-2 standard MPEG-4 standard video transport stream Signal Processing Signalbehandling Computer Sciences Datavetenskap (datalogi) Telecommunications Telekommunikation
88	Model-Based Eye Detection and Animation Trejo Guerrero, Sandra January 2006 (has links) In this thesis we present a system to extract the eye motion from a video stream containing a human face and applying this eye motion into a virtual character. By the notation eye motion estimation, we mean the information which describes the location of the eyes in each frame of the video stream. Applying this eye motion estimation into a virtual character, we achieve that the virtual face moves the eyes in the same way than the human face, synthesizing eye motion into a virtual character. In this study, a system capable of face tracking, eye detection and extraction, and finally iris position extraction using video stream containing a human face has been developed. Once an image containing a human face is extracted from the current frame of the video stream, the detection and extraction of the eyes is applied. The detection and extraction of the eyes is based on edge detection. Then the iris center is determined applying different image preprocessing and region segmentation using edge features on the eye picture extracted. Once, we have extracted the eye motion, using MPEG-4 Facial Animation, this motion is translated into the Facial Animation arameters (FAPs). Thus we can improve the quality and quantity of Facial Animation expressions that we can synthesize into a virtual character. Eye motion estimation MPEG-4 Facial Animation Feature Points Facial Animation Parameters Feature extraction face tracking eye tracking
89	Image coding with H.264 I-frames / Stillbildskodning med H.264 I-frames Eklund, Anders January 2007 (has links) In this thesis work a part of the video coding standard H.264 has been implemented. The part of the video coder that is used to code the I-frames has been implemented to see how well suited it is for regular image coding. The big difference versus other image coding standards, such as JPEG and JPEG2000, is that this video coder uses both a predictor and a transform to compress the I-frames, while JPEG and JPEG2000 only use a transform. Since the prediction error is sent instead of the actual pixel values, a lot of the values are zero or close to zero before the transformation and quantization. The method is much like a video encoder but the difference is that blocks of an image are predicted instead of frames in a video sequence. / I det här examensarbetet har en del av videokodningsstandarden H.264 implementerats. Den del av videokodaren som används för att koda s.k. I-bilder har implementerats för att testa hur bra den fungerar för ren stillbildskodning. Den stora skillnaden mot andra stillbildskodningsmetoder, såsom JPEG och JPEG2000, är att denna videokodaren använder både en prediktor och en transform för att komprimera stillbilderna, till skillnad från JPEG och JPEG2000 som bara använder en transform. Eftersom prediktionsfelen skickas istället för själva pixelvärdena så är många värden lika med noll eller nära noll redan innan transformationen och kvantiseringen. Metoden liknar alltså till mycket en ren videokodare, med skillnaden att man predikterar block i en bild istället för bilder i en videosekvens. Image coding H.264 I-frames MPEG-4 Part 10 JPEG JPEG2000 Data compression
90	Multi-View Video Transmission over the Internet Abdullah Jan, Mirza, Ahsan, Mahmododfateh January 2010 (has links) 3D television using multiple views rendering is receiving increasing interest. In this technology a number of video sequences are transmitted simultaneously and provides a larger view of the scene or stereoscopic viewing experience. With two views stereoscopic rendition is possible. Nowadays 3D displays are available that are capable of displaying several views simultaneously and the user is able to see different views by moving his head. The thesis work aims at implementing a demonstration system with a number of simultaneous views. The system will include two cameras, computers at both the transmitting and receiving end and a multi-view display. Besides setting up the hardware, the main task is to implement software so that the transmission can be done over an IP-network. This thesis report includes an overview and experiences of similar published systems, the implementation of real time video, its compression, encoding, and transmission over the internet with the help of socket programming and finally the multi-view display in 3D format. This report also describes the design considerations more precisely regarding the video coding and network protocols. Multi-view Coding Stereoscopic H.264/AVC Video Compression Moving Picture Experts Group (MPEG) MPEG-4 3DTV Sockets Streaming Server Information Systems

Search results