Global ETD Search

11	Improving the Visual Experience When Coding Computer Generated Content Using the H.264 Standard Berthelsen, Nicolai January 2011 (has links) The purpose of this Master thesis was to improve the visual experience when coding computer generated content (CGC) using the H.264 standard. As H.264 is designed primarily to code natural video, it exhibits weaknesses when coding CGC at low bit rates. The thesis has focused on identifying and modifying the components in the H.264 algorithm responsible for the occurrence of unwanted noise artifacts. The research method was based on performing quantitative research to confirm or deny the hypothesis claiming that the H.264 algorithm performs sub-optimally when coding CGC. Experiments were conducted using coders written specically for the thesis. The results from these experiments were then analyzed, and conclusions were drawn based on empirical observations. An implementation of H.264 was used to identify the noise artifacts resulting from coding CGC at low rates. The results indicated that H.264 indeed performs sub-optimally when coding CGC. We learned that the reason for this was that the characteristics of CGC led to the signal being more compactly represented in the spatial domain than in the transform domain. We therefore proposed to omit the component transform and quantize the residual signal directly. This method, called residual scalar quantization (RSQ), was shown to outperform traditional H.264 coding for certain CGC in terms of quantified visual quality and bit rate. However, even when outperformed, the RSQ coder did not exhibit any of the noise artifacts present when coding with the traditional coder. We also introduced Rate-Distortion optimization, which allowed the coder to adaptively choose between traditional and RSQ coding, ensuring that each block is coded optimally, independent of the source content. This scheme was shown to outperform both stand-alone coders for all sample content. A quantizer with representation levels tailored specifically for the characteristics of CGC was also presented, and experiments showed that it outperformed uniform quantization when coding CGC. The results in this thesis were produced by simplified versions of the actual coders, and may not be completely accurate. However, the accumulated results indicate that RSQ may indeed outperform traditional H.264 coding for CGC. To confirm the theories that have been presented, the proposed techniques should be implemented in a full-scale implementation of H.264 and the experiments repeated. ntnudaim:6284 MTKOM kommunikasjonsteknologi Lyd- og bildebehandling
12	Numerisk modellering av spredning fra kuler og sylindere anvendt i romakustikk / Numerical Modelling of Scattering from Spheres and Cylinders applied to Room Acoustics Waagø, Per January 2010 (has links) Denne oppgaven handler om spredning av lyd fra kuler og sylindere. Den diffuserende effekten til en konstellasjon av spredende sylindere i et rom undersøkes ved hjelp av numeriske simuleringer med endelig element-metoden utført i COMSOL Multiphysics. Modellen av rommet er en todimensjonal, forenklet modell inspirert av St. Olavs domkirke i Trondheim hvor det henger flere klynger med kuleformede lamper i glass som trolig bidrar til å gjøre lydfeltet mer diffust. Spredningsmønstrene fra ei kule og en sylinder studeres og sammenlignes analytisk. Det analytiske uttrykket for spredning fra en sylinder blir også benyttet til å evaluere den numeriske løsningen, og til å evaluere ulike valg for en totalt absorberende grensebetingelse i modellen. Spredningen fra samlingen av sylindere sammenlignes med spredning fra en enkelt sylinder, et kvadratisk prisme og en plan reflektor, og med spredningen fra en samling kuler, en enkelt kule, en kube, og en kvadratisk reflektor. Sammenligningene gjøres ved hjelp av diffusjonskoeffisienter beregnet fra numeriske simuleringer. ntnudaim SIE7 kommunikasjonsteknologi Lyd- og bildebehandling
13	Automatisk sporing av Dopplerspektrum / Tracing of cardiac Doppler spectrums Vartdal, Gaute January 2011 (has links) En rekke verdier i hjertet som man normalt finner ved invasiv undersøkelse med kateter, kan med stor nøyaktighet beregnes ut fra hastigher man finner ved Dopplerultralyd. Eksempelvis er trykket i hjertekammere og blodårer viktig informasjon i forbindelse med undersøkelser av hjertefunksjonen til en pasient. Ved å studere konturene av hastigheten til blodet ved gitte punkt i hjertet kan man med bruk av Bernoulli's forenklede ligning beregne disse verdiene uten å penetrere pasienten med kateter. Maksimal positiv og negativ trykkøkning(dP/dt) i ventrikkelen er eksempler på verdier man kan finne, og er to av de mest utbredte indikatorene på ventrikkelfunksjonen. Disse kan beregnes ut fra hastigheten til lekkasjer fra ventrikkelen til atrium, også kalt mitralinsuffisienser. Muligheten til å måle disse, samt andre verdier med Dopplerultralyd, har en enorm fordel fremfor bruk av kateter.Som oftest må sporingene av et Dopplerspektrum gjøres manuelt, en prosess som er tidkrevende og vanskelig. Denne masteroppgaven foreslår en algoritme som automatisk kan spore konturene av Dopplermålinger. Algoritmen er tilpasset Dopplerspektrum av mitralinsuffisienser, men fungerer generelt for alle typer spektrum. Algoritmen gjør også et forsøk på å håndtere delvis svake/manglende kanter i spektrumet. Resultatene er sammenlignet med manuelt sporede kanter, og viser at algoritmen med stor nøyaktighet kan beregne verdier som dP/dt og maksimal hastighet. Maksimal og minimal dP/dt kan beregnes med en gjennomsnittlig forskjell fra den manuelle sporingen på under 100 mmHg/s, og toppunktet med forskjell under 0.05 m/s. Resultatene viser at så lenge kvaliteten på Dopplermålingene er akseptabel, sporer algoritmen konturen nøyaktig, og fjerner effektivt støy og artifakter langs konturen. Forskjellen mellom automatisk og manuelt ervervet maksimal dP/dt har et standardavvik så lavt som 79 mmHg/s når spektrumene har god kvalitet.Også i spektrum hvor deler av signalet er svakt kan verdier som dP/dt predikeres av algoritmen, og når mindre enn 60% av et spektrum må predikeres kan maksimal og minimal dP/dt finnes veldig nøyaktig. ntnudaim:5942 MTKOM kommunikasjonsteknologi Lyd- og bildebehandling
14	Real-Time Simulation of Reduced Frequency Selectivity and Loudness Recruitment Using Level Dependent Gammachirp Filters Bertheussen, Gaute January 2012 (has links) A real-time system for simulating reduced frequency selectivity and loudness recruitment was implemented in the C programming language. The finished system is an executable program where a user can input a sound file and a list of hearing losses. As the program runs, a processed version of the input signal is played back. The processed signal includes the effects of either one or both the hearing impairments. The system, called a hearing loss simulator, is based on the dynamic compressive gammachirp filter bank. Each channel in the filter bank is signal dependent, meaning the filter characteristics are changed according to an estimate of the signal level. Reduced frequency selectivity was simulated by influencing the filter characteristics by a hearing loss value in addition to the signal level. This produced masking effects, and was able reduced the detail of spectral envelopes. Loudness recruitment was simulated by scaling each sample based on the signal level. This technique accounted for abnormal growth of loudness-level and elevated absolute thresholds. It made low sounds disappear while leaving loud sounds closer to their original level. ntnudaim:7248 MTKOM kommunikasjonsteknologi Lyd- og bildebehandling
15	MCTF and JPEG 2000 Based Wavelet Video Coding Compared to the Future HEVC Standard Erlid, Frøy Brede Tureson January 2012 (has links) Video and multimedia content has over the years become an important part of our everyday life. At the same time, the technology available to consumers has become more and more advanced. These technologies, such as streaming services and advanced displays, has enabled us to watch video content on a large variety of devices, from small, battery powered mobile phones to large TV-sets.Streaming of video over the Internet is a technology that is getting increasingly popular. As bandwidth is a limited resource, efficient compression techniques are clearly needed. The wide variety of devices capable of streaming and displaying video suggest a need for scalable video coders, as different devices might support different sets of resolutions and frame rates.As a response to the demands for efficient coding standards, VCEG and MPEG are jointly developing an emerging video compression standard called High Efficiency Video Coding (HEVC). The goal for this standard is to improve the coding efficiency as compared to H.264, without affecting image quality. A scalable video coding extension to HEVC is also planned to be developed.HEVC is based on the classic hybrid coding approach. This however, is not the only way to compress video, and attention is given to wavelet coders in the literature. JPEG 2000 is a wavelet image coder that offers spatial and quality scalability. Combining JPEG 2000 with Motion Compensated Temporal Filtering (MCTF) gives a wavelet video coder which offers both temporal, spatial and quality scalability, without the need for complex extensions.In this thesis, a wavelet video coder based on the combination of MCTF and JPEG 2000 was implemented. This coder was compared to HEVC by performing objective and subjective assessments, with the use case being streaming of video with a typical consumer broadband connection. The objective assessment showed that HEVC was the superior system in terms of both PSNR and SSIM. The subjective assessment revealed that observers preferred the distortion produced by HEVC over the proposed system. However, the results also indicated that improvements to the proposed system can be made that could possibly enhance the objective and subjective quality. In addition, indications were also found that suggest that a use case operating with higher bit rates is more suitable for the proposed system. ntnudaim:7900 MTKOM kommunikasjonsteknologi Lyd- og bildebehandling
16	Subjective and Objective Crosstalk Assessment Methodologies for Auto-stereoscopic Displays Skildheim, Kim Daniel January 2012 (has links) Stereoscopic perception is achievable when the observer sees a scene from a slightly different angle. Auto-stereoscopic displays utilize several separate views to achieve this without using any special glasses. Crosstalk is an undesired effect of separating views. It is one of the most annoying artefacts occurring in an auto-stereoscopic display. This experiment has two parts. The first part proposes a subjective assessment methodology for characterizing crosstalk in an auto-stereoscopic display, without restriction of subjects viewing behaviour. The intention was to create an inexpensive method. The measurement was performed by using a Kinect prime sensor as a head tracking system combined with subjective score evaluation to get a data plot of the perceived crosstalk. The crosstalk varies in line with image content, disparity and viewing position. The result is a data plot that approaches a periodically pattern, which is consistent with the characteristics of an auto-stereoscopic display. The result is not perfect since there are many sources of errors occurring. These errors can be improved with better head tracking, an improved movement system, post processing of data, more data and removal of outliers.The second part proposes methods for extracting subjective values based on interpolated plots and creating objective crosstalk influenced pictures which correlate with the subjective data. The best extraction method was to combine an adapted sine regression curve with a linear interpolation. This interpolation followed the subjective values in a parallel slice plot at 3.592 m from the screen. The interpolation was adapted to fit a derived model as best as possible to achieve a good correlation. Objective crosstalk pictures were created, where the amount of crosstalk was determined by the neighbouring view that influenced the current view the most. The correlation was based on the relationship between the SSIM value from the created crosstalk picture and the extracted subjective value. The total correlation of the pictures together were 0,8249, where the picture with the highest correlation had 0,9561. This method is quite good for pictures that have a maximum disparity grade below 38 pixels. The overall result is good and it is also a measure of quality for the subjective test. This result can be improved by increasing the complexity of how the objective crosstalk pictures are created by adding more views into account or try another method to create crosstalk. Improved extraction of subjective values will also be beneficial in terms of improving the correlation even more. ntnudaim:7696 MTKOM kommunikasjonsteknologi Lyd- og bildebehandling
17	iVector Based Language Recognition Tokheim, Åsmund Einar Haugland January 2012 (has links) The focus of this thesis is an fairly new approach to phonotactic language recognition, i.e. identifying a language from the sounds in an spoken utterance, known as iVector subspace modeling. The goal of the iVector is to compactly represent the discriminative information in a utterance so that further processing of the utterance is less computationally intensive. This might enable the system to be trained with more data, and thereby reach an higher performance. We present both the theory behind iVectors and experiments to better fit the iVector space to our development data. The final system got comparable result to our baseline PRLM system on the NIST LRE03 30 second evaluation set. ntnudaim:8174 MTKOM kommunikasjonsteknologi Lyd- og bildebehandling
18	Residents' perceived impact of noise reducing measures implemented on habitations located nearby heavy traffic roads Sardinoux, Frederik Strand January 2012 (has links) Road traffic has seen a constant augmentation these last decades. The noise it generates has grown more or less linearly with the traffic and has created a major environmental problem. It affects the human body in negative ways by engendering sleep disorders, stress and even cardiovascular diseases. The Norwegian government came with regulations to reduce the number of people exposed to high noise levels and many people have now had their habitation supplemented with noise-reducing measures. There is however a lack of researches made on these noise control measures in the indoor noise; especially on how the residents experience these modifications. A telephone-based survey is done in this research where 76 households have been selected on the parcel from Gardemoen to Biri on the new E6 in Norway. Between these habitations, the average outdoors and indoor noise levels are, respectively, 61dB and 34dB prior the installation of any noise reducing measures. The results, treated statistically using the software SPSS, are showing the nuisance degree experienced by the habitants for different noise levels both outdoors and indoors. Additionally, the subjective improvement of the noise situation felt by the habitants after the measures shows a rather different picture outdoor and indoor, as the amelioration is generally bigger inside the habitation.Indeed, while nearly 90% of the respondents felt annoyed to extremely annoyed outdoors only 50% showed the same nuisance levels after the noise control measures were installed. For the indoors situation 40% of the participants of the survey felt annoyed to very annoyed prior the measures while only 5% of them felt the same degree of annoyance after. Furthermore, while 42% were sleep disturbed and 50% experienced stress and 66% felt a reduction of their well-being only 13%, 20% and 18% felt the same health issues after the noise-measures were installed. Around half of the interviewees declared they were satisfied with both Sweco and Statens Vegvesen which were key firms for the planning and building of the measures along the chosen parcel. Finally, taking all this into account, it appears that 40% of the 76 selected residents are satisfied with the noise-reducing measures, 10% are unsatisfied and the rest is neutrally satisfied. ntnudaim:7607 MTKOM kommunikasjonsteknologi Lyd- og bildebehandling
19	Dialektgjenkjenning / Automatic Dialect Recognition Sandberg, Susanne Barkhald January 2012 (has links) Dette arbeidet har blitt utført med et mål om å utvikle et system for automatisk gjenkjenning av dialekter basert på akustisk modellering. Systemet ble utviklet ved bruk av Hidden Markov ToolKit (HTK), et samling av ferdige språkmodelleringsverktøy utviklet av Cambridge University Engineering Department i 1996. Trening og evaluering ble gjort for to dialektformer innenfor de tre språkene spansk, engelsk og mandarin. ntnudaim:7274 MTKOM kommunikasjonsteknologi Lyd- og bildebehandling

Search results