• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 4
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 10
  • 10
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Audiovizuální istalace / Audiovisual Installation

Kachtík, Petr January 2013 (has links)
The basis of the thesis are based on the installation linked with real time manipulated audio, video and lights. Interactions between different parts of the installation is given by handling all its components. The aim is to comprehensively cover a darkened room space by creating a specific environment.
2

Firemní nasazení audio-video techniky / Deployment of corporate audio-video technology

Nikl, Martin January 2016 (has links)
This thesis deals with the possibility of connecting, automating and controlling audio/video hardware. In the theoretical part the author deals with input and output A/V devices, A/V interfaces communication interfaces, possibility of connecting automation elements and a room control. It explains terms relating to the audio/video topic. The practical part suggests a complete design and implementation of a showroom to present the portfolio of Colsys company, which demonstrates functions of the control system Crestron, such as controlling projectors, video matrices, sound equipment and building automation (lights, air conditioning, blinds, etc.). First part of this section deals with defining the requirements for individual devices. The next step was the selection of suitable components based on fulfilling the set criteria and the budget. After that a check of compatibility was performed and room layout was designed, which contained electrical layout and a room blueprint. Final step of this section was designing the user interface and its realization in order to create the control unit program. Based on the theoretical part the author created a chapter titled Recommendation, which is verified in the practical part by creating a fully functional showroom for Colsys company, which also serves as a meeting room.
3

Qualitätsgerechte Übertragung komprimierter Audio-/ Videoströme in IP- Netzen im Vergleich verschiedener Kompressionsverfahren

Sonne, Dirk 17 November 2017 (has links)
Zielstellung der vorliegenden Arbeit ist die Einsatzerprobung und Bewertung der Übertragung komprimierter Audio- Video- Ströme in IP- Netzen. Zu diesem Zwecke wurde ein Videoschnittplatz mit der Audio- Video- Kompressionshardware der Firma Optibase aufgebaut. Im speziellen wurde das Verhalten der Videokarte bei MPEG 1 und MPEG 2 Kompression untersucht. Es wurden über 160 Filme mit unterschiedlichen Parametern für Bitrate und GOP- Aufbau des codierten Videostromes aufgenommen und aus diesen über 2400 Bilder extrahiert. Diese Bilder wurden kategorisiert und einer qualitativen Beurteilung unterzogen.
4

Development and Application of Digital Audio/Video Learning Material for Traffic Pattern Training in Airforce Academy

Guo, Shun-sian 05 August 2005 (has links)
Abstract Flying is a three dimensional exercise that is combined with time. It includes application of theory, realistic space handling and management of time. In order to let pilot feel the handling, tempo, movement in space, aircraft attitude and radio communication procedure for flying, it is a brand new challenge to develop digitization teaching material of flying handling skill for audio/video and assisting of simulation machine function weak point of the visual effect, use it to help student pilot with understanding the knowledge of flying handling. This research initially statistically analyzed the training completion and drop-out data of 18 Basic Flying Training classes of the Chinese Air Force Academy, from the academic year of 88 to 92. The research found that the traffic pattern phase has 21.8% of drop-out rate among the 35.2% of the whole training drop-out rate. Furthermore, 16 out of 18 classes had the highest drop-out rate during the traffic pattern phase. Therefore, The purpose of this research lies in offering digitization teaching material for audio/video in order to help student pilot with understanding and learning the flying skill for traffic pattern, in hopes of assisting student pilot on handling and concept and providing effective learning method and technical instruction for both the student pilot and the training unit to advance the performance of flying training. The question of this research includes: 1. Novel flying training and teaching assistance equipments. 2. Raising the traffic pattern phase training completion rate. 3. Raising the traffic pattern phase training performance. 4. Investigating the potential problems associated with flying training. The researcher is currently the Chinese Air Force Academy instructor pilot. Therefore, the researcher hopes through studying the law as the main research approach with oneself action of action research.In this research, the researcher except that collect putting to documents in order and devotes to develop digitization teaching material for audio/video and select two of the flying training classes of the current academic year, coordinating with the basic flying training schedule of the Chinese Air Force Academy. There will be surveys and interviews with case study through analysis tool for analyzing research data, after applied the traffic pattern digital Audio/Video. Lastly, the research procedure and data analysis will discover that action research has important value in the field of flight training and conclude critical influence factors for the teaching media aspect, unit administration aspect, flying training actual practice aspect, and flying students¡¦ aspect, in order to provide training advice and future research direction. Keywords: Traffic Pattern¡BTraining Performance¡BCritical Influence Factor¡BAction Research¡BDigital Audio/Video Learning Material
5

Učící se analyzátor audio-vizuálních záznamů / Continously Learning Analyser of Audio-Visual Recordings

Košarko, Ondřej January 2016 (has links)
This thesis introduces a tool for analysis of audiovisual records. The tool uses the audio and closed captions supplied by the user to prepare text annotation. The annotation contains a transcript of the show which is based on the closed captions. In addition, speaker diarization is performed to mark who spoke when. The diarization is performed by a third party library. The library is evaluated on data from DIALOG corpus. The inner workings of the library are described. To assign the right portions of the text to the right section of the record Kaldi, a speech recognition toolkit, is used. Furthermore the thesis contains an overview describing how closed captions are created; overview of speech corpora creation; and a brief review of literature on record analysis. 1
6

Graduate Students' Perceptions of the Effectiveness of a Two-Way Audio/Video Distance Learning Session and of Its Effects on Graduate Students' Comfort Level

Bangpipob, Savanee 12 1900 (has links)
The purposes of this study were to (a) determine graduate students' perceptions of the effectiveness of the delivery system and their level of comfort with the delivery system, (b) determine graduate students' perceptions of the effectiveness of the delivery system and their level of comfort with the teacher, (c) determine graduate students' level of comfort with the delivery system and their level of comfort with the teacher, (d) determine differences in graduate students' ratings of the effectiveness of the delivery system before a distance education session and after a distance education session, and (e) determine differences in graduate students' level of comfort with the teacher before a distance education session and after a distance education session.
7

Modèles acoustiques à structure temporelle renforcée pour la vérification du locuteur embarquée / Reinforced temporal structure of acoustic models for speaker recognition

Larcher, Anthony 24 September 2009 (has links)
La vérification automatique du locuteur est une tâche de classification qui vise à confirmer ou infirmer l’identité d’un individu d’après une étude des caractéristiques spécifiques de sa voix. L’intégration de systèmes de vérification du locuteur sur des appareils embarqués impose de respecter deux types de contraintes, liées à cet environnement : – les contraintes matérielles, qui limitent fortement les ressources disponibles en termes de mémoire de stockage et de puissance de calcul disponibles ; – les contraintes ergonomiques, qui limitent la durée et le nombre des sessions d’entraînement ainsi que la durée des sessions de test. En reconnaissance du locuteur, la structure temporelle du signal de parole n’est pas exploitée par les approches état-de-l’art. Nous proposons d’utiliser cette information, à travers l’utilisation de mots de passe personnels, afin de compenser le manque de données d’apprentissage et de test. Une première étude nous a permis d’évaluer l’influence de la dépendance au texte sur l’approche état-de-l’art GMM/UBM (Gaussian Mixture Model/ Universal Background Model). Nous avons montré qu’une contrainte lexicale imposée à cette approche, généralement utilisée pour la reconnaissance du locuteur indépendante du texte, permet de réduire de près de 30% (en relatif) le taux d’erreurs obtenu dans le cas où les imposteurs ne connaissent pas le mot de passe des clients. Dans ce document, nous présentons une architecture acoustique spécifique qui permet d’exploiter à moindre coût la structure temporelle des mots de passe choisis par les clients. Cette architecture hiérarchique à trois niveaux permet une spécialisation progressive des modèles acoustiques. Un modèle générique représente l’ensemble de l’espace acoustique. Chaque locuteur est représenté par une mixture de Gaussiennes qui dérive du modèle du monde générique du premier niveau. Le troisième niveau de notre architecture est formé de modèles de Markov semi-continus (SCHMM), qui permettent de modéliser la structure temporelle des mots de passe tout en intégrant l’information spécifique au locuteur, modélisée par le modèle GMM du deuxième niveau. Chaque état du modèle SCHMM d’un mot de passe est estimé, relativement au modèle indépendant du texte de ce locuteur, par adaptation des paramètres de poids des distributions Gaussiennes de ce GMM. Cette prise en compte de la structure temporelle des mots de passe permet de réduire de 60% le taux d’égales erreurs obtenu lorsque les imposteurs prononcent un énoncé différent du mot de passe des clients. Pour renforcer la modélisation de la structure temporelle des mots de passe, nous proposons d’intégrer une information issue d’un processus externe au sein de notre architecture acoustique hiérarchique. Des points de synchronisation forts, extraits du signal de parole, sont utilisés pour contraindre l’apprentissage des modèles de mots de passe durant la phase d’enrôlement. Les points de synchronisation obtenus lors de la phase de test, selon le même procédé, permettent de contraindre le décodage Viterbi utilisé, afin de faire correspondre la structure de la séquence avec celle du modèle testé. Cette approche a été évaluée sur la base de données audio-vidéo MyIdea grâce à une information issue d’un alignement phonétique. Nous avons montré que l’ajout d’une contrainte de synchronisation au sein de notre approche acoustique permet de dégrader les scores imposteurs et ainsi de diminuer le taux d’égales erreurs de 20% (en relatif) dans le cas où les imposteurs ignorent le mot de passe des clients tout en assurant des performances équivalentes à celles des approches état-de-l’art dans le cas où les imposteurs connaissent les mots de passe. L’usage de la modalité vidéo nous apparaît difficilement conciliable avec la limitation des ressources imposée par le contexte embarqué. Nous avons proposé un traitement simple du flux vidéo, respectant ces contraintes, qui n’a cependant pas permis d’extraire une information pertinente. L’usage d’une modalité supplémentaire permettrait néanmoins d’utiliser les différentes informations structurelles pour déjouer d’éventuelles impostures par play-back. Ce travail ouvre ainsi de nombreuses perspectives, relatives à l’utilisation d’information structurelle dans le cadre de la vérification du locuteur et aux approches de reconnaissance du locuteur assistée par la modalité vidéo / SPEAKER verification aims to validate or invalidate identity of a person by using his/her speech characteristics. Integration of an automatic speaker verification engine on embedded devices has to respect two types of constraint, namely : – limited material resources such as memory and computational power ; – limited speech, both training and test sequences. Current state-of-the-art systems do not take advantage of the temporal structure of speech. We propose to use this information through a user-customised framework, in order to compensate for the short duration speech signals that are common in the given scenario. A preliminary study allows us to evaluate the influence of text-dependency on the state-of-the-art GMM/UBM (Gaussian Mixture Model / Universal Background Model) approach. By constraining this approach, usually dedicated to text-independent speaker recognition, we show that a lexical constraint allows a relative reduction of 30% in error rate when impostors do not know the client password. We introduce a specific acoustic architecture which takes advantage of the temporal structure of speech through a low cost user-customised password framework. This three stage hierarchical architecture allows a layered specialization of the acoustic models. The upper layer, which is a classical UBM, aims to model the general acoustic space. The middle layer contains the text-independent specific characteristics of each speaker. These text-independent speaker models are obtained by a classical GMM/UBM adaptation. The previous text-independent speaker model is used to obtain a left-right Semi-Continuous Hidden Markov Model (SCHMM) with the goal of harnessing the Temporal Structure Information (TSI) of the utterance chosen by the given speaker. This TSI is shown to reduce the error rate by 60% when impostors do not know the client password. In order to reinforce the temporal structure of speech, we propose a new approach for speaker verification. The speech modality is reinforced by additional temporal information. Synchronisation points extracted from an additional process are used to constrain the acoustic decoding. Such an additional modality could be used in order to add different structural information and to thwart impostor attacks such as playback. Thanks to the specific aspects of our system, this aided-decoding shows an acceptable level of complexity. In order to reinforce the relaxed synchronisation between states and frames due to the SCHMM structure of the TSI modelling, we propose to embed an external information during the audio decoding by adding further time-constraints. This information is here labelled external to reflect that it is aimed to come from an independent process. Experiments were performed on the BIOMET part of the MyIdea database by using an external information gathered from an automatic phonetical alignment. We show that adding a synchronisation constraint to our acoustic approach allows to reduce impostor scores and to decrease the error rate from 20% when impostor do not know the client password. In others conditions, when impostors know the passwords, the performance remains similar to the original baseline. The extraction of the synchronisation constraint from a video stream seems difficult to accommodate with embedded limited resources. We proposed a first exploration of the use of a video stream in order to constrain the acoustic process. This simple video processing did not allow us to extract any pertinent information
8

Multimedia unter Linux

Heik, Andreas 21 March 2000 (has links)
Mit der Verbreitung von Linux als Desktopsystem steigen auch die Anforderungen des Nutzers an multimediale Fähigkeiten wie z.B. das Anhören eines digitalisierten Musikstückes, die Einbindung einer Digitalkamera in die Bildverarbeitung, die Nutzung einer Radio/TV-Karte oder gar das Bearbeiten eines kleinen Videofilms.
9

Technological Acceptance of an Avatar Based Interview Training Application : The development and technological acceptance study of the AvBIT application.

Dalli, Kevin Charles January 2021 (has links)
This thesis expands on previous research and designs of avatar-based child interview training software. The goal of the thesis was to identify requirements, identify technologies and evaluate the likelihood of acceptance of a distribution ready software that would enhance role-play training exercises commonly used for child interview training. After identifying the requirements needed to create this type of application the needed technologies for solving those requirements were identified and one prototype and two production ready applications were developed. The production ready versions were distributed in an official capacity through AvBIT Labs Ab. Each version was evaluated using the technological acceptance model (TAM) in order to determine likelihood of acceptance in relevant industries. The TAM survey, USE survey and correspondence with experts were used to evaluate missing requirements and the likelihood of software acceptance. The research conducted in this thesis directly contributed to the founding of AvBIT Labs AB and the distribution of the AvBIT application to both governmental and non-governmental organizations, seeking to enhance their child interview training, throughout Europe.
10

Knihovna pro efektivní záznam videa v 3D aplikaci / Library for Efficient Video Capture in 3D Application

Pospíšil, Petr January 2012 (has links)
This thesis deals with library for recording video in the background of 3D application. A library is designed to work under the Microsoft Windows and Linux operation systems.  It records image and also sound. Image recording is supported in OpenGL, Direct3D9, Direct3D10 and Direct3D11. To reduce video data size, library supports image compression using MJPG codec. Audio is recorded by WaveForm audio, Windows Core Audio or ALSA. Recorded sound is for whole operation system. A library is able to record up to two audio streams to accommodate possible microphone input. It can mix audio data together if needed. Output data are then written into AVI file. It is possible to write own text information as overlay that is rendered as part of application screen output.

Page generated in 0.0466 seconds