1 |
Speech recognition availability / Tillgängligheten i taligenkänningEriksson, Mattias January 2004 (has links)
<p>This project investigates the importance of availability in the scope of dictation programs. Using speech recognition technology for dictating has not reached the public, and that may very well be a result of poor availability in today’s technical solutions. </p><p>I have constructed a persona character, Johanna, who personalizes the target user. I have also developed a solution that streams audio into a speech recognition server and sends back interpreted text. Johanna affirmed that the solution was successful in theory. </p><p>I then incorporated test users that tried out the solution in practice. Half of them do indeed claim that their usage has been and will continue to be increased thanks to the new level of availability.</p>
|
2 |
Speech recognition availability / Tillgängligheten i taligenkänningEriksson, Mattias January 2004 (has links)
This project investigates the importance of availability in the scope of dictation programs. Using speech recognition technology for dictating has not reached the public, and that may very well be a result of poor availability in today’s technical solutions. I have constructed a persona character, Johanna, who personalizes the target user. I have also developed a solution that streams audio into a speech recognition server and sends back interpreted text. Johanna affirmed that the solution was successful in theory. I then incorporated test users that tried out the solution in practice. Half of them do indeed claim that their usage has been and will continue to be increased thanks to the new level of availability.
|
3 |
An investigation of protocol command translation as a means to enable interoperability between networked audio devicesIgumbor, Osedum Peter January 2014 (has links)
Digital audio networks allow multiple channels of audio to be streamed between devices. This eliminates the need for many different cables to route audio between devices. An added advantage of digital audio networks is the ability to configure and control the networked devices from a common control point. Common control of networked devices enables a sound engineer to establish and destroy audio stream connections between networked devices that are distances apart. On a digital audio network, an audio transport technology enables the exchange of data streams. Typically, an audio transport technology is capable of transporting both control messages and audio data streams. There exist a number of audio transport technologies. Some of these technologies implement data transport by exchanging OSI/ISO layer 2 data frames, while others transport data within OSI/ISO layer 3 packets. There are some approaches to achieving interoperability between devices that utilize different audio transport technologies. A digital audio device typically implements an audio control protocol, which enables it process configuration and control messages from a remote controller. An audio control protocol also defines the structure of the messages that are exchanged between compliant devices. There are currently a wide range of audio control protocols. Some audio control protocols utilize layer 3 audio transport technology, while others utilize layer 2 audio transport technology. An audio device can only communicate with other devices that implement the same control protocol, irrespective of a common transport technology that connects the devices. The existence of different audio control protocols among devices on a network results in a situation where the devices are unable to communicate with each other. Furthermore, a single control application is unable to establish or destroy audio stream connections between the networked devices, since they implement different control protocols. When an audio engineer is designing an audio network installation, this interoperability challenge restricts the choice of devices that can be included. Even when audio transport interoperability has been achieved, common control of the devices remains a challenge. This research investigates protocol command translation as a means to enable interoperability between networked audio devices that implement different audio control protocols. It proposes the use of a command translator that is capable of receiving messages conforming to one protocol from any of the networked devices, translating the received message to conform to a different control protocol, then transmitting the translated message to the intended target which understands the translated protocol message. In so doing, the command translator enables common control of the networked devices, since a control application is able to configure and control devices that conform to different protocols by utilizing the command translator to perform appropriate protocol translation.
|
4 |
An investigation of the XMOS XSl architecture as a platform for development of audio control standardsDibley, James January 2014 (has links)
This thesis investigates the feasiblity of using a new microcontroller architecture, the XMOS XS1, in the research and development of control standards for audio distribution networks. This investigation is conducted in the context of an emerging audio distribution network standard, Ethernet Audio/Video Bridging (`Ethernet AVB'), and an emerging audio control standard, AES-64. The thesis describes these emerging standards, the XMOS XS1 architecture (including its associated programming language, XC), and the open-source implementation of an Ethernet AVB streaming audio device based on the XMOS XS1 architecture. It is shown how the XMOS XS1 architecture and its associated features, focusing on the XC language's mechanisms for concurrency, event-driven programming, and integration of C software modules, enable a powerful implementation of the AES-64 control standard. Feasibility is demonstrated by the implementation of an AES-64 protocol stack and its integration into an XMOS XS1-based Ethernet AVB streaming audio device, providing control of Ethernet AVB features and audio hardware, as well as implementations of advanced AES-64 control mechanisms. It is demonstrated that the XMOS XS1 architecture is a compelling platform for the development of audio control standards, and has enabled the implementation of AES-64 connection management and control over standards-compliant Ethernet AVB streaming audio devices where no such implementation previously existed. The research additionally describes a linear design method for applications based on the XMOS XS1 architecture, and provides a baseline implementation reference for the AES-64 control standard where none previously existed.
|
5 |
An investigation into the control of audio streaming across networks having diverse quality of service mechanismsFoulkes, Philip James January 2012 (has links)
The transmission of realtime audio data across digital networks is subject to strict quality of service requirements. These networks need to be able to guarantee network resources (e.g., bandwidth), ensure timely and deterministic data delivery, and provide time synchronisation mechanisms to ensure successful transmission of this data. Two open standards-based networking technologies, namely IEEE 1394 and the recently standardised Ethernet AVB, provide distinct methods for achieving these goals. Audio devices that are compatible with IEEE 1394 networks exist, and audio devices that are compatible with Ethernet AVB networks are starting to come onto the market. There is a need for mechanisms to provide compatibility between the audio devices that reside on these disparate networks such that existing IEEE 1394 audio devices are able to communicate with Ethernet AVB audio devices, and vice versa. The audio devices that reside on these networks may be remotely controlled by a diverse set of incompatible command and control protocols. It is desirable to have a common network-neutral method of control over the various parameters of the devices that reside on these networks. As part of this study, two Ethernet AVB systems were developed. One system acts as an Ethernet AVB audio endpoint device and another system acts as an audio gateway between IEEE 1394 and Ethernet AVB networks. These systems, along with existing IEEE 1394 audio devices, were used to demonstrate the ability to transfer audio data between the networking technologies. Each of the devices is remotely controllable via a network neutral command and control protocol, XFN. The IEEE 1394 and Ethernet AVB devices are used to demonstrate the use of the XFN protocol to allow for network neutral connection management to take place between IEEE 1394 and Ethernet AVB networks. User control over these diverse devices is achieved via the use of a graphical patchbay application, which aims to provide a consistent user interface to a diverse range of devices.
|
6 |
串流式音訊分類於智慧家庭之應用 / Streaming audio classification for smart home environments溫景堯, Wen, Jing Yao Unknown Date (has links)
聽覺與視覺同為人類最重要的感官。計算式聽覺場景分析(Computation Auditory Scene Analysis, CASA)透過聽覺心理學中對於人耳特性與心理感知的關連性,定義了一個可能的方向,讓電腦聽覺更為貼近人類感知。本研究目的在於應用聽覺心理學之原則,以影像處理與圖型辨識技術,設計音訊增益、切割、描述等對應之處理,透過相似度計算方式實現智慧家庭之環境中的即時音訊分類。
本研究分為三部分,第一部分為音訊處理,將環境中的聲音轉換成電腦可處理與強化之訊號;第二部分透過CASA原則設計影像處理,以冀於影像上達成音訊處理之結果,並以影像特徵加以描述音訊事件;第三部分定義影像特徵之距離,以K個最近鄰點(K-Nearest Neighbor, KNN)技術針對智慧家庭環境常見之音訊事件,實現即時辨識與分類。實驗結果顯示本論文所提出的音訊分類方法有著不錯的效果,對八種家庭環境常見的聲音辨識正確率可達80-90%,而在雜訊或其他聲音干擾的情況下,辨識結果也維持在70%左右。 / Human receive sounds such as language and music through audition. Therefore, audition and vision are viewed as the two most important aspects of human perception. Computational auditory scene analysis (CASA) defined a possible direction to close the gap between computerized audition and human perception using the correlation between features of ears and mental perception in psychology of hearing. In this research, we develop and integrate methods for real-time streaming audio classification based on the principles of psychology of hearing as well as techniques in pattern recognition.
There are three major parts in this research. The first is audio processing, translating sounds into information that can be enhanced by computers; the second part uses the principles of CASA to design a framework for audio signal description and event detection by means of computer vision and image processing techniques; the third part defines the distance of image feature vectors and uses K-Nearest Neighbor (KNN) classifier to accomplish audio recognition and classification in real-time. Experimental results show that the proposed approach is quite effective, achieving an overall recognition rate of 80-90% for 8 types of audio input. The performance degrades only slightly in the presence of noise and other interferences.
|
7 |
Bluetooth audio and video streaming on the J2ME platformSahd, Curtis Lee 09 September 2010 (has links)
With the increase in bandwidth, more widespread distribution of media, and increased capability of mobile devices, multimedia streaming has not only become feasible, but more economical in terms of space occupied by the media file and the costs involved in attaining it. Although much attention has been paid to peer to peer media streaming over the Internet using HTTP and RTSP, little research has focussed on the use of the Bluetooth protocol for streaming audio and video between mobile devices. This project investigates the feasibility of Bluetooth as a protocol for audio and video streaming between mobile phones using the J2ME platform, through the analysis of Bluetooth protocols, media formats, optimum packet sizes, and the effects of distance on transfer speed. A comparison was made between RFCOMM and L2CAP to determine which protocol could support the fastest transfer speed between two mobile devices. The L2CAP protocol proved to be the most suitable, providing average transfer rates of 136.17 KBps. Using this protocol a second experiment was undertaken to determine the most suitable media format for streaming in terms of: file size, bandwidth usage, quality, and ease of implementation. Out of the eight media formats investigated, the MP3 format provided the smallest file size, smallest bandwidth usage, best quality and highest ease of implementation. Another experiment was conducted to determine the optimum packet size for transfer between devices. A tradeoff was found between packet size and the quality of the sound file, with highest transfer rates being recorded with the MTU size of 668 bytes (136.58 KBps). The class of Bluetooth transmitter typically used in mobile devices (class 2) is considered a weak signal and is adversely affected by distance. As such, the final investigation that was undertaken was aimed at determining the effects of distance on audio streaming and playback. As can be expected, when devices were situated close to each other, the transfer speeds obtained were higher than when devices were far apart. Readings were taken at varying distances (1-15 metres), with erratic transfer speeds observed from 7 metres onwards. This research showed that audio streaming on the J2ME platform is feasible, however using the currently available class of Bluetooth transmitter, video streaming is not feasible. Video files were only playable once the entire media file had been transferred.
|
8 |
Symbiotic Audio Communication on Interactive TransportOlaleye, Olufunke I. 01 May 2007 (has links)
No description available.
|
9 |
Kaufst du noch oder streamst du schon?: Der Einfluss von Musik Streaming Diensten auf den Kauf von Musikdateien und MusikdatenträgernLiese, Christin 18 May 2015 (has links)
Die Zeiten der Plattensammlung sind vorbei, Kassetten und CDs sind der MP3-Datei gewichen und nun wird Musik ausschließlich gestreamt. Dieses Zukunftsszenario ist bis dato noch nicht eingetreten, aber wird dies überhaupt passieren? Wird der Kauf von physischen Musikdatenträgern und digitalen Musikdateien dank der immer stärker ansteigenden Streaming Aktivitäten komplett eingestellt? Oder können beide Formen nebeneinander existieren? Um diesen Fragen auf den Grund zu gehen, wurde im Rahmen dieser Arbeit eine Umfrage mit 1.661 Studenten der Technischen Universität Dresden durchgeführt. Die Ergebnisse geben Aufschluss über die Nutzungshäufigkeiten von kostenfreien und kostenpflichtigen Streaming Anbietern sowie von CDs / Schallplatten und MP3 Musikdateien. Zudem wird aufgezeigt, dass eine geringe Zahlungsbereitschaft bei den Studenten besteht. Es werden bereits selten mehr als 5 € in Musik investiert, doch seitdem die Studenten Streaming Dienste nutzen, geben sie nach eigenen Angaben noch weniger Geld für Musik aus als zuvor. Diesem Negativtrend steht die Erkenntnis gegenüber, dass die Probanden seit der Nutzung von Streaming Angeboten weniger Musik illegal herunterladen. Auch wenn der Großteil weniger Musik kauft, so ist es etwa der Hälfte aller Befragten sehr wichtig, Musik zu besitzen, vor allem in physischer Form. Zudem wurden Nutzungsmotive der Möglichkeiten des Musikhörens erfasst, um deren Stärken und Schwächen aufzuzeigen. Die Ergebnisse verdeutlichen, dass die kostenfreie Variante des Streamens zwar häufig genutzt wird, sich die traditionellen Musikdatenträger und Musikdateien jedoch immer noch großer Beliebtheit erfreuen. Von einer kompletten Verdrängung des Kaufens von Musik kann demnach nicht ausgegangen werden.:1. Einführung und Relevanzbegründung
2. Musik Streaming Dienste
2.1. Begriffsdefinition
2.2 Technologische Aspekte
2.3 Rechtliche Aspekte
2.4 Wirtschaftliche Aspekte
3. Der Musikkonsum im Umbruch
3.1 Der Musikkonsum im Wandel
3.1.1 Die fortschreitende Digitalisierung
3.1.2 Die aktuelle Musiknutzung
3.2 Die deutsche Musikindustrie – Nutzung, Absatz und Umsatz
3.2.1 Aktuelle Absatz- und Umsatzzahlen
3.2.2 Zwei Zukunftsszenarien
4. Musik Streaming Dienste im Fokus der Forschung
4.1 Aktuelle Studien zum Musik Streaming
4.2 Die Digital Natives als Zielgruppe
5. Das Forschungsvorhaben
5.1 Herleitung der Forschungsfragen und Hypothesen
5.2 Erhebungsmethode
5.3 Zielgruppenbestimmung und Grundgesamtheit
5.4 Die Online-Befragung
5.4.1 Aufbau und Durchführung
5.4.2 Beschreibung der Stichprobe
6. Darstellung und Auswertung
6.1 Die Nutzung von Musik als Stream, physisches und digitales Medium
6.2 Einflüsse der Musik Streaming Dienste auf das Kaufverhalten
6.3 Zahlungsbereitschaft für Musik
6.4 Nutzungsmotive für die vier Optionen des Musikhörens
7. Diskussion
7.1 Kritik und Interpretation der Ergebnisse
7.2 Ein Ausblick auf die Zukunft
8. Literatur
9. Anhang
A. Fragebogen
B. Email-Anschreiben an alle TU Dresden Studenten
C. Weitere Tabellen
|
Page generated in 0.0737 seconds