Digitizing Sound Archives at Royal Library of Belgium: Challenges and difficulties encountered during a major digitization projectLemmers, Frédéric 03 December 2019 (has links)
Music in general and recorded music in particular are rarely a priority for libraries’ digitization policies, although wax cylinders and 78rpm discs might be digitized for preservation and accessibility reasons. The respect of the original recording technique during the digitization process will ensure the scientific and artistic credibility of the digitized sources. The Royal Library of Belgium started in 2016 the digitization of its whole collection of 78rpm. Realized by subcontracting, this project of about 4,000 hours will constitute a large corpus of sources for the digital musicology upcoming needs. / Musikalien im Allgemeinen und Tonträger im Besonderen erhalten in bibliothekarischen Digitalisierungsstrategien häufig nur wenig Beachtung, obwohl gerade für Wachszylinder und Schellackplatten sowohl aus Gründen der Bestandserhaltung als auch zur Verbesserung der Zugänglichkeit ihre Digitalisierung dringend geboten wäre. Um bei der Retrodigitalisierung von historischen Tonaufnahmen künstlerisch und wissenschaftlich zuverlässige Ergebnisse zu erreichen, ist den originalen Aufnahmetechniken große Aufmerksamkeit zu schenken. Die Königliche Bibliothek Belgiens lässt seit 2016 ihre vollständige Sammlung an Schellackplatten digitalisieren. Mithilfe eines Dienstleisters werden ca. 4 000 Stunden Tonaufnahmen produziert, die für aufkommende Forschungsfragen der digitalen Musikwissenschaft einen gewichtigen Quellenkorpus darstellen.
Aplicación de redes neuronales convolucionales para la emulación del modelo psicoacústico MPEG-1, capa I, para la codificación de señales de audio / Convolutional neural networks applied to the emulation of the psychoacoustic model for MPEG-1, Layer I audio signal encodersSanchez Huapaya, Alonso Sebastián, Serpa Pinillos, Sergio André 26 August 2020 (has links)
Solicitud de envío manuscrito de artículo científico. / El presente trabajo propone 4 alternativas de codificadores inspirados en el codificador MPEG-1, capa I, descrito en el estándar ISO/IEC 11172-3. El problema que se intenta resolver es el de requerir definir un modelo psicoacústico explícitamente para lograr codificar audio, reemplazándolo por redes neuronales. Todas las alternativas de codificador están basadas en redes neuronales convolucionales multiescala (MCNN) que emulan el modelo psicoacústico 1 del codificador mencionado. Las redes tienen 32 entradas que corresponden a las 32 subbandas del nivel de presión sonora (SPL – sound pressure level), y una única salida que corresponde a una de las 32 subbandas de o bien la relación señal a máscara (SMR) o bien el vector de asignación de bits. Es decir, un codificador está compuesto de un conjunto de 32 redes neuronales. La validación empleó los 10 primeros segundos de 15 canciones elegidas aleatoriamente de 10 géneros musicales distintos. Se comparó la calidad de las señales de audio generadas por cada codificador contra la de MPEG-1, capa I, mediante la métrica de ODG. El codificador cuya entrada es el SPL y cuya salida es la SMR, planteado por Guillermo Kemper, obtuvo los mejores resultados al realizar la comparación para 96 kbps y 192 kbps. El codificador denominado “SBU1” obtuvo los mejores resultados para 128 kbps. / The present work proposes 4 encoder alternatives, inspired in the MPEG-1, layer I encoder described in the ISO/IEC 11172-3 standard. The problem addressed here is the requirement of explicitly defining a psychoacoustic model to code audio, instead replacing it by neural networks. All the proposals are based on multiscale convolutional neural networks (MCNN) that emulate the psychoacoustic model 1 of the referred encoder. The networks have 32 inputs that map the 32 subbands of the sound pressure level (SPL), and a single output that corresponds to each of the 32 subbands of either the signal-to-mask ratio (SMR) or the bit allocation vector. Thus, an encoder is composed of a set of 32 neural networks. The validation process took the first 10 seconds of 15 randomly chosen songs of 10 different musical genres. The audio signal quality of the proposed encoders was compared to that of the MPEG-1, layer I encoder, using the ODG metric. The encoder whose input is the SPL and whose output is the SMR, proposed by Guillermo Kemper, yielded the best results for 96 kbps and 192 kbps. The encoder named “SBU1” had the best results for 128 kbps. / Tesis
Visual and spatial audio mismatching in virtual environmentsGarris, Zachary Lawrence 08 August 2023 (has links) (PDF)
This paper explores how vision affects spatial audio perception in virtual reality. We created four virtual environments with different reverb and room sizes, and recorded binaural clicks in each one. We conducted two experiments: one where participants judged the audio-visual match, and another where they pointed to the click direction. We found that vision influences spatial audio perception and that congruent audio-visual cues improve accuracy. We suggest some implications for virtual reality design and evaluation.
Investigating the impact of physical layer transmission for Bluetooth LE AudioArponen, Kevin, Björkman, Axel January 2023 (has links)
Bluetooth Low Energy (BLE) is a widely used low-energy version of Bluetooth’swireless protocol. To meet increasing requirements of modern wireless audio devices,Bluetooth LE Audio was released with its new Low Complexity CommunicationsCodec (LC3) being much more data efficient than its predecessor Low Complexity SubBand Coding. Because of its increased data efficiency, LC3 opens the door of exploring usage ofvarious physical layer configurations, especially those with lower data rates. Thedifference in performance when streaming audio with the uncoded LE 2M and 1Mconfigurations, compared to using the LE coded S=2 and S=8 configurations (whichhave a lower throughput) points to a research gap which this thesis aims to fill. To be able to gather data necessary to fill the identified gap, multiple iterations of bothsoftware and hardware artefacts were made. The produced artefacts were designed torun the same Bluetooth version (LE Audio) and switch between the physical layerconfigurations. Throughput and current consumption in varied ranges was measuredthrough usage of the artefacts. The results from the experiments show that for energy optimization, an adaptive schemewould not be beneficial over only using LE 2M. However, an adaptive scheme for thephysical layer can be used for LE Audio to improve range and stability. This doeshowever, come with the cost of increased energy consumption.
[pt] A degradação da qualidade do áudio pode ter muitas causas. Para
aplicações musicais, esta fragmentação pode levar a experiências altamente
desagradáveis. Algoritmos de restauração podem ser empregados para
reconstruir partes do áudio de forma semelhante à reconstrução da imagem,
em uma abordagem chamada Audio Inpainting. Os métodos atuais de
última geração para Audio Inpainting cobrem cenários limitados, com janelas
de intervalo bem definidas e pouca variedade de gêneros musicais. Neste
trabalho, propomos um método baseado em aprendizado profundo para
Audio Inpainting acompanhado por um conjunto de dados com condições de
fragmentação aleatórias que se aproximam de situações reais de deficiência.
O conjunto de dados foi coletado utilizando faixas de diferentes gêneros
musicais, o que proporciona uma boa variabilidade de sinal. Nosso melhor
modelo melhorou a qualidade de todos os gêneros musicais, obtendo uma
média de 13,1 dB de PSNR, embora tenha funcionado melhor para gêneros
musicais nos quais os instrumentos acústicos são predominantes. / [en] Audio quality degradation can have many causes. For musical
applications, this fragmentation may lead to highly unpleasant experiences.
Restoration algorithms may be employed to reconstruct missing parts of
the audio in a similar way as for image reconstruction - in an approach
called audio inpainting. Current state-of-theart methods for audio inpainting
cover limited scenarios, with well-defined gap windows and little variety
of musical genres. In this work, we propose a Deep-Learning-based (DLbased)
method for audio inpainting accompanied by a dataset with random
fragmentation conditions that approximate real impairment situations. The
dataset was collected using tracks from different music genres to provide a
good signal variability. Our best model improved the quality of all musical
genres, obtaining an average of 13.1 dB of PSNR, although it worked better
for musical genres in which acoustic instruments are predominant.
Automatic Maximum Sound Pressure Level (SPL) Measurements Inside Cars / Automatisk mätning av maximal ljudtrycksnivå i bilarDong, Luyao January 2023 (has links)
With a growing interest in technical specifications among consumers, there is a need for accessible measurement tools that enable individuals to evaluate the performance of their equipment, including common speakers and car audio systems, beyond what the manufacturer provides. However, the existing measurement systems are often geared towards professionals. This thesis aims to address this gap by designing and developing a user-friendly measurement tool that empowers individuals to easily measure and evaluate the performance of their devices. The work started with identifying the key technical specifications that users are interested in, and three parameters were selected for estimation: the maximum sound pressure level the system can provide, the corresponding multi-tone distortion and total harmonic distortion. Each parameter's measurement method varies, particularly in the choice of test stimuli and data processing. The methods in this thesis were determined after comparing existing standards for acoustical output-based measurement. Furthermore, some problems in terms of measurement capabilities and accuracy when implementing measurements within the defined application scenarios were also discussed. Ideally, the tool can finally provide users with detailed insights into chosen technical specifications, allowing them to know their audio systems better and make informed decisions. The automatic control of playback and recording as well as the processing afterwards was implemented in Python with the help of some existing packages. A graphic user interface based on PyQt was also developed to improve the manipulation of the measurement. Thus, the functionality that the tool is supposed to have is initially fulfilled, although its accuracy needs further verifying and improvement and the scope of the tool can be extended. / Med ett växande intresse för tekniska specifikationer bland konsumenter finns det ett behov av tillgängliga mätverktyg som gör det möjligt för privatpersoner att utvärdera prestandan hos sin utrustning, inklusive vanliga högtalare och bilstereosystem, utöver vad tillverkaren tillhandahåller. De befintliga mätsystemen är dock ofta inriktade på professionella användare. Denna avhandling syftar till att åtgärda denna brist genom att utforma och utveckla ett användarvänligt mätverktyg som gör det möjligt för privatpersoner att enkelt mäta och utvärdera prestandan hos sina enheter. Arbetet inleddes med att identifiera de viktigaste tekniska specifikationerna som användarna är intresserade av, och tre parametrar valdes ut för uppskattning: den maximala ljudtrycksnivå som systemet kan ge, motsvarande multitondistorsion och total harmonisk distorsion. Mätmetoden för varje parameter varierar, särskilt när det gäller valet av teststimuli och databehandling. Metoderna i denna avhandling fastställdes efter jämförelse av befintliga standarder för akustisk effektbaserad mätning. Dessutom diskuterades vissa problem när det gäller mätkapacitet och noggrannhet vid implementering av mätningar inom de definierade tillämpningsscenarierna. I bästa fall kan verktyget slutligen ge användarna detaljerade insikter i valda tekniska specifikationer, så att de kan lära känna sina ljudsystem bättre och fatta välgrundade beslut. Den automatiska styrningen av uppspelning och inspelning samt bearbetningen i efterhand implementerades i Python med hjälp av några befintliga paket. Ett grafiskt användargränssnitt baserat på PyQt utvecklades också för att förbättra hanteringen av mätningen. Den funktionalitet som verktyget är tänkt att ha är således initialt uppfylld, även om dess noggrannhet behöver verifieras och förbättras ytterligare och verktygets omfattning kan utökas.
A NEW APPROACH AND GUIDLINE FOR LOUDNESS IN GAME AUDIO : Developing Specific Loudness Standards for Each Section of Game AudioWang, Leshan January 2023 (has links)
Audio plays a crucial role in the immersive and emotional experience of playing videogames, and loudness is a key aspect of game audio that can greatly impact player engagement and immersion. There are existing loudness standards such as AES-EBU R128, which is mostly commonly used by social media platforms such as YouTube and Spotify. Moreover, SONY has developed the Sony Computer Entertainment America (SCEA) Loudness Standard for maintaining consistent and balanced loudness levels in game audio as well. However, there is a need for more section-specific standards that consider the unique requirements of different genres and elements of game audio. This thesis is proposing a new approach to loudness standardization for game audio by analyzing existing standards, identifying their limitations, and evaluating the impact of loudness on the gaming experience. The results have implications for game developers and audio designers, potentially enhancing player immersion and engagement.
[pt] Aplicações de Realidade Virtual (VR) com rastreamento de movimentos da cabeça precisam de efeitos de espacialização de alta qualidade. A abordagem tradicional para RV/jogos (interpolação dos canais L+R para
construção do estereo) se mostrou insuficiente por ser incapaz de simular a acústica do mundo real. Por isso a pesquisa na área tem migrado para espacialização 3D do áudio. O receptor tem a sensação de que o som veio de um local no espaço 3D. Em outras palavras, ele pode localizar o emissor apenas pelo áudio por consequência permite a construção ambientes mais imersivos e coerentes quando usados em conjunto de técnicas visuáis. Nesse novo contexto, motores de jogos devem prover aos designers de áudio uma
gama de ferramentas especializadas para a espacialização de àudio 3D além as de uso geral, que encluem: reverberações e reflexões usadas na construção de ambientes como igrejas e cavernas (locais com ecos); modulação, para criar variações de frequência e aliviar na repetitividade de sons recorrentes
(como os de passos e tiros); mix e fade de volumes, utilizado na criação de momentos dramáticos na história e reprodução musical. Nesse trabalho, nós propomos um motor de áudio de tempo real para espacialização de fontes sonoras pontuais em ambientes virtuais. Vai possuir uma arquitetura
documentada e de código aberto que provê um conjunto de efeitos e a habilidade de os compor. Nós implementamos a espacialização de áudio em 3D sobre bancos de dados de respostas impulsionais da cabeça (HRIRs) e efeitos sonoros com técnicas de processamento digital de sinais (DSP).
Apesar da existência de sistemas comerciais poderosos de áudio para VR estejam disponíveis (e.g. Oculus), nosso protótipo pode ser uma alternativa se a simplicidade, testabilidade e ajustes forem levados em conta. / [en] Virtual Reality (VR) applications with low-latency head tracking require high-quality spatial audio effects. However, classic VR/game sound approaches cannot properly simulate the acoustic of the real world. Current audio research is moving towards 3D spatial audio to have a more realistic simulation. In 3D spatial audio, the listener has the sensation that sound comes from a particular direction in 3D space. In other words, the listener can localize a source based on audio and have a more coherent and immersive
experience when paired with visual simulation. In this new context, game engines should provide sound designers with a set of 3D spatial audio tools. The following common effects are desirable in this type of toolbox: reverberations and reflections, which can be employed in the creation of caverns or churches (places with lots of echoes); modulation, which can increase the perceived variety of a recorded sound, by slightly varying its pitch (as in the sounds of footsteps); mixing and fading volumes, which can
create dramatic moments in storytelling and music reproduction. In this work, we propose a realtime
audio engine to spatialize sound point sources in virtual environments. This engine is an open-source architecture that provides a basic set of audio effects and an efficient way to mix and match them. We implement 3D audio spatialization by leveraging recorded head-related impulse responses (HRIRs) and we produce special sound effects with digital signal processing (DSP) techniques. Although some powerful commercial audio SDKs for Virtual Reality are currently available (e.g. Oculus), our audio engine
prototype may be a flexible option when adaptation, simplification, testing, and parameter tuning are necessary.
A Study of the Use of Motion Pictures and Film Strips in the Coaching of Athletics in a Selected Group of Texas High SchoolsWorkman, Mayfield 06 1900 (has links)
The purpose of this study was to make an investigation of the use of motion pictures and film strips in the coaching of athletics in a selected group of high schools in Texas. It was proposed, also, to formulate conclusions as to the practices utilized in connection with these visual aids in the coaching technique and to make recommendations for their use by coaches as outgrowths of the study.
A Study of the Types of Audio-Visual Aids and the Extent of their Use in the Industrial Arts Program in the Junior High Schools of Texas to Formulate a Program for Audio-Visual Aids Based upon Skills, Training and Attitudes of Teachers in Service and Professional LiteratureErwin, William R., Jr. 08 1900 (has links)
The purpose of this study is to determine the types of audio-visual aids that are being used in the junior high schools of Texas, the extent of their use, the evaluation by teachers in service of various audio-visual aids, the needs recognized by the teachers in service for additional audio-visual aids in the respective schools, and the needs of teachers in service for aid and instructions in the use of such aids. It is hoped that, with this information and additional information from professional literature, conclusions can be drawn and recommendations made that will aid teachers in the Texas junior high schools to use available audio-visual aids to a much greater advantage than is now being evidenced.
