1 |
Automatic Pitch Detection and Shifting of Musical Tones in Real TimeKim, Jinho January 2013 (has links)
Thesis advisor: Sergio Alvarez / Musical notes are acoustic stimuli with specific properties that trigger a psychological perception of pitch. Pitch is directly associated with the fundamental frequency of a sound wave, which is typically the lowest frequency of a periodic waveform. Shifting the perceived pitch of a sound wave is most easily done by changing the playback speed, but this method warps some of the characteristics and changes the time scale. This thesis aims to accurately shift the pitch of musical notes while preserving its other characteristics, and it implements this in real time on an Android device. There are various methods of detecting and shifting pitch, but in the interests of simplicity, accuracy, and speed, a three step process is used. First, the fundamental pitch of a stable periodic section of the signal is found using the Yin pitch detection algorithm. Secondly, pitch marks that represent the local peak of energy are found, each spaced out by roughly one period (inverse of the fundamental frequency). Lastly, these marks are used in the Pitch Synchronous Overlap and Add (PSOLA) algorithm to generate a new signal with the desired fundamental frequency and similar acoustical characteristics to the original signal. / Thesis (BS) — Boston College, 2013. / Submitted to: Boston College. College of Arts and Sciences. / Discipline: Computer Science Honors Program. / Discipline: Computer Science.
|
2 |
Multiple Fundamental Frequency Pitch Detection for Real Time MIDI ApplicationsHilbish, Nathan 18 July 2012 (has links)
This study aimed to develop a real time multiple fundamental frequency detection algorithm for real time pitch to MIDI conversion applications. The algorithm described here uses neural network classifiers to make classifications in order to define a chord pattern (combination of multiple fundamental frequencies). The first classification uses a binary decision tree that determines the root note (first note) in a combination of notes; this is achieved through a neural network binary classifier. For each leaf of the binary tree, each classifier determines the frequency group of the root note (low or high frequency) until only two frequencies are left to choose from. The second classifier determines the amount of polyphony, or number of notes played. This classifier is designed in the same fashion as the first, using a binary tree made up of neural network classifiers. The third classifier classifies the chord pattern that has been played. The chord classifier is chosen based on the root note and amount of polyphony, the first two classifiers constrain the third classifier to chords containing only a specific root not and a set polyphony. This allows for the classifier to be more focused and of a higher accuracy. To further increase accuracy, an error correction scheme was devised based on repetitive coding, a technique that holds out multiple frames and compares them in order to detect and correct errors. Repetitive coding significantly increases the classifiers accuracy; it was found that holding out three frames was suitable for real-time operation in terms of throughput, though holding out more frames further increases accuracy it was not suitable real time operation. The algorithm was tested on a common embedded platform, which through benchmarking showed the algorithm was well suited for real time operation.
|
3 |
Packet Loss Concealment in Voice Over InternetGokhale, Rishikesh S 31 July 2003 (has links)
Traditional telephony networks with their cumbersome and costly infrastructures are being replaced with voice being transmitted over the Internet. The Internet is a very commonly used technology that was traditionally used to transmit data. With the availability of large bandwidth and high data rates the transmission of data, voice and video over the Internet is gaining popularity. Voice is a real time application and the biggest problem it faces is the loss of packets due to network congestion. The Internet implements protocols to detect and retransmit the lost packets. However, for a real time application it is too late before a lost intermediate packet is retransmitted. This causes a need for reconstruction of the lost packet. Therefore, good reconstruction techniques are being researched. In this thesis a new concealment algorithm to reconstruct lost voice packets is reported. The algorithm is receiver based and its functionality is based on Time Scale Modifications of speech and autocorrelation of a speech signal. The new techniqhe is named the Modified Waveform Similarity Overlap Add , (WSOLA) technique. All simulations were performed in MATLAB.
|
4 |
Real-time monitoring of voice characteristics usingaccelerometer and microphone measurementsVirebrand, Marcus January 2011 (has links)
VoxLog is a portable voice accumulator, that uses both an accelerometer that measures skin vibrations and a regular microphone to collect data. The goal of the thesis was to implement and evaluate methods that based on this data estimate the three different voice parameters fundamental frequency, phonation and soundpressure level. For pitch, three different methods were evaluated. The different methods all require relatively low computational power since the goal was to implement at least one of them on the digital signal processor in the VoxLog. The results from these evaluations show that the best estimations of pitch were made with a FFT-based approach that uses phase information to get an estimation with high frequencyresolution. Phonation is estimated with an energy based voice activity detection method.This estimation is then used to choose when sound pressure level should be estimated. Here one of the main problems was to make a distinction between when sound pressure level should be estimated for the wearer of the VoxLog or when an estimation should be made for the background noise. This was solved by implementing a time window before and after phonation were neither is estimated. For both pitch and sound pressure level a feedback functionality was implemented. The feedback is given to the user via vibrations in the VoxLog, the feedback is given when estimated parameters break set limits on pitch or sound pressure level.
|
5 |
A Design of Speech Recognition System for the Mandarin ToponymsWei, Hong-jhang 31 August 2006 (has links)
In this thesis, a Mandarin toponym speech recognition system is developed using MFCC, LPC and HMM under Red Hat Linux 9.0. The system is based on monosyllable HMM's to select the initial toponym candidates, and its final classification result can be obtained by further pitch identification mechanisms. For speaker-dependent case, a 90% correct rate can be achieved approximately and the recognition process can be accomplished within 1.5 seconds on the average.
|
6 |
Language Independent Speech VisualizationBraunisch, Jan January 2011 (has links)
A speech visualization system is proposed thatcould be used by a deaf person for understanding speech.Several novel techniques are proposed, including: (1) Minimizing spectral leakage in the Fourier transform by using avariable-length window. (2) Making use of the fact that there is no spectral leakage in order to calculate how much of the energy of the speech signal is due to its periodic component vs. its nonperiodic component. (3) Modelling the mouth and lips as a band-pass filter and estimating the central frequency and bandwidth of this filter in order to assign colours tounvoiced speech sounds.
|
7 |
Αυτόματο σύστημα εκμάθησης μουσικών οργάνωνΚομπογιάννης, Ηλίας 30 December 2014 (has links)
Ο σκοπός της παρούσας διπλωματικής είναι η κατασκευή ενός συστήματος εκμάθησης
μουσικών οργάνων. Συγκεκριμένα, στα πλαίσια της διπλωματικής αυτής μελετήθηκε το
όργανο της κιθάρας. Αυτό επετεύχθη με την βοήθεια του Matlab software όπου έχουμε το
πρωτότυπο κομμάτι μουσικής και το κομμάτι το οποίο παίζει ο μαθητής και κάνουμε την
σύγκριση μεταξύ των δύο. Για να γίνει αυτό όμως πρέπει να γίνουν κάποια βήματα
προηγουμένως. Αρχικά, εντοπίζουμε σε ποιο χρονικό σημείο παίζονται οι νότες, δηλαδή
βρίσκουμε τα onset points. Έπειτα, καθορίζουμε ποια νότα παίζεται στα αντίστοιχα χρονικά
σημεία, το οποίο επιτυγχάνεται με την Harmonic Product Spectrum μέθοδο όπου
βρίσκουμε την θεμελιώδη συχνότητα. Τέλος, καθορίζουμε με ποια κριτήρια θα γίνει η
σύγκριση και τι αποτελέσματα θα παρέχουμε. / The purpose of this project is the construction of a musical-ιnstrument learning system.
Specifically, in the context of this thesis, we studied the guitar. This was achieved with the
help of Matlab software where we define the original music track and the track played by
the student and make the comparison between the two. To do this, however, we must take
some steps. First, we identify the time which the notes are played, that is to say we find
the onset points. Then, we determine what note is played in the respective time points,
which is obtained by the Harmonic Product Spectrum method, where we find the
fundamental frequency. Finally, we determine the comparison criteria and what results are
provided.
|
8 |
Strojový přepis kytarových melodií do tabulatury / Computer Aided Transformation of Guitar Solos from Recorded Song to TabsJoščák, Juraj January 2019 (has links)
The aim of this thesis was automatic pitch detection in melodic guitar lines and subsequent transcription to guitar tablature. Final system uses comb filtering to detect pitch. Individual notes are separated by beat detection. An algorithm for transcription of notes to guitar tablature, based on minimalization of hand movement is proposed.
|
9 |
Pitcher : An automatic guitar tuner / Automatisk gitarrstämmareAndersson, Hannes, Sjöberg, John January 2021 (has links)
Pitcher is a prototype which makes it easier for inexperienced guitar players to tune their guitars without any prior knowledge required. This thesis will explore how the construction varies between the usage of DC and a stepper motor, how reliable the tuner is and how long it takes to tune the guitar. The tuner will capture sound with a microphone and calculate the current frequency of the string with YIN autocorrelation. Based on the frequency a control system regulator is used to determine the speed and direction of a motor which turns the tuning peg, this is repeated until the string is in tune. 30 tests were conducted from different starting frequencies, and the time it took for the tuner to find the right pitch and the string’s corresponding frequency was measured. Some of the measurements were a couple of Hz off pitch, and only about half of the frequencies measured belonged to the interval where there is no noticeable dierence of the pitch, therefore the tuner could not be considered reliable. The time it takes to tune the guitar is dependent on how far off pitch the string is andthe dierence in time does not depend linearly with the starting frequency, it increases faster the further off pitch the string is.The tuner is portable and to apply the tuner to the guitar it is held and placed on the tuning peg with one hand as the other hand is plucking the string. / Den automatiska gitarrstämmaren, Pitcher, är en prototyp som möjliggör för oerfarna gitarranvändare utan förkunskaper att stämma en gitarr. Den här avhandlingen kommer att undersöka hur konstruktionen skiljer sig åt vidanvändning utav en stegmotor respektive en likströmsmotor, hur lång tid det tar att stämma gitarren samt hur tillförlitligprototypen är. Stämmaren avläser ljudsignaler med en mikrofon och beräknar sedan frekvensen av strängen med hjälp av YIN autokorrelation. Den beräknade frekvensen behandlas i en regulator som avgör vilken hastighet och i vilken riktningmotorn ska rotera stämskruven. Detta repeteras tills korrekt frekvens erhålls. 30 test gjordes då gitarren stämdes från olika startfrekvenser där tiden att stämma strängen respektive dess frekvens mättes. Några mätningar hade en frekvens som avvek flera Hz från korrekt frekvens, och cirka hälften av frekvenserna från alla mätningar tillhörde frekvensintervallet där ingen skillnad kan höras på tonen, därför kan gitarrstämmaren ej anses vara tillförlitlig. Tidendet tar att stämma en sträng är beroende på hur ostämd den är och skillnaden i tid beror inte linjärt av startfrekvens, utan den ökar snabbare desto mer ostämd gitarren är. Stämmaren är portabel och för att applicera den på gitarren placeras munstycket på stämskruven medan den andra handen slår an strängen.
|
10 |
Investigation Of The Significance Of Periodicity Information In Speaker IdentificationGursoy, Secil 01 April 2008 (has links) (PDF)
In this thesis / general feature selection methods and especially the use of periodicity and aperiodicity information in speaker verification task is searched. A software system is constructed to obtain periodicity and aperiodicity information from speech. Periodicity and aperiodicity information is obtained by using a 16 channel filterbank and analyzing channel outputs frame by frame according to the pitch of that frame. Pitch value of a frame is also found by using periodicity algorithms. Parzen window (kernel density estimation) is used to represent each person&rsquo / s selected phoneme. Constructed method is tested for different phonemes in order to find out its usability in different phonemes. Periodicity features are also used with MFCC features to find out their contribution to speaker identification problem.
|
Page generated in 0.108 seconds