Spelling suggestions: "subject:"spectrograms"" "subject:"espectrograms""
1 |
Possible Applications of ECG Signal HarmonicsKao, Ruei-Da 19 July 2012 (has links)
Via the delivery of blood, heart transfers oxygen and nutrients to various organs and is thus a highly influential for circulatory system. To adapt to the variation of physiological conditions, the intensity and frequency of heart beats change with time. Careful observation finds that the time intervals between heartbeats are often different even if the body is at rest. Such heart rate variability (HRV) has been used to estimate the activity of the autonomic nervous system which can be divided into sympathetic and parasympathetic subsystems both of which can significantly affect the physiology of the human body. As a result, HRV has been used as a physiological indicator to assist doctors in making diagnostic decisions.
Many studies have used HRV to analyze the ECG signal via studying the QRS complex waveform to determine the time intervals between R-peaks and analyze the R-R intervals from time and frequency domains. Different from the conventional R-R Interval based approach, this work introduces new HRV feature variables by computing spectrogram of the ECG signal waveform. In particular, based on the harmonics of the spectrum, we introduce the concepts of modes. By find the relative amount of energy associated with each mode and degree-of-energy-concentration associated with each mode, this work introduces two sets of new HRV features. In addition, we also investigate how these variables change with time and the correlations between these features.
To demonstrate the potential of the proposed features, the differences of the values of the proposed features are compared for healthy individuals versus OSA patients, young versus old and male versus female. The experimental results show the differences between many of the tested features are statistically significant.
|
2 |
ASMR, Spectrograms, and Adam Young: Shaping a Genre Through FrequenciesMcMullen, Cheyenne 01 June 2021 (has links)
Spectrogram analysis of the most popular works of Adam Young’s four major projects; Owl City, Sky Sailing, Port Blue, and The Score Project. Spectrograms reveal several elements separate to what waveforms can show, and better show elements like frequency saturation, frequency ranges, overtones, and timbral sections. All these elements also can be used to better describe the phenomenon of the Autonomous Sensory Meridian Response (ASMR) and can also be seen on a spectrogram. Because of the ability to see these different elements, a spectrogram provides a good vehicle to analyze and compare elements of Young’s works and ASMR.After analysis, Young’s works show similar types of spectrograms to ASMR content, but the link between the influences between the two is not certain, it is probable that popular music and the ASMR phenomenon are linked in some measurable way. This thesis provides insight into how to further and continue music and ASMR research.
|
3 |
Descrição fonético-acústica das vibrantes no português e no espanhol /Carvalho, Kelly Cristiane Henschel Pobbe de. January 2004 (has links)
Orientador: Rafael Eugenio Hoyos-Andrade / Banca: Adelaide Hercília P. Silva / Banca: Gisele Domingos do Mar / Banca: Mirian Therezinha da Matta Machado / Banca: Zilda Maria Zapparoli Castro Melo / Resumo: Neste trabalho observamos e contrastamos as realizações das consoantes chamadas vibrantes, no português e no espanhol, em diferentes contextos fônicos, do ponto de vista acústico. Para tanto, utilizamos o Multi-Speech, programa de análise de fala para Windows, produzido pela Kay Elemetrics, que possibilita o desenvolvimento das análises espectrográficas necessárias neste tipo de investigação. As gravações foram feitas em sala acusticamente isolada, com gravador profissional, no Laboratório de Línguas da Faculdade de Ciências e Letras de Assis (UNESP), por informantes da região de Assis (interior de São Paulo) e da cidade de Bogotá (Colômbia). Embora este estudo tenha um caráter primordialmente descritivo, pode, eventualmente, servir de apoio àqueles que se dedicam ao ensino/aprendizagem do português e do espanhol como línguas estrangeiras, pois atesta informações relevantes de natureza contrastiva sobre o componente fônico das duas línguas, no que se refere às consoantes vibrantes. / Abstract: This dissertation deals with the acoustic analysis of trills and taps, in Portuguese as well as in Spanish. These consonants were spectrographically studied in the different contexts in which they appear in both languages. The physical analysis was made by means of the Kay Elemetrics Multi-Speech for Windows software. With its help we obtained the sound waves and sound spectrograms, necessary to our purposes, namely the acoustical description of the selected sounds in order to elaborate a contrastive description of the "r type" consonants. Our study was limited to the Portuguese spoken in our city area (Assis SP, Brazil) and to the Spanish spoken in Bogotá (Colombia). The data to be analyzed were recorded in the Language Laboratory of our University Campus (Faculdade de Ciências e Letras de Assis - UNESP). We used a professional cassette recorder within an acoustically isolated room. Although this study has a primarily descriptive character, it may eventually help those people who are devoted to the teaching/learning process of Portuguese and Spanish as foreign languages. In fact it shows contrastive relevant information about the phonetic component properties of both languages, in the very specific area of the so called trills or vibrant consonants. / Doutor
|
4 |
Descrição fonético-acústica das vibrantes no português e no espanholCarvalho, Kelly Cristiane Henschel Pobbe de [UNESP] 02 March 2004 (has links) (PDF)
Made available in DSpace on 2014-06-11T19:32:09Z (GMT). No. of bitstreams: 0
Previous issue date: 2004-03-02Bitstream added on 2014-06-13T20:42:48Z : No. of bitstreams: 1
carvalho_kchp_dr_assis.pdf: 17423514 bytes, checksum: 5a2ddc3e71ca7722c0a452d6131d37ce (MD5) / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) / Neste trabalho observamos e contrastamos as realizações das consoantes chamadas vibrantes, no português e no espanhol, em diferentes contextos fônicos, do ponto de vista acústico. Para tanto, utilizamos o Multi-Speech, programa de análise de fala para Windows, produzido pela Kay Elemetrics, que possibilita o desenvolvimento das análises espectrográficas necessárias neste tipo de investigação. As gravações foram feitas em sala acusticamente isolada, com gravador profissional, no Laboratório de Línguas da Faculdade de Ciências e Letras de Assis (UNESP), por informantes da região de Assis (interior de São Paulo) e da cidade de Bogotá (Colômbia). Embora este estudo tenha um caráter primordialmente descritivo, pode, eventualmente, servir de apoio àqueles que se dedicam ao ensino/aprendizagem do português e do espanhol como línguas estrangeiras, pois atesta informações relevantes de natureza contrastiva sobre o componente fônico das duas línguas, no que se refere às consoantes vibrantes. / This dissertation deals with the acoustic analysis of trills and taps, in Portuguese as well as in Spanish. These consonants were spectrographically studied in the different contexts in which they appear in both languages. The physical analysis was made by means of the Kay Elemetrics Multi-Speech for Windows software. With its help we obtained the sound waves and sound spectrograms, necessary to our purposes, namely the acoustical description of the selected sounds in order to elaborate a contrastive description of the r type consonants. Our study was limited to the Portuguese spoken in our city area (Assis SP, Brazil) and to the Spanish spoken in Bogotá (Colombia). The data to be analyzed were recorded in the Language Laboratory of our University Campus (Faculdade de Ciências e Letras de Assis - UNESP). We used a professional cassette recorder within an acoustically isolated room. Although this study has a primarily descriptive character, it may eventually help those people who are devoted to the teaching/learning process of Portuguese and Spanish as foreign languages. In fact it shows contrastive relevant information about the phonetic component properties of both languages, in the very specific area of the so called trills or vibrant consonants.
|
5 |
Demodulation of Narrowband Speech SpectrogramsAragonda, Haricharan January 2014 (has links) (PDF)
Speech is a non-stationary signal and contains modulations in both spectral and temporal domains. Based on the type of modulations studied, most speech processing algorithms can be classified into short-time analysis algorithms, narrow-band analysis algorithms, or joint spectro-temporal analysis algorithms. While traditional methods of speech analysis study the modulation along either time (Short-time analysis algorithms) or frequency (Narrowband analysis) at a time. A new class of algorithms that work simultaneously along both temporal as well as spectral dimensions, called the spectro-temporal analysis algorithms, have become prominent over the past decade.
Joint spectro-temporal analysis (also referred to as 2-D speech analysis) has shown promise in applications such as formant estimation, pitch estimation, speech recognition, etc.
Over the past decade, 2-D speech analysis has been independently motivated from several directions. Broadly these motivations for 2-D speech models can be grouped into speech-production motivated, source-separation/machine- learning motivated and neurophysiology motivated.
In this thesis, we develop 2-D speech model based on the speech production motivation. The overall organization of the thesis is as follows: We first develop the context of 2-D speech processing in Chapter one, we then proceed to develop a 2-D multicomponent AM-FM model for narrowband spectrogram patch of voiced speech and experiment with the perceptual significance of number of components needed to represent a spectrogram patch in Chapter two. In Chapter three we develop a demodulation algorithm called the inphase and the quadrature phase demodulation (IQ), compared to the state-of-the art sinusoidal demodulation, the AM obtained using this method is more robust to carrier estimation errors. The demodulation algorithm was verified on call voiced sentences taken from the TIMIT database. In chapter four we develop a demodulation algorithm based on Riesz transform, a natural extension of the Hilbert transform to higher dimensions, unlike the sinusoidal and the IQ demodulation techniques, Riesz-transform-based demodulation does not require explicit carrier estimation and is also robust to pitch discontinuous in patches. The algorithm was validated on all voiced sentences from the TIMIT database. Both IQ and Riesz-transform-based methods were found to give more accurate estimates of the 2-D AM (relates to vocal tract) and 2-D carrier (relates to source) compared with the sinusoidal modulation. In Chapter five we show application of the demodulated AM and carrier to pitch estimation and for creation of hybrid sounds. The hybrid sounds created were found to have better perceptual quality compared with their counterparts created using the linear prediction analysis. In Chapter six we summarize the work and present with possible directions of future research.
|
6 |
Deep Learning Models for Human Activity RecognitionAlbert Florea, George, Weilid, Filip January 2019 (has links)
AMI Meeting Corpus (AMI) -databasen används för att undersöka igenkännande av gruppaktivitet. AMI Meeting Corpus (AMI) -databasen ger forskare fjärrstyrda möten och naturliga möten i en kontorsmiljö; mötescenario i ett fyra personers stort kontorsrum. För attuppnågruppaktivitetsigenkänninganvändesbildsekvenserfrånvideosoch2-dimensionella audiospektrogram från AMI-databasen. Bildsekvenserna är RGB-färgade bilder och ljudspektrogram har en färgkanal. Bildsekvenserna producerades i batcher så att temporala funktioner kunde utvärderas tillsammans med ljudspektrogrammen. Det har visats att inkludering av temporala funktioner både under modellträning och sedan förutsäga beteende hos en aktivitet ökar valideringsnoggrannheten jämfört med modeller som endast använder rumsfunktioner[1]. Deep learning arkitekturer har implementerats för att känna igen olika mänskliga aktiviteter i AMI-kontorsmiljön med hjälp av extraherade data från the AMI-databas.Neurala nätverks modellerna byggdes med hjälp av KerasAPI tillsammans med TensorFlow biblioteket. Det finns olika typer av neurala nätverksarkitekturer. Arkitekturerna som undersöktes i detta projektet var Residual Neural Network, Visual GeometryGroup 16, Inception V3 och RCNN (LSTM). ImageNet-vikter har använts för att initialisera vikterna för Neurala nätverk basmodeller. ImageNet-vikterna tillhandahålls av Keras API och är optimerade för varje basmodell [2]. Basmodellerna använder ImageNet-vikter när de extraherar funktioner från inmatningsdata. Funktionsextraktionen med hjälp av ImageNet-vikter eller slumpmässiga vikter tillsammans med basmodellerna visade lovande resultat. Både Deep Learning användningen av täta skikt och LSTM spatio-temporala sekvens predikering implementerades framgångsrikt. / The Augmented Multi-party Interaction(AMI) Meeting Corpus database is used to investigate group activity recognition in an office environment. The AMI Meeting Corpus database provides researchers with remote controlled meetings and natural meetings in an office environment; meeting scenario in a four person sized office room. To achieve the group activity recognition video frames and 2-dimensional audio spectrograms were extracted from the AMI database. The video frames were RGB colored images and audio spectrograms had one color channel. The video frames were produced in batches so that temporal features could be evaluated together with the audio spectrogrames. It has been shown that including temporal features both during model training and then predicting the behavior of an activity increases the validation accuracy compared to models that only use spatial features [1]. Deep learning architectures have been implemented to recognize different human activities in the AMI office environment using the extracted data from the AMI database.The Neural Network models were built using the Keras API together with TensorFlow library. There are different types of Neural Network architectures. The architecture types that were investigated in this project were Residual Neural Network, Visual Geometry Group 16, Inception V3 and RCNN(Recurrent Neural Network). ImageNet weights have been used to initialize the weights for the Neural Network base models. ImageNet weights were provided by Keras API and was optimized for each base model[2]. The base models uses ImageNet weights when extracting features from the input data.The feature extraction using ImageNet weights or random weights together with the base models showed promising results. Both the Deep Learning using dense layers and the LSTM spatio-temporal sequence prediction were implemented successfully.
|
7 |
A smart sound fingerprinting system for monitoring elderly people living aloneEl Hassan, Salem January 2021 (has links)
There is a sharp increase in the number of old people living alone throughout the world. More often than not, such people require continuous and immediate care and attention in their everyday lives, hence the need for round the clock monitoring, albeit in a respectful, dignified and non-intrusive way. For example, continuous care is required when they become frail and less active, and immediate attention is required when they fall or remain in the same position for a long time. To this extent, various monitoring technologies have been developed, yet there are major improvements still to be realised.
Current technologies include indoor positioning systems (IPSs) and health monitoring systems. The former relies on defined configurations of various sensors to capture a person's position within a given space in real-time. The functionality of the sensors varies depending on receiving appropriate data using WiFi, radio frequency identification (RFIO), ultrawide band (UWB), dead reckoning (OR), infrared indoor (IR), Bluetooth (BLE), acoustic signal, visible light detection, and sound signal monitoring. The systems use various algorithms to capture proximity, location detection, time of arrival, time difference of arrival angle, and received signal strength data. Health monitoring technologies capture important health data using accelerometers and gyroscope sensors. In some studies, audio fingerprinting has been used to detect indoor environment sound variation and have largely been based on recognising TV sound and songs. This has been achieved using various staging methods, including pre-processing, framing, windowing, time/frequency domain feature extraction, and post-processing. Time/frequency domain feature extraction tools used include Fourier Transforms (FTs}, Modified Discrete Cosine Transform (MDCT}, Principal Component Analysis (PCA), Mel-Frequency Cepstrum Coefficients (MFCCs), Constant Q Transform (CQT}, Local Energy centroid (LEC), and Wavelet transform. Artificial intelligence (Al) and probabilistic algorithms have also been used in IPSs to classify and predict different activities, with interesting applications in healthcare monitoring. Several tools have been applied in IPSs and audio fingerprinting. They include Radial Basis Kernel (RBF), Support Vector Machine (SVM), Decision Trees (DTs), Hidden Markov Models (HMMs), Na'ive Bayes (NB), Gaussian Mixture Modelling (GMM), Clustering algorithms, Artificial Neural Networks (ANNs), and Deep Learning (DL). Despite all these attempts, there is still a major gap for a completely non-intrusive system capable of monitoring what an elderly person living alone is doing, where and for how long, and providing a quick traffic-like risk score prompting, therefore immediate action or otherwise.
In this thesis, a cost-effective and completely non-intrusive indoor positioning and activity-monitoring system for elderly people living alone has been developed, tested and validated in a typical residential living space. The proposed system works based on five phases:
(1)Set-up phase that defines the typical activities of daily living (TADLs).
(2)Configuration phase that optimises the implementation of the required sensors in exemplar flat No.1.
(3)Learning phase whereby sounds and position data of the TADLs are collected and stored in a fingerprint reference data set.
(4)Listening phase whereby real-time data is collected and compared against the reference data set to provide information as to what a person is doing, when, and for how long.
(5)Alert phase whereby a health frailty score varying between O unwell to 10 healthy is generated in real-time. Two typical but different residential flats (referred to here are Flats No.1 and 2) are used in the study.
The system is implemented in the bathroom, living room, and bedroom of flat No.1, which includes various floor types (carpet, tiles, laminate) to distinguish between various sounds generated upon walking on such floors. The data captured during the Learning Phase yields the reference data set and includes position and sound fingerprints. The latter is generated from tests of recording a specific TADL, thus providing time and frequency-based extracted features, frequency peak magnitude (FPM), Zero Crossing Rate (ZCR), and Root Mean Square Error (RMSE). The former is generated from distance measurement. The sampling rate of the recorded sound is 44.1kHz. Fast Fourier Transform (FFT) is applied on 0.1 seconds intervals of the recorded sound with minimisation of the spectral leakage using the Hamming window. The frequency peaks are detected from the spectrogram matrices to get the most appropriate FPM between the reference and sample data. The position detection of the monitored person is based on the distance between that captured from the learning and listening phases of the system in real-time.
A typical furnished one-bedroom flat (flat No.2) is used to validate the system. The topologies and floorings of flats No.1 and No.2 are different. The validation is applied based on "happy" and "unusual" but typical behaviours. Happy ones include typical TADLs of a healthy elderly person living alone with a risk metric higher than 8. Unusual one's mimic acute or chronic activities (or lack thereof), for example, falling and remaining on the floor, or staying in bed for long periods, i.e., scenarios when an elderly person may be in a compromised situation which is detected by a sudden drop of the risk metric (lower than 4) in real-time.
Machine learning classification algorithms are used to identify the location, activity, and time interval in real-time, with a promising early performance of 94% in detecting the right activity and the right room at the right time.
|
Page generated in 0.0658 seconds