• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 4
  • 3
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 14
  • 14
  • 5
  • 5
  • 4
  • 4
  • 4
  • 3
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Digital Image Processing via Combination of Low-Level and High-Level Approaches.

Wang, Dong January 2011 (has links)
With the growth of computer power, Digital Image Processing plays a more and more important role in the modern world, including the field of industry, medical, communications, spaceflight technology etc. There is no clear definition how to divide the digital image processing, but normally, digital image processing includes three main steps: low-level, mid-level and highlevel processing. Low-level processing involves primitive operations, such as: image preprocessing to reduce the noise, contrast enhancement, and image sharpening. Mid-level processing on images involves tasks such as segmentation (partitioning an image into regions or objects), description of those objects to reduce them to a form suitable for computer processing, and classification (recognition) of individual objects. Finally, higher-level processing involves "making sense" of an ensemble of recognised objects, as in image analysis. Based on the theory just described in the last paragraph, this thesis is organised in three parts: Colour Edge and Face Detection; Hand motion detection; Hand Gesture Detection and Medical Image Processing. II In Colour Edge Detection, two new images G-image and R-image are built through colour space transform, after that, the two edges extracted from G-image and R-image respectively are combined to obtain the final new edge. In Face Detection, a skin model is built first, then the boundary condition of this skin model can be extracted to cover almost all of the skin pixels. After skin detection, the knowledge about size, size ratio, locations of ears and mouth is used to recognise the face in the skin regions. In Hand Motion Detection, frame differe is compared with an automatically chosen threshold in order to identify the moving object. For some special situations, with slow or smooth object motion, the background modelling and frame differencing are combined in order to improve the performance. In Hand Gesture Recognition, 3 features of every testing image are input to Gaussian Mixture Model (GMM), and then the Expectation Maximization algorithm (EM)is used to compare the GMM from testing images and GMM from training images in order to classify the results. In Medical Image Processing (mammograms), the Artificial Neural Network (ANN) and clustering rule are applied to choose the feature. Two classifier, ANN and Support Vector Machine (SVM), have been applied to classify the results, in this processing, the balance learning theory and optimized decision has been developed are applied to improve the performance.
12

Localization, Characterization and Recognition of Singing Voices

Regnier, Lise 08 March 2012 (has links) (PDF)
This dissertation is concerned with the problem of describing the singing voice within the audio signal of a song. This work is motivated by the fact that the lead vocal is the element that attracts the attention of most listeners. For this reason it is common for music listeners to organize and browse music collections using information related to the singing voice such as the singer name. Our research concentrates on the three major problems of music information retrieval: the localization of the source to be described (i.e. the recognition of the elements corresponding to the singing voice in the signal of a mixture of instruments), the search of pertinent features to describe the singing voice, and finally the development of pattern recognition methods based on these features to identify the singer. For this purpose we propose a set of novel features computed on the temporal variations of the fundamental frequency of the sung melody. These features, which aim to describe the vibrato and the portamento, are obtained with the aid of a dedicated model. In practice, these features are computed on the time-varying frequency of partials obtained using the sinusoidal model. In the first experiment we show that partials corresponding to the singing voice can be accurately differentiated from the partials produced by other instruments using decisions based on the parameters of the vibrato and the portamento. Once the partials emitted by the singer are identified, the segments of the song containing singing can be directly localized. To improve the recognition of the partials emitted by the singer we propose to group partials that are related harmonically. Partials are clustered according to their degree of similarity. This similarity is computed using a set of CASA cues including their temporal frequency variations (i.e. the vibrato and the portamento). The clusters of harmonically related partials corresponding to the singing voice are identified using the vocal vibrato and the portamento parameters. Groups of vocal partials can then be re-synthesized to isolate the voice. The result of the partial grouping can also be used to transcribe the sung melody. We then propose to go further with these features and study if the vibrato and portamento characteristics can be considered as a part of the singers' signature. Previous works on singer identification describe audio signals using features extracted on the short-term amplitude spectrum. The latter features aim to characterize the timbre of the sound, which, in the case of singing, is related to the vocal tract of the singer. The features we develop in this document capture long-term information related to the intonation of the singer, which is relevant to the style and the technique of the singer. We propose a method to combine these two complementary descriptions of the singing voice to increase the recognition rate of singer identification. In addition we evaluate the robustness of each type of feature against a set of variations. We show the singing voice is a highly variable instrument. To obtain a representative model of a singer's voice it is thus necessary to build models using a large set of examples covering the full tessitura of a singer. In addition, we show that features extracted directly from the partials are more robust to the presence of an instrumental accompaniment than features derived from the amplitude spectrum.
13

Detekce stresu / Stress detection

Jindra, Jakub January 2019 (has links)
Stress detection based on non-EEG physiological data can be useful for monitoring drivers, pilots, and also for monitoring of people in ordinary situation, where standard EEG monitoring is unsuitable. This work uses Non-EEG database freely available from Physionet. The database contains records of heart rate, saturation of blood oxygen, motion, a conductance of skin and temperature recorded for 3 type of stress alternated with relax state. Two final models were created in this thesis. First model for Binary classification stress/relax, second for classification of 4 different type of psychical state. Best results were reached using model created by decision tree algorithm with 8 features for binary classification and with 8 features for classification of 4 psychical state. Accuracy of final models is aproximately 95 % for binary model and 99 % for classification of 4 psychical state. All algorithms were implemented in Python.
14

Analýza Parkinsonovy nemoci pomocí segmentálních řečových příznaků / Analysis of Parkinson's disease using segmental speech parameters

Mračko, Peter January 2015 (has links)
This project describes design of the system for diagnosis Parkinson’s disease based on speech. Parkinson’s disease is a neurodegenerative disorder of the central nervous system. One of the symptoms of this disease is disability of motor aspects of speech, called hypokinetic dysarthria. Design of the system in this work is based on the best known segmental features such as coefficients LPC, PLP, MFCC, LPCC but also less known such as CMS, ACW and MSC. From speech records of patients affected by Parkinson’s disease and also healthy controls are calculated these coefficients, further is performed a selection process and subsequent classification. The best result, which was obtained in this project reached classification accuracy 77,19%, sensitivity 74,69% and specificity 78,95%.

Page generated in 0.0973 seconds