In this thesis, an audio-video database for speaker recognition is constructed using a digital camcorder. Motion pictures of fifteen hundred speakers are recorded in three different sessions in the database. For each speaker, 20 still images per session are also derived from the video data. It is hoped that this database can provide an appropriate training and testing mechanism for person identification using both voice and face features.
Identifer | oai:union.ndltd.org:NSYSU/oai:NSYSU:etd-0905108-021052 |
Date | 05 September 2008 |
Creators | Chen, Chun-chi |
Contributors | Chii-Maw Uang, Chih-Chien Chen, Tsung Lee |
Publisher | NSYSU |
Source Sets | NSYSU Electronic Thesis and Dissertation Archive |
Language | Cholon |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0905108-021052 |
Rights | not_available, Copyright information available at source archive |
Page generated in 0.0018 seconds