• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 22
  • 3
  • 2
  • 2
  • 1
  • Tagged with
  • 44
  • 25
  • 18
  • 16
  • 9
  • 7
  • 7
  • 6
  • 6
  • 6
  • 5
  • 5
  • 4
  • 4
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Lip password-based speaker verification system with unknown language alphabet

Zhou, Yichao 31 August 2018 (has links)
The traditional security systems that verify the identity of users based on password usually face the risk of leaking the password contents. To solve this problem, biometrics such as the face, iris, and fingerprint, begin to be widely used in verifying the identity of people. However, these biometrics cannot be changed if the database is hacked. What's more, verification systems based on the traditional biometrics might be cheated by fake fingerprint or the photo.;Liu and Cheung (Liu and Cheung 2014) have recently initiated the concept of lip password, which is composed of a password embedded in the lip movement and the underlying characteristics of lip motion [26]. Subsequently, a lip password-based system for visual speaker verification has been developed. Such a system is able to detect a target speaker saying the wrong password or an impostor who knows the correct password. That is, only a target user speaking correct password can be accepted by the system. Nevertheless, it recognizes the lip password based on a lip-reading algorithm, which needs to know the language alphabet of the password in advance, which may limit its applications.;To tackle this problem, in this thesis, we study the lip password-based visual speaker verification system with unknown language alphabet. First, we propose a method to verify the lip password based on the key frames of lip movement instead of recognizing the individual password elements, such that the lip password verification process can be made without knowing the password alphabet beforehand. To detect these key frames, we extract the lip contours and detect the interest intervals where the lip contours have significant variations. Moreover, in order to avoid accurate alignment of feature sequences or detection on mouth status which is computationally expensive, we design a novel overlapping subsequence matching approach to encode the information in lip passwords in the system. This technique works by sampling the feature sequences extracted from lip videos into overlapping subsequences and matching them individually. All the log-likelihood of each subsequence form the final feature of the sequence and are verified by the Euclidean distance to positive sample centers. We evaluate the proposed two methods on a database that contains totally 8 kinds of lip passwords including English digits and Chinese phrases. Experimental results show the superiority of the proposed methods for visual speaker verification.;Next, we propose a novel visual speaker verification approach based on diagonal-like pooling and pyramid structure of lips. We take advantage of the diagonal structure of sparse representation to preserve the temporal order of lip sequences by employ a diagonal-like mask in pooling stage and build a pyramid spatiotemporal features containing the structural characteristic under lip password. This approach eliminates the requirement of segmenting the lip-password into words or visemes. Consequently, the lip password with any language can be used for visual speaker verification. Experiments show the efficacy of the proposed approach compared with the state-of-the-art ones.;Additionally, to further evaluate the system, we also develop a prototype of the lip password-based visual speaker verification. The prototype has a Graphical User Interface (GUI) that make users easy to access.
22

A standardization of the "Children's Speechreading Test" on normal children

Newcombe, Lorna Helen 01 January 1969 (has links)
This study is limited standardization of the "Children's Speechreading Test" designed by Dr. Dolores S. Butt of the University of New Mexico. After studying the development of language skills in young acoustically handicapped children, she randomly selected subjects from 10 nursery schools and primary departments of schools for the deaf and administered her test to these children. The purpose of the present study is to provide a limited standardization of the "Children's speechreading Test" on normal hearing children. Although Dr. Butt indicates some relation of her test to intelligence, no attempt was made in this pilot study to correlate mental ability with speechreading ability. The "children's Speechreading Test" reproduced in complete form in Appendix A, was administered to 20 normal hearing children, all of whom were in the first grade. Information in the form of raw scores was then to utilized in calculating the standard deviation and percentile scores. Information gained from administering the test was also to be utilized to predict further investigation regarding the usefulness of this test. In addition a further purpose of this pilot project was to compile the materials necessary for administration of the test for future use in the Portland State University Speech and Hearing Clinic.
23

A framework for speechreading acquisition tools

Gorman, Benjamin Millar January 2018 (has links)
At least 360 million people worldwide have disabling hearing loss that frequently causes difficulties in day-to-day conversations. Hearing aids often fail to offer enough benefits and have low adoption rates. However, people with hearing loss find that speechreading can improve their understanding during conversation. Speechreading (often called lipreading) refers to using visual information about the movements of a speaker's lips, teeth, and tongue to help understand what they are saying. Speechreading is commonly used by people with all severities of hearing loss to understand speech, and people with typical hearing also speechread (albeit subconsciously) to help them understand others. However, speechreading is a skill that takes considerable practice to acquire. Publicly-funded speechreading classes are sometimes provided, and have been shown to improve speechreading acquisition. However, classes are only provided in a handful of countries around the world and students can only practice effectively when attending class. Existing tools have been designed to help improve speechreading acquisition, but are often not effective because they have not been designed within the context of contemporary speechreading lessons or practice. To address this, in this thesis I present a novel speechreading acquisition framework that can be used to design Speechreading Acquisition Tools (SATs) - a new type of technology to improve speechreading acquisition. I interviewed seven speechreading tutors and used thematic analysis to identify and organise the key elements of the framework. I evaluated the framework by using it to: 1) categorise every tutor-identified speechreading teaching technique, 2) critically evaluate existing Conversation Aids and SATs, and 3) design three new SATs. I then conducted a postal survey with 59 speechreading students to understand students' perspectives on speechreading, and how their thoughts could influence future SATs. To further evaluate the framework's effectiveness I then developed and evaluated two new SATs (PhonemeViz and MirrorMirror) designed using the framework. The findings from the evaluation of these two new SATs demonstrates that using the framework can help design effective tools to improve speechreading acquisition.
24

Lip motion tracking and analysis with application to lip-password based speaker verification

Liu, Xin 01 January 2013 (has links)
No description available.
25

The Effect of Parent-Child Interaction on the Language Development of the Hearing-Impaired Child

Melum, Arla J. 01 January 1982 (has links)
In recent years much interest has been focused on the manner in which the young child acquires language. Some researchers (Chomsky, 1965; McNeil) have postulated an inherent capacity to comprehend and utilize linguistic structures, while others such as Irwin, (1960), Hess and Shipman (1965), and Greenstein, et al, (1974) have focused on experiential determinants of language competence in early childhood. As with all children, the social and emotional behavior of deaf children is greatly influenced by their ability to communicate with significant others. Interactions between the normally developing child and his parents are characterized by mutual responsiveness, where each initiates and reciprocates communication. When a Child's language development is delayed or impaired(as with a hearing loss), this communication process may also become impaired, with parents being unable to respond appropriately to confused or reduced messages from the child. This paper reviews some of the pertinent research regarding the behavioral interaction between the parent and child and its effect on communication and psychosocial development. The implications of this data for the hearing-impaired child and his family are considered. It will address the question," What is it that parents with young hearing impaired children do that facilitates or impedes speech and language development?" A methodology is also presented for developing effective communication between such children and their parents.
26

Speechreading ability in elementary school-age children with and without functional articulation disorders

Habermann, Barbara L. 01 January 1990 (has links)
The purpose of this study was to compare the speechreading abilities of elementary school-age children with mild to severe articulation disorders with those of children with normal articulation. Speechreading ability, as determined by a speechreading test, indicates how well a person recognizes the visual cues of speech. Speech sounds that have similar visual characteristics have been defined as visemes by Jackson in 1988 and can be categorized into distinct groups based on their place of articulation. A relationship between recognition of these visemes and correct articulation was first proposed by Woodward and Barber in 1960. Dodd, in 1983, noted that speechread information shows a child how to produce a sound, while aural input simply offers a target at which to aim.
27

Vocalisations with a better view : hyperarticulation augments the auditory-visual advantage for the detection of speech in noise

Lees, Nicole C., University of Western Sydney, College of Arts January 2007 (has links)
Recent studies have shown that there is a visual influence early in speech processing - visual speech enhances the ability to detect auditory speech in noise. However, identifying exactly how visual speech interacts with auditory processing at such an early stage has been challenging, because this so-called AV speech detection advantage is both highly related to a specific lower-order, signal-based, optic-acoustic relationship between the second formant amplitude and the area of the mouth (F2/Mouth-area), and mediated by higher-order, information-based factors. Previous investigations either have maximised or minimised information-based factors, or have minimised signal-based factors, in order to try to tease out the relative importance of these sources of the advantage, but they have not yet been successful in this endeavour. Maximising signal-based factors has not previously been explored. This avenue was explored in this thesis by manipulating speaking style, hyperarticulated speech was used to maximise signal-based factors, and hypoarticulated speech to minimise signal-based factors - to examine whether the AV speech detection advantage is modified by these means, and to provide a clearer idea of the primary source of visual influence in the AV detection advantage. Two sets of six studies were conducted. In the first set, three recorded speech styles, hyperarticulated, normal, and hypoarticulated, were extensively analysed in physical (optic and acoustic) and perceptual (visual and auditory) dimensions ahead of stimulus selection for the second set of studies. The analyses indicated that the three styles comprise distinctive categories on the Hyper-Hypo continuum of articulatory effort (Lindblom, 1990). Most relevantly, both optically and visually hyperarticulated speech was more informative, and hypoarticulated less informative, than normal speech with regard to signal-based movement factors. However, the F2/Mouth-area correlation was similarly strong for all speaking styles, thus allowing examination of signal-based, visual informativeness on AV speech detection with optic-acoustic association controlled. In the second set of studies, six Detection Experiments incorporating the three speaking styles were designed to examine whether, and if so why, more visually-informative (hyperarticulated) speech augmented, and less visually informative (hypoarticulated) speech attenuated, the AV detection advantage relative to normal speech, and to examine visual influence when auditory speech was absent. Detection Experiment 1 used a two-interval, two-alternative (first or second interval, 2I2AFC) detection task, and indicated that hyperarticulation provided an AV detection advantage greater than for normal and hypoarticulated speech, with less of an advantage for hypoarticulated than for normal speech. Detection Experiment 2 used a single-interval, yes-no detection task to assess responses in signal-absent independent of signal-present conditions as a means of addressing participants’ reports that speech was heard when it was not presented in the 2I2AFC task. Hyperarticulation resulted in an AV detection advantage, and for all speaking styles there was a consistent response bias to indicate speech was present in signal-absent conditions. To examine whether the AV detection advantage for hyperarticulation was due to visual, auditory or auditory-visual factors, Detection Experiments 3 and 4 used mismatching AV speaking style combinations (AnormVhyper, AnormVhypo, AhyperVnorm, AhypoVnorm) that were onset-matched or time-aligned, respectively. The results indicated that higher rates of mouth movement can be sufficient for the detection advantage with weak optic-acoustic associations, but, in circumstances where these associations are low, even high rates of movement have little impact on augmenting detection in noise. Furthermore, in Detection Experiment 5, in which visual stimuli consisted only of the mouth movements extracted from the three styles, there was no AV detection advantage, and it seems that this is so because extra-oral information is required, perhaps to provide a frame of reference that improves the availability of mouth movement to the perceiver. Detection Experiment 6 used a new 2I-4AFC task and the measures of false detections and response bias to identify whether visual influence in signal absent conditions is due to response bias or an illusion of hearing speech in noise (termed here the Speech in Noise, SiN, Illusion). In the event, the SiN illusion occurred for both the hyperarticulated and the normal styles – styles with reasonable amounts of movement change. For normal speech, the responses in signal-absent conditions were due only to the illusion of hearing speech in noise, whereas for hypoarticulated speech such responses were due only to response bias. For hyperarticulated speech there is evidence for the presence of both types of visual influence in signal-absent conditions. It seems to be the case that there is more doubt with regard to the presence of auditory speech for non-normal speech styles. An explanation of past and present results is offered within a new framework -the Dynamic Bimodal Accumulation Theory (DBAT). This is developed in this thesis to address the limitations of, and conflicts between, previous theoretical positions. DBAT suggests a bottom-up influence of visual speech on the processing of auditory speech; specifically, it is proposed that the rate of change of visual movements guides auditory attention rhythms ‘on-line’ at corresponding rates, which allows selected samples of the auditory stream to be given prominence. Any patterns contained within these samples then emerge from the course of auditory integration processes. By this account, there are three important elements of visual speech necessary for enhanced detection of speech in noise. First and foremost, when speech is present, visual movement information must be available (as opposed to hypoarticulated and synthetic speech) Then the rate of change, and opticacoustic relatedness also have an impact (as in Detection Experiments 3 and 4). When speech is absent, visual information has an influence; and the SiN illusion (Detection Experiment 6) can be explained as a perceptual modulation of a noise stimulus by visually-driven rhythmic attention. In sum, hyperarticulation augments the AV speech detection advantage, and, whenever speech is perceived in noisy conditions, there is either response bias to perceive speech or a SiN illusion, or both. DBAT provides a detailed description of these results, with wider-ranging explanatory power than previous theoretical accounts. Predictions are put forward for examination of the predictive power of DBAT in future studies. / Doctor of Philosophy (PhD)
28

Speechreading abilities of Cantonese-speaking hearing-impaired children on consonants and words

Ho, Eve. January 1997 (has links)
Thesis (B.Sc)--University of Hong Kong, 1997. / "A dissertation submitted in partial fulfilment of the requirements for the Bachelor of Science (Speech and Hearing Sciences), The University of Hong Kong, April 30, 1997." Also available in print.
29

On study of lip segmentation in color space

Li, Meng 08 August 2014 (has links)
This thesis mainly addresses two issues: 1) to investigate how to perform the lip segmentation without knowing the true number of segments in advance, and 2) to investigate how to select the local optimal observation scale for each structure from the viewpoint of lip segmentation e.ectively. Regarding the .rst issue, two number of prede.ned segments independent lip segmentation methods are proposed. In the .rst one, a multi-layer model is built up, in which each layer corresponds to one segment cluster. Subsequently, a Markov random .eld (MRF) derived from this model is obtained such that the segmenta­tion problem is formulated as a labeling optimization problem under the maximum a posteriori-Markov random .eld (MAP-MRF) framework. Suppose the pre-assigned number of segments may over-estimate the ground truth, whereby leading to the over-segmentation. An iterative algorithm capable of performing segment clusters and over-segmentation elimination simultaneously is presented. Based upon this algorithm, a lip segmentation scheme is proposed, featuring the robust performance to the estimate of the number of segment clusters. In the second method, a fuzzy clustering objective function which is a variant of the partition entropy (PE) and implemented using Havrda-Charvat’s structural a-entropy is presented. This objec­tive function features that the coincident cluster centroids in pattern space can be equivalently substituted by one centroid with the function value unchanged. The minimum of the proposed objective function can be reached provided that: (1) the number of positions occupied by cluster centroids in pattern space is equal to the truth cluster number, and (2) these positions are coincident with the optimal cluster centroids obtained under PE criterion. In the implementation, the clusters provided that the number of clusters is greater than or equal to the ground truth are ran­domly initialized. Then, an iterative algorithm is utilized to minimize the proposed objective function. The initial over-partition will be gradually faded out with the redundant centroids superposed over the convergence of the algorithm. For the second issue, an MRF based method with taking local scale variation into account to deal with the lip segmentation problem is proposed. Supposing each pixel of the target image has an optimal local scale from the segmentation viewpoint, the lip segmentation problem can be treated as a combination of observation scale selection and observed data classi.cation. Accordingly, a multi-scale MRF model is proposed to represent the membership map of each input pixel to a speci.c segment and local-scale map simultaneously. The optimal scale map and the correspond­ing segmentation result are obtained by minimizing the objective function via an iterative algorithm. Finally, based upon the three proposed methods, some lip segmentation exper­iments are conducted, respectively. The results show the e.cacy of the proposed methods in comparison with the existing counterparts.
30

Adaptive threshold optimisation for colour-based lip segmentation in automatic lip-reading systems

Gritzman, Ashley Daniel January 2016 (has links)
A thesis submitted to the Faculty of Engineering and the Built Environment, University of the Witwatersrand, Johannesburg, in ful lment of the requirements for the degree of Doctor of Philosophy. Johannesburg, September 2016 / Having survived the ordeal of a laryngectomy, the patient must come to terms with the resulting loss of speech. With recent advances in portable computing power, automatic lip-reading (ALR) may become a viable approach to voice restoration. This thesis addresses the image processing aspect of ALR, and focuses three contributions to colour-based lip segmentation. The rst contribution concerns the colour transform to enhance the contrast between the lips and skin. This thesis presents the most comprehensive study to date by measuring the overlap between lip and skin histograms for 33 di erent colour transforms. The hue component of HSV obtains the lowest overlap of 6:15%, and results show that selecting the correct transform can increase the segmentation accuracy by up to three times. The second contribution is the development of a new lip segmentation algorithm that utilises the best colour transforms from the comparative study. The algorithm is tested on 895 images and achieves percentage overlap (OL) of 92:23% and segmentation error (SE) of 7:39 %. The third contribution focuses on the impact of the histogram threshold on the segmentation accuracy, and introduces a novel technique called Adaptive Threshold Optimisation (ATO) to select a better threshold value. The rst stage of ATO incorporates -SVR to train the lip shape model. ATO then uses feedback of shape information to validate and optimise the threshold. After applying ATO, the SE decreases from 7:65% to 6:50%, corresponding to an absolute improvement of 1:15 pp or relative improvement of 15:1%. While this thesis concerns lip segmentation in particular, ATO is a threshold selection technique that can be used in various segmentation applications. / MT2017

Page generated in 0.0469 seconds