In human-human dialogues, face-to-face meetings are often preferred over phone conversations.One explanation is that non-verbal modalities such as gesture provide additionalinformation, making communication more efficient and accurate. If so, computerprocessing of natural language could improve by attending to non-verbal modalitiesas well. We consider the problem of sentence segmentation, using hand-annotatedgesture features to improve recognition. We find that gesture features correlate wellwith sentence boundaries, but that these features improve the overall performance of alanguage-only system only marginally. This finding is in line with previous research onthis topic. We provide a regression analysis, revealing that for sentence boundarydetection, the gestural features are largely redundant with the language model andpause features. This suggests that gestural features can still be useful when speech recognition is inaccurate.
Identifer | oai:union.ndltd.org:MIT/oai:dspace.mit.edu:1721.1/30540 |
Date | 19 April 2005 |
Creators | Eisenstein, Jacob, Davis, Randall |
Source Sets | M.I.T. Theses and Dissertation |
Language | en_US |
Detected Language | English |
Format | 13 p., 13772256 bytes, 521371 bytes, application/postscript, application/pdf |
Relation | Massachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratory |
Page generated in 0.0016 seconds