| Human Language Technology: Opportunities and Challenges |
2005 |
5 pages |
| Authors:
Mari Ostendorf; Elizabeth Shriberg; Andreas Stolcke; SRI INTERNATIONAL MENLO PARK CA
|
 | In recent years, there has been dramatic progress in both speech and language processing, in many cases leveraging some of the same underlying methods. This progress and the growing technical ties motivate efforts to combine speech and language technologies in spoken document processing applications. This paper outlines some of the issues involved, as well as the opportunities, presenting an overview of the special double session on this topic. |
|
| Structural Metadata Research in the Ears Program |
2005 |
5 pages |
| Authors:
Yang Liu; Elizabeth Shriberg; Andreas Stolcke; Barbara Peskin; Jeremy Ang; Dustin Hillard; Mari Ostendorf; Marcus Tomalin; Phil Woodland; Mary Harper; INTERNATIONAL COMPUTER SCIENCE INST BERKELEY CA
|
 | Both human and automatic processing of speech require recognition of more than just words. In this paper we provide a brief overview of research on structural metadata extraction in the DARPA EARS rich transcription program. Tasks include detection of sentence boundaries, filler words, and disfluencies. Modeling approaches combine lexical, prosodic, and syntactic information, using various modeling techniques for knowledge source integration. The performance of these methods is evaluated by task, ... |
|
| Parsing Conversational Speech Using Enhanced Segmentation |
2004 |
5 pages |
| Authors:
Jeremy G. Kahn; Mari Ostendorf; Ciprian Chelba; WASHINGTON UNIV SEATTLE DEPT OF ELECTRICAL ENGINEERING
|
 | The lack of sentence boundaries and presence of disfluencies pose difficulties for parsing conversational speech. This work investigates the effects of automatically detecting these phenomena on a probabilistic parser's performance. We demonstrate that a state-of-the-art segmenter, relative to a pause-based segmenter, gives more than 45% of the possible error reduction in parser performance, and that presentation of interruption points to the parser improves performance over using sentence boundaries alone. Parsing ... |
|
| Detecting Structural Metadata with Decision Trees and Transformation-Based Learning |
2004 |
9 pages |
| Authors:
Joungbum Kim; Sarah E. Schwarm; Mari Ostendorf; WASHINGTON UNIV SEATTLE DEPT OF ELECTRICAL ENGINEERING
|
 | The regular occurrence of disfluencies is a distinguishing characteristic of spontaneous speech. Detecting and removing such disfluencies can substantially improve the usefulness of spontaneous speech transcripts. This paper presents a system that detects various types of disfluences and other structural information with cues obtained from lexical and prosodic information sources. Specifically, combinations of decision trees and language models are used to predict sentence ends and interruption points and given these ... |
|
| Language Modeling With Sentence-Level Mixtures |
1994 |
7 pages |
| Authors:
Rukmini Iyer; Mari Ostendorf; J. R. Rohlicek; BOSTON UNIV MA
|
 | This paper introduces a simple mixture language model that attempts to capture long distance constraints in a sentence or paragraph. The model is an m-component mixture of trigram models. The models were constructed using a 5K vocabulary and trained using a 76 million word Wail Street Journal text corpus. Using the BU recognition system, experiments show a 7% improvement in recognition accuracy with the mixture trigram models as compared to ... |
|
| Recognition Using Classification and Segmentation Scoring |
1992 |
6 pages |
| Authors:
Owen Kimball; Mari Ostendorf; Robin Rohlicek; BOSTON UNIV MA
|
 | Traditional statistical speech recognition systems typically make strong assumptions about the independence of observation frames and generally do not make use of segmental information. In contrast, when the segmentation is known, existing classifiers can readily accommodate segmental information in the decision process. We describe an approach to connected word recognition that allows the use of segmental information through an explicit decomposition of the recognition criterion into classification and segmentation scoring. ... |
|
| Weight Estimation for N-Best Rescoring |
1992 |
3 pages |
| Authors:
Ashvin Kannan; Mari Ostendorf; J. R. Rohlicek; BOSTON UNIV MA
|
 | This paper describes recent improvements in the weight estimation technique for sentence hypothesis rescoring using the N-Best formalism. Mismatches between training and test data are also explored. |
|
| The Use of Prosody in Syntactic Disambiguation |
1991 |
7 pages |
| Authors:
Patti Price; Mari Ostendorf; Stefanie Shattuck-Hufnagel; Cynthia Fong; SRI INTERNATIONAL MENLO PARK CA
|
 | Prosodic structure and syntactic structure are not identical; neither are they unrelated. Knowing when and how the two correspond could yield better quality speech synthesis, could aid in the disambiguation of competing syntactic hypotheses in speech understanding, and could lead to a more comprehensive view of human speech processing. In a set of experiments involving 35 pairs of phonetically similar sentences representing seven types of structural contrasts, the perceptual evidence ... |
|