| Comparing Evaluation Metrics for Sentence Boundary Detection |
2007 |
5 pages |
| Authors:
Yang Liu; Elizabeth Shriberg; TEXAS UNIV AT DALLAS RICHARDSON
|
 | In recent NIST evaluations on sentence boundary detection, a single error metric was used to describe performance. Additional metrics, however, are available for such tasks, in which a word stream is partitioned into subunits. This paper compares alternative evaluation metrics including the NIST error rate, classification error rate per word boundary, precision and recall, ROC curves, DET curves, precision-recall curves, and area under the curves and discusses advantages and disadvantages ... |
|
| On Speaker-Specific Prosodic Models for Automatic Dialog Act Segmentation of Multi-Party Meetings |
2006 |
5 pages |
| Authors:
Jachym Kolar; Elizabeth Shriberg; Yang Liu; INTERNATIONAL COMPUTER SCIENCE INST BERKELEY CA
|
 | We explore speaker-specific prosodic modeling for dialog act segmentation of speech from the ICSI Meeting Corpus. We ask whether features beyond pauses help individual speakers, and whether some speakers benefit from prosody models trained on only their speech. We find positive results for both questions, although the second is more complex. Feature analysis reveals that duration is the most used feature type, followed by pause and pitch features. Results also ... |
|
| Toward Joint Segmentation and Classification of Dialog Acts in Multiparty Meetings |
2005 |
8 pages |
| Authors:
Matthias Zimmermann; Yang Liu; Elizabeth Shriberg; Andreas Stolcke; INTERNATIONAL COMPUTER SCIENCE INST BERKELEY CA
|
 | The authors present baseline results for the joint segmentation and classification of dialog acts (DAs) of the International Computer Science Institute (ICSI) Meeting Corpus. Two simple approaches based on word information are investigated and compared with previous work on the same task. The first approach is based on a Hidden-Event Language Model (HE-LM), and the second relies on a Hidden Markov Model (HMM) based tagger. The HE-LM is frequently used ... |
|
| Human Language Technology: Opportunities and Challenges |
2005 |
5 pages |
| Authors:
Mari Ostendorf; Elizabeth Shriberg; Andreas Stolcke; SRI INTERNATIONAL MENLO PARK CA
|
 | In recent years, there has been dramatic progress in both speech and language processing, in many cases leveraging some of the same underlying methods. This progress and the growing technical ties motivate efforts to combine speech and language technologies in spoken document processing applications. This paper outlines some of the issues involved, as well as the opportunities, presenting an overview of the special double session on this topic. |
|
| Structural Metadata Research in the Ears Program |
2005 |
5 pages |
| Authors:
Yang Liu; Elizabeth Shriberg; Andreas Stolcke; Barbara Peskin; Jeremy Ang; Dustin Hillard; Mari Ostendorf; Marcus Tomalin; Phil Woodland; Mary Harper; INTERNATIONAL COMPUTER SCIENCE INST BERKELEY CA
|
 | Both human and automatic processing of speech require recognition of more than just words. In this paper we provide a brief overview of research on structural metadata extraction in the DARPA EARS rich transcription program. Tasks include detection of sentence boundaries, filler words, and disfluencies. Modeling approaches combine lexical, prosodic, and syntactic information, using various modeling techniques for knowledge source integration. The performance of these methods is evaluated by task, ... |
|
| The ICSI Meeting Recorder Dialog Act (MRDA) Corpus |
2004 |
5 pages |
| Authors:
Elizabeth Shriberg; Raj Dhillon; Sonali Bhagat; Jeremy Ang; Hannah Carvey; INTERNATIONAL COMPUTER SCIENCE INST BERKELEY CA
|
 | We describe a new corpus of over 180,000 hand- annotated dialog act tags and accompanying adjacency pair annotations for roughly 72 hours of speech from 75 naturally-occurring meetings. We provide a brief summary of the annotation system and labeling procedure, inter-annotator reliability statistics, overall distributional statistics, a description of auxiliary files distributed with the corpus, and information on how to obtain the data. |
|
| The Meeting Project at ICSI |
2001 |
8 pages |
| Authors:
Nelson Morgan; Don Baron; Jane Edwards; Dan Ellis; David Gelbart; Adam Janin; Thilo Pfau; Elizabeth Shriberg; Andreas Stolcke; INTERNATIONAL COMPUTER SCIENCE INST BERKELEY CA
|
 | In collaboration with colleagues at UW, OGI, IBM, and SRI, we are developing technology to process spoken language from informal meetings. The work includes a substantial data collection and transcription effort, and has required a nontrivial degree of infrastructure development. We are undertaking this because the new task area provides a significant challenge to current HLT capabilities, while offering the promise of a wide range of potential applications. In this ... |
|
| A System for Labeling Self-Repairs in Speech |
22 FEB 1993 |
10 pages |
| Authors:
John Bear; John Dowding; Elizabeth Shriberg; Patti Price; SRI INTERNATIONAL MENLO PARK CA ARTIFICIAL INTELLIGENCE CENTER
|
 | This document outlines a system for labeling self-repairs in spontaneous speech. The system marks the location and extent of a repair, as well as relevant words in the region of the repair. Together these labels determine the relationship between the "error" and the hypothesized "correction." The system is designed to be able to capture distinctions among different repair patterns while remaining easy to learn, apply, and integrate into existing transcription ... |
|
| Detection and Correction of Repairs in Human-Computer Dialog |
05 MAY 1992 |
12 pages |
| Authors:
John Bear; John Dowding; Elizabeth Shriberg; SRI INTERNATIONAL MENLO PARK CA
|
 | The authors have analyzed 607 sentences of spontaneous human-computer speech data containing repairs that were drawn from a total corpus of 10,718 sentences. In this paper, they present criteria and techniques for automatically detecting the presence of a repair, its location, and making the appropriate correction. The criteria involve integration of knowledge from several sources: pattern matching, syntactic and semantic analysis, and acoustics. In summary, disfluencies occur at high enough ... |
|
| Spontaneous Speech Effects in Large Vocabulary Speech Recognition Applications |
FEB 1992 |
6 pages |
| Authors:
John Butzberger; Hy Murveit; Elizabeth Shriberg; Patti Price; SRI INTERNATIONAL MENLO PARK CA
|
 | We describe three analyses on the effects of spontaneous speech on continuous speech recognition performance. We have found that: (1) spontaneous speech effects significantly degrade recognition performance, (2) fluent spontaneous speech yields word accuracies equivalent to read speech, and (3) using spontaneous speech training data can significantly improve performance for recognizing spontaneous speech. We conclude that word accuracy can be improved by explicitly modeling spontaneous effects in the recognizer, and ... |
|