Storming Media: Pentagon Reports and DocumentsPentagon Reports: Fast. Definitive. Complete.     
New Account »
Forgot Password?
Advanced Search »

Newsletter
Unsubscribe »
CommunicationsVoice Communications

Total Results: 1839 Pages: Previous  4 5 6 7 8 [9] 10 11 12 13 14 Next Results per page:
Sort by: Title Date Desc Pages Display:
Segment-Based Acoustic Models for Continuous Speech Recognition 31 DEC 94 18 pages
Authors:  Mari Ostendorf; J. R. Rohlicek; BOSTON UNIV MA
The full text of this report is available for sale.This research aims to develop new and more accurate stochastic models for speaker-independent continuous speech recognition by extending previous work in segment-based modeling, by introducing a new hierarchical approach to representing intra-utterance statistical dependencies, and by developing language models that capture topic dependencies. These techniques, which have high computational costs because of the large search space associated with higher order models, are made feasible through a multi-pass search strategy that ...


Voice Analysis Using the Bispectrum DEC 94 73 pages
Authors:  Deborah A. Douglass; AIR FORCE INST OF TECH WRIGHT-PATTERSON AFB OH
The full text of this report is available for sale.The theory of the bispectrum has been studied, though very few practical applications have yet been considered in any depth. One application mentioned in the literature is the use of the bispectrum for voice signal processing. The aim of this thesis was to research the bispectrum towards the particular application of speech enhancement. The technique is based on the fact that the bispectrum is zero for a Gaussian white noise ...


Multiclassifier Fusion of an Ultrasonic Lip Reader in Automatic Speech Recognition DEC 94 100 pages
Authors:  David L. Jennnings; AIR FORCE INST OF TECH WRIGHT-PATTERSON AFB OH
The full text of this report is available for sale.This thesis investigates the use of two active ultrasonic devices in collecting lip information for performing and enhancing automatic speech recognition. The two devices explored are called the 'Ultrasonic Mike' and the 'Lip Lock Loop.' The devices are tested in a speaker dependent isolated word recognition task with a vocabulary consisting of the spoken digits from zero to nine. Two automatic lip readers are designed and tested based on the ...


Isolated Digit Recognition Without Time Alignment DEC 94 148 pages
Authors:  Jeffrey M. Gay; AIR FORCE INST OF TECH WRIGHT-PATTERSON AFB OH
The full text of this report is available for sale.This thesis examines methods for isolated digit recognition without using time alignment. Resource requirements for isolated word recognizers that use time alignment can become prohibitively large as the vocabulary to be classified grows. Thus, methods capable of achieving recognition rates comparable to those obtained with current methods using these techniques are needed. The goals of this research are to find feature sets for speech recognition that perform well without using ...


High-Performance Speech Recognition Using Consistency Modeling DEC 94 149 pages
Authors:  Vassilios Digalakis; Hy Murveit; Peter Monaco; Leo Neumeyer; Ananth Sankar; SRI INTERNATIONAL MENLO PARK CA
The full text of this report is available for sale.The goal of SRI's consistency modeling project is to improve the raw acoustic modeling component of SRI's DECIPHER speech recognition system and develop consistency modeling technology. Consistency modeling aims to reduce the number of improper independence assumptions used in traditional speech recognition algorithms so that the resulting speech recognition hypotheses are more self-consistent and, therefore, more accurate. At the initial stages of this effort, SRI focused on developing the appropriate ...


Speech Analysis and Synthesis Based on Pitch-Synchronous Segmentation of the Speech Waveform 09 NOV 94 55 pages
Authors:  George S. Kang; Lawrence J. Fransen; NAVAL RESEARCH LAB WASHINGTON DC
The full text of this report is available for sale.This report describes a new speech analysis/synthesis method. This new technique does not attempt to model the human speech production mechanism. Instead, we represent the speech waveform directly in terms of the speech waveform defined in a pitch period. A significant merit of this approach is the complete elimination of pitch interference because each pitch-synchronously segmented waveform does not include a waveform discontinuity. One application of this new speech analysis/synthesis ...


Joint Maritime Command Information System (JMCIS) Synthetic Theater of War (STOW) Interface OCT 94 31 pages
Authors:  J. K. Byram; NAVAL COMMAND CONTROL AND OCEAN SURVEILLANCE CENTER RDT AND E DIV SAN DIEGO CA
The full text of this report is available for sale.This particular effort, a subset of the ADS program provides the ability to interface the Joint Maritime Command Information System (JMCIS) to the Synthetic Theater of War (STOW). JMCIS is an operational command control system providing tactical C2I planning, execution and supervision support for all warfare areas at over 250 installations afloat and ashore. STOW is a spatially distributed synthetic battlefield represented by real world forces, simulators, and models and ...


Vocoded KING Data Base OCT 94 43 pages
Authors:  P. C. Grossnickle; NAVAL COMMAND CONTROL AND OCEAN SURVEILLANCE CENTER RDT AND E DIV SAN DIEGO CA
The full text of this report is available for sale.This document reports the development of a data base of vocoded speech that has been passed through several popular voice coders and their corresponding decoders. The KING data base consists of samples from 26 male speakers recorded sequentially on DAT tape. Two products were produced: five DAT tapes. each consisting of the original KING data base on one channel and the vocoded version on the other; and raw binary files, ...


Vocal Cord Function and Voice Quality Evaluation of Active Duty U.S. Army Drill Instructors OCT 94 11 pages
Authors:  Eric Mann; Jeffrey Paffrath; WALTER REED ARMY MEDICAL CENTER WASHINGTON DC
The full text of this report is available for sale.Leaders of small military units readily admit to deterioration or even loss of voice during training and field exercises. Degradation of voice quality can severely impair field communication, and could potentially adversely affect a leaders ability to safely and effectively command his/her unit. Frequently, such voice changes resolve only after prolonged voice rest, and repeated episodes have led to permanent vocal cord pathology and socially unacceptable voice quality in some ...


Usable, Real-Time, Interactive Spoken Language Systems SEP 94 87 pages
Authors:  J. Makhoul; M. Bates; BBN SYSTEMS AND TECHNOLOGIES CORP CAMBRIDGE MA
The full text of this report is available for sale.The objective of this project was to make the next significant advance in human-machine interaction by developing a spoken language system (SLS) that operates in real-time while maintaining high accuracy on cost- effective COTS (commercial, off-the-shelf) hardware. The system has a highly interactive user interface, is largely user independent and to be easily portable to new applications. The BBN HARC spoken language system consists of Byblos speech recognition system and ...


The TRAINS Project: A Case Study in Building a Conversational Planning Agent SEP 94 60 pages
Authors:  J. F. Allen; L. K. Schubert; G. M. Ferguson; P. A. Heeman; C. H. Hwang; ROCHESTER UNIV NY DEPT OF COMPUTER SCIENCE
The full text of this report is available for sale.The TRAINS project is an effort to build a conversationally proficient planning assistant. A key part of the project is the construction of the TRAINS system, which provides the research platform for a wide range of issues in natural language understanding, mixed-initiative planning systems, and representing and reasoning about time, actions and events. Four years have now passed since the beginning of the project. Each year we have produced a ...


Robust Continuous Speech Recognition SEP 94 50 pages
Authors:  J. Makhoul; R. Schwartz; BBN SYSTEMS AND TECHNOLOGIES CORP CAMBRIDGE MA
The full text of this report is available for sale.The objective of this basic research is to develop accurate and detailed mathematical models of the fundamental units of speech (phonemes) for large-vocabulary continuous speech recognition. The important goals of this work are to achieve the highest possible word recognition accuracy in continuous speech and to develop methods for the rapid adaptation of phonetic models to the voice of a new speaker.


Semi-Automated Speech Transcription System Study AUG 94 21 pages
Authors:  Janet Baker; DRAGON SYSTEMS INC NEWTON MA
The full text of this report is available for sale.This report describes preliminary explorations towards the design of a semi-automatic transcription system. Current transcription practices were studied and are described in this report. The promising results of several speech recognition experiments as well as a topic identification experiment, all performed on broadcast data, are reported. These experiments were designed to gauge the quality of speech recognition on broadcast data and to explore possible uses of a continuous speech recognizer ...


TRAINS: Dialogue Transcription Tools AUG 94 61 pages
Authors:  Peter A. Heeman; James F. Allen; ROCHESTER UNIV NY DEPT OF COMPUTER SCIENCE
The full text of this report is available for sale.This document describes a toolkit and guidelines for the transcription of dialogues. The premise of these tools is that a dialogue between two people can be broken down into a series of utterance files, each spoken by one participant. This allows the transcription tools and standards already designed for single speaker speech to be used. (AN)


Detection of Stress by Voice: Analysis of the Glottal Pulse AUG 94 80 pages
Authors:  Jeff Waters; Steve Nunn; Brenda Gillcrist; Eric VonColln; NAVAL COMMAND CONTROL AND OCEAN SURVEILLANCE CENTER RDT AND E DIV SAN DIEGO CA
The full text of this report is available for sale.An Independent Exploratory Development (IED) study was performed to determine whether or not significant measures are present in the human voice for detecting the emotional reaction, 'stress.' A technique was implemented for automatically measuring parameters of the glottal pulse, to see which might be indicators of stress. The results of this IED study confirmed that several of the measures are significant indicators of stress; for example, the glottal pulse generally ...


A Comparison of Signal-Processing Front Ends for Automatic Speech Recognition 18 JUL 94
Authors:  C. R. Jankowski Jr.; H-d. H. Vo; R. P. Lippmann; MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB
The full text of this report is not available and therefore is not for sale. This information is provided for reference purposes only.The first stage of any system for automatic speech recognition (ASR) is a signal-processing front end that converts a sampled speech waveform into a more suitable representation for later processing. Several front ends are compared, three of which are based on knowledge about the human auditory system. The performance of an ASR system with these front ends was compared to a control mel filter bank (MFB)-based cepstral representation in clean ...


A Variable Rate Voice Coder using LPC-10E 12 JUL 1994 3 pages
Authors:  J. P. Macker; R. B. Adamson; NAVAL RESEARCH LAB WASHINGTON DC COMMUNICATION SYSTEMS BRANCH
The full text of this report is available for sale.This paper describes the current status of an ongoing research and development effort whose objective is a variable rate voice coder with a high degree of speech intelligibility and natural voice quality with an average throughput rate near 1200 b/s. The voice coder described here is based on the DoD FS 1015 LPC-10 2400 b/s vocoder. A method of silence detection and the use of variable size data frame formats ...


Spoken Dialogue Understanding and Local Context JUL 94 56 pages
Authors:  Peter A. Heeman; ROCHESTER UNIV NY DEPT OF COMPUTER SCIENCE
The full text of this report is available for sale.Spoken dialogue poses many new problems to researchers in the field of computational linguistics. In particular, conversants must detect and correct speech repairs, segment a turn into individual utterances, and identify discourse markers. These problems are interrelated. For instance, there are some lexical items whose role in an utterance can be ambiguous: they can act as discourse markers, signal a speech repair, or even be part of the content of ...


Segment-Based Acoustic Models for Continuous Speech Recognition 30 JUN 94 13 pages
Authors:  Mari Ostendorf; J. R. Rohlicek; BOSTON UNIV MA
The full text of this report is available for sale.This research aims to develop new and more accurate stochastic models for speaker-independent continuous speech recognition by extending previous work in segment-based modeling, by introducing a new hierarchical approach to representing intra-utterance statistical dependencies, and by developing language models that capture topic dependencies. These techniques, which have high computational costs because of the large search space associated with higher order models, are made feasible through a multi-pass search strategy that ...


Tutorial on Set-Up and Communications Delays for all UHF SATCOM DAMA modes of Operation 20 JUN 1994 144 pages
Authors:  JOINT INTEROPERABILITY AND ENGINEERING ORGANIZATION FORT MONMOUTH NJ
The full text of this report is available for sale.The objective of this briefing is to address two specific requirements identified at the MILSATCOM Users Conference 93-1. These requirements are stated in a Joint staff tasking memorandum, Subject: Tasking Based on MILSATCOM Users Conference 93-1, dated 10 September 1993. Requirements (a) and (b) of that memorandum request that the following actions be taken: (1) Identify voice services supported by the UHF DAMA standards (MIL-STD-188-181 through 183). Describe access and ...


Noise Reduction for Speech Enhancement Using Non-Linear Wavelet Processing JUN 94 201 pages
Authors:  Hassan Dehmani; AIR FORCE INST OF TECH WRIGHT-PATTERSON AFB OH SCHOOL OF ENGINEERING
The full text of this report is available for sale.The problem of speech enhancement presents many obstacles in the speech processing field. This thesis develops several speech de-noising systems that can be used in the time, fourier, and wavelet domains. We present two thresholding techniques: soft and hard. The application of these thresholding techniques to noisy speech data is discussed. The combination of both wavelets and the Fourier domains with noisy phase restoration proves to yield the best results ...


Speech Recognition of Foreign Accent JUN 94 82 pages
Authors:  John K. Dewey; NAVAL POSTGRADUATE SCHOOL MONTEREY CA
The full text of this report is available for sale.This thesis investigates the application of AutoRegressive (AR) modeling techniques on single syllable words to detect foreign accents in spoken American English. The study involves thirty-one native American English speakers, and six native Brazilian speakers. Five different distance measures are used for classification. Results show that correct classification is obtained for 88 % of the native English speakers and 80.5% of the non-native (foreign) English speakers. Speech processing, Foreign accent ...


An Analysis of Tower (Local) Controller - Pilot Voice Communications JUN 94 28 pages
Authors:  Kim M. Cardosi; JOHN A VOLPE NATIONAL TRANSPORTATION SYSTEMS CENTER CAMBRIDGE MA
The full text of this report is available for sale.The purposes of this analysis were to examine current pilot- controller communication practices in the terminal environment. Forty-nine hours of voice tapes from local positions in ten Air Traffic Control Towers (ATCTs) were examined. There were 8,444 controller-to-pilot messages (e.g., clearances to takeoff or land, instructions to hold short or change radio frequencies, etc. ) examined in this study. The complexity of the controller's message (i.e., the number of pieces ...


An Analysis of Voice Responses for the Detection of Deception JUN 94 40 pages
Authors:  Victor L. Cestaro; Andrew B. Dollins; DEPARTMENT OF DEFENSE POLYGRAPH INST FORT MCCLELLAN AL
The full text of this report is available for sale.This study was designed to examine the feasibility of using audio pitch analysis and spectrum decomposition techniques to aid in the detection of deception following a numbers test. Usable audio recordings from 28 of 44 male subjects' responses during a Peak of Tension (POT) test were made while a Lafayette field polygraph was usedAo collect respiration, cardiovascular, and electrodermal responses for manual evaluation. Half of the examinees were programmed 'deceptive' ...


Signal Processing via Fourier-Bessel Series Expansion 31 MAY 94
Authors:  Jim Schroeder; DENVER UNIV CO COLL OF ENGINEERING
The full text of this report is not available and therefore is not for sale. This information is provided for reference purposes only.In many cases it may not be desirable or even practical to represent a signal by its sample values directly or by an analytical function if a suitable function is available. For example, a signal may be determined by time domain sample values when the parameters of interest are more compact within the frequency domain. Many practical signals are highly redundant, both image and speech signals fall into this category, ...


Summary of International Workshop on Networked Reality in Telecommunication (1st) Held in Tokyo, Japan on 13-14 May 1994 14 MAY 94 22 pages
Authors:  T. Davis; ASIAN OFFICE OF AEROSPACE RESEARCH AND DEVELOPMENT APO AP 96337-0007
The full text of this report is available for sale.Tele-presence makes it possible to convene quickly a meeting of several people in different offices, cities, or continents. As a result, time otherwise spent in travel may be used more efficiently, and urgent matters may be considered quickly for timely action. The design should be a social activity the interactions of individuals within groups and the relation of groups to one another. The communication needs of designers are increasing s ...


Segment-Based Acoustic Models for Continuous Speech Recognition 11 MAY 94 14 pages
Authors:  Mari Ostendorf; J. R. Rohlicek; BOSTON UNIV MA DEPT OF ELECTRICAL COMPUTER AND SYSTEMS ENGINEERING
The full text of this report is available for sale.This research aims to develop new and more accurate stochastic models for speaker-independent continuous speech recognition by extending previous work in segment-based modeling and by introducing a new hierarchical approach to representing intra-utterance statistical dependencies. These techniques, which have high computational costs because of the large search space associated with higher order models, are made feasible through rescoring a set of HMM-generated N-best sentence hypotheses. We expect these different modeling, ...


Voice and Video Transmission Using XTP and FDDI MAY 94 8 pages
Authors:  John Drummond; Edwin Cheng; Will Gex; NAVAL COMMAND CONTROL AND OCEAN SURVEILLANCE CENTER RDT AND E DIV SAN DIEGO CA
The full text of this report is available for sale.The use of XTP and FDDI provides a high speed and high performance network solution to multimedia transmission that requires high bandwidth. FDDI is an ANSI and ISO standard for a MAC and Physical layer protocol that provides a signaling rate of 100 Mbits/sec and fault tolerance. XTP is a Transport and Network layer protocol designed for high performance and efficiency and is the heart of the SAFENET Lightweight Suite ...


Modeling the Interaction between Speech and Gesture MAY 94 15 pages
Authors:  Justine Cassell; Matthew Stone; Brett Douville; Scott Prevost; Brett Achorn; MOORE SCHOOL OF ELECTRICAL ENGINEERING PHILADELPHIA PA DEPT OF COMPUTER AND I NFORMATION SCIENCES
The full text of this report is available for sale.This paper describes an implemented system that generates spoken dialogue, including speech, intonation, and gesture, using two copies of an identical program that differ only in knowledge of the world and which must cooperate to accomplish a goal. The output of the dialogue generation is used to drive a three-dimensional interactive animated model - two graphic figures on a computer screen who speak and gesture according to the rules of ...


Technology Status: Damage Control Hull Communications (DC HULLCOM). Part 1. Advanced Development Model 22 MAR 94 31 pages
Authors:  Thomas T. Street; John Vodzak; Tung Pham; NAVAL RESEARCH LAB WASHINGTON DC
The full text of this report is available for sale.A noninterruptible and survivable damage control communications system (DC HULLCOM) has been developed that uses ultrasonic energy to communicate voice and casualty data through the ship's hull and structure. The system consists of a portable, battery-powered transceiver unit capable of voice and/or data communications; an acoustic transducer with clamp attachment; and a headset and Voice-Ducer (bone-conduction microphone and earphone) with cable assemblies. The system can be interfaced to the hull-mounted ...


Noise Cancellation for CELP Voice Encoders in an F/A-18 Noise Environment 18 MAR 94 23 pages
Authors:  David A. Heide; NAVAL RESEARCH LAB WASHINGTON DC
The full text of this report is available for sale.Because of the severe noise environment in the Navy's F/A-18 jet aircraft, it has always been very difficult to achieve highly intelligible speech using low data rate voice encoders such as the 2.4 kbps LPC-10. As a result, all voice encoding has been done with a high data rate 16.0 kbps CVSD algorithm. The main focus of this research was to develop a technique that could retain the acceptable intelligibility ...


High-Performance Speech Recognition Using Consistency Modeling MAR 94 11 pages
Authors:  Vassilios Digilakis; Peter Monaco; Hy Murveit; Mitchel Weintraub; SRI INTERNATIONAL MENLO PARK CA
The full text of this report is available for sale.The goal of this project conducted by SRI International (SRI) is to develop consistency modeling technology. Consistency modeling aims to reduce the number of improper independence assumptions used in traditional speech- recognition algorithms so that the resulting speech-recognition hypotheses are more self-consistent and, therefore, more accurate. Consistency is achieved by conditioning HMM output distributions on state and observations histories, P(x/ s,H). The technical objective of the project is to find ...


Clustering Techniques in Speaker Recognition MAR 94 99 pages
Authors:  Douglas N. Prescott; AIR FORCE INST OF TECH WRIGHT-PATTERSON AFB OH SCHOOL OF ENGINEERING
The full text of this report is available for sale.This thesis presents a comparison based on identification rate, of three clustering techniques applied to cepstral features for speaker identification. LBG vector quantization as developed by Linde, Buzo and Gray; is used to provide benchmark performance for comparison with Fuzzy clustering (based on the unsupervised fuzzy partition-optimal number of classes, UFP-ONC algorithm by Gath and Geva) and an Artificial Neural Network, the Multilayer Perceptron. Cepstral features from the TIMIT, King ...


Segment-Based Acoustic Models for Continuous Speech Recognition 11 FEB 94 12 pages
Authors:  Mari Ostendorf; J. R. Rohlicek; BOSTON UNIV MA DEPT OF ELECTRICAL COMPUTER AND SYSTEMS ENGINEERING
The full text of this report is available for sale.In work, we are interested in the problem of large vocabulary, speaker-independent continuous speech recognition, and primarily in the acoustic modeling component of this problem. In developing acoustic models for speech recognition, we have conflicting goals. On one hand, the models should be robust to inter- and intra-speaker variability, to the use of a different vocabulary in recognition than in training, and to the effects of moderately noisy environments. In ...


Connected Speech Study for Cockpit Applications FEB 94 41 pages
Authors:  Timothy P. Barry; Thomas J. Solz; John M. Reising; WRIGHT LAB WRIGHT-PATTERSON AFB OH
The full text of this report is available for sale.Eleven subjects participated in a study designed to test the accuracy of a newer generation connected speech recognition system using 49 vocabulary words likely to be tested in an aircraft cockpit environment. The 49 vocabulary words were used to create 392 phrases. These phrases were divided into three groups: COMPLEX phrases, which contained more than five words, and two groups of SIMPLE phrases, which contained 5 words or less. The ...


Advanced Distributed Simulation Technology, Digital Voice Gateway Reference Guide 28 JAN 94 43 pages
Authors:  Dan Van Hook; Ed Stadler; LORAL SYSTEMS CO ORLANDO FL ADST PROGRAM OFFICE
The full text of this report is available for sale.The Digital Voice Gateway (referred to as the 'DVG' in this document) transmits and receives four full duplex encoded speech channels over the Ethernet. The information in this document applies only to DVGs running firmware of the version listed on the title page. This document, previously named Digital Voice Gateway Reference Guide, BBN Systems and Technologies Corporation, Cambridge, MA 02138, was revised for revision 2.00. This new revision changes the ...


A Transducer/Equipment System for Capturing Speech Information for Subsequent Processing by Computer Systems 07 JAN 1994 39 pages
Authors:  Benjamin Tirabassi; TECHNICAL EVALUATION RESEARCH INC LITTLE SILVER NJ
The full text of this report is available for sale.The objective of the Phase I Small Business Innovative Research (SBIR) Program Topic A93-033, entitled A Transducer/Equipment System for Capturing Speech Information for Subsequent Processing by Computer Systems is exploratory research to parameterize the speech signal, provide benchmark measurement techniques, and to assemble a superior speech capture system while minimizing the effects of noise and interference. Human verbal response under controlled conditions and when exposed to stress exhibit quantifiable differences ...


Adaptation to New Microphones Using Tied-Mixture Normalization 1994 6 pages
Authors:  Anastasios Anastasakos; Francis Kubala; John Makhoul; Richard Schwartz; BBN SYSTEMS AND TECHNOLOGIES CORP CAMBRIDGE MA
The full text of this report is available for sale.In this paper, we present several approaches designed to increase the robustness of BYBLOS, the BBN continuous speech recognition system. We address the problem of increased degradation in performance when there is mismatch in the characteristics of the training and the test microphones. We introduce a new supervised adaptation algorithm that computes a transformation from the training microphone codebook to that of a new microphone, given some information about the ...


High-Accuracy Large-Vocabulary Speech Recognition Using Mixture Tying and Consistency Modeling 1994 7 pages
Authors:  Vassilios Digalakis; Hy Murveit; SRI INTERNATIONAL MENLO PARK CA
The full text of this report is available for sale.Improved acoustic modeling can significantly decrease the error rate in large-vocabulary speech recognition. Our approach to the problem is twofold. We first propose a scheme that optimizes the degree of mixture tying for a given amount of training data and computational resources. Experimental results on the Wall Street Journal (WSJ) Corpus show that this new form of output distribution achieves a 25% reduction in error rate over typical tied- mixture ...


Microphone-Independent Robust Signal Processing Using Probabilistic Optimum Filtering 1994 7 pages
Authors:  Leonardo Neumeyer; Mitchel Weintraub; SRI INTERNATIONAL MENLO PARK CA
The full text of this report is available for sale.A new mapping algorithm for speech recognition relates the features of simultaneous recordings of clean and noisy speech. The model is a piecewise nonlinear transformation applied to the noisy speech feature. The transformation is a set of multidimensional linear least-squares filters whose outputs are combined using a conditional Gaussian model. The algorithm was tested using SRI's DECIPHER(Trademark) speech recognition system. Experimental results show how the mapping is used to reduce ...


Techniques to Achieve an Accurate Real-Time Large-Vocabulary Speech Recognition System 1994 7 pages
Authors:  Hy Murveit; Peter Monaco; Vassilios Digalakis; John Butzberger; SRI INTERNATIONAL MENLO PARK CA
The full text of this report is available for sale.In addressing the problem of achieving high-accuracy real-time speech recognition systems, we focus on recognizing speech from ARPA's 20,000-word Wall Street Journal (WSJ) task, using current UNIX workstations. We have found that our standard approach-using a narrow beam width in a viterbi search for simple discrete-density hidden Markov models (HMMs)-works in real time with only very low accuracy. Our most accurate algorithms recognize speech many times slower than real time. ...


Subphonetic Acoustic Modeling for Speaker-Independent Continuous Speech Recognition 17 DEC 93
Authors:  Mei-Yuh Hwang; CARNEGIE-MELLON UNIV PITTSBURGH PA DEPT OF COMPUTER SCIENCE
The full text of this report is not available and therefore is not for sale. This information is provided for reference purposes only.To model the acoustics of a large vocabulary well while staying within a reasonable memory capacity, most speech recognition systems use phonetic models to share parameters across different words in the vocabulary. This dissertation investigates the merits of modeling at the subphonetic level. We demonstrate that sharing parameters at the subphonetic level provides more accurate acoustic models than sharing at the phonetic level. The concept of subphonetic parameter sharing can ...


Identity Verification Through the Fusion of Face and Speaker Recognition DEC 93 186 pages
Authors:  John G. Keller; AIR FORCE INST OF TECH WRIGHT-PATTERSON AFB OH SCHOOL OF ENGINEERING
The full text of this report is available for sale.In this research, face recognition and speaker identification systems are each converted into verification systems. The two verification systems are then fused to form a single identity verification system. Finally, the use of the Karhunen-Loeve Transform (KLT) for dimensional reduction is examined for suitability in the verification task. The base face recognition system used the KLT for feature reduction and a back-propagation neural net for classification. Verification involved training a ...


An Exploratory Study of Neuro Linguistic Programming and Communication Anxiety DEC 93 96 pages
Authors:  Lois M. Brunner; NAVAL POSTGRADUATE SCHOOL MONTEREY CA
The full text of this report is available for sale.This thesis is an exploratory study of Neuro-Linguistic Programming (NLP), and its capabilities to provide a technique or a composite technique that will reduce the anxiety associated with making an oral brief or presentation before a group, sometimes referred to as Communication Apprehension. The composite technique comes from NLP and Time Line Therapy, which is an extension to NLP. Student volunteers (17) from a Communications course given by the Administrative ...


ATC/Pilot Voice Communications - A Survey of the Literature NOV 93 39 pages
Authors:  O. V. Prinzo; Thomas W. Britton; FEDERAL AVIATION ADMINISTRATION WASHINGTON DC OFFICE OF AVIATION MEDICINE
The full text of this report is available for sale.The first radio-equipped control tower in the United States opened at the Cleveland Municipal Airport in 1930. From that time to the present, voice radio communications have played a primary role in air safety. Verbal communications in air traffic control (ATC) operations have been frequently cited as causal factors in operational errors and pilot deviations in the FAA operational Error and Deviation System, the NASA Aviation Safety Reporting System (ASRS), ...


NCVS Status and Progress Report, Volume 5, November 1993 NOV 93 106 pages
Authors:  NATIONAL CENTER FOR VOICE AND SPEECH IOWA CITY IA
The full text of this report is available for sale.This 5th Status and Progress Report comes at a time when we are all thinking of the long term future of the National Center for Voice and Speech. Many things are changing right now in the health science arena. All of us are familiar, of course, with the day to day developments of President Clinton's health care package. One thing is certain - each year we have to become more ...


Segment-Based Acoustic Models for Continuous Speech Recognition 08 OCT 93 12 pages
Authors:  Mari Ostendorf; J. R. Rohlicek; BOSTON UNIV MA DEPT OF ELECTRICAL COMPUTER AND SYSTEMS ENGINEERING
The full text of this report is available for sale.This research aims to develop new and more accurate stochastic models for speaker-independent continuous speech recognition by extending previous work in segment-based modeling and by introducing a new hierarchical approach to representing intra-utterance statistical dependencies. These techniques, which have high computational costs because of the large search space associated with higher order models, are made feasible through rescoring a set of HMM-generated N-best sentence hypotheses. We expect these different modeling ...


Template Based Low Data Rate Speech Encoder 30 SEP 93 15 pages
Authors:  Lawrence Fransen; NAVAL RESEARCH LAB WASHINGTON DC
The full text of this report is available for sale.The 2400-b/s linear predictive coder (LPC) is currently being widely deployed to support tactical voice communication over narrowband channels. However, there is a need for lower-data-rate voice encoders for special applications: improved performance in high bit-error conditions, low- probability-of-intercept (LPI) voice communication, and narrowband integrated voice/data systems. An 800-b/s voice encoding algorithm is presented which is an extension of the 2400-b/s LPC. To construct template tables, speech samples of 420 ...


A Real-Time Spoken-Language System for Interactive Problem-Solving, Combining Linguistic and Statistical Technology for Improved Spoken Language Understanding 30 SEP 93 8 pages
Authors:  Robert C. Moore; Michael H. Cohen; SRI INTERNATIONAL MENLO PARK CA
The full text of this report is available for sale.Under this effort, SRI has developed spoken-language technology for interactive problem solving, featuring real-time performance for up to several thousand word vocabularies, high semantic accuracy, habitability within the domain, and robustness to many sources of variability. Although the technology is suitable for many applications, efforts to date have focussed on developing an Air Travel Information System (ATIS) prototype application. SRI's ATIS system has been evaluated in four ARPA benchmark evaluations, ...


Perception of Complex Auditory Patterns 14 SEP 93
Authors:  Charles S. Watson; INDIANA UNIV AT BLOOMINGTON HEARING AND COMMUNICATION LAB
The full text of this report is not available and therefore is not for sale. This information is provided for reference purposes only.This report describes research progress in three areas: the perception of complex sounds, including tonal sequences and bursts of frozen gaussian noise; models for the discrimination of complex sounds; and the perception of speech sounds, under various degrees of stimulus uncertainty and levels of training. Major accomplishments during this period include: the finding that the ability to detect very small frequency changes in single components of tonal sequences, previously assumed ...


Total Results: 1839 Pages: Previous  4 5 6 7 8 [9] 10 11 12 13 14 Next Results per page: