| Phraselator Questionnaire Responses |
May-2009 |
14 pages |
| Authors:
James D Walrath; ARMY RESEARCH LAB ADELPHI MD COMPUTATIONAL AND INFORMATION SCIENCES DIRECTORATE
|
 | A questionnaire designed to elicit Soldier judgments of, and comments about, the upgraded Phraselator speech translation device was completed by seven personnel serving in Iraq. Aggregate responses to yes or no questions are provided, as are the frequency of usage by mission, acceptability by mission, and all comments made by personnel to open-ended questions about the usefulness of the Phraselator during actual missions. |
|
| The ICAO English Language Proficiency Rating Scale Applied to Enroute Voice Communication of U.S. and Foreign Pilots |
May-2009 |
20 pages |
| Authors:
O Veronika Prinzo; Audrey C Thompson; FEDERAL AVIATION ADMINISTRATION OKLAHOMA CITY OK CIVIL AEROSPACE MEDICAL INST
|
 | This is the third and final report in a series that examined communications between pilots and air traffic controllers during en route operations. The first report examined message complexity and message length as factors associated with communication problems (e.g., readback errors (RBEs), requests for repeats (RfR), and breakdowns in communication (BIC). The second report examined these same communication problems by differentiating between pilots flying U.S. - and foreign-registry aircraft. Aircraft ... |
|
| Law Enforcement Head-Borne Personal Protective Equipment Hearing Attenuation |
Apr-2009 |
48 pages |
| Authors:
Qi Li; Joshua Haijeck; Tom Burchfield; LI CREATIVE TECHNOLOGIES INC FLORHAM PARK NJ
|
 | Test methods were developed to quantify and assess the effects of personal protection equipment (PPE) on hearing. The tests use a head and torso simulator that is able to don PPE and employs advanced acoustic, signal processing, and measurement techniques. The tests measure localization and speech intelligibility effects of PPE. The methods also assess the effects of noise generated by PPE fabric and/or electro/mechanical noise. Localization effects are evaluated in ... |
|
| Voice Traffic over Mobile Ad Hoc Networks: A Performance Analysis of the Optimized Link State Routing Protocol |
26-Mar-2009 |
102 pages |
| Authors:
Noreen P Santos; AIR FORCE INST OF TECH WRIGHT-PATTERSON AFB OH GRADUATE SCHOOL OF ENGINEERING AND MANAGEMENT
|
 | This thesis investigates the performance of the Optimized Link State Routing (OLSR) protocol on Voice over Internet Protocol (VoIP) applications in Mobile Ad hoc Networks (MANETs). Representative VoIP traffic is submitted to a MANET and end-to-end delay and packet loss are observed. Node density, number of data streams and mobility are varied creating a full-factorial experimental design with 18 distinct scenarios. The MANET is simulated in OPNET and VoIP traffic ... |
|
| Regulation Scheme for Improved Innovation and Efficiency in Wireless Communications |
Mar-2009 |
123 pages |
| Authors:
John R Kajmowicz; NAVAL POSTGRADUATE SCHOOL MONTEREY CA
|
 | Current FCC regulation of the electromagnetic spectrum hinders the growth of wireless communication technology and fails to make efficient use of an extremely valuable asset. Current policies have failed to keep pace with advancing technology and require a completely new allocation scheme in order to promote growth in the wireless communications industry. This paper proposes a new allocation scheme for spectrum regulation that promotes competition in the marketplace in order ... |
|
| Subjective Audio Quality over a Secure IEEE 802.11n Draft 2.0 Wireless Local Area Network |
Mar-2009 |
90 pages |
| Authors:
Benjamin W Ramsey; AIR FORCE INST OF TECH WRIGHT-PATTERSON AFB OH SCHOOL OF ENGINEERING AND MANAGEMENT
|
 | This thesis investigates the quality of audio generated by a G.711 codec and transmission over an IEEE 802.11n draft 2.0 wireless local area network (WLAN). Decline in audio quality due to additional calls or by securing the WLAN with Internet Protocol Security (IPsec) is quantified. Audio quality over an IEEE 802.11n draft 2.0 WLAN is also compared to that of IEEE 802.11b and IEEE 802.11g WLANs under the same conditions. ... |
|
| Intelligibility of Target Signals in Sequential and Simultaneous Segregation Tasks |
Mar-2009 |
43 pages |
| Authors:
Douglas Brungart; Brian D Simpson; Nandini Iyer; AIR FORCE RESEARCH LAB WRIGHT-PATTERSON AFB OH BATTLESPACE ACOUSTICS BRANCH
|
 | Two experiments are described in the report: the first experiment investigated target intelligibility in a sequential segregation task, while the second experiment investigated performance in a simultaneous segregation task. In the first experiment, speech intelligibility was examined in the presence of four types of maskers (continuous noise, interrupted noise, continuous speech and interrupted speech) at seven interruption rates. Results highlight the important role that target-masker similarity plays in the segregation ... |
|
| Exploring the Plausibility of a National Multi-Agency Communications System for the Homeland Security Community: A Southeast Ohio Half-Duplex Voice Over IP Case Study |
Mar-2009 |
93 pages |
| Authors:
Christopher S Smith; NAVAL POSTGRADUATE SCHOOL MONTEREY CA
|
 | Since 9-11, it has become apparent that the Homeland Security Community consists of more than first responders, and is, in essence, a Megacommunity composed of three components: government, business, and nonprofit organizations. Unfortunately, this reality has not translated into a common communications strategy, which presently focuses on radios for first responders in an emergency. Many reasons exist for not addressing this gap, including the myths that it is impossible or ... |
|
| Prosodic Stress, Information, and Intelligibility of Speech in Noise |
28-Feb-2009 |
18 pages |
| Authors:
Pierre L Divenyi; EAST BAY INST FOR RESEARCH AND EDUCATION INC MARTINEZ CA
|
 | Prosodic stress increases the salience of stressed syllables. The project investigated whether this property of speech is used by listeners for the understanding of spoken sentences presented in noise. Stressed syllables in the 720-sentence IEEE corpus were marked and envelope contours were generated to increase or decrease the level of speech-spectrum noise in synchrony with the occurrence of stressed syllables. Data from ten normal-hearing young listeners indicate that signal-to-noise ratio ... |
|
| Applying State-of-the-Art Technologies to Reduce Escape Times from Fires Using Environmental Sensing, Improved Occupant Egress Guidance, and Multiple Communication Protocols |
06-Feb-2009 |
37 pages |
| Authors:
Frederick W Williams; Thomas T Street; Mark H Hammond; NAVAL RESEARCH LAB WASHINGTON DC CHEMISTRY DIV
|
 | In 2006, under contract to the Consumer Product Safety Commission (CPSC), the Naval Research Laboratory (NRL) was tasked with investigating various technology and concepts--such as visual signals and unique audible sounds--that have the potential to improve residential occupant escape in the event of fire. The investigation included an evaluation of the feasibility of incorporating new technologies or concepts to aid escape capabilities and that may improve egress times in residential ... |
|
| Rapidly Customizable Spoken Dialogue Systems |
28-Jan-2009 |
12 pages |
| Authors:
James Allen; FLORIDA INSTITUTE FOR HUMAN AND MACHINE COGNITION INC PENSACOLA FL
|
 | Building a robust spoken dialogue system for a new application, task, or domain currently requires considerable effort, including substantial efforts in data collection, building language models, grammar/parser development, building a custom dialogue manager, and developing the connection to the system's back-end systems (e.g., a database query or knowledge based system). This project developed key parts of a technology base upon which spoken dialogue systems can be rapidly constructed for new ... |
|
| Smart Caching for Efficient Information Sharing in Distributed Information Systems |
01-Sep-2008 |
67 pages |
| Authors:
Dirk Ableiter; NAVAL POSTGRADUATE SCHOOL MONTEREY CA
|
 | Remarkable technical advances in cell phones and smart phones have resulted in a worldwide marketplace permeated by mobile devices. These capabilities, in combination with increasing consumers demand to share information show the need for utilizing the mobile devices more efficiently. But, within a distributed network of mobile devices like TwiddleNet, the two most limiting resources are still battery power and bandwidth. By distributing only small sets of data that represent ... |
|
| A Review of Contributions by Australian Research Institutions into Speech Processing |
Aug-2008 |
50 pages |
| Authors:
Trevor C Tao; DEFENCE SCIENCE AND TECHNOLOGY ORGANISATION EDINBURGH (AUSTRALIA) COMMAND CONTROL COMMUNICATIONS AND INTELLIGENCE DIV
|
 | This report is a survey of contributions by various research institutions within Australia into several important applications of speech processing, such as speech and speaker recognition. The purpose of this report is to give a rough snapshot of where a number of individual research institutions stand. For each application, a number of research papers within Australia are discussed in detail. Although much of the above research is directed towards simple ... |
|
| Department of the Navy Naval Networking Environment (NNE)-2016. Strategic Definition, Scope and Strategy Paper, Version 1.1 |
13-May-2008 |
35 pages |
| Authors:
NAVAL NETWORKING ENVIRONMENT
|
 | The Department of the Navy (DON) Chief Information Officer (CIO) has led the effort to define the vision, scope, strategy, and concept of operations (CONOPS), for the Department of the Navy's future Naval Networking Environment (NNE), in the 2016 timeframe. The information contained in this paper will allow the Department to formulate the strategy behind NNE-2016 that is linked to the warfighting and warfighting support needs of the Department. |
|
| Automated Speech Intelligibility System for Head-Borne Personal Protective Equipment: Proof of Concept |
01-Apr-2008 |
20 pages |
| Authors:
Karen M Coyne; Daniel J Barker; Jonathan P Eshbaugh; EDGEWOOD CHEMICAL BIOLOGICAL CENTER ABERDEEN PROVING GROUND MD RESEARCH AND TECHNOLOGY DIR
|
 | An automated objective test system was developed to assess the impact of head-borne personal protective equipment on speech intelligibility and transmission. The system comprised talker and listener headforms, speech recordings, and speech recognition software. A recording of sentences from the Speech Perception in Noise test was transmitted from the speaker in the talker headform to microphones in the ears of the listener headform. The speech recognition software recorded the speech ... |
|
| Speaker Localisation Using Time Difference of Arrival |
01-Apr-2008 |
105 pages |
| Authors:
Derek Z Thai; Matthew Trinkle; Ahmad Hashemi-Sakhtsari; Tim Pattison; DEFENCE SCIENCE AND TECHNOLOGY ORGANISATION EDINBURGH (AUSTRALIA) COMMAND CONTROL COMMUNICATIONS AND INTELLIGENCE DIV
|
 | This report describes the research and development of speaker localisation to locate the position of a person speaking. Two closed-form localisation techniques were analysed, the first was developed by Schau and Robinson (1987) based on spherical intersection and the other developed by Chan and Ho (1994). Both techniques are based on time difference of arrival measurements. Accordingly three time delay estimators, namely cross-correlation, generalised cross-correlation, and an eigenvalue decomposition based ... |
|
| CrossTalk: The Journal of Defense Software Engineering. Volume 21, Number 4 |
01-Apr-2008 |
33 pages |
| Authors:
Capers Jones; Kym Henderson; Ofer Zwikael; Walt Lipke; David J Coe; David Premeaux; Phillip G Armour; SOFTWARE TECHNOLOGY SUPPORT CENTER HILL AFB UT
|
 | CONTENTS: 1) Software Tracking:The Last Defense Against Failure by Capers Jones: This article concentrates on four worst practices and the factors that most often lead to failure and litigation and gives advice on how to avoid them. 2) Does Project Performance Stability Exist? A Re-examination of CPI and Evaluation of SPI(t) Stability by Kym Henderson and Dr. Ofer Zwikael: This article investigates whether the SPI(t) exhibits similar stability characteristics to ... |
|
| Iterated Class-Specific Subspaces for Speaker-Dependent Phoneme Classification |
Jan-2008 |
6 pages |
| Authors:
Paul M Baggenstoss; NAVAL UNDERSEA WARFARE CENTER DIV NEWPORT RI
|
 | The features based on the MEL cepstrum have long dominated probabilistic methods in automatic speech recognition (ASR). This feature set has evolved to maximize general ASR performance within a Bayesian classifier framework using a common feature space. Now, however, with the advent of the PDF projection theorem (PPT) and the class-specific method (CSM), it is possible to design features separately for each phoneme and compare log-likelihood values fairly across various ... |
|
| A Multi-Resolution Hidden Markov Model Using Class-Specific Features |
Jan-2008 |
6 pages |
| Authors:
Paul M Baggenstoss; NAVAL UNDERSEA WARFARE CENTER DIV NEWPORT RI
|
 | We address the problem in signal classification applications, such as automatic speech recognition (ASR) systems that employ the hidden Markov model (HMM), that it is necessary to settle for a fixed analysis window size and a fixed feature set. This is despite the fact that complex signals such as human speech typically contain a wide range of signal types and durations. We apply the probability density function (PDF) projection theorem ... |
|
| Voice Over Internet Protocol Testbed Design for Non-Intrusive, Objective Voice Quality Assessment |
SEP 2007 |
117 pages |
| Authors:
David L. Manka; NAVAL POSTGRADUATE SCHOOL MONTEREY CA
|
 | Voice over Internet Protocol (VoIP) is an emerging technology with the potential to assist the United States Marine Corps in solving communication challenges stemming from modern operational concepts. This thesis conducts a review of VoIP standards and develops an H.323-based testbed for the study of tactical wireless VoIP performance. Methods of collecting and presenting voice quality parameters in packet-based networks are explored. Incorporation of an Adtech SX/14 Data Channel Simulator ... |
|
| Testing and Demonstrating Speaker Verification Technology in Iraqi-Arabic as Part of the Iraqi Enrollment Via Voice Authentication Project (IEVAP) in Support of the Global War on Terrorism (GWOT) |
SEP 2007 |
130 pages |
| Authors:
Jeffrey W. Withee; Edwin D. Pena; NAVAL POSTGRADUATE SCHOOL MONTEREY CA
|
 | This thesis documents the findings of an Iraqi-Arabic language test and concept of operations for speaker verification technology as part of the Iraqi Banking System in support of the Iraqi Enrollment via Voice Authentication Project (IEVAP). IEVAP is an Office of the Secretary of Defense (OSD) sponsored research project commissioned to study the feasibility of speaker verification technology in support security requirements of the Global War on Terrorism (GWOT). The ... |
|
| MEMS PolyMUMPS-Based Miniature Microphone for Directional Sound Sensing |
SEP 2007 |
106 pages |
| Authors:
Timothy J. Shivok; NAVAL POSTGRADUATE SCHOOL MONTEREY CA
|
 | A miniature directional sound sensor was fabricated using micro-electro-mechanical system (MEMS) technology based on the operational principle of Ormia ochracea fly s hearing organ. The fly uses coupled bars hinged at the center to achieve the directional sound sensing by monitoring the difference in vibration amplitude between them. The MEMS sensor design employed in this thesis was fabricated using the PolyMUMPs process. The sound sensor has two primary vibrational modes ... |
|
| Radio Interoperability: There Is More to It Than Hardware |
01-Jun-2007 |
28 pages |
| Authors:
Ronald P Timmons; Susan G Hutchins; NAVAL POSTGRADUATE SCHOOL MONTEREY CA DEPT OF INFORMATION SCIENCES
|
 | Radio Interoperability: The Problem *Superfluous radio transmissions contribute to auditory overload of first responders -Obscure development of an accurate operational picture for all involved -Radio spectrum is a limited commodity once it's full, it's full. *Practical limit to number of people who can operate on a common platform before quality of communications deteriorates *Policies and practices need to be reexamined to develop new strategies which will facilitate effective communications. |
|
| Voice and Video Capacity of a Secure Wireless System |
JUN 2007 |
114 pages |
| Authors:
Jason R. Seyba; AIR FORCE INST OF TECH WRIGHT-PATTERSON AFB OH SCHOOL OF ENGINEERING AND MANAGEMENT
|
 | Improving the security and availability of secure wireless multimedia systems is the purpose of this thesis. Specifically, this thesis answered research questions about the capacity of wireless multimedia systems and how three variables relate to this capacity. The effects of securing the voice signal, real-time traffic originating foreign to a wireless local area network and use of an audio-only signal compared with a combined signal were all studied. The research ... |
|
| Effects of the Wireless Channel, Signal Compression and Network Architecture on Speech Quality in Voip Networks |
JUN 2007 |
105 pages |
| Authors:
Tiantioukas Nikolaos; NAVAL POSTGRADUATE SCHOOL MONTEREY CA
|
 | Voice over Internet Protocol (VoIP) telephony is an emerging technology slowly finding its way into military applications. It provides several advantages over PSTN but comes short on performance, quality of service and availability. The purpose of this thesis is to measure the quality of voice in VoIP communications. More specifically it investigates the effects of wireless channel conditions as well as channel coding and compression on the received speech quality. ... |
|
| Effects of the Advanced Combat Helmet (ACH) and Selected Communication and Hearing Protection Systems (C&HPSs) on Speech Communication: Talk-Through Systems |
APR 2007 |
36 pages |
| Authors:
Rachel A. Weatherless; Rhoda M. Wilson; Lamar Garrett; Tomasz R. Letowski; Mary S. Binseel; ARMY RESEARCH LAB ABERDEEN PROVING GROUND MD
|
 | Communication in military settings must be clear and understandable to avoid possible fatal accidents and mistakes. Speech intelligibility is the overall quality of speech that makes it comprehensible. Intelligibility of speech depends on the properties of the talker, transmission channel, and the listener. The purpose of the reported study was to evaluate intelligibility of speech provided by five communication and hearing protection systems (C&HPSs) operating in talk-through mode. The systems ... |
|
| A Model for Predicting Intelligibility of Binaurally Perceived Speech |
APR 2007 |
36 pages |
| Authors:
Angelique A. Scharine; Paula P. Henry; Mohan D. Rao; Jason T. Dreyer; ARMY RESEARCH LAB ABERDEEN PROVING GROUND MD
|
 | Predicting and modeling intelligibility of monaurally or binaurally presented speech is difficult because it depends primarily on the accuracy and interdependency of frequency, time, and spatial information arriving at the listener. Despite these complex relationships, a new pragmatic model is suggested for speech mixed with broadband noise. A form of the logistic regression function is used to characterize human performance data. The regression of these signal properties onto empirical speech ... |
|
| Variable Data Rate Voice Encoder to Narrowband and Wideband Speech |
02 MAR 2007 |
30 pages |
| Authors:
Thomas M. Moran; David A. Heide; Yvette T. Lee; George S. Kang; NAVAL RESEARCH LAB WASHINGTON DC
|
 | Past designs for many military communications systems were based upon specific radio links with fixed and limited channel capacities. Accordingly, many different voice compression algorithms, operating at various fixed rates, were implemented. While still being used today, these incompatible systems are an obstacle to interoperable communications. Emerging net-centric communications promise to provide connectivity to all military users but voice interoperability will still require compatible voice encoding as well as encryption ... |
|
| The Matrix Pencil and its Applications to Speech Processing |
MAR 2007 |
152 pages |
| Authors:
Darren H. Haddad; Andrew J. Noga; AIR FORCE RESEARCH LAB ROME NY INFORMATION DIRECTORATE
|
 | Matrix Pencils facilitate the study of differential equations resulting from oscillating systems. Certain problems in linear ordinary differential equations, such as speech processing, can be represented as the problem of finding a canonical pencil strictly equivalent to a given pencil. It was originally applied by the radar community to phased array radar for signal directional finding applications. The Matrix Pencil (MP) algorithm is a direct data approach, and is a ... |
|
| Comparison of Hearing in Noise Test (HINT) Scores Using Three Different Transducers |
FEB 2007 |
16 pages |
| Authors:
John Ribera; ARMY AEROMEDICAL RESEARCH LAB FORT RUCKER AL
|
 | Army aircrew in hostile listening environments rely on hearing for crew coordination--a critical component of rotary-wing aviation. New technologies may require aviators to have the ability to not only hear in noise but also to localize warning and other signals. The Hearing in Noise Test (HINT) evaluates functional hearing in noise but has only been normalized using supra-aural headphones. Method: Sixty normal hearing students from Utah State University were equally ... |
|
| Auditory Modeling as a Basis for Spectral Modulation Analysis with Application to Speaker Recognition |
31 JAN 2007 |
|
| Authors:
Tianyu T. Wang; Thomas F. Quatieri; MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB
|
 | This report explores auditory modeling as a basis for robust automatic speaker verification. Specifically, we have developed feature-extraction front-ends that incorporate (1) time-varying, level-dependent filtering, (2) variations in analysis filter-bank size, and (3) nonlinear adaptation. Our methods are motivated both by a desire to better mimic auditory processing relative to traditional front-ends (e.g., the mel-cepstrum) as well as by reported gains in automatic speech recognition robustness exploiting similar principles. Traditional ... |
|
| Evaluation of a Spoken Dialogue System for Virtual Reality Call for Fire Training |
2007 |
9 pages |
| Authors:
Susan M. Robinson; Antonio Roque; Ashish Vaswani; David Traum; Charles Hernandez; Bill Millspaugh; UNIVERSITY OF SOUTHERN CALIFORNIA MARINA DEL REY CA INST FOR CREATIVE TECHNOLOGIES
|
 | We present an evaluation of a spoken dialogue system that engages in dialogues with soldiers training in an immersive Call for Fire (CFF) simulation. We briefly describe aspects of the Joint Fires and Effects Trainer System, and the Radiobot-CFF dialogue system, which can engage in voice communications with a trainee in call for fire dialogues. An experiment is described to judge performance of the Radiobot CFF system compared with human ... |
|
| Extended Littoral Battlespace (ELB) Secure Network Voice Gateway |
2007 |
5 pages |
| Authors:
R. B. Adamson; Tom Moran; Jr. Cole Raymond; Michael S. McBeth; NEWLINK GLOBAL ENGINEERING INC SPRINGFIELD VA
|
 | The Extended Littoral Battlespace (ELB) Advanced Concept Technology Demonstration (ACTD) uses wireless Local Area Network (LAN) technology to provide U.S. Marines in the field with multimedia connectivity to shore-based and afloat command and control centers. Computer network voice communication services are being evaluated and demonstrated as part of the ELB project. A gateway is needed for network voice users to communicate with users on other tactical voice and military telephone ... |
|
| Entropy Based Classifier Combination for Sentence Segmentation |
2007 |
5 pages |
| Authors:
M. Magimai-Doss; D. Hakkani-Tur; O. Cetin; E. Shriberg; J. Fung; N. Mirghafori; INTERNATIONAL COMPUTER SCIENCE INST BERKELEY CA
|
 | We describe recent extensions to our previous work, where we explored the use of individual classifiers, namely, boosting and maximum entropy models for sentence segmentation. In this paper we extend the set of classification methods with support vector machine (SVM). We propose a new dynamic entropy-based classifier combination approach to combine these classifiers, and compare it with the traditional classifier combination techniques, namely, voting, linear regression and logistic regression. Furthermore, ... |
|
| Comparing Evaluation Metrics for Sentence Boundary Detection |
2007 |
5 pages |
| Authors:
Yang Liu; Elizabeth Shriberg; TEXAS UNIV AT DALLAS RICHARDSON
|
 | In recent NIST evaluations on sentence boundary detection, a single error metric was used to describe performance. Additional metrics, however, are available for such tasks, in which a word stream is partitioned into subunits. This paper compares alternative evaluation metrics including the NIST error rate, classification error rate per word boundary, precision and recall, ROC curves, DET curves, precision-recall curves, and area under the curves and discusses advantages and disadvantages ... |
|
| Adapting the Vehicle Mounted Tactical Loudspeaker System to Today's Operational Environment |
01 DEC 2006 |
71 pages |
| Authors:
Jonathan B. Keiser; Mark C. Engen; NAVAL POSTGRADUATE SCHOOL MONTEREY CA
|
 | From the time they were first used by the United States Army during World War II, loudspeakers have proven to be an effective means for tactical psychological operations (PSYOP) teams to disseminate messages to their intended target audiences. The vehicle mounted family of loudspeakers (FOL) is the loudspeaker system currently being utilized by tactical psychological operations forces as the primary mobile means of disseminating messages or sound effects to their ... |
|
| The Outcome of ATC Message Complexity on Pilot Readback Performance |
NOV 2006 |
36 pages |
| Authors:
O. V. Prinzo; Alfred M. Hendrix; Ruby Hendrix; FEDERAL AVIATION ADMINISTRATION OKLAHOMA CITY OK CIVIL AEROMEDICAL INST
|
 | Field data and laboratory studies conducted in the 1990s reported that the rate of pilot readback errors and communication problems increased as controller transmissions became more complex. This resulted in the recommendation that controllers send shorter messages to reduce the memory load imposed on pilots by complex messages. More than 10 years have passed since a comprehensive analysis quantified the types and frequency of readback errors and communication problems that ... |
|
| Informational and Energetic Masking Effects in Multitalker Speech Perception |
OCT 2006 |
6 pages |
| Authors:
Douglas S. Brungart; AIR FORCE RESEARCH LAB WRIGHT-PATTERSON AFB OH HUMAN EFFECTIVENESS DIRECTORATE
|
 | When a speech signal is obscured by a second simultaneous competing speech signal, two types of masking contribute to overall performance. Traditional "energetic" masking occurs when both utterances contain energy in the same critical bands at the same time and portions of one or both of the speech signals are rendered inaudible at the periphery. Higher-level "informational masking" occurs when the signal and masker are both audible but the listener ... |
|
| Multilingual Phoneme Models for Rapid Speech Processing System Development |
SEP 2006 |
85 pages |
| Authors:
Eric G. Hansen; AIR FORCE INST OF TECH WRIGHT-PATTERSON AFB OH SCHOOL OF ENGINEERING AND MANAGEMENT/DEPT OF ENGINEERING PHYSICS
|
 | Current speech recognition systems tend to be developed only for commercially viable languages. The resources needed for a typical speech recognition system include hundreds of hours of transcribed speech for acoustic models and 10 to 100 million words of text for language models; both of these requirements can be costly in time and money. The goal of this research is to facilitate rapid development of speech systems to new languages ... |
|
| Testing Template and Testing Concept of Operations for Speaker Authentication Technology |
SEP 2006 |
119 pages |
| Authors:
Marek M. Sipko; NAVAL POSTGRADUATE SCHOOL MONTEREY CA
|
 | This thesis documents the findings of developing a generic testing template and supporting concept of operations for speaker verification technology as part of the Iraqi Enrollment via Voice Authentication Project (IEVAP). The IEVAP is an Office of the Secretary of Defense sponsored research project commissioned to study the feasibility of speaker verification technology in support of the Global War on Terrorism security requirements. The intent of this project is to ... |
|
| A Chorus of Whales: Evaluation of Sequential and Batch Approaches to Time-Series Tracking |
Sep-2006 |
7 pages |
| Authors:
Stefano Coraluppi; Odile Gerard; Walter Zimmer; Peter Willett; NATO UNDERSEA RESEARCH CENTRE LA SPEZIA (ITALY)
|
 | This paper applies target-tracking technology to a novel application: the processing of mammal vocalizations or clicks, with the goal of identifying the number of marine mammals in a surveillance region. This problem has direct application to marine mammal mitigation efforts in the context of active sonar operations. |
|
| Phrase-based Multimedia Information Extraction |
JUL 2006 |
26 pages |
| Authors:
Eric Cohen; Evelyne Tzoukermann; STREAMSAGE INC WASHINGTON DC
|
 | StreamSage proposed to develop a prototype software system that would specifically deal with the two primary challenges of speech data on the performance of information extraction: degraded input data and the time-based nature of the content. In order to overcome these two challenges, this effort focused on two general areas: mitigating the degraded quality of speech data and improving entity identification. Technologies developed under this project for audio/video named entity ... |
|
| Nonverbal Communication and Aircrew Coordination in Army Aviation: Annotated Bibliography |
JUN 2006 |
119 pages |
| Authors:
Lawrence C. Katz; Gretchen Kambe; Kurt F. Kline; Gary N. Grubb; ARMY RESEARCH INST FOR THE BEHAVIORAL AND SOCIAL SCIENCES ALEXANDRIA VA
|
 | The Army's Aircrew Coordination Training (ACT) programs emphasize the importance of verbal communications between crewmembers during mission execution. While this is a critical component of effective crew coordination, little attention has been directed towards the influence of nonverbal communication on effective crew coordination. Nonverbal communication transactions occur in the cockpit, but the extent to which they supplement verbal communication and their contribution to safe mission performance remain unclear. The report ... |
|
| Extended Range Underwater Loudhailer for Port Security Applications |
JUN 2006 |
158 pages |
| Authors:
Ric Walker; Bruce Abraham; COAST GUARD RESEARCH AND DEVELOPMENT CENTER GROTON CT
|
 | The U.S. Coast Guard (CG) has developed an Integrated Anti-swimmer System (IAS) to aid enforcement of security zones around high-value maritime assets. The IAS includes a diver recall system to issue verbal warnings and commands as the first response to a detected underwater intruder. However, the range of the recall system does not meet the CG s requirement of 500 yards for security zone enforcement. Consequently, this system must be ... |
|
| A Methodology to Predict Specific Communication Themes from Overall Communication Volume for Individuals and Teams |
JUN 2006 |
|
| Authors:
Elliot E. Entin; Shawn A. Weil; APTIMA INC WOBURN MA
|
 | We focus on a means to code voice communications and derive communication measures because communication plays such a critical role in military decision making and mission accomplishment. Voice communication has proved labor intensive to code manually and, beyond simple counts of utterances, has proved relatively intractable to automate coding even for powerful computers. The methodology we describe has the potential to alleviate a significant portion of the current coding burden. ... |
|
| Advancing Noise Robust Automatic Speech Recognition for Command and Control Applications |
31 MAR 2006 |
29 pages |
| Authors:
James D. Bass; ARMY WAR COLL CARLISLE BARRACKS PA
|
 | This is a technical assessment paper intended for use by engineers and research scientist working on the development and integration of Automatic Speech Recognition (ASR), it will cover the state of speech and recognition technologies with emphasis on noise robust command and control (C2) application. The reliable elimination of the keyboard and mouse in mounted and un-mounted C2 systems has been a desire of systems developers and requirements writers since ... |
|
| Recognition of In-Ear Microphone Speech Data Using Multi-Layer Neural Networks |
MAR 2006 |
187 pages |
| Authors:
Gokhan Bulbuller; NAVAL POSTGRADUATE SCHOOL MONTEREY CA
|
 | Speech collected through a microphone placed in front of the mouth has been the primary source of data collection for speech recognition. There are only a few speech recognition studies using speech collected from the human ear canal. In this study, a speech recognition system is presented, specifically an isolated word recognizer which uses speech collected from the external auditory canals of the subjects via an in-ear microphone. Currently, the ... |
|
| Isolated Word Recognition From In-Ear Microphone Data Using Hidden Markov Models (HMM) |
MAR 2006 |
177 pages |
| Authors:
Remzi S. Kurcan; NAVAL POSTGRADUATE SCHOOL MONTEREY CA
|
 | This thesis is part of an ongoing larger scale research study started in 2004 at the Naval Postgraduate School (NPS) which aims to develop a speech-driven human-machine interface for the operation of semi-autonomous military robots in noisy operational environments. Earlier work included collecting a small database of isolated word utterances of seven words from 20 adult subjects using an in-ear microphone. The research conducted here develops a speaker-independent isolated word ... |
|
| Speech Recognition Using the Mellin Transform |
MAR 2006 |
75 pages |
| Authors:
Jesse R. Hornback; AIR FORCE INST OF TECH WRIGHT-PATTERSON AFB OH DEPT OF ELECTRICAL AND COMPUTER ENGINEERING
|
 | The purpose of this research was to improve performance in speech recognition. Specifically, a new approach was investigating by applying an integral transform known as the Mellin transform (MT) on the output of an auditory model to improve the recognition rate of phonemes through the scale-invariance property of the Mellin transform. Scale-invariance means that as a time-domain signal is subjected to dilations, the distribution of the signal in the MT ... |
|
| Supervised and Unsupervised Speaker Adaptation in the NIST 2005 Speaker Recognition Evaluation |
FEB 2006 |
10 pages |
| Authors:
Eric G. Hansen; Raymond E. Slyh; Timothy R. Anderson; AIR FORCE RESEARCH LAB WRIGHT-PATTERSON AFB OH HUMAN EFFECTIVENESS DIRECTORATE
|
 | Starting in 2004, the annual NIST Speaker Recognition Evaluation (SRE) has added an optional unsupervised speaker adaptation track where test files are processed sequentially and one may update the target model. In this paper, various model adaptation techniques are implemented using a supervised (ideal) adaptation scheme. Once the best performing model adaptation method is found, unsupervised adaptation experiments are run using a threshold to determine when to update the target ... |
|