Storming Media: Pentagon Reports and DocumentsPentagon Reports: Fast. Definitive. Complete.     
New Account »
Forgot Password?
Advanced Search »

Newsletter
Unsubscribe »
Reports by Keyword(s)*NATURAL LANGUAGE
Total Results: 451 Pages: Previous [1] 2 3 4 5 6 7 8 9 10 Next Results per page:
Sort by: Title Date Desc Pages Display:
Blog Fingerprinting: Identifying Anonymous Posts Written by an Author of Interest Using Word and Character Frequency Analysis Sep-2009 93 pages
Authors:  David J Dreier; NAVAL POSTGRADUATE SCHOOL MONTEREY CA DEPT OF COMPUTER SCIENCE
The full text of this report is available for sale.Internet blogs are an easily accessible means of global communications. Monitoring blogs for criminal and terrorist activity is a serious challenge, due to blogs' anonymous nature and the sheer volume of data. The intelligence community is often faced with more information than it can process. The need exists to develop methods for processing the massive amounts of data this media presents, without a significant increase in manpower. An automated tool ...


Topic Detection in Online Chat Sep-2009 102 pages
Authors:  Jonathan S Durham; NAVAL POSTGRADUATE SCHOOL MONTEREY CA
The full text of this report is available for sale.The ubiquity of Internet chat applications has benefited many different segments of society. It also creates opportunities for criminal enterprise, terrorism, and espionage. This thesis proposes statistical Natural Language Processing (NLP) methods for creating systems that would detect the topic of chat in support of larger NLP goals such as information retrieval, text classification and illicit activity detection. We propose a novel method for determining the topic of chat discourse. ...


Synergist: Collaborative Analyst Assistant Apr-2009 56 pages
Authors:  Munirathnam Srikanth; Marta Tatu; Adriana Badulescu; Guillaume Bailey; Marian Olteanu; Christine Clark; LYMBA CORP RICHARDSON TX
The full text of this report is available for sale.Intelligence Analysts work with a significant amount of information expressed in natural language. The textual information that Analysts work with or generate provide important clues on what is known to them, what are their immediate and long term needs, and what aspects are typically explored or missed in the context of a particular topic. For CASE, Lymba developed Synergist: Collaborative Analyst Assistant, an automated system that understands unstructured content, extracts ...


Proactive Intelligence (PAINT) Simulated Exploration of Executable Design Strategies (SEEDS) Apr-2009 28 pages
Authors:  Rafael Alonso; Sven Brueckner; Diane Yang; Andrew Yinger; TECHTEAM GOVERNMENT SOLUTIONS INC CHANTILLY VA DIGITAL SUPPORT
The full text of this report is available for sale.The ProActive Intelligence(PAINT) Simulated Exploration of Executable Design Strategies (SEEDS) project first defined the overall systems architecture for the creation of robust, complex, & optimized probes based on a generate & test approach, then specified the architecture & processes within the Possible World Generator, our key component for testing & evaluating probe candidates, & finally, implemented & demonstrated an illustration-of-concept interactive prototype that shows the primary interactions of the key ...


Recognizing Connotative Meaning in Military Chat Communications Apr-2009 9 pages
Authors:  Sharon M Walter; Emily R Budlong; Ozgur Yilmazel; AIR FORCE RESEARCH LAB ROME NY INFORMATION DIRECTORATE
The full text of this report is available for sale.Over the last five to seven years the use of chat in military contexts has expanded quite significantly, in some cases becoming a primary means of communicating time-sensitive data to decision makers and operators. For example, during humanitarian operations with Joint Task Force-Katrina, chat was used extensively to plan, task, and coordinate predeployment and ongoing operations. The informal nature of chat communications allows the relay of far more information than ...


DARPA CS Study Panel 2007 Mar-2009 5 pages
Authors:  Noah Smith; CARNEGIE-MELLON UNIV PITTSBURGH PA
The full text of this report is available for sale.I have attended the four Computer Science Study Panel (CSSP) sessions, in April, June, July, and October 2007. I found the sessions to be highly interesting, both as a citizen and as a scientist. Because my research deals with information -- in particular, automated processing of language data like text and speech -- I saw the greatest connection with my research in the final session when we interact with the ...


Form and Function of Linguistic Elements. Formal Systems for Representing Changing Situations. Dynamic Information Systems: Notes on some systems of grammar and interpretation 20-Feb-2009 50 pages
Authors:  Emmon Bach; Wynn Chao; UNIVERSITY COLL LONDON (UNITED KINGDOM)
The full text of this report is available for sale.This report results from a contract tasking University of London as follows: In two papers, the DoubleR Model will be assessed in light of other current formal theories. The candidate theories/issues for initial consideration will be those that address the nature of the syntax-semantics interface. Beyond the basic level, aspects of meaning relevant to information structural content (topic, focus, contrast and backgrounding) may become relevant, and may help explain some ...


Extracting Formal Models from Informal Requirements and Using Them for Validation Jan-2009 9 pages
Authors:  Insup Lee; PENNSYLVANIA UNIV PHILADELPHIA BOARD OF TRUSTEES
The full text of this report is available for sale.The goal of the project is to study formalization of regulations and regulatory compliance. Technical objectives involve addressing two verification problems: 1. Consistency of regulation / Compliance can be achieved only if the regulation is internally consistent. This verification problem answers the question whether any organization is capable of complying with the regulation. 2. Compliance of organizations / This verification problem answers the question whether the operation of an organization ...


Natural Language Dialogue Architectures for Tactical Questioning Characters Dec-2008 9 pages
Authors:  Anton Leuski; David Traum; Bilyana Martinovski; Susan Robinson; Antonio Roque; Sudeep Gandhe; David DeVault; Jillian Gerten; UNIVERSITY OF SOUTHERN CALIFORNIA MARINA DEL REY CA INST FOR CREATIVE TECHNOLOGIES
The full text of this report is available for sale.In this paper we contrast three architectures for natural language questioning characters. We contrast the relative costs and benefits of each approach in building characters for tactical questioning. The first architecture works purely at the textual, using cross-language information retrieval techniques to learn the best output for any input from a training set of linked questions and answers. The second architecture adds a global emotional model and computes a compliance ...


Looking Under the Hood of Stochastic Machine Learning Algorithms for Parts of Speech Tagging 01-Jul-2008 36 pages
Authors:  Jana Diesner; Kathleen M Carley; CARNEGIE-MELLON UNIV PITTSBURGH PA INST OF SOFTWARE RESEARCH INTERNAT
The full text of this report is available for sale.A variety of Natural Language Processing and Information Extraction tasks, such as question answering and named entity recognition, can benefit from precise knowledge about a words? syntactic category or Part of Speech (POS) "Church, 1988; Rabiner, 1989; Stolz, Tannenbaum, & Carstensen, 1965". POS taggers are widely used to assign a single best POS to every word in text data, with stochastic approaches achieving accuracy rates of up to 96% to ...


Authorship Discovery in Blogs Using Bayesian Classification with Corrective Scaling 01-Jun-2008 51 pages
Authors:  Grant T Gehrke; NAVAL POSTGRADUATE SCHOOL MONTEREY CA DEPT OF COMPUTER SCIENCE
The full text of this report is available for sale.Widespread availability of free, public blog platforms has facilitated growth in the amount of individually written electronic text available online. Our research leverages an extremely large blog corpus for a study in authorship discovery, both to evaluate a traditional technique as applied to blogs, as well as to demonstrate the implications of authorship discovery in blogs for intelligence and forensic purposes. Our study uses a Bayesian classifier with two important ...


Data Analysis Project: Leveraging Massive Textual Corpora Using n-Gram Statistics 01-May-2008 31 pages
Authors:  Ian Fette; Andrew Carlson; Tom M Mitchell; CARNEGIE-MELLON UNIV PITTSBURGH PA SCHOOL OF COMPUTER SCIENCE
The full text of this report is available for sale.We study methods of efficiently leveraging massive textual corpora through n-gram statistics. Specifically, we explore algorithms that use a database of frequency counts for sequences of tokens in a teraword Web corpus to correct spelling mistakes and to extract a list of instances of some category given only the name of the target category. For spelling correction, we use a novel correction algorithm and demonstrate high accuracy in correcting both ...


Improving Information Extraction and Translation Using Component Interactions JAN 2008 163 pages
Authors:  Heng Ji; NEW YORK UNIV NY DEPT OF COMPUTER SCIENCE
The full text of this report is available for sale.The traditional natural language processing (NLP) pipeline incorporates multiple stages of linguistic analysis. Although errors are typically compounded through the pipeline, it is possible to reduce the errors in one stage by harnessing the results of the other stages. This thesis presents a new framework based on component interactions to approach this goal. The new framework applies all stages in a suitable order, with each stage generating multiple hypotheses and ...


Techniques for Automatically Generating Biographical Summaries from News Articles SEP 2007 211 pages
Authors:  Matthew W. Esparza; NAVAL POSTGRADUATE SCHOOL MONTEREY CA
The full text of this report is available for sale.The work of manually creating a biographical summary from multiple information sources is both time-intensive and detail-oriented. Automating the task is also non-trivial because of the many natural language processing (NLP) areas that must be used to efficiently extract the relevant facts. Yet, no study has been done to determine how powerful a biographical summarization system must be to achieve the basic goal of filling slots in a biography template. ...


Natural Language Dialogue for Intelligent Tutoring Systems 02 AUG 2007 15 pages
Authors:  Barbara Di Eugenio; ILLINOIS UNIV AT CHICAGO
The full text of this report is available for sale.We study tutorial dialogue with two aims: understanding what promotes learning in one on one tutoring; developing language interfaces to Intelligent Tutoring Systems (ITSs). We worked in three different domains. Our work comprises: linguistic analysis, data mining, computational modeling (e.g., discourse planning) implementation, and empirical evaluation with human subjects. Our results show that interfaces developed on the basis of the tutorial dialogue analysis engender significantly more learning than other types ...


A Survey of Statistical Machine Translation APR 2007 51 pages
Authors:  Adam Lopez; MARYLAND UNIV COLLEGE PARK DEPT OF COMPUTER SCIENCE
The full text of this report is available for sale.Statistical machine translation (SMT) treats the translation of natural language as a machine learning problem. By examining many samples of human-produced translation, SMT algorithms automatically learn how to translate. SMT has made tremendous strides in less than two decades, and many popular techniques have only emerged within the last few years. This survey presents a tutorial overview of state-of-the-art SMT at the beginning of 2007. We begin with the context ...


Automated Run-Time Mission and Dialog Generation MAR 2007 75 pages
Authors:  John D. Kelly; NAVAL POSTGRADUATE SCHOOL MONTEREY CA
The full text of this report is available for sale.Current mission driven systems, be they games or training simulations, are generally restricted to using a set of training missions that are hard coded into the system. This has the unfortunate effect of limiting the number of times a person or team can be sent through a simulator before it begins to lose its training value or the number of times a person can replay a game without it becoming ...


The TextLearner System: Reading Learning Comprehension JUN 2006 51 pages
Authors:  Jon Curtis; Michael Witbrock; John /Cabral Baxter David; Peter Wagner; Bjorn Aldag; Keith Goolsbey; Ben Gottesman; Zelal Gungordu; Robert C. Kahlert; CYCORP AUSTIN TX
The full text of this report is available for sale.The goal of DARPA's Reading Learning Comprehension seedling was to determine the feasibility of autonomous knowledge acquisition through the analysis of text. This report describes the results of that effort by detailing the capabilities of the TextLearner prototype, a knowledge-acquisition program that represents the culmination of the year-long effort. Built atop the Cyc Knowledge Base and implemented almost entirely in the formal representation language of CycL, TextLearner is an anomaly ...


Effectively Using Syntax for Recognizing False Entailment JUN 2006 9 pages
Authors:  Rion Snow; Lucy Vanderwende; Arul Menezes; STANFORD UNIV CA DEPT OF COMPUTER SCIENCE
The full text of this report is available for sale.Recognizing textual entailment is a challenging problem and a fundamental component of many applications in natural language processing. We present a novel framework for recognizing textual entailment that focuses on the use of syntactic heuristics to recognize false entailment. We give a thorough analysis of our system, which demonstrates state-of-the-art performance on a widely-used test set.


Exploring the Utility of ResearchCyc for Reasoning from Natural Language MAY 2006 17 pages
Authors:  Christopher Manning; Andrew Ng; LELAND STANFORD JUNIOR UNIV STANFORD CA
The full text of this report is available for sale.This project investigated the potential for using ResearchCyc in natural language processing systems. The project focused particularly on natural language problems connected to sentence understanding, such as reading comprehension and robust textual inference. The project completed studies of the possibilities for using ResearchCyc knowledge for problems from this domain, it developed software for interaction between robust natural language processing systems and ResearchCyc, via its Java interface, and evaluated in a ...


Steps Toward the Alignment of Complementary Lexical Resources and Knowledge Databases JAN 2006 19 pages
Authors:  INTERNATIONAL COMPUTER SCIENCE INST BERKELEY CA
The full text of this report is available for sale.


Image Browsing and Natural Language Paraphrases of Semantic Web Annotations 2006 13 pages
Authors:  Christian Halaschek-Wiener; Jennifer Golbeck; Bijan Parsia; Vladimir Kolovski; Jim Hendler; MARYLAND UNIV COLLEGE PARK DEPT OF COMPUTER SCIENCE
The full text of this report is available for sale.Recently, there has been interest in marking up digital images with annotations describing the content of the images using Web-based ontologies encoded in the W3C's Web Ontology Language (OWL). The annotations are subsequently exploited to improve the user experience of large collections of images, whether by enhanced search or by a structured browsing experience. In the latter case, the complexity and unfamiliarity of logic-based ontology languages may do more to ...


The Problem of Ontology Alignment on the Web: A First Report 2006 9 pages
Authors:  Davide Fossati; Gabriele Ghidoni; Barbara Di Eugenio; Isabel Cruz; Huiyong Xiao; Rajen Subba; ILLINOIS UNIV AT CHICAGO DEPT OF COMPUTER SCIENCE
The full text of this report is available for sale.This paper presents a general architecture and four algorithms that use Natural Language Processing for automatic ontology matching. The proposed approach is purely instance based, i.e., only the instance documents associated with the nodes of ontologies are taken into account. The four algorithms have been evaluated using real world test data, taken from the Google and LookSmart online directories. The results show that NLP techniques applied to instance documents help ...


Mitre's Qanda at TREC-11 2006 11 pages
Authors:  John D. Burger; Lisa Ferro; Warren Greiff; John Henderson; Marc Light; Scott Mardis; Alex Morgan; MITRE CORP BEDFORD MA
The full text of this report is available for sale.Qanda is MITRE's TREC-style question answering system. Since last year's evaluation, principal improvements to the system have been aimed at making it faster and more robust. We discuss the current architecture of the system in Section 1. Some work has gone into better answer formation and ranking, which we discuss in Section 2. After this year's evaluation, we have done a number of ROVER-style system combination experiments using the judged ...


A Note on Topical N-Grams 24 DEC 2005 9 pages
Authors:  Xuerui Wang; Andrew McCallum; MASSACHUSETTS UNIV AMHERST DEPT OF COMPUTER SCIENCE
The full text of this report is available for sale.Most of the popular topic models (such as Latent Dirichlet Allocation) have an underlying assumption: bag of words. However, text is indeed a sequence of discrete word tokens, and without considering the order of words (in another word, the nearby context where a word is located), the accurate meaning of language cannot be exactly captured by word co-occurrences only. In this sense, collocations of words (phrases) have to be considered. ...


QuARS: A Tool for Analyzing Requirements SEP 2005
Authors:  Giuseppe Lami; CARNEGIE-MELLON UNIV PITTSBURGH PA SOFTWARE ENGINEERING INST
The full text of this report is not available and therefore is not for sale. This information is provided for reference purposes only.Numerous tools and techniques are available for managing requirements. Many are designed to define requirements, provide configuration management, and control distribution. However, there are few automatic tools to support the quality analysis of natural language (NL) requirements. Ambiguity analysis and consistency and completeness verification are usually carried out by human reviewers who read requirements documents and look for defects. This clerical activity is boring, time consuming, and often ineffective. This ...


Intricacies of Collins' Parsing Model 31 AUG 2005 33 pages
Authors:  Daniel M. Bikel; PENNSYLVANIA UNIV PHILADELPHIA DEPT OFCOMPUTER AND INFORMATION SCIENCE
The full text of this report is available for sale.This paper documents a large set of heretofore unpublished details Collins used in his parser, such that, along with Collins' thesis (Collins, 1999), this paper contains all information necessary to duplicate Collins' benchmark results. Indeed, these as-yet-unpublished details account for an 11% relative increase in error from an implementation including all details to a clean-room implementation of Collins' model. We also show a cleaner and equally-well-performing method for the handling ...


Capturing and Modeling Domain Knowledge Using Natural Language Processing Techniques JUN 2005
Authors:  Alain Auger; DEFENCE RESEARCH AND DEVELOPMENT CANADA VALCARTIER (QUEBEC)
The full text of this report is not available and therefore is not for sale. This information is provided for reference purposes only.Command and control (C2) and the decision making domain are seriously threatened, facing information overload and uncertainty issues. To make sense out of the flood of information, military have to create new ways of processing sensor and intelligence information, and of providing the results to commanders. Initiated in 2004 at Defense Research and Development Canada (DRDC), the SACOT knowledge engineering research project is currently investigating, developing and validating innovative natural ...


Plan Critiquing and Look-Ahead Constraint Reasoning for Active Templates OCT 2004 70 pages
Authors:  Christopher M. White; ALPHATECH INC BURLINGTON MA
The full text of this report is available for sale.The primary objective of this program was to develop knowledge-based Active Templates technology to help Special Operations Forces (SOF) planning staff quickly build robust and agile plans, and then monitor their execution. To this end, we conducted research and development in four major areas. First, we developed a SOF knowledge representation language and example scenario for representing mission plans and constraints on those plans. Second, we developed a template-based natural ...


Advanced Capabilities for Evidence Extraction (ACEE) JUL 2004 56 pages
Authors:  Elizabeth Liddy; Nancy McCracken; Eileen Allen; SYRACUSE UNIV NY CENTER FOR NATURAL LANGUAGE PROCESSING
The full text of this report is available for sale.The Center for Natural Language Processing (CNLP) at Syracuse University recently completed the Advanced Capabilities for Evidence Extraction (ACEE) Project, which has improved the effectiveness of its basic entity, relation, and event extraction technology and extended these capabilities in several ways. First, the Information Extraction (IE) technology can now be quickly ported to new domains by use of algorithms utilizing Transformation- Based Learning to specialize generic relation extraction to specific ...


From Language to Knowledge: Starting Hawk JUN 2004 39 pages
Authors:  Boris Katz; Gary Borchardt; Sue Felshin; MASSACHUSETTS INST OF TECHNOLOGY CAMBRIDGE PRECISION SYSTEMS DESIGN AND MANUFACTURING LAB
The full text of this report is available for sale.This report describes work completed by the MIT Computer Science and Artificial Intelligence Laboratory in support of DARPA's Rapid Knowledge Formation (RKF) program over the period from July 2000 to September 2003. The primary focus of the RKF program is to develop new technology to automate the task of transforming raw human- understandable information into encoded, machine-understandable information. The project described in this report addresses a central subtask of this ...


Understanding of Navy Technical Language via Statistical Parsing JUN 2004 37 pages
Authors:  Neil C. Rowe; NAVAL POSTGRADUATE SCHOOL MONTEREY CA DEPT OF COMPUTER SCIENCE
The full text of this report is available for sale.A key problem in indexing technical information is the interpretation of technical words and word senses, expressions not used in everyday language. This is important for captions on technical images, whose often pithy descriptions can be valuable to decipher. We describe the natural-language processing for MARIE-2, a natural-language information retrieval system for multimedia captions. Our approach is to provide general tools for lexicon enhancement with the specialized words and word ...


Understanding Natural Language Descriptions of Physical Phenomena 07-May-2004 236 pages
Authors:  Sven E Kuehne; NORTHWESTERN UNIV EVANSTON IL DEPT OF COMPUTER SCIENCE
The full text of this report is available for sale.The fact that human readers can learn about the physical world from textual descriptions leads to a number of interesting questions about the connections between our conceptual understanding of the physical world and how it is reflected in natural language. This thesis investigates some forms in which information about physical phenomena is typically expressed in natural language and how this knowledge can be used to construct models of the underlying ...


Interaction on Emotions 16 JAN 2004 148 pages
Authors:  Arno Hartholt; Tijmen J. Muller; UNIVERSITY OF SOUTHERN CALIFORNIA MARINA DEL REY CA INST FOR CREATIVE TECHNOLOGIES
The full text of this report is available for sale.This report describes the addition of an emotion dialogue to the Mission Rehearsal Exercise (MRE) system. The goal of the MRE system is to provide an immersive learning environment for army officer recruits. The user can engage in conversation with several intelligent agents in order to accomplish the goals within a certain scenario. Although these agents did already posses emotions, they were unable to express them verbally. A question - ...


HITIQA: A Data Driven Approach to Interactive Analytical Question Answering 2004 5 pages
Authors:  Sharon Small; Tomek Strzalkowski; STATE UNIV OF NEW YORK AT ALBANY
The full text of this report is available for sale.In this paper we describe the analytic question answering system HITIQA (High-Quality Interactive Question Answering) which has been developed over the last 2 years as an advanced research tool for information analysts. HITIQA is an interactive open-domain question answering technology designed to allow analysts to pose complex exploratory questions in natural language and obtain relevant information units to prepare their briefing reports. The system uses novel data-driven semantics to conduct ...


Syntactic Simplification for Improving Content Selection in Multi-Document Summarization 2004 8 pages
Authors:  Advaith Siddharthan; Ani Nenkova; Kathleen McKeown; COLUMBIA UNIV NEW YORK DEPT OF COMPUTER SCIENCE
The full text of this report is available for sale.In this paper, we explore the use of automatic syntactic simplification for improving content selection in multi-document summarization. In particular, we show how simplifying parentheticals by removing relative clauses and appositives results in improved sentence clustering, by forcing clustering based on central rather than background information. We argue that the inclusion of parenthetical information in a summary is a reference-generation task rather than a content-selection one, and implement a baseline ...


Robust Reading: Identification and Tracing of Ambiguous Names 2004 9 pages
Authors:  Xin Li; Paul Morie; Dan Roth; ILLINOIS UNIV AT URBANA DEPT OF COMPUTER SCIENCE
The full text of this report is available for sale.A given entity, representing a person, a location or an organization, may be mentioned in text in multiple, ambiguous ways. Understanding natural language requires identifying whether different mentions of a name, within and across documents, represent the same entity. We develop an unsupervised learning approach that is shown to resolve accurately the name identification and tracing problem. At the heart of our approach is a generative model of how documents ...


Question Answering Based on Semantic Structures 2004 10 pages
Authors:  Srini Narayanan; Sanda Harabagiu; INTERNATIONAL COMPUTER SCIENCE INST BERKELEY CA
The full text of this report is available for sale.The ability to answer complex questions posed in Natural Language depends on (1) the depth of the available semantic representations and (2) the inferential mechanisms they Support. In this paper we describe a QA architecture where questions are analyzed and candidate answers generated by 1) identifying predicate argument structures and semantic frames from the input and 2) performing structured probabilistic inference using the extracted relations in the context of a ...


Training Tree Transducers 2004 9 pages
Authors:  Jonathan Graehl; Kevin Knight; UNIVERSITY OF SOUTHERN CALIFORNIA MARINA DEL REY INFORMATION SCIENCES INST
The full text of this report is available for sale.Many probabilistic models for natural language are now written in terms of hierarchical tree structure. Tree-based modeling still lacks many of the standard tools taken for granted in (finite-state) string-based modeling. The theory of tree transducer automata provides a possible framework to draw on, as it has been worked out in an extensive literature. We motivate the use of tree transducers for natural language and address the training problem for ...


A Linear Programming Formulation for Global Inference in Natural Language Tasks 2004 9 pages
Authors:  Dan Roth; Wen-tau Yih; ILLINOIS UNIV AT URBANA-CHAMPAIGN DEPT OF COMPUTER SCIENCE
The full text of this report is available for sale.Given a collection of discrete random variables representing outcomes of learned local predictors in natural language. e.g.. named entities and relations. we seek an optimal global assignment to the variables in the presence of general (non-sequential) constraints. Examples of these constraints include the type of arguments a relation can take, and the mutual activity of different relations. etc. We develop a linear programing formulation for this problem and evaluate it ...


A Statistical Model for Multilingual Entity Detection and Tracking 2004 9 pages
Authors:  R. Florian; H. Hassan; A. Ittycheriah; H. Jing; N. Kambhatla; X. Luo; H. Nicolov; S. Roukos; IBM THOMAS J WATSON RESEARCH CENTER YORKTOWN HEIGHTS NY
The full text of this report is available for sale.Entity detection and tracking is a relatively new addition to the repertoire of natural language tasks. In this paper, we present a statistical language-independent framework for identifying and tracking named, nominal and pronominal references to entities within unrestricted text documents, and chaining them into clusters corresponding to each logical entity present in the text. Both the mention detection model and the novel entity tracking model can use arbitrary feature types, ...


The ICSI Meeting Recorder Dialog Act (MRDA) Corpus 2004 5 pages
Authors:  Elizabeth Shriberg; Raj Dhillon; Sonali Bhagat; Jeremy Ang; Hannah Carvey; INTERNATIONAL COMPUTER SCIENCE INST BERKELEY CA
The full text of this report is available for sale.We describe a new corpus of over 180,000 hand- annotated dialog act tags and accompanying adjacency pair annotations for roughly 72 hours of speech from 75 naturally-occurring meetings. We provide a brief summary of the annotation system and labeling procedure, inter-annotator reliability statistics, overall distributional statistics, a description of auxiliary files distributed with the corpus, and information on how to obtain the data.


On the Representation of Physical Quantities in Natural Language Text 2004 7 pages
Authors:  Sven E. Kuehne; NORTHWESTERN UNIV EVANSTON IL
The full text of this report is available for sale.In this paper we investigate the forms in which quantity information can appear in written natural language. Our focus is on physical quantities found in descriptions of physical processes, such as expansion, movement, or transfer. Using Qualitative Process Theory as our underlying formalism, we show how information extracted from natural language text corresponds to the five constituents of physical quantities. The results of this analysis can be used for the ...


Robustness Versus Fidelity in Natural Language Understanding 2004 9 pages
Authors:  Mark G. Core; Johanna D. Moore; EDINBURGH UNIV (UNITED KINGDOM) DIVISION OF INFORMATICS
The full text of this report is available for sale.A number of issues arise when trying to scale-up natural language understanding (NLU) tools designed for relatively simple domains (e.g. flight information) to domains such as medical advising or tutoring where deep understanding of user utterances is necessary. Because the subject matter is richer, the range of vocabulary and grammatical structures is larger meaning NLU tools are more likely to encounter out-of-vocabulary words or extra-grammatical utterances. This is especially true ...


Consolidating the Results of the CIRCSIM-Tutor Project and Further Consolidation of the Results of the CIRCSIM-Tutor Project 31 DEC 2003 18 pages
Authors:  Martha W. Evens; ILLINOIS INST OF TECH CHICAGO DEPT OF COMPUTER SCIENCE
The full text of this report is available for sale.This grant supported the writing of a book on the author's experiments in human and computer tutoring and enabled the running of one last experiment with the CIRCSIM-Tutor system, Version 2.9. The experiment compared the learning gains made by medical students using the CIRCSIM-Tutor system with those made by students reading a carefully edited relevant text. The experiment was suggested at a meeting of ONR Grantees in the tutoring portion ...


An Architecture for the Semantic Processing of Natural Language Input to a Policy Workbench MAR 2003 107 pages
Authors:  E. J. Custy; NAVAL POSTGRADUATE SCHOOL MONTEREY CA
The full text of this report is available for sale.Formal methods hold significant potential for automating the development, refinement, and implementation of policy. For this potential to be realized, however, improved techniques are required for converting natural- language statements of policy into a computational form. In this paper we present and analyze an architecture for carrying out this conversion. The architecture employs semantic networks to represent both policy statements and objects in the domain of those statements. We present ...


Spatial Language for Human-Robot Dialogs 2003 40 pages
Authors:  Marjorie Skubic; Dennis Perzanowski; Sam Blisard; Alan Schultz; William Adams; Magda Bugajska; Derek Brock; NAVAL RESEARCH LAB WASHINGTON DC CENTER FOR APPLIED RESEARCH IN ARTIFICIAL INTELLIGENCE
The full text of this report is available for sale.In conversation, people often use spatial relationships to describe their environment, e.g., "There is a desk in front of me and a doorway behind it", and to issue directives, e.g., "Go around the desk and through the doorway. " In our research, we have been investigating the use of spatial relationships to establish a natural communication mechanism between people and robots, in particular, for novice users. In this paper, the ...


Linguistic Resource Creation for Research and Technology Development: A Recent Experiment 2003 30 pages
Authors:  Stephanie Strassel; Mike Maxwell; Christopher Cieri; PENNSYLVANIA UNIV PHILADELPHIA
The full text of this report is available for sale.Advances in statistical machine learning encourage language-independent approaches to linguistic technology development. Experiments in porting technologies to handle new natural languages have revealed a great potential for multilingual computing, but also a frustrating lack of linguistic resources for most languages. Recent efforts to address the lack of available resources have focused either on intensive resource development for a small number of languages or development of technologies for rapid porting. The ...


Utterance Classification in Auto Tutor 2003 9 pages
Authors:  Andrew Olney; Max Louwerse; Eric Matthews; Johanna Marineau; Heather Hite-Mitchell; Arthur Graesser; MEMPHIS UNIV TN
The full text of this report is available for sale.This paper describes classification of typed student utterances within AutoTutor, an intelligent tutoring system. Utterances are classified to one of 18 categories including 16 question categories. The classifier presented uses part of speech tagging, cascaded finite state transducers, and simple disambiguation rules. Shallow NLP is well suited to the task: session log file analysis reveals significant classification of eleven question categories, frozen expressions, and assertions.


HITIQA: An Interactive Question Answering System. A Preliminary Report 2003 9 pages
Authors:  Sharon Small; Ting Liu; Nobuyuki Shimizu; Tomek Strzalkowski; STATE UNIV OF NEW YORK AT ALBANY
The full text of this report is available for sale.HITIQA is an interactive question answering technology designed to allow intelligence analysts and other users of information systems to pose questions in natural language and obtain relevant answers, or the assistance they require in order to perform their tasks. Our objective in HITIQA is to allow the user to submit exploratory, analytical, non-factual questions, such as "What has been Russia's reaction to U.S. bombing of Kosovo?" The distinguishing property of ...


Total Results: 451 Pages: Previous [1] 2 3 4 5 6 7 8 9 10 Next Results per page: