Tools were developed for the representation and analysis of uncertainty in INTEL data and targeted uncertainty reduction. The purpose is to help INTEL analysts answer these questions: 1) What hypotheses can be validated/refuted based on available uncertain data and at what level of certainty? 2) What missing data is critical for verifying or refuting given hypotheses and increasing the certainty of current conclusions? 3) What are the tradeoffs between the ...
This paper presents the CMU submission to the 2008 TREC blog distillation track. Similar to last year's experiments, we evaluate different retrieval models and apply a query expansion method that leverages the link structure in Wikipedia. We also explore using a corpus that combines several different representations of the documents, using both the feed XML and permalink HTML, and apply initial experiments with spam filtering.