Scratch space for joint project by Ringger & Giraud-Carrier

Focus: MAHT

Focus: TDT

Sub-problems

Areas of interest in 2004

  • Story Segmentation (* not evaluated in 2004)
  • New Event Detection = First Story Detection
    • System Goal: To detect the first story that discusses each topic
  • Link Detection
    • System Goal: To detect whether a pair of stories discuss the same topic.
  • Topic Tracking
    • System Goal: To detect stories that discuss the target topic, in multiple source streams
      • Supervised Training: Given Nt samples stories that discuss a given target topic
      • Testing: Find all subsequent stories that discuss the target topic
    • Traditional (non-adaptive?)
    • Supervised Adaptive Topic Tracking (* experimental task)
      • System Goal: To detect stories that discuss the target topic when a human provides feedback to the system (System receives human judgment (on or off-topic) for every retrieved story)
  • Topic Detection
    • Traditional (flat?)
    • Hierarchical Topic Detection (* experimental task)
      • System Goal: To detect topics in terms of the (clusters of) stories that discuss them

Metrics

  • New Event Detection and Link Detection:
    • Detection Cost
    • Detection error tradeoff (DET) Curves
    • Notes:
      • Same as in spoken language identification! Identification is detection, as opposed to classification.
  • Supervised Adaptive Tracking
    • (Normalized) Detection Cost
    • Linear Utility Measure (a la TREC 2002 Filtering Track, per Robertson & Soboroff)
  • Hierarchical Topic Detection
    • Weighted combination of Detection Cost and Travel Cost

Terminology, Acronyms

  • An event: a specific thing that happens at a specific time and place along with all necessary preconditions and unavoidable consequences.
  • A topic: an event or activity, along with all directly related events and activities
  • A broadcast news story: a section of transcribed text with substantive information content and a unified topical focus

Papers to acquire

  • Title:Detection As Multi-Topic Tracking.(Author abstract).
    • Author(s):James Allan.
    • Source:Information Retrieval 5.2 (April 2002): p139.
    • Document Type:Magazine/Journal
    • Byline: James Allan (1)
    • Keywords: topic detection and tracking (TDT); event-based information organization; information filtering; evaluation
    • Abstract: The topic tracking task from TDT is a variant of information filtering tasks that focuses on event-based topics in streams of broadcast news. In this study, we compare tracking to another TDT task, detection, which has the goal of partitioning all arriving news into topics, regardless of whether the topics are of interest to anyone, and even when a new topic appears that had not been previous anticipated. There are clear relationships between the two tasks (under some assumptions, a “perfect” tracking system could “solve” the detection problem), but they are evaluated quite differently. We describe the two tasks and discuss their similarities. We show how viewing detection as a form of multi-topic parallel tracking can illuminate the performance tradeoffs of detection over tracking.
    • Source Citation: Allan, James. “Detection As Multi-Topic Tracking.” Information Retrieval 5.2 (April 2002): 139. Academic OneFile. Gale. Brigham Young University - Utah. 4 Apr. 2008
    • Gale Document Number:A155176171
nlp-private/ringger-giraud-carrier.txt · Last modified: 2015/04/23 13:20 by ryancha
Back to top
CC Attribution-Share Alike 4.0 International
chimeric.de = chi`s home Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0