Dependent Variable

Time: the time in seconds that the subject spent on the current case.

Sentence at a Time

Batch Oracular Model

  • Length
  • Number Needing Correction
  • Conditional Entropy
  • Accuracy on Test Set

Descriptive Oracular Model

  • Length: The number of tokens in the sentence. When annotating a single word it is the length of the sentence in which the word appears.
  • Subject Accuracy: The percentage of tokens correctly tagged by the subject. When annotating a single word this is either 0% or 100%
  • Location: Index of the current case in the session
  • Tagger Accuracy: The percentage of words correctly tagged by the automatic tagger in the sentence. When annotating a single word this is either 0% or 100%
  • Number Needing Correction: the number of words in the case needing correction
  • Percent Done: percentage of the cases assigned to the current subject already encountered
  • Conditional Entropy:
    • For whole sentence annotation, an estimate of the total tag sequence entropy given the words in the current sentence.
    • For single word annotation, the entropy of the tag distribution for the current word.
    • Probably useless because sentences were selected based on high entropy.
  • From Tagger: The accuracy of the tagger providing the candidate tags on the test set
  • Native English Speaker: a 0/1 indicator of whether the subject is a native English speaker
  • Previously Participated in Study: a 0/1 indicator of whether the subject was part of a previous (similar) tagging exercise
  • Self Evaluation Tagging Proficiency: a 0/1/2 indicator of the subject self-evaluation of tagging proficiency.
  • Self Evaluation of Performance in Study: a 0/1/2 indicator of the subject self-evaluation of tagging accuracy in this study.

Annotation-Time Model

  • Could have running average of time on previous cases, normalized by length
  • Length: The number of tokens in the sentence. When annotating a single word it is the length of the sentence in which the word appears.
  • Location: Index of the current case in the session
  • Tagger Accuracy: The percentage of words correctly tagged by the automatic tagger in the sentence. When annotating a single word this is either 0% or 100%
    • Approximated by running average
  • Number Needing Correction: the number of words in the case needing correction
    • Approximated by (1 - accuracy) * length
  • Percent Done: percentage of the cases assigned to the current subject already encountered
  • Conditional Entropy:
    • For whole sentence annotation, an estimate of the total tag sequence entropy given the words in the current sentence.
    • For single word annotation, the entropy of the tag distribution for the current word.
    • Probably useless because sentences were selected based on high entropy.
  • Native English Speaker: a 0/1 indicator of whether the subject is a native English speaker
  • Previously Participated in Study: a 0/1 indicator of whether the subject was part of a previous (similar) tagging exercise
  • Self Evaluation Tagging Proficiency: a 0/1/2 indicator of the subject self-evaluation of tagging proficiency.

Word at a Time

Batch Oracular Model

Descriptive Oracular Model

=== Annotation-Time Model

nlp-private/cost-models-from-the-user-study-data.txt · Last modified: 2015/04/23 20:39 by ryancha
Back to top
CC Attribution-Share Alike 4.0 International
chimeric.de = chi`s home Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0