This page discusses the Experimentation API defined within the edu.byu.nlp.experimentation package.

capabilities

The interfaces in this package allow implementors to define simple actions that can be performed on a given type of object, and to specify objects that can perform such actions.

  • [http://nlp.cs.byu.edu/trac/statnlp/browser/trunk/src/edu/byu/nlp/experimentation/capabilities/Capability.java Capability<T>]

    : The implementor can perform an action on objects of type T (by means of

    public void performAction(T onThisObject)

    )

  • [http://nlp.cs.byu.edu/trac/statnlp/browser/trunk/src/edu/byu/nlp/experimentation/capabilities/Capable.java Capable<T>]

    : The implementor provides a list of Capability<T>'s. Thus an implementor of Capable<T> is capable of performing various actions on T's.

In practice this abstraction is utilized by the FEC user interface to populate context menus. For example, when right-clicking on a representation of an Experiment (which implements Capable<Experiment>), the pop-up context menu is at least in part populated by querying the Experiment's getCapabilities() method and adding a menu item for each Capable<Experiment> returned thereby.

categorization

As classification/identification tasks involve placing various items into bins or categories, we provide means for describing such categorization systems. The primary interfaces are:

  • [http://nlp.cs.byu.edu/trac/statnlp/browser/trunk/src/edu/byu/nlp/experimentation/categorization/Categorization.java Categorization]

    : A String→Category map representing a system for categorizing things. Also provides a name (

    getName()

    ) and a convenient indexed-access method (

    getCategoryAtIndex(Integer i)

    ).

  • [http://nlp.cs.byu.edu/trac/statnlp/browser/trunk/src/edu/byu/nlp/experimentation/categorization/Category.java Category]

    : Represents a category within a categorization.

    getShortName()

    returns an identifier unique within the Categorization, while

    getLongName()

    gives a descriptive string for display purposes. These parallel methods may eventually be merged into a single

    getID()

    as this seems unnecessarily complex.

Some classes included for convenience:

  • [http://nlp.cs.byu.edu/trac/statnlp/browser/trunk/src/edu/byu/nlp/experimentation/categorization/Categorizations.java Categorizations]

    : A utility class that provides means of generating all two-category combinations of categories in a categorization, and a join facility like that provided by Ruby's Array class.

  • [http://nlp.cs.byu.edu/trac/statnlp/browser/trunk/src/edu/byu/nlp/experimentation/categorization/AbstractCategorization.java AbstractCategorization]

    /

    [http://nlp.cs.byu.edu/trac/statnlp/browser/trunk/src/edu/byu/nlp/experimentation/categorization/SimpleCategorization.java SimpleCategorization]

    and

    [http://nlp.cs.byu.edu/trac/statnlp/browser/trunk/src/edu/byu/nlp/experimentation/categorization/AbstractCategory.java AbstractCategory]

    /

    [http://nlp.cs.byu.edu/trac/statnlp/browser/trunk/src/edu/byu/nlp/experimentation/categorization/SimpleCategory.java SimpleCategory]

    : Various degrees of partial and simple implementations of Categorization and Category.

display

Miscellaneous types that facilitate the graphical display of the data contained within an ExperimentationSystem's data structures.

  • [http://nlp.cs.byu.edu/trac/statnlp/browser/trunk/src/edu/byu/nlp/experimentation/display/ChartType.java enum ChartType]

    : This enumeration specifies whether a chart should be globally, categorically, or individually scoped. Specifically:

    • GLOBAL

      : One series should be generated aggregating across all categories.

    • PER_CATEGORY

      : One series should be generated for each category in the current categorization.

    • ONE_ON_ONE

      : One series should be generated for each two-category combination in the current categorization. See

      Categorizations.generateCategoryPairs

      .

  • [http://nlp.cs.byu.edu/trac/statnlp/browser/trunk/src/edu/byu/nlp/experimentation/display/AddableSeriesDataset.java AddableSeriesDataset]

    : This interface abstracts away from JFreeChart's Dataset type so we can avoid explicitly depending upon that library. It does this using the

    addSeries(String seriesLabel, List<Pair<Double, Double>> points)

    method, which allows for adding of a labeled two-dimensional data series.

  • [http://nlp.cs.byu.edu/trac/statnlp/browser/trunk/src/edu/byu/nlp/experimentation/display/NotableScoreTracker.java NobableScoreTracker]

    and

    BasicNotableScoreTracker

    : Mechanism for tracking high and low values. This becomes useful in displays like the CostComponentMatrix, though I can't at the moment say why this needs to be here in StatNLP. (Maybe it doesn't??)

  • [http://nlp.cs.byu.edu/trac/statnlp/browser/trunk/src/edu/byu/nlp/experimentation/display/PlotSeriesLoader.java PlotSeriesLoader<R extends ClassificationResult>]

    : Centering around the

    load(AddableSeriesDataset dataset, R result, ChartType type)

    method, a PlotSeriesLoader loads an R (Result) into an AddableSeriesDataset, generating either GLOBAL, PER_CATEGORY, or ONE_ON_ONE series' as requested. This allows an ExperimentationSystem to take responsibility for deciphering its own results and plotting them for higher-level display purposes. (Is it possible that this is unnecessary, as the ClassificationResult abstraction should provide enough information to do this anyway?)

experiments

Experiments are, naturally, at the heart of the

experimentation

API. Three key interfaces make up a hierarchy such that

Experiment

ClassificationExperiment

IdentificationExperiment

(where → means is the superclass of).

  • Experiment

    : Provides expected facilities such as name, description, and status indications for data generation and model training. Additionally, Experiment keeps track of FeatureTemplate's used in this experiment, of ExperimentData generated or retrieved, and of various Result's generated in experiment runs.

    • ClassificationExperiment

      : In addition to methods provided by Experiment, ClassificationExperiment provides mechanisms that imply the existence and use of categorizations. The separation of Experiment and ClassificationExperiment imply that it may in the future be desirable to accommodate other sorts of experiments than classification and identification.

      • IdentificationExperiment

        : Further refinement of ClassificationExperiment to allow for continuous outcomes and thresholded results.

jobs

  • Job

    : Job is an abstract class extending Java 6's SwingWorker. Beyond SwingWorker's capabilities, Job provides type and description info, a locking mechanism, and Capability<Job>'s useful in managing running job instances. Job's are the common currency of FEC's EngineeringEnvironment thread pool/job execution system, as seen in

    EngineeringEnvironment.queueJob(Job job)

    . Thus, for an ExperimentationSystem to be able to schedule jobs as part of running its experiments, it must be able to produce an instance of a class extending Job.

metric

A standardized mechanism for defining metrics with which to evaluate classification or identification performance. This was derived from/inspired by Robbie's work in

edu.byu.langid.metric

. The two primary interfaces are:

  • DecisionCostFunction

    : This extension of

    edu.berkeley.nlp.math.Function

    guarantees that the implementor is an object capable of calculating

    cost(final int i)

    (the total cost of all errors for category i, where i is an index into a categorization),

    costComponent(final int i, final int j)

    (the relevant cost for all errors occuring at the combination of category i and category j), and

    averageCost()

    .

    • ThresholdedDCF

      : ThresholdedDCF adds three corresponding methods suffixed with -At and requiring an additional threshold argument: <coe>costAt(final int i, final double threshold)</code>,

      costComponentAt(final int i, final int j, final double threshold)

      , and

      averageCostAt(final double threshold)

      .

Beyond the fundamental interfaces, abstract partial implementations (

AbstractDCF

and

AbstractThresholdedDCF

) and one simplistic full implementation (

CountsOnlyDCF

) are provided for convenience.

models

  • ModelInfo

    : Meta-information about a model. This is an extremely simple interface derived from the former ModelInfo class (now ModelConfig) inside of ConfigParser. In addition, it subsumes the former edu.byu.nlp.experimentation.ModelFile. (I don't really know if this is necessary, but it's here and it's being used somewhere….)

    • SimpleModelInfo

      : A probably-superfluous subclass of ModelInfo.

quantization

Quantizations are used in the Language-ID feature definition system (which we hope to eventually generalize and incorporate into StatNLP) and UI has been provided within FEC for management and use thereof. Much like the Categorization/Category arrangement, we have these two interfaces:

  • Quantization

    : A Quantization puts something into the Quantile it belongs to based on its value and the values of the minimum and maximum of the various Quantile's (an internal representation of which is implied by this interface.)

  • Quantile

    : A Quantile is a subdivision of a range, having a minimum and a maximum value. Quantile's also provide a name (

    getName()

    ) and will tell you whether a certain value is inside of itself (

    Boolean isInside(Double value)

    ).

results

Results are the point of experiments. They tell us how the system performed in its task. As with the experiments, results are defined using a three-interface hierarchy:

Result

ClassificationResult

IdentificationResult

.

  • Result

    : Simply provides a name, a relevant Experiment, and a full name. (The name/full name distinction is unclear to me at the moment.)

    • ClassificationResult

      : Using the

      edu.byu.nlp.experimentation.trials.TrialType

      enum, this interface allows users to query counts and rates for all possible statuses of classifier trials (TrialType's

      TRUE_NEGATIVE

      ,

      FALSE_ALARM

      ,

      HIT

      ,

      MISS

      ) and in total (TrialType's

      ALL_TRIALS

      ).

      • IdentificationResult

        : Extends ClassificationResult much as ThresholdedDCF extends DecisionCostFunction: using a series of -At methods, IdentificationResult allows queries based also on a threshold value. Additionally, IdentificationResult provides

        getCost()

        ,

        getEqualErrorRate()

        , and

        getOptimalGlobalThreshold()

        methods relevant to identification experiments.

Likewise, some useful classes are provided:

  • AbstractClassificationResult

    : This key class uses a HashMap-based structure to keep track of every trial in a classification run. The

    TruthMap trials

    member points to various HypothesisMap's, which in turn provide TrialGroup's from which individual trials can be extracted. A MetaIterator class provides for one-at-a-time iteration over all of the trials referenced by the TruthMap. AbstractClassificationResult also provides the obvious implementations of getCount and getRate methods.

    • AbstractIdentificationResult

      : This class expands upon its parent (AbstractClassificationResult) in obvious ways suggested by the IdentificationResult interface.

systems

To begin outfitting a system for FEC-based interaction, implement either ClassificationSystem or IdentificationSystem as appropriate. As with other hierarchies, the distinction between ExperimentationSystem and ClassificationSystem is somewhat arbitrary and implies the possibility of future implementors of ExperimentationSystem that are neither classification nor identification systems.

  • ExperimentationSystem<E extends Experiment>

    : Implementors of this interface are systems for managing an entire environment for experimentation. These systems expose, through this interface, mechanisms for managing or inspecting experiments, results, quantizations, feature templates, runs, and regression tests.

    • ClassificationSystem<E extends ClassificationExperiment>

      : Beyond ExperimentationSystem, ClassificationSystem provides means of dealing with categorizations, one-on-one experiments, and TrialTypeGroup's.

      • IdentificationSystem<E extends IdentificationExperiment>

        : Aside from narrower bounds for the type parameter, this interface does not currently add anything to ClassificationSystem.

tests

  • RegressionTest

    : This painfully ad hoc interface provides a standard way to run and receive results from regression tests. Some of the semantics of

    passed()

    and

    hasBeenRunBefore()

    are a bit weird, if I remember right.

trials

A trial is an instance of a classifier attempting to classify something, generally as part of a classification run that processes many things in a batch.

  • ClassificationTrial

    : ClassificationTrial's store information about the name of the sample in question, the true category in which it belongs, and the hypothesized category that the classifier is being queried about. The interface also allows inspection of the classifier's conclusion (

    hypothesisAssertedAsTrue()

    ) and inspection about the logical status of that decision by means of the

    is-

    methods. Thus it can quickly be determined if the trial resulted in a false alarm, miss, true negative, or hit.

    • IdentificationTrial

      : IdentificationTrial adds to ClassificationTrial thresholded versions of the

      is-

      methods and

      hypothesisAssertedAsTrue

      , plus

      Double getOutcome()

      to retrieve a rating of the classifier's confidence that the hypothesis is true.

  • enum TrialType

    : Specifies the logical status of a trial:

    • TRUE_NEGATIVE

      : The sample was not of class hypothesis, and the classifier correctly determined this to be the case.

    • FALSE_ALARM

      : The sample was not of class hypothesis, but the classifier erroneously asserted that it was.

    • HIT

      : The sample was of class hypothesis and the classifier correctly determined this to be the case.

    • MISS

      : The sample was of class hypothesis, but the classifier erroneously asserted that it was not.

    • ALL_TRIALS

      : This special value does not fit into the truth table to which the previous four correspond, but is rather used when trials of any logical status need to be selected.

Other Classes

  • ExperimentData
  • ExperimentRun
  • FeatureTemplate

Feature Engineering Console Statistical NLP

nlp-private/edu.byu.nlp.experimentation.txt · Last modified: 2015/04/23 13:16 by ryancha
Back to top
CC Attribution-Share Alike 4.0 International
chimeric.de = chi`s home Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0