__TOC__

<br />

Planned Meeting Topics

<br />

Past Meetings

<br />

Future Meeting Agenda Items

For Now:

  • Review annotation tools, e.g. Lehigh's http://dae.cse.lehigh.edu/DAE/, GEDI, PixLabeler.
  • Dan and Dr. Ringger present a good paper on relevant current research.
  • Get CIKM “AND” proceedings, '07, '08, '09. Very relevant venue for this group.
  • Josh Hansen's HMM-LDA project?
  • Report on state of data.
  • Decide on an annotation tool (Image Annotation Tools)
  • Decide on an OCR engine (OCR Engines)
  • Document image data sets (Document Image Data Sets)
  • Funding discussion.
  • Aaron to demo FOCIH-based image annotation tool.

For Later:

  • Create annotation plan for Ancestry and other data.
  • Hire under-grads, library, request Ancestry employees to do annotation.
  • Dr. Ringger to invite Dan Lopresti to come to BYU.
  • Decide to make a competition.
  • More clarity on directions and contributions, including specific projects, what tools to use (Ocropus?), how to leverage those tools, what data to work on, how to magnify our efforts by working together as a group, etc.
  • Plan for zoning, layout, language modeling, table interpretation and E/R classification earlier in the (Ocropus) pipeline.
  • Should we use Ocropus, train a character recognition model from Internet Archive data, hire an undergrad to do this plus output bounding boxes, etc.?
  • CHURP to NSF Proposal (sometime in June to August).
  • OCRopus update.

<br />

Major Tasks

These are big tasks affecting everyone in the NOCR group.

<br />

Participants

  • Aaron Stewart
  • Bill Lund
  • Dan Walker
  • Thomas Packer (tpacker@byu.net)
  • Dr. David Embley
  • Dr. Eric Ringger
nlp-private/noisy-ocr-group.txt · Last modified: 2015/04/23 19:20 by ryancha
Back to top
CC Attribution-Share Alike 4.0 International
chimeric.de = chi`s home Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0