Differences

This shows you the differences between two versions of the page.

Link to this comparison view

nlp:on-ramp [2015/04/21 22:23] (current)
ryancha created
Line 1: Line 1:
 +... to the ALFA project.
  
 +== April 28, 2009 ==
 +(continued on 4/30/09)
 +
 +* Topic: Welcome
 +* Topic: Intro. to Probability Theory
 +** Presenter: Eric Ringger
 +* Reading assignment: Manning & Schuetze 2.1, 2.2, 3, 4
 +* Reading assignment: Russell & Norvig 14.1-14.4
 +* Homework: https://​cswiki.cs.byu.edu/​cs479/​index.php/​Homework_0.1 ​
 +* Optional homework: https://​cswiki.cs.byu.edu/​cs479/​index.php/​Homework_0.2
 +
 +== May 5, 2009 ==
 +
 +* Topic: Word Sense Disambiguation as motivation for Feature Engineering
 +** Presenter: Eric Ringger
 +* Topic: Feature Engineering Console
 +** Presenter: Josh Hansen
 +* Topic: Maximum Entropy Models
 +** Presenter: Peter McClanahan
 +* Reading assignment: M&S 7, M&S 16
 +* Optional reading assignment: [http://​www.cs.cmu.edu/​afs/​cs/​user/​aberger/​www/​html/​tutorial/​tutorial.html Berger'​s MaxEnt tutorial]
 +* Homework: https://​cswiki.cs.byu.edu/​cs479/​index.php/​Project_2.2
 +** BUT: Use the Feature Engineering Console! http://​nlp.cs.byu.edu/​mediawiki-private/​index.php/​Feature_Engineering_Console (on the Private wiki -- BYU NLP only -- requires authentication)
 +** Write as little extra code as possible. ​ Possible exceptions: new feature templates/​extractors.
 +** Work with Josh Hansen if you want to improve the FEC itself.
 +
 +== May 12, 2009 ==
 +
 +* Topic: Active Learning
 +** Presenter: Robbie Haertel
 +* Reading assignment: [http://​pages.cs.wisc.edu/​~bsettles/​pub/​settles.activelearning.pdf Survey of Active Learning by Burr Settles]
 +* Homework:
 +** Implement one active learning selection function
 +** Reference: http://​nlp.cs.byu.edu/​mediawiki/​index.php/​Using_the_active_learner
 +** Plot learning curve for chosen function, versus random, using Gnuplot or Excel
 +** Work with Robbie Haertel to bring the plotting code back to life
 +
 +== May 19, 2009 ==
 +
 +* Topic: Sequence Labeling
 +** Presenter: George Busby
 +* Reading assignment: M&S 9, M&S 10
 +* Reading assignment: [http://​faculty.cs.byu.edu/​~ringger/​CS401R/​papers/​ToutanovaManning_POS-emnlp2000.pdf Paper by Toutanova & Manning on MEMMs]
 +* Optional reading assignment: [http://​faculty.cs.byu.edu/​~ringger/​CS401R/​papers/​Brants_POS-00tnt.pdf Paper on TnT by Brants]
 +* Homework:
 +** Continue on Active Learning experiments
 +** Focus on PNP classification task
 +*** Plot means of multiple (around 5) random runs over # of iterations.
 +*** Would be interesting to plot variances of multiple random runs over # of iterations.
 +** Try POS tagging task with small batch size B and number of iterations N, such that N x B is approx. 300
 +
 +== May 26, 2009 ==
 +
 +* Topic: Intro. to StatNLP Code-base
 +** Presenter: Robbie Haertel
 +* Reading assignment: ​ [http://​aclweb.org/​anthology-new/​W/​W07/​W07-1516.pdf our ALFA Paper at the LAW]
 +* Reading assignment: ​ [http://​aclweb.org/​anthology-new/​D/​D07/​D07-1051.pdf Tomanek et al., "An Approach to Text Corpus Construction which Cuts Annotation Costs and Maintains Reusability of Annotated Data"]
 +* Homework: https://​cswiki.cs.byu.edu/​cs479/​index.php/​Project_3.1
 +
 +== May 31 - June 6, 2009 ==
 +
 +* [http://​www.naacl2009.org NAACL HLT 2009]
 +* [http://​nlp.cs.byu.edu/​alnlp/​ Workshop on Active Learning for NLP]
 +
 +== June 9, 2009 ==
 +
 +* Topic: Named Entity Recognition
 +** Presenter: Eric Ringger
 +* Reading assignment: [http://​nlp.stanford.edu/​pubs/​conll-ner.pdf Klein et al. paper from 2003: "Named Entity Recognition with Character-Level Models"​]
 +* Reading assignment: [http://​www.cs.umass.edu/​~mccallum/​papers/​mccallum-conll2003.pdf McCallum et al. paper in CoNLL 2003: "Early Results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons"​]
 +* Reading assignment: [http://​l2r.cs.uiuc.edu/​~danr/​Papers/​RatinovRo09.pdf Ratinov and Roth paper in CoNLL 2009: "​Design Challenges and Misconceptions in Named Entity Recognition"​]  ​
 +* Homework:
 +** Data: CoNLL 2003 Named Entity shared task data set: http://​www.cnts.ua.ac.be/​conll2003/​ner/​
 +** Baseline: dictionary look-up method on CoNLL named entity recognition shared task data
 +*** dictionary is simply the list of named entities extracted from the training set
 +** Baseline: MEMM for Named Entity Recognition on the CoNLL data
 +*** Improve on this by doing error analysis and feature engineering,​ as you did for the POS tagging task
 +** Run both methods (dictionary look-up and MEMM) on noisy OCR data
 +** Coordinate with Thomas Packer for noisy OCR data (esp. the labeled dev test set)
 +*** Private wiki site for the noisy OCR data: http://​nlp.cs.byu.edu/​mediawiki-private/​index.php/​Ancestry_dot_Com
 +** Pick one 3rd-party tool (distinct from other students) from the list of open source tools on Wikipedia: http://​en.wikipedia.org/​wiki/​Named_entity_recognition
 +*** Prefer one of the following:
 +**** Stanford Named Entity tagger
 +**** CCG group at UIUC: Named Entity + semantic role-labeling tagger
 +**** Mallet from U. Mass. Amherst
 +** Run 3rd-party tool on CoNLL data and noisy OCR data
 +** Report results
 +
 +== June 16, 2009 ==
 +
 +* Topic: User Study and Regression Results
 +** Presenter: Kevin Seppi
 +** Reading assignment: [http://​www.lrec-conf.org/​proceedings/​lrec2008/​pdf/​832_paper.pdf our LREC Paper]
 +** Reading assignment: [http://​aclweb.org/​anthology-new/​W/​W09/​W09-1903.pdf Shilpa Arora et al.'s paper at ALNLP 2009]
 +* Homework:
 +** Repro. regression results from LREC paper using R
 +** Apply methods from Arora'​s study to our data using SVM Regression
 +*** over-all cost model
 +*** per-subject cost model
 +*** per-subject-type cost model
nlp/on-ramp.txt ยท Last modified: 2015/04/21 22:23 by ryancha
Back to top
CC Attribution-Share Alike 4.0 International
chimeric.de = chi`s home Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0