This shows you the differences between two versions of the page.

Both sides previous revision Previous revision Next revision | Previous revision | ||

cs-401r:assignment-b [2014/12/23 15:10] ringger [Language Modeling] |
cs-401r:assignment-b [2014/12/23 15:17] ringger [Report Requirements] |
||
---|---|---|---|

Line 55: | Line 55: | ||

Once you have these probabilities you can use the Viterbi algorithm to find the most likely sequence of tags given an unlabeled sequence of text. The Viterbi algorithm is described in section 15.2.3 (for some reason, they do not tell you it is called Viterbi until the end of the section!). | Once you have these probabilities you can use the Viterbi algorithm to find the most likely sequence of tags given an unlabeled sequence of text. The Viterbi algorithm is described in section 15.2.3 (for some reason, they do not tell you it is called Viterbi until the end of the section!). | ||

- | Repeat this task but for a second order Markov process for the transitions, where the transition probabilities depend on two POSs of context instead of just one. | + | Repeat this task but for a second order Markov process for the transitions, where the transition probabilities depend on two POSs of context instead of just one. Lecture #23 includes some helpful explanation about how to extend the method to Markov order two. |

== Data == | == Data == | ||

Line 73: | Line 72: | ||

== Report Requirements == | == Report Requirements == | ||

- | Please limit your report to 6 pages of prose, not including large tables or figures. | + | |

+ | ''Please limit the non-code portion of you report to about 6 pages.'' | ||

Write a report on the work you have performed. You should describe what you built, what choices you had to make, why you made the choices you did, how well they worked out, and what you might do to improve things further. | Write a report on the work you have performed. You should describe what you built, what choices you had to make, why you made the choices you did, how well they worked out, and what you might do to improve things further. | ||

- | * [4 points] Clear writing is an important aspect of your report. Also, label the sections of your report. Include an introduction and conclusion. Structure your report in such a way that it is easy to read and follow. | + | * [10 points] Clear writing is an important aspect of your report. Also, label the sections of your report. Include an introduction and conclusion. Structure your report in such a way that it is easy to read and follow. |

- | * [4 points] Show that you obtained reasonable transition probabilities in the first order language model. Here and below, when I ask for probabilities I expect to see numbers for the training data or for some test case that you provide. | + | * [5 points] Show that you obtained reasonable transition probabilities in the first order language model. Here and below, when I ask for probabilities I expect to see numbers for the training data or for some test case that you provide. |

- | * [4 points] Show that you can generate reasonable looking random text from the first order model. Again, here and below, I expect to see generated text. | + | * [5 points] Show that you can generate reasonable looking random text from the first order model. Again, here and below, I expect to see generated text. |

- | * [3 points] Show that you obtained reasonable transition probabilities in the second order language model. | + | * [5 points] Show that you obtained reasonable transition probabilities in the second order language model. |

- | * [3 points] Show that you can generate better looking random text from the second order model. | + | * [5 points] Show that you can generate better looking random text from the second order model. |

* [5 points] Show that you obtained reasonable transition and emission probabilities in the first order HMM. | * [5 points] Show that you obtained reasonable transition and emission probabilities in the first order HMM. | ||

- | * [10 points] Show that you can predict reasonable looking POS tags for new (test) text using the first order HMM. Show some tags and the sentence they came from. | + | * [15 points] Show that you can predict reasonable looking POS tags for new (test) text using the first order HMM. Show some sentences and their tags. |

* [5 points] Show that you obtained reasonable transition and emission probabilities in the second order HMM. | * [5 points] Show that you obtained reasonable transition and emission probabilities in the second order HMM. | ||

- | * [12 points] Show that you can predict better POS tags for new (test) text using the second order HMM. | + | * [10 points] Show that you can predict better POS tags for new (test) text using the second order HMM. |

- | * [8 points] Include confusion matrices (described in class, or see http://en.wikipedia.org/wiki/Confusion_matrix) to evaluating your ability to predict POS tags for the first and second order models (above). Since the full tag confusion matrices are large, you could display only the most interesting parts of a confusion matrix. One possible way to do this is to implement a total threshold to filter rows or columns (so that rows and columns which are confused less than a given number times will not be displayed). I still expect to see a matrix, not just a few isolated values. | + | * [15 points] Include confusion matrices (described in class, or see http://en.wikipedia.org/wiki/Confusion_matrix) to evaluating your ability to predict POS tags for the first and second order models (above). Since the full tag confusion matrices are large, you could display only the most interesting parts of a confusion matrix. One possible way to do this is to implement a total threshold to filter rows or columns (so that rows and columns which are confused less than a given number times will not be displayed). I still expect to see a matrix, not just a few isolated values. |

- | * [8 points] Throughout, provide evidence that you looked at the specific behavior of your models, thought about what they are doing and that when you found something that was wrong, that you took appropriate action. | + | * [10 points] Throughout, provide evidence that you looked at the specific behavior of your models, thought about what they are doing and that when you found something that was wrong, that you took appropriate action. |

- | * [6 points] Working code. | + | * [10 points] Working code. |

You are also required to include at the top of page 1 of your report a clear measure (in hours) of how long it took you (each) to complete this project. | You are also required to include at the top of page 1 of your report a clear measure (in hours) of how long it took you (each) to complete this project. | ||

Please also include in your report a short section titled "Feedback". Reflect on your experience in the project and provide any concrete feedback you have for us that would help make this project a better learning experience. | Please also include in your report a short section titled "Feedback". Reflect on your experience in the project and provide any concrete feedback you have for us that would help make this project a better learning experience. |