The final exam is scheduled in the classroom on the date scheduled by the University (see the course schedule on Learning Suite). You will have a 3 hour time limit. The exam is closed book, and you may have no notes. No calculators or digital assistants (you won't need one). I think a well-prepared student will be able to complete the exam in two hours. You're going to do well!

The exam is comprehensive. The format is short answers, worked mathematical solutions, and possibly some T/F.

The difficulty level is comparable to the difficulty of the mid-term exam.

You will be expected to show your work. You will be graded not only on the correctness of your answers, but also on the clarity with which you express your rationale for your answers; be neat. It is your job to make your understanding clear to the graders; non-neat work is likely to earn a lower grade. If using a pencil (rather than a pen) helps you be neat, please plan accordingly.

I recommend the following activities to study the topics covered by the exam:

- Review the lecture notes and identify the topics we emphasized in class. Focus on those topics listed below.
- Compare the homework solution keys to your homework assignments, and make sure that you understand the major
*principles*covered in the homework problems. - While you are reviewing the lecture notes, the homework solutions, and the topics in the mid-term study guide and this final study guide, I
**strongly encourage**you to build the following lists:- Problems (e.g., classification, clustering)
- Models (e.g., Naive Bayes, Gausian Mixture Model)
- Algorithms (e.g., the Viterbi algorithm, the Expectation Maximization (EM) algorithm)
- Theories (e.g., probability theory)
- Methodologies (e.g., feature engineering, unsupervised learning)

- Identify common themes and ideas within each of the lists. This will aid you in organizing your thoughts and making comparisons and contrasts.

The final exam is **comprehensive** and will cover a subset of the following topics as well as topics from the mid-term exam study guide:

- Steps of the Expectation Maximization algorithm
- Mixture models
- Mixture of multinomials model
- NO deriving new EM algorithms for new models
- Initialization for Expectation Maximization
- Computing the likelihood of the data according to a model
- Converting likelihood expressions into log-space
- Interpreting Hierarchical Bayesian models
- (Multivariate) Gaussian distributions
- Gaussian Mixture Models (GMMs)
- Sequence labeling
- Part-of-speech tagging
- Hidden Markov Models (HMMs)
- Independence assumptions in HMMs
- The Viterbi algorithm
- Components of a speech recognition system
- Application of HMMs in speech recognition
- Application of GMMs in speech recognition
- Formulating recognition problems in the source/channel (aka “noisy channel”) paradigm
- Language models as Markov chains
- Decoding as search
- Beam search as an approximation to the Viterbi algorithm
- The Monte Carlo principle
- Gibbs Sampling
- Justifying steps in the derivation of complete conditional distributions for Gibbs sampling
- NO novel derivations of complete conditional distributions for Gibbs sampling
- Document clustering with Gibbs sampling on a mixture of multinomials
- Metrics for clustering
- Topic modeling and topic discovery
- Latent Dirichlet Allocation (LDA): the generative story and model
- Inference in LDA using Gibbs sampling
- Strengths and limitations of joint models
- Strengths and limitations of conditional models
- Answering conditional queries using a joint model versus using a conditional model directly
- Maximum entropy classifiers / Logistic regression
- NO derivations of gradients of the likelihood (using differential Calculus) for gradient descent / ascent learning of maximum entropy model parameters
- The feature engineering cycle
- Pros and cons of Naive Bayes versus Maximum entropy as classifiers