# Mid-Term Exam Study Guide

## Plan

The mid-term exam is scheduled in the Testing Center for a Thursday, Friday, and Saturday (see the course schedule on Learning Suite). You will have a 3 hour time limit. The exam is closed book, and you may have no notes. No calculators or digital assistants (you won't need one). I think a well-prepared student will be able to complete the exam in two hours. You’re going to do well!

The format is short answers, worked mathematical solutions, and possibly some T/F.

## Study

I recommend the following activities to study the topics covered by the exam:

• Review the lecture notes and identify the topics we emphasized in class. Focus on those listed below.
• Compare the homework solution keys to your homework assignments, and make sure that you understand the major principles covered in the homework problems.

## Topics

The exam will cover a subset of the following topics:

1. Probability theory: sample spaces, sigma algebras, probability functions
2. The three axioms of probability
3. NO proofs involving set theory
4. Definition of conditional probability
5. Marginalization, Law of Total Probability
6. Product rule, chain rule
7. Independence and conditional independence of events
8. Random variables
9. Independence and conditional independence of random variables
10. Bayes rule
11. Basic discrete distributions: bernoulli, binomial, categorical, multinomial
12. Parametric distribution; parameters of distributions
13. Expected value of a random variable
14. Querying joint distributions
15. Efficiency of storage in joint distributions as tables
16. Rationale for directed grpahical models
17. Directed graphical models as joint distributions
18. Visual language of directed graphical models
19. Reading independence and conditional independence in a directed graphical model
20. Reading influence / information flow in a directed graphical model
21. VERY IMPORTANT: Answering questions on directed graphical models: joint queries, marginal queries, conditional queries
22. Efficiency of answering conditional queries
23. Text classification
24. Other kinds of classification problems
25. “Bag-of-words” assumption
26. VERY IMPORTANT: Naive Bayes as a directed graphical model, classifying with Naive Bayes, shortcomings of Naive Bayes models
27. Various event models for Naive Bayes: multivariate bernoulli, multivariate categorical, multinomial (especially multivariate categorical)
28. Class-conditional language models as classifiers
29. Evaluating classifiers
30. Maximum likelihood estimation for the categorical distribution
31. NO Lagrange Multipliers
32. The purpose and shapes and parametrization of the Beta distribution
33. The purpose and shapes and parametrization of the Dirichlet distribution
34. NO analytical forms of the Beta and Dirichlet distribution
35. Beta-Binomial conjugacy
36. Dirichlet-Multinomial conjugacy
37. NO Completing the integral
38. Point estimates to summarize the posterior distribution
39. Maximum a Posteriori (MAP) parameter estimation for the categorical distribution
40. Relationship between MAP estimation and add-one smoothing
41. Reading generative stories from a directed graphical model
42. Plate notation
43. High-level steps of the Expectation Maximization algorithm 