Expectation Maximization Part 1
Objectives
This assignment is designed to:
provide practice with essential ideas for using the expectation maximization (EM) algorithm
help you become more fluent with the terminology and the techniques
prepare you for the next assignment dedicated to using EM for document clustering
Instructions
This is primarily a mathematical homework assignment. Show your work. Be clear and concise. Where code is required, include your code. Type your assignment.
Work through these exercises as soon as possible for their instructional content. If you have general questions about the assignment, please post on the Google Group. Finish early, and earn the early bonus.
Exercises
Question #1
[5 points]
I have one coin (coin 0) in my pocket: Coin 0 has probability $\lambda$ of heads.
I toss the coin 10 times, and see the following sequence: HHTTHHHTHH (7 heads out of 10)
What would you guess $\lambda$ to be?
Why?
Problem Description: for questions #2-#8
I have three coins (0, 1, 2) in my pocket:
Coin 0 has probability $\lambda$ of heads
Coin 1 has probability $\beta_1$ of heads
Coin 2 has probability $\beta_2$ of heads
I now run several trials. For each trial I do the following:
First I toss coin 0. The result is the value of random variable $Y$.
If coin 0 turns up heads, then:
I toss coin 1. The result is the value of random variable $X_1$.
I toss coin 1 a second time yielding the value of random variable $X_2$.
I toss coin 1 a third time yielding the value of random variable $X_3$.
If coin 0 turns up tails, then:
I toss coin 2. The result is the value of random variable $X_1$.
I toss coin 2 a second time yielding the value of random variable $X_2$.
I toss coin 2 a third time yielding the value of random variable $X_3$.
Group the three random variables $\langle X_1,X_2,X_3 \rangle$ together as $X$.
Note that $\langle X_1,X_2,X_3 \rangle$ represent flip tokens, by analogy with word tokens.
Consider two scenarios:
Scenario 1: I tell you whether coin 0 came up heads or tails, and I tell you what the outcomes from flipping coin 1 or 2 were.
Scenario 2: I don’t tell you whether coin 0 came up heads or tails or whether coin 1 or 2 was tossed subsequently, but I do tell you what the outcomes from flipping coin 1 and/or coin 2 were.
The problem to solve: given observed coin flips, estimate the values for $\lambda, \beta_1$, and $\beta_2$.
Question #2
[5 points]
In scenario #1, you make the following complete observations: $(H, \langle HHH \rangle),(T, \langle TTT \rangle),(H, \langle HHH \rangle),(T, \langle TTT \rangle),(H, \langle HHH \rangle)$
Note that the notation $(H, \langle HHH \rangle)$ is short-hand for the joint event $(Y=H, X=\langle HHH \rangle) = (Y=H, X_1=H, X_2=H, X_3=H)$
Compute the maximum likelihood estimates for $\lambda, \beta_1, \beta_2$.
For questions #3-#8
In scenario #2, you see a sequence of observations. (e.g., $\langle HHH \rangle,\langle TTT \rangle,\langle HHH \rangle,\langle TTT \rangle,\langle HHH \rangle$).
To compute the maximum likelihood estimates for $\lambda, \beta_1, \beta_2$, you will need to employ the EM algorithm to estimate those parameters, so the following problems will help you think and work through the steps.
Question #3
[20 points]
Draw the directed graphical model for the situation described above (under “Problem Description”) consistent with scenario #2.
Refer to the parameters collectively as $\Theta = \{\lambda, \beta_1, \beta_2\}$.
Specify the conditional probability table – symbolically in terms of $\lambda, \beta_1$, and $\beta_2$ – for the random variable $Y$ corresponding to the outcome for flipping coin 0: $P(Y|\Theta)$
Specify the conditional probability table – symbolically in terms of $\lambda, \beta_1$, and $\beta_2$ – for the random variables $X$ corresponding to the outcomes for flipping coins 1 or 2: $P(X|Y,\Theta)$
Question #4
[20 points]
What is the probability $P(X=\langle THT \rangle,Y=H|\Theta)$ symbolically in terms of $\lambda, \beta_1$, and $\beta_2$ ?
What is the probability $P(X=\langle THT \rangle,Y=T|\Theta)$ symbolically in terms of $\lambda, \beta_1$, and $\beta_2$ ?
What is the probability $P(X=\langle THT \rangle|\Theta)$ symbolically in terms of $\lambda, \beta_1$, and $\beta_2$ ?
What is the probability $P(Y=H|X=\langle THT \rangle, \Theta)$ symbolically in terms of $\lambda, \beta_1$, and $\beta_2$ ?
Hint: in each case, ask yourself what kind of query the expression represents, and use the methods with which you are already familiar.
Question #5
[30 points]
Let the partially observed data be $(\underline{e}_1, \underline{e}_2, \underline{e}_3, \underline{e}_4, \underline{e}_5) = (\langle HHH \rangle,\langle TTT \rangle,\langle HHH \rangle,\langle TTT \rangle,\langle HHH \rangle)$. (No labels)
You will compute the maximum likelihood estimates for parameters $\lambda, \beta_1, \beta_2$ using the Expectation Maximization algorithm.
Construct a table formatted as shown here to capture your results.
Show your work separately. If you show the careful work for E-step #1 and M-step #1 and you are confident about how to proceed with the next iteration, then it would be reasonable to be more terse in showing work for the subsequent steps.
Iteration | $\lambda$ | $\beta_1$ | $\beta_2$ | $P(Y=H | \underline{e}_1, \Theta)$ | $P(Y=H | \underline{e}_2, \Theta)$ | $P(Y=H | \underline{e}_3, \Theta)$ | $P(Y=H | \underline{e}_4, \Theta)$ | $P(Y=H | \underline{e}_5, \Theta)$ |
0 | | | | | | | | |
1 | | | | | | | | |
2 | | | | | | | | |
3 | | | | | | | | |
Step 0
Step 1
Step 2
Step 3
Step 4
Question #6
Question #7
[10 points]
Add a new column labeled $L(\Theta)$, the log-likelihood of the observed data according to the current parameters $\Theta$, to your table from Question #5.
Compute the log-likelihood of the observed data for each row in the table.
Question #8
Submission
Submit a .pdf document through Learning Suite.
Back to top