cs-677sp2010:clique-trees [CS Wiki]

Definitions
- Cluster Graphs
- Clique Tree ( or Junction Tree )
Message Passing in Clique Tree
- Initial potentials
Clique Tree Calibration
- Shafer-Shenoy algorithm
- Calibrated Clique Tree
Other Message Passing method
- Message passing with division
Comparison with Variable Elimination Algorithm
Candidate Question
References

In the variable elimination algorithm, the basic computational step is the manipulation of factors. In clique trees, a factor $\phi_i$ is considered as a computational data structure, which receives messages generated by other factors and sends a message used by another factor.

Definitions

Cluster Graphs

A cluster graph for a set of factors F is an undirected graph with the following properties:

Each node i is associated with a subset ( cluster ) $C_i$
Family preserving: for each factor $f_i \in F$ , such that scope[$f_i$]$\subseteq C_i$
Each edge i-j is associated with a sepset $S_{ij} = C_i \cap C_j$

Example: From Variable Elimination Execution to cluster tree

fig1.jpg fig2.jpg

Fig.1 Original graph

Fig.2 Cluster tree after the variable elimination execution

Execution of variable elimination defines a cluster-graph

Each factor used in elimination becomes a cluster-node

An edge is drawn between two clusters if a message is passed between them in elimination

Clique Tree ( or Junction Tree )

A clique tree is a cluster tree that satisfies the running intersection property

Running Intersection Property: if cluster $C_i$ and $C_j$ both contain variable X, then all clusters in the ( unique ) path between $C_i$ and $C_j$ contain X too.
The running intersection property implies that $S_{ij} = C_i \cap C_j$

So the tree shown in Fig.2 is also a clique tree, each box represents a clique C and on each edge between them the sepset $S_{ij} $ is labeled.

Message Passing in Clique Tree

Given a clique tree as a starting point, this data structure can be used to perform variable elimination. Furthermore, the clique tree provides a data structure allowing multiple executions of variable elimination to be performed much more efficiently than simply performing each one separately.

Here is an example of using a clique tree to do the variable elimination.

Fig3.jpg

Fig.3 Given a possible clique tree at the beginning

Notice that this clique tree is different from the one in Fig.2 which is obtained from the variable elimination execution. And we could verify that this clique tree satisfy the definition.

Initial potentials

Let's define a term before we show the clique tree algorithm

The initial potential $\pi^0_i(C_i)$ is computed by multiplying the initial factors assigned to the clique $C_i$ . For example, $\pi^0_4(G,J,S,L) = P(L|G)P(J|S,L)$.

Fig4.jpg

Fig.4 initial potentials

Now, assume that we want to compute the probability P(J). We do the varibale elimination process so that J is not eliminated. We select a clique containing J as our root clique, for example,$ C_4$.

Each clique $C_i$ , except for the root, performs a message passing computation and sends a message to its upstream neighbor. The clique $C_i$ multiplies all incoming messages from its other neighbors with its initial potential, resulting in a factor whose scope is the clique. It then sums out all variables except those in the sepset between $C_i$ and $C_j$ , and sends the resulting factor as a message to $C_j$ .

Fig5.jpg

Fig.5 Message passing

In Fig.5, for example, the clique $C_2(G,I,D)$ generates a factor ( message ) by multiplying the message$\delta_{1 \rightarrow 2}(D)$ from $C_1$ with its initial potential $\pi^0_2(G,I,D)$ and then summing out the variable D. The resulting factor ( message ) which has a scope G, I, is then sent to clique $C_3(G,S,I)$ .

We say that a clique is ready when it has received all of its incoming messages. Thus, in this example, we could see that $C_4$ is ready at the very start of the algorithm, and the computation associated with it can be performed at any point in the execution.

Clique Tree Calibration

If we want to compute the posterior distribution over every random variable in the network with least computation, there is a “Shafer-Shenoy” algorithm to solve this.

Shafer-Shenoy algorithm

Define a root
Asynchromous implementation of of messages passing in two passes: upward ( sending towards the root ) and downward ( send from the root )
Marginalize clique’s ancillary variables to get the marginal probability distribution of each variable

Fig6.jpg

Fig.6 Messages passing in two passes

A key point is that the resulting marginal distribution does not depend on the clique we selected. That is, if variable X appears in two cliques, they must agree on its marginal. So here comes the definition of calibrated clique tree.

Calibrated Clique Tree

Define a term, belief $\beta_i(C_i)=\pi^0_i \prod_{k \in Nb_i} \delta_{k \rightarrow i} $
Two adjacent cliques $C_i$ and $C_j$ are said to be calibrated if $\sum_{C_i - S_{i,j}} \beta_i(C_i) = \sum_{C_j - S_{i,j}} \beta_j(C_j)$
A clique tree is calibrated if all pairs of adjacent cliques are calibrated.

Other Message Passing method

Message passing with division

An alternative approach to computing the message is to multiply in all of the messages and then divide the resulting factor by returning message.

$\delta_{i \rightarrow j} = \dfrac{\sum_{C_i - S_{i,j}} \beta_i(C_i)}{ \delta_{j \rightarrow i}} $

Comparison with Variable Elimination Algorithm

In principle, they both use the same basic operations of multiplying factors and summing out variables. The cliques in clique tree are basically the factors in variable elimination. In practice, clique trees have:

Advantages:
- Providesing answers to mutiple cliques using a single computation
- Allowing the same data structure to answer a broad range of queries
- Executing the required operations in advance, such as the construction of the basic data structures, the choice of eliminating ordering and the product of CPDs assigned to a clique.
Disadvantages:
- More expensive in terms of space
- The structure of the computation is fixed and predetermined

Candidate Question

Given a Baysian Network, construct your own clique tree and compute the posterior marginal distribution of each node in the network. (Using the calibrated clique tree algorithm)

References

Probabilistic Graphical Models Principles and Techniques, by Daphne Koller and Nir Friedman, 2009 MIT Press
www.cs.stanford.edu/~nmramesh/junction_tree.ppt

cs-677sp2010/clique-trees.txt · Last modified: 2014/12/11 16:36 by ryancha

Back to top

Table of Contents