__NOTOC__
The Bayes Network we all know and love is easily extended to encode the fundamental concepts used in decision theory. In this introduction to decision-theoretic graphical models I briefly review some mathematical foundations of decision theory. Next I introduce a simple decision-theoretic graph structure known as the *decision tree*. That discussion will lead naturally into the presentation of *influence diagrams*.

- $\,\!\Theta$: the state of the world
- $\,\!Z$: the set of possible outcomes
- $\,\!A$: the set of possible actions such that $\,\!\forall_{a \in A} a : \Theta \rightarrow Z$
- Local Utility - how much we value a particular outcome: $\,\!u : Z \rightarrow \mathfrak{R}$
- Global Utility - given our current beliefs $\,\!\pi$ about the state of the world, how much utility we expect to derive overall from all outcomes together after taking action $\,\!a$: $\,\! U_\pi(a)=\int_{\theta}{u(a(\theta))\pi(\theta)d\theta}$
- Bayes Action - the action that maximizes expected utility: $\,\!a^{*} = argmax_a U_\pi (a)$
- Distributional Utility - how much utility we expect to derive with beliefs $\,\!\pi$ assuming we take the utility-maximizing (Bayes) action: $\,\!V(\pi) = \sup_a U_\pi(a) = U_\pi(a^{*})$
- $\,\!EVSI(\mathcal{E}) = E_x[V(\pi_x)-U_{\pi_x}(a^{*})] = E_x[V(\pi_x)]-V(\pi) = E_x[V(\pi_x)]-V(E_x[\pi_x])$ where $\,\!\pi_x$ is our posterior belief about the world after making some observation.

Note: this notation is basically that of Parmigiani and Inoue in *Decision Theory: Principles and Approaches*.

Most readers will be familiar with a type of decision tree from machine learning, such as that generated by the ID3 and C4.5 algorithms. In the realm of probabilistic graphical models, however, a decision tree is “a representation of the different scenarios that might be encountered by the decision maker in the context of a particular decision problem.” Additionally, “[t]he decision-tree representation allows us to encode decision scenarios in a way that reveals much more of their internal structure than the abstract setting of outcomes and utilities.” (Koller & Friedman, 1083-4).

A decision tree has two types of nodes, both of which fittingly represent *decisions*:

frame|The oil well example encoded as a decision tree Decision_tree_agent_node.png Agent nodes represent the decisions of an agent. Their outbound edges represent the possible choices of the agent. The root of the tree is an agent node, reflecting the fact that while the decision tree paradigm does model the world it is ultimately focused on determining optimal agent actions.

Decision_tree_nature_node.png Nature nodes represent the decisions of nature – in other words, the outcomes of random processes not under the agent's control. Their outbound edges represent the possible values of a random variable and are labeled with the probabilities of those outcomes.

Each node is labeled with the utility at that state in the decision process, and the outgoing edges of terminal nodes are simply drawn pointing to a number representing the final utility. The optimal decision on an agent node can be marked using a bolded edge.

As an example, the simple oil well scenario from class is shown here as a decision tree. The root agent node represents the choice of whether to drill or not. The two nature nodes represent the possibility of there actually being oil at the drill site. The optimal decision (to drill) is marked with a bolded “Yes” edge. Note the redundancy in the two nature nodes.

Fortunately, the backward induction algorithm can efficiently compute the optimal decision sequence for a given decision tree. Unfortunately, though, decision tree node counts grow exponentially in the number of agent or nature decisions involved in the decision scenario.

Influence diagrams overcome some of the limitations of decision trees, particularly redundant tree structure and a representation of utilities that obscures the way those utilities are computed. The main innovation of influence diagrams over decision trees is to condition nodes on their parents rather than encoding the same structure multiple times throughout the graph. An influence diagram is a directed acyclic graph with three types of nodes (*chance*, *decision*, and *utility*) with the following properties:

Let $\,\!X$ be a set of chance variables such that for each $\,\!x \in X$ there is a conditional probability distribution $\,\!P(X|Parents(X))$.

Let $\,\!D$ be a set of decision variables such that for each $\,\!d \in D$ some agent chooses amongst $\,\!d$'s options based on incoming “information edges.” A decision rule exists as a CPD for each $\,\!d \in D$, with deterministic decisions being CPDs with non-zero value for only one member of the domain. (For single-agent situations the agent can always have a deterministic optimal strategy, but presumably there are multi-agent situations where only probabilistically optimal solutions exist.)

Let $\,\!U$ be a set of utility variables. For all $\,\!u \in U$, there exists a deterministic utility function $\,\!f_{u} : \bigotimes_{p \in Parents(u)} Values(p) : \mathfrak{R}$ (where we interpret $\,\!\bigotimes$ as a cumulative cartesian product), and $\,\!Children(u)=\emptyset$

The approach to computing MEU on an influence diagram described in Koller & Friedman is as follows:

- Convert the influence diagram to an equivalent decision tree representation.
- Use the backward induction algorithm to compute MEUs for each chance node, choosing the Bayes action at each decision node.
- Return the MEU of the highest-MEU child of the top decision node as the MEU of the entire network.

Now that we have a method for computing the MEU of an influence diagram it is trivial to compute the value of a piece of information. (This is the value of perfect information – not quite the same as EVSI, but comparable.) $\,\!VPI(I,i) = MEU(I) - MEU(I - i)$ where $\,\!i$ is an information edge representing the piece of information in question, $\,\!I$ is the influence diagram including the information edge $\,\!i$, and $\,\!I - i$ is the influence diagram omitting the information edge $\,\!i$.

A key limitation of influence diagrams is the inability to represent decision sequences that vary depending on the choice made in one of the decisions. Decision trees are naturally able to do this, but at the cost of exponential space requirements.

Another issue faced by influence diagrams is the reliance of some formulations on the “perfect recall” assumption. This is the assumption that a decision at a particular node will have access to all preceding information. This makes the formulation of decision rules potentially exponential. One way of dealing with it is to not assume perfect recall, for example by conditioning a decision only on the information from its immediate predecessor nodes.

Draw the decision tree representing the oil well scenario with the option of conducting a test, where the cost of the test is $1 million, and P(Test|Oil)=0.9 and P(Test|⌐Oil)=0.2. Compute the expected utilities and maximum expected utilities of each node from the bottom up using the backward induction algorithm. Should the test be performed?

The test should not be performed. [Note: why is this not breaking even as the original example suggests it should?]