When working with very small probabilities, it is easy to encounter problems with underflow in the floating point representation. To avoid this problem, it is customary to work in the Logarithmic Domain (or “log space”). Logarithms scale the quantities in question into a workable range where machine precision issues are not as detrimental. The base of the logarithm, once fixed, can be ignored.

Multiplication, Exponents, and Division

Working in the log domain requires that we recall a couple of identities from algebra:

  • $log(a \cdot b) = log(a) + log(b)$
  • $log(a^b) = b \cdot log(a)$

Consequently,

  • $log(1/a) = log(a^{-1}) = - log(a)$

Hence,

  • $log(a / b) = log(a) - log(b)$

Multiplicative identity:

  • $log 1 = 0$

Additive identity:

  • $log 0 = -Infinity$

In other words, we can replace multiplication, exponentiation, and division by simple operations in the log domain.

Addition and Subtraction

A more subtle technique is required to replace addition (and subtraction). In other words, if we wish to replace $log(a + b)$ by some operation involving quantities $log(a)$ and $log(b)$ already in log space, then we require the $logAdd()$ method.

Conceptually, $logAdd(log(a),log(b))$ takes two arguments in log space, adds them in regular space, and returns a result in log space. Naively this could be implemented as log (e^{log(a)} + e^{log(b)}). However, when we are dealing with numbers $a$ and $b$ that are so small that they can't be represented in regular space using floating point representation (e^{log(a)} or e^{log(b)} would underflow to 0), so the implementation must play some math tricks to avoid actually doing the exponentiation.

The $logAdd()$ method below employs those tricks. First off, logAdd() takes two arguments of type double in log space, logX and logY. It returns a double value which closely approximates log(X + Y). Consequently, the inputs and output are in log space, and we need not convert in and out of log space to complete this operation.

Steps 1-4 are commented in the code. We will prove why step 4 is correct after the code.

   double logAdd(double logX, double logY)
       // 1. make X the max
       if (logY > logX)
           double temp = logX
           logX = logY
           logY = temp
       // 2. now X is bigger
       if (logX == Double.NEGATIVE_INFINITY)
           return logX
       // 3. how far "down" (think decibels) is logY from logX?
       //    if it's really small (20 orders of magnitude smaller), then ignore
       double negDiff = logY - logX
       if (negDiff < -20)
           return logX
       // 4. otherwise use some nice algebra to stay in the log domain
       //    (except for negDiff)
       return logX + log(1.0 + exp(negDiff))

Here's the proof of correctness for step 4

$log X + log(1.0 + e^{logY-logX})$

$= log(X \cdot (1.0 + e^{logY-logX}))$

$= log(X + X \cdot e^{logY/X})$

$= log(X + X \cdot Y / X)$

$= log(X + Y)$

Couldn't be nicer.

More arguments: logSum()

To log-sum several logarithmic quantities, simply represent those quantities in a list L. Call logAdd() to log-accumulate each of the quantities in the list, one at a time. Thus, you could simply implement logSum() using the following pseudo-code:

double logSum(a list of logarithmic quantities L)
	logResult = Double.NEGATIVE_INFINITY
	foreach logX in L:
		logResult = logAdd(logResult, logX)
	return logResult
cs-401r/log-domain-computations.txt · Last modified: 2014/11/03 17:47 by ringger
Back to top
CC Attribution-Share Alike 4.0 International
chimeric.de = chi`s home Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0