Differences

This shows you the differences between two versions of the page.

Link to this comparison view

nlp:log-domain-computations [2015/04/23 21:40] (current)
ryancha created
Line 1: Line 1:
 +When working with very small probabilities,​ it is easy to encounter problems with underflow in the floating point representation. ​ To avoid this problem, it is customary to work in the Logarithmic Domain (or "log space"​). ​ Logarithms scale the quantities in question into a workable range where machine precision issues are not as detrimental. ​ The base of the logarithm, once fixed, can be ignored.
  
 +== Multiplication,​ Exponents, and Division ==
 +
 +Working in the log domain requires that we recall a couple of identities from algebra:
 +
 +* $log(a \cdot b) = log(a) + log(b)$
 +* $log(a^b) = b \cdot log(a)$
 +
 +Consequently,​
 +* $log(1/a) = log(a^{-1}) = - log(a)$
 +
 +Hence,
 +* $log(a / b) = log(a) - log(b)$
 +
 +Multiplicative identity:
 +* $log 1 = 0$
 +
 +Additive identity:
 +
 +* $log 0 = -Infinity$
 +
 +In other words, we can replace multiplication,​ exponentiation,​ and division by simple operations in the log domain.
 +
 +== Addition and Subtraction ==
 +
 +A more subtle technique is required to replace addition (and subtraction). ​ In other words, if we wish to replace $log(a + b)$ by some operation involving $log(a)$ and $log(b)$, we require the logAdd() method (available in the Stanford and Berkeley NLP libraries, specifically in the SloppyMath package by Dan Klein).
 +
 +Here follows a discussion of the logAdd() method:
 +
 +First off, logAdd() takes two arguments of type double in log space, logX and logY.
 +
 +It returns a double value which closely approximates log(X + Y).  Consequently,​ the inputs and output are in log space, and we need not convert in and out of log space to complete this operation.
 +
 +Steps 1-4 are commented in the code.  We will prove why step 4 is correct after the code.
 +
 +    public static double logAdd(double logX, double logY) {
 +        // 1. make X the max
 +        if (logY > logX) {
 +            double temp = logX;
 +            logX = logY;
 +            logY = temp;
 +        }
 +        // 2. now X is bigger
 +        if (logX == Double.NEGATIVE_INFINITY) {
 +            return logX;
 +        }
 +        // 3. how far "​down"​ (think decibels) is logY from logX?
 +        //    if it's really small (20 orders of magnitude smaller), then ignore
 +        double negDiff = logY - logX;
 +        if (negDiff < -20) {
 +            return logX;
 +        }
 +        // 4. otherwise use some nice algebra to stay in the log domain
 +        //    (except for negDiff)
 +        return logX + java.lang.Math.log(1.0 + java.lang.Math.exp(negDiff));​
 +    }
 +
 +=== Here's the proof of correctness for step 4 ===
 +
 +$logX + log(1.0 + e^{logY-logX})$
 +
 +$= log(X \cdot (1.0 + e^{logY-logX}))$
 +
 +$= log(X + X \cdot e^{logY/​X})$
 +
 +$= log(X + X \cdot Y / X)$
 +
 +$= log(X + Y)$
 +
 +Couldn'​t be nicer.
nlp/log-domain-computations.txt ยท Last modified: 2015/04/23 21:40 by ryancha
Back to top
CC Attribution-Share Alike 4.0 International
chimeric.de = chi`s home Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0