6.5.16 Log-Space Implementation and Numerical Stability
6.5.16 Log-Space Implementation and Numerical Stability
Section titled “6.5.16 Log-Space Implementation and Numerical Stability”In the previous sections, we formulated the Viterbi algorithm in terms of probabilities that are multiplied along a path. While this formulation is conceptually straightforward, it leads to a serious practical problem.
The problem of numerical underflow
Section titled “The problem of numerical underflow”Probabilities are values between 0 and 1. When we multiply many such values, the result decreases exponentially with sequence length.
For example, even moderate probabilities such as
already produce a small number. For realistic biological sequences with hundreds or thousands of positions, the resulting values quickly become so small that they cannot be represented accurately in floating-point arithmetic. This phenomenon is known as numerical underflow.
As a consequence, a direct implementation of the Viterbi algorithm in probability space is unstable and unreliable for real-world applications.
Logarithmic Transformation
Section titled “Logarithmic Transformation”To address this issue, we transform the computation into logarithmic space.
The key observation is that multiplication of probabilities becomes addition in log-space:
Applying this transformation to the Viterbi recursion yields a numerically stable formulation.
Log-Space Viterbi Scores
Section titled “Log-Space Viterbi Scores”We define the log-Viterbi score:
Using this definition, the initialization step becomes:
Recursion in log-space
Section titled “Recursion in log-space”The recursive step becomes:
This expression replaces products with sums, which are numerically stable even for long sequences.
Backpointers remain unchanged
Section titled “Backpointers remain unchanged”Importantly, the backpointer computation is unaffected:
Thus, the structure of the algorithm remains identical. Only the arithmetic changes.
Interpretation
Section titled “Interpretation”Working in log-space also provides a useful conceptual reinterpretation.
- Instead of maximizing a product of probabilities, we maximize a sum of log-probabilities
- This can be viewed as accumulating a score along a path
In this perspective, each transition and emission contributes an additive term to the total score. The Viterbi algorithm then finds the path with the highest total score.
This formulation is often more intuitive when thinking about the relative contributions of different parts of the model.
Handling Zero Probabilities
Section titled “Handling Zero Probabilities”A practical detail arises when probabilities are zero. Since
such transitions or emissions are effectively forbidden. In implementation, this is typically handled by representing impossible events with a sufficiently large negative number or by explicitly encoding (-\infty).
This interpretation is consistent with the probabilistic model: a zero probability means that a particular transition or emission cannot occur.
Comparison to the Forward Algorithm
Section titled “Comparison to the Forward Algorithm”It is worth noting that while the Viterbi algorithm uses the maximum operator in its recursion, the forward algorithm uses a sum over all predecessor states.
In log-space, this leads to an important difference:
-
Viterbi: [ \max_k (\cdot) ]
-
Forward: [ \log \sum_k \exp(\cdot) ]
The latter expression is known as the log-sum-exp operation and requires special care for numerical stability. We will return to this point when discussing the forward algorithm.
Practical Recommendation
Section titled “Practical Recommendation”In practice, implementations of the Viterbi algorithm should always be carried out in log-space. This ensures:
- numerical stability for long sequences
- robustness against underflow
- improved interpretability as additive scoring
Conceptual Summary
Section titled “Conceptual Summary”The log-space formulation preserves the structure of the Viterbi algorithm while making it computationally feasible for realistic biological data.
- Multiplication → addition
- Probabilities → log-scores
- Same dynamic programming structure
This transformation is a standard technique in probabilistic modeling and will reappear in other algorithms throughout this chapter.
Self-Check Questions
Section titled “Self-Check Questions”- Why does multiplying probabilities lead to numerical underflow?
- How does the logarithmic transformation address this issue?
- What changes in the Viterbi recursion when moving to log-space?
- Why does the backpointer computation remain unchanged?
- What is the conceptual difference between maximizing probabilities and maximizing log-scores?