First Links in the Markov Chain
Probability and poetry were unlikely partners in the creation of a computational tool
Past, Present and Future
A diagram made up of dots and arrows shows the structure of a Markov chain. Dots represent states; arrows indicate transitions. Each arrow has an associated number, which gives the probability of that transition. Because these numbers are probabilities, they must lie between 0 and 1, and all the probabilities issuing from a dot must add up to exactly 1. In such a diagram you can trace a pathway that defines a sequence of states—perhaps sunny, sunny, cloudy, rainy in the weather example. To calculate the probability of this specific sequence, just multiply the probabilities associated with the corresponding transition arrows.
The chain can also answer questions such as, “If it’s cloudy today, what is the probability of rain two days from now?” The answer is found by summing the contributions of all pathways that lead from the cloudy state to the rainy state in exactly two steps. This sounds like a tedious exercise, but there’s an easy way to organize the computation, based on the arithmetic of matrices.
The transition probabilities for a three-state Markov chain can be arranged in a three-by-three matrix—a square array of nine numbers. As shown in the illustration at right, the process for calculating multistage transitions is equivalent to matrix multiplication. The matrix itself (call it P) predicts tomorrow’s weather; the product P×P, or P2, gives weather probabilities for the day after tomorrow; P3 defines the probabilities for three days hence, and so on. The entire future unfolds from this one matrix.
Given the hypothetical probabilities in the weather example shown at right, the successive powers of the matrix rapidly converge to a stationary configuration in which all the rows are identical and all the columns consist of a single repeated value. This outcome has a straightforward interpretation: If you let the system evolve long enough, the probability of a given state no longer depends on the initial state. In the case of the weather, such a result is unsurprising. Knowing that it’s raining today may offer a clue about tomorrow’s weather, but it’s not much help in predicting the state of the skies three weeks from now. For such an extended forecast you may as well consult the long-term averages (which are the values to which the Markov chain converges).
Markov’s scheme for extending the laws of probability beyond the realm of independent variables has one crucial restriction: The probabilities must depend only on the present state of the system, not on its earlier history. The Markovian analysis of Monopoly, for example, considers a player’s current position on the board but not how he or she got there. This limitation is serious. After all, life presents itself as a long sequence of contingent events—kingdoms are lost for want of a nail, hurricanes are spawned by butterflies in Amazonia—but these causal chains extending into the distant past are not Markov chains.
On the other hand, a finite span of history can often be captured by encoding it in the current state. For example, tomorrow’s weather could be made dependent on both yesterday’s and today’s by creating a nine-state model in which each state is a two-day sequence. The price to be paid is an exponential increase in the number of states.