First Links in the Markov Chain
Probability and poetry were unlikely partners in the creation of a computational tool
A Markovian Weather Forecast
Probability theory has its roots in games of chance, where every roll of the dice or spin of the roulette wheel is a separate experiment, independent of all others. It’s an article of faith that one flip of a coin has no effect on the next. If the coin is fair, the probability of heads is always 1/2.
This principle of independence makes it easy to calculate compound probabilities. If you toss a fair coin twice, the chance of seeing heads both times is simply 1/2×1/2, or 1/4. More generally, if two independent events have probabilities p and q, the joint probability of both events is the product pq.
However, not all aspects of life adhere to this convenient principle. Suppose the probability of a rainy day is 1/3; it does not follow that the probability of rain two days in a row is 1/3×1/3=1/9. Storms often last several days, so rain today may signal an elevated chance of rain tomorrow.
For another example where independence fails, consider the game of Monopoly. Rolling the dice determines how many steps your token advances around the board, but where you land at the end of a move obviously depends on where you begin. From different starting points, the same number of steps could take you to the Boardwalk or put you in jail. The probabilities of future events depend on the current state of the system. The events are linked, one to the next; they form a Markov chain.
To be considered a proper Markov chain, a system must have a set of distinct states, with identifiable transitions between them. A simplified model of weather forecasting might have just three states: sunny, cloudy and rainy. There are nine possible transitions (including “identity” transitions that leave the state unchanged). For Monopoly, the minimal model would require at least 40 states, corresponding to the 40 squares around the perimeter of the board. For each state there are transitions to all other states that can be reached in a roll of the dice—generally those from 2 to 12 squares away. A realistic Monopoly model incorporating all of the game’s quirky rules would be much larger.
Recent years have seen the construction of truly enormous Markov chains. For example, the PageRank algorithm devised by Larry Page and Sergey Brin, the founders of Google, is based on a Markov chain whose states are the pages of the World Wide Web—perhaps 40 billion of them. The transitions are links between pages. The aim of the algorithm is to calculate for each web page the probability that a reader following links at random will arrive at that page.