COMPUTING SCIENCE
Rumours and Errours
Brian Hayes
My Bad
If you have already figured out where my reasoning went astray, I
offer my congratulations. My own belated enlightenment came when I
finally drew the matrix of all nine possible encounters of
ignorants, spreaders and stiflers. As shown in Figure 4, this
diagram can serve as more than just an enumeration of possible
outcomes; it encodes the entire structure and operation of the
model. If we make the widths of the columns and rows proportional to
the sizes of the three subpopulations, then the area of each of the
nine boxes gives the probability of the corresponding two-person
encounter. Choosing two participants at random is equivalent to
choosing a point at random within the diagram; the outcome of the
interaction is then decided by which of the nine boxes the chosen
point lies within. (I am again glossing over the issue of spreading
a rumor to oneself; it's a minor correction.)


To understand where I went wrong, it's enough to analyze the
simplest case, where the three subpopulations are of equal size and
all nine kinds of encounters have the same probability, namely 1/9.
The boxes at the four corners of the diagram correspond to events
that do not involve a spreader and that change no one's status. Two
other boxes describe ignorant-spreader encounters, which therefore
have a total probability of 2/9. Two more boxes correspond to
spreader-stifler meetings, so those events also occur with
probability 2/9. But there is only one box representing
spreader-spreader interactions, which accordingly must be assigned a
probability of 1/9. The key point is that ignorant-spreader and
spreader-stifler events are each twice as likely as
spreader-spreader meetings.
Now consider what happens in my first program for the Daley-Kendall
model. By always choosing a spreader first, I confined all events to
the middle row of the matrix, and the three boxes in this row were
selected with equal probability. As a result, spreader-spreader
interactions were twice as frequent as they should have been, and
the rumor was extinguished prematurely.
From the point of view of probability theory, the error is an
elementary one of failing to count cases properly. Perhaps a
professional programmer would cite a different root cause: I had
violated the old adage, "Don't start optimizing your program
until you've finished writing it."
» Post Comment