COMPUTING SCIENCE

# How to Avoid Yourself

# Coming and Going

If you look at a complete set of *n*-step self-avoiding walks, the first thing you're likely to notice is a lot of repetition. Many of the walks have the same basic shape; they differ only by a rotation or reflection.

On the square lattice any path can be rotated into four orientations and reflected across four axes (vertical, horizontal and two diagonals). Should the results of these transformations be considered eight distinct walks or eight variations on a single walk? The answer depends on what you want to do with the walks, but if you're merely counting them, it's clearly foolish to count by ones when you can count by eights. Programs for walk counting generate just one of the eight configurations, and then multiply to get the total number.

To eliminate the fourfold rotational symmetry of the lattice, you can generate only those walks that start with a step in some particular direction, say east. The four mirror symmetries disappear if you consider only walks whose first turn after the initial step is in one specified direction, say north. In this way the number of walks to be counted is reduced to approximately one-eighth the total number. Why *approximately*? Because of one small complication: A straight walk makes no turns away from its initial direction and is therefore left unchanged by mirror reflection. Hence there are only four distinguishable straight walks instead of eight, and the total number of walks is reduced by 4.

In the symmetries described so far, walks are considered to be *rooted*, that is, one end of each walk is distinguished as the starting end, as if it were planted in the ground. In many contexts, however, the direction in which a walk is traversed has no significance. For example, if you were folding a polymer along the path of a self-avoiding walk, you could start at either end of the walk. Viewing self-avoiding walks as unrooted paths makes many pairs of walks indistinguishable, so that the number of distinct *n*-step walks is reduced by a further factor of approximately 2.

I have been unable to find any published literature on enumerating unrooted self-avoiding walks, although it seems unlikely I am the first to consider the problem. The search for a practical algorithm proved an interesting challenge.

For unrooted walks, two paths through the lattice are identical if traversing one of them forward yields the same path as following the other one backward. The symmetry is easiest to understand when you represent a walk not as a list of lattice sites or as absolute directions but as instructions telling you what turns to take at every intersection along the path. Suppose the route from your home to your office is abbreviated [*lfrff*], where *l* stands for *left*, *r* for *right* and *f* for *forward*. (The notation is borrowed from the turtle graphics system of the Logo programming language.) On your way home, if you want to retrace your steps, you would obey the turtle-graphics commands [*fflfr*]. The two lists of instructions are related by a transformation I shall call retroreflection: The sequence is letters are reversed, and all the *lefts* and *rights* are interchanged.

In writing a program to enumerate the unrooted self-avoiding walks, the aim is to select one member of each retroreflected pair and discard the other. (It doesn't matter which one is kept.) The direct solution would be to maintain an archive of all the walks seen so far. Then as you generate each new walk, you reverse the list of turtle-graphics commands, flip the *lefts* and *rights*, and compare the result with the archive of stored walks, keeping the new walk only if the retroreflected version hasn't been seen already. This strategy would work, but it would be hideously slow and a memory hog.

There is a better way. The key idea is to transform the list of turtle-graphics commands into a number, specifically a ternary (base 3) number, with the digit 0 representing *forward*, 1 representing *left* and 2 representing *right*. Then every *n*-step self-avoiding walk has a unique encoding as an (*n*–1)-digit ternary number; equally important, every (*n*–1)-digit ternary number specifies an *n*-step nonreversing (though not necessarily self-avoiding) walk. For example, counting in ternary from 0000, 0001, 0002, 0010 up to 2222 generates every five-step nonreversing walk. Discarding the walks that fail a test for self-intersections leaves just the self-avoiding five-step walks. (In practice, it's better to write the walk numbers in "balanced ternary" notation, where the digits are –1, 0 and +1; then interchanging *left* and *right* is just multiplying by –1. Balanced ternary is the only numbering system I know that has its own Web page.)

The reason for treating the walks as numbers is that it imposes a total ordering on them. Given any two distinct finite numbers, there is always a larger and a smaller. Likewise, given any two distinct walks related by the retroreflective symmetry, one walk's ternary encoding must be less than the other's. This immediately suggests an efficient algorithm for splitting the symmetrical pairs: As you generate each walk, compare the ternary representation with its retroreflection. If the original number is greater, discard the walk; otherwise keep it. In this scheme there is no need to maintain and search an archive of walks; every decision is made with a single numerical comparison.

Figure 5 gives the number of unrooted self-avoiding walks of up to 24 steps. As would be expected, the numbers are approximately one-sixteenth the number of all self-avoiding walks of the same length, and the ratio approaches 1:16 more closely as *n* increases; and yet the ratio is not exact. The reason is another little complication, analogous to the problem of straight walks that are invariant under reflection. In this case a class of walks pass through retroreflection unchanged. They are the palindromic walks. Consider the path [*flfrf*]; if you reverse the sequence of instructions and interchange *lefts* and *rights*, you wind up with the same sequence of commands again. The path is its own retroreflection. To get a correct census of the unrooted walks it's crucial that such paths be counted once and only once.

EMAIL TO A FRIEND :

**Of Possible Interest**

**Computing Science**: Belles lettres Meets Big Data

**Technologue**: Quantum Randomness

**Technologue**: The Quest for Randomness