COMPUTING SCIENCE

# Identity Crisis

Some Are Less Equal Than Others

Numbers aren't the only things that can be equal or unequal. Most programming languages also have equality operators for other simple data objects, such as alphabetic characters; thus a = a but a ≠ b. (Whether a = A is a matter up for debate.)

Sequences of characters (usually called strings) are also easy to compare. Two strings are equal if they consist of the same characters in the same sequence, which implies the strings also have the same length. Hence an equality operator for strings simply marches through the two strings in parallel, matching up the characters one by one. Certain other data structures, such as arrays, are handled in much the same way.

But one important kind of data structure can be problematic. The
most flexible way of organizing data elements is with links, or
pointers, from one item to another. For example, the symbols
*a*, *b* and *c* might be linked into the list
*a* → *b* → *c* → *nil*,
where *nil* is a special value that marks the end of a chain
of pointers. Comparing two such structures for equality is
straightforward: Just trace the two chains of pointers, and if both
reach *nil* at the same time without having encountered any
discrepancies along the way, they are identical.

The pointer-following algorithm works well enough in most cases, but consider this structure:

An algorithm that attempts to trace the chain of pointers until
reaching *nil* will never terminate, and so structural
equality will never be decided. This problem can be solved—the
workaround is to lay down a trail of breadcrumbs as you go, and stop
following the pointers as soon as you recognize a site you've
already visited—but the technique is messy.

There's something else inside the computer that's remarkably hard to
test for equality: programs. Even in the simplest cases, where the
program is the computational equivalent of a mathematical function,
proving equality is a challenge. A function is a program that
accepts inputs (called the arguments of the function) and computes a
value, but does nothing else to alter the state of the computer. The
value returned by the function depends only on the arguments, so
that if you apply the function to the same arguments repeatedly, it
always returns the same value. For example, *f(x)* =
*x* ^{2} is a function of the single argument
*x*, and its returned value is the square of *x*.

A given function could be written as a computer program in many
different ways. At the most trivial level, *f(x)* =
*x* ^{2} might be replaced by *f(y)* =
*y* ^{2}, where the only change is to the name of the
variable. Another alternative might be *f(x)* = *x* ?
*x*, or perhaps *f(x)* = exp(*2*
log(*x*)). It seems reasonable to say that two such functions
are identical if they return the same value when applied to the same
argument. But if that criterion were to serve as a test of function
equality, you would have to test all possible arguments within the
domain of the function. Even when the domain is not infinite, it is
often inconveniently large. The alternative to such an
"extensional" test of equality is an
"intensional" test, which tries to prove that the texts of
the two programs have the same meaning. Fabricating such proofs is
not impossible—optimizing compilers do it all the time when
they substitute a faster sequence of machine instructions for a
slower one—but it is hardly a straightforward task.

*Programs and Machines*.)

**IN THIS SECTION**

EMAIL TO A FRIEND :

**Of Possible Interest**

**Perspective**: Searching for Great Adventures

**Computing Science**: New Dilemmas for the Prisoner

**Feature Article**: Digital Forensics