Logo IMG


Connecting the Dots

Can the tools of graph theory and social-network studies unravel the next big plot?

Brian Hayes

Who Calls Whom

The NSA is the U.S. espionage service with responsibility for cryptography and "signals intelligence." Although its budget and staffing are secret, it is often said to be the largest of the U.S. intelligence agencies and also, incidentally, the largest employer of mathematicians in the United States and perhaps in the world. And it is assumed to possess prodigious computing resources.

Exploration of the call graph belongs to the branch of signals intelligence known as traffic analysis. In a battlefield situation, you might intercept an enemy's radio transmissions but be unable to read their encrypted content. Nevertheless, just counting the messages can yield valuable information. A flurry of activity might signal an impending troop movement; sudden radio silence could be even more ominous. If you can identify the source and the intended recipient of each message—in effect, constructing a call graph—you can learn even more, since lines of communication often reveal something about the organization of a military force.

The search for meaningful patterns in telephone records could rely on similar principles, but the problem is much harder. In the military situation, messages between enemy units are readily identified as such. In the telephone database, calls among a few dozen conspirators would all too easily get lost in the background noise of other conversations.

The records in the call database are collected not for the sake of national security but for mundane commercial purposes. In order to send you an itemized bill at the end of the month, a phone company needs to keep track of every call completed, with the originating and receiving phone numbers and the starting and ending times. The largest companies handle roughly 250 million toll calls a day, and so a month's worth of data amounts to several billion call records. AT&T reports that its database of retained records is approaching two trillion calls and more than 300 terabytes of data.

Apart from billing, the call graph has other uses within the phone company—some of which are not too different from what the NSA may be doing, and almost as secretive. Historical calling patterns can be used to detect fraud, and some patterns are also of interest in marketing. For example, a company that offers a discounted rate within a "calling circle" can use information from the call graph to estimate the costs and benefits of the program.

In principle, the same kind of traffic data found in telephone call-detail records could also be compiled for other communications channels. For instance, Federal Express and other courier services keep digitized records of their deliveries, which could readily be transformed into a database of senders and receivers. Curiously, the most digital medium of all—the Internet—does not provide for routine retention of who-speaks-to-whom data; there's no direct need for it, since customers do not pay by the message. However, there is no technological barrier to collecting detailed statistics on e-mail messages and other kinds of Internet traffic. A "packet sniffer" installed on the network backbone would simply need to scan the headers of messages and record the to and from addresses. (It's even possible that equipment reportedly installed by the NSA at certain Internet switching centers could have this purpose.)

comments powered by Disqus


Of Possible Interest

Feature Article: In Defense of Pure Mathematics

Feature Article: The Statistical Crisis in Science

Computing Science: Clarity in Climate Modeling

Subscribe to American Scientist