Modern crime often leaves an electronic trail. Finding and preserving that evidence requires careful methods as well as technical skill
Digital forensics relies on a kit of tools and techniques that can be applied equally to suspects, victims, and bystanders. A cell phone found on a dead body without identification would almost certainly be subjected to analysis, but so would a phone dropped during a house burglary. How the analysis is performed is therefore more a matter of legal issues than technological ones. As the field has grown, practitioners have tried to create a consistent but flexible approach for performing investigations, despite policy variations. Several such digital forensic models have been proposed, but most have common elements.
Before data can be analyzed, they are collected from the field (the “scene of the crime”), stabilized, and preserved to create a lasting record. Understanding the inner workings of how computers store data is key to accurate extraction and retention. Although computers are based entirely on computations involving the binary digits 0 and 1, more commonly known as bits, modern computers do most of their work on groups of eight bits called bytes. A byte can represent the sequences 00000000, 00000001, 00000010, through 11111111, which corresponds to the decimal numbers 0 through 255 (there are two options, 0 and 1, with eight combinations, so 28 = 256). One common use for bytes inside the computer is to store written text, where each letter is represented by a specific binary code. UTF-8, a common representation, uses the binary sequence 00100001 to represent the letter A, 00100010 for the letter B, and so on. (Computers often use hexadecimal codes in memory as well; see figure at right.)
When recorded on a hard drive or memory card, these bytes are grouped in blocks called sectors that are typically 512 or 4,096 bytes in length. A sector is the smallest block of data that a drive can read or write. Each sector on the disk has a unique identifying number, called the sector’s logical block address. An email message might require 10 or 20 sectors to store; a movie might require hundreds of thousands. A cell phone advertised as having “8 GB” of storage has 8 billion bytes or roughly 15 million sectors.
Depending on the arrangement of other files on the device, the sectors can be stored as a single sequential stream or fragmented into many different locations. Other sectors contain information that the computer uses to find the stored data; such bookkeeping material is called file system metadata (literally “data about data”).
To preserve the data on a computer or phone, each of these sectors must be individually copied and stored on another computer in a single file called a disk image or physical image. This file, which contains every byte from the target device, naturally includes every visible file. But the physical image also records invisible files, as well as portions of files that have been deleted but not yet overwritten by the operating system.
In cases involving networks instead of individual machines, the actual data sent over the network connection are preserved. Thus network forensics is equivalent to a wiretap—and, indeed, law enforcement is increasingly putting network forensics equipment to this use.
The random access memory (RAM) associated with computer systems is also subject to forensic investigation. RAM gets its name because the data it stores can be accessed in any order. This fast access makes RAM particularly useful as temporary storage and working space for a computer’s operating systems and programs. But RAM is difficult to work with, because its contents change very quickly and are lost when a computer is turned off. RAM must be captured with a dedicated program (a memory imager) and is stored in its own special kind of file, called a memory dump. Although data in RAM can be extracted from all kinds of electronic systems—not just desktops, laptops, and cell phones but also network communications equipment such as wireless routers—each of these systems uses different kinds of internal software structures, so programs designed to analyze one may not work on another. And even though forensics researchers have developed approaches for ensuring the forensic integrity of drive copies, currently there is no widely accepted approach for mathematically ensuring a RAM dump.
Preserving the data is only the first step in the process. Next, an examiner has to explore for information that might be relevant to the investigation. Most examinations are performed with tools that can extract user files from the disk image, search for files that contain a specific word or phrase in a variety of human languages, and even detect the presence of encrypted data. Relevant data are then extracted from the preserved system so they are easier to analyze.