Data recovery with the TestDisk/PhotoRec duo

Valerii Kirsanov, 123RF.com

Valerii Kirsanov, 123RF.com

Fishing for Data

Data loss is painful. But, if dealt with promptly, much of the data that has been lost can be recovered. Our tests with TestDisk and PhotoRec prove that data recovery does not require expensive software.

To conduct this test, I destroyed a partition table and deleted about 600GB of data. Then, I let TestDisk and PhotoRec show what they could do. A common saying for hard drives is that they have just three states: empty, full, and damaged. The latter two states occur more frequently than most users would expect. Damage to a hard disk has the habit of occurring quite unexpectedly once symptoms of mechanical fatigue appear.

Until a few years ago, all hard disks were mechanical. Users could therefore sometimes hear clacking noises when a hard drive was nearing the end of its life. They could then change out the drive before it became inaccessible. The SSD drives today are different; they die quietly.

One indicator that can be used to determine the health of a hard drive running under Linux is the hard drive monitoring software S.M.A.R.T. [1]. The Linux Smartmontools [2] can process and interpret the data for signs pointing to wear and tear of the drive. However, if the hard drive is no longer accessible, but is still visible from the BIOS, or if a partition or data within a partition have accidentally been deleted, then it is time to use the forensic tools TestDisk and PhotoRec.

No Overwrites, Please

A basic rule about how data recovery tools work is that deleted data does not disappear completely in the same moment that the delete key gets pressed. Instead, the data delete process involves making a change to the first character of a file name thereby causing the data to become invisible in the filesystem. (See the box "What Happens When Data Is Deleted?" for more information.)

What Happens When Data Is Deleted?

When data is deleted, it is actually only marked as deleted in the filesystem. This means the data is no longer visible in the file manager and the console. The corresponding disk sectors remain unchanged and, therefore, the data is still on the disk.

Forensic software can track down these unassigned sectors by reading header records and making the data visible again for the filesystem. However, if the physical disk sectors are overwritten with different data, recovery becomes practically impossible.

At this point, chances for retrieving data are excellent. However the more that data is written to the disk – potentially overwriting the disk areas containing the deleted data – the less likely it is that a typical home user with common tools like TestDisk and PhotoRec will be able to recover lost data. A lot more can be done in a laboratory setting that offers clean room technology and direct work on the magnetic disk platters. However, the costs associated with this type of approach are often too high for private users.

The Data Rescuers

The software described here usually comes as a double package in most distributions. This makes sense because PhotoRec is a useful extension for TestDisk, and the two pieces of software are often used in tandem.

TestDisk is primarily responsible for recovering partitions, partition tables, the Master Boot Record (MBR), and data that have been deleted. PhotoRec was developed as recovery software for photos on the internal or the external storage medium for digital cameras. However, it can also deal with almost 450 different file formats, mainly in the areas of multimedia and office. TestDisk can repair the following filesystems: FAT{12,16,32}, NTFS, EXT{2,3,4}, Btrfs, and HFS+. Aside from Linux, both tools are available for various BSD versions as well as for Solaris, Mac OS, and Windows. TestDisk only works via the command line. Recently, the graphical interface QPhotoRec was developed for PhotoRec.

Cool Heads Can Save Data

It is not a good idea to try to perform data recovery without taking time to consider some helpful rules. The first thing to find out when you suspect that data has been lost on a hard drive, memory card, or USB stick is to make sure that the storage device is no longer written to. Otherwise, data can become irretrievably deleted.

Both tools have to be operated as the root user. If at all possible, you should not work with the original data. Instead, you can use the dcfldd [6], program, which is based on dd in the source text to copy the data. Among other things, it creates copies of files, partitions, and entire hard drives. It also creates MD5 sums and offers verification down to the bit for originals and copies. Unlike dd, it has a progress indicator so that during the copying of entire partitions or hard drives, the remaining copy time is displayed. Dcfldd can also automatically distribute the data across several files.

Preparatory Work

Just as you want to avoid working with the original data, you also don't want to work directly in the affected filesystem. Recovered data should never be written to the same partition. For this situation, Linux has an ideal solution. The system makes it possible to work from a current live CD or from specialty tool collections such as Parted Magic, SystemRescueCD or Ultimate Boot CD. Almost all of the distributions come with the TestDisk/PhotoRec duo. Adrenalin levels can rise when the user suspects a loss of data.

Operating the software can become difficult under such circumstances. Thus, it is a good idea to perform at least one dry run to have a basic idea of how the software works. If these basic conditions are met, you can start the tool of your choice from a terminal in a live medium, using:

dd if=/dev/zero of=/dev/sdb bs=512 count=1

For testing, I intentionally clobbered the partition table of an external FAT formatted 3TB mechanical hard drive. To carry out the test, I used TestDisk to reconstruct the partition table. I also deleted about 600GB of mixed data, which I tried to recover with PhotoRec. Test hardware included a current notebook model with a Haswell CPU and 8GB RAM.

Live Image

After starting the live image of your choice, the next thing to do is to update TestDisk. The new version 7.1. has been around since April 2015, and it comes with new file types and a GUI for PhotoRec.

To begin the update process, enter su in the terminal to become the root user. No password is required here for a live medium. Then, start TestDisk and confirm that you would like to create a log file (Figure 1). Navigation in TestDisk is performed with the arrow keys. After confirming via the return key, TestDisk consults the BIOS or the UEFI to determine partition information and then displays a list of partitions. From this list, select the partition you want to repair and confirm the preset Proceed by pressing the return key. At this point, the program tries to recognize the partition type (Figure 2). The default entry is Intel , and this usually turns out to be correct (Figure 3). After entering another confirmation, TestDisk will display the tools it has to offer.

Figure 1: Having a log is always a good thing.
Figure 2: Be careful when you choose a hard drive.
Figure 3: Intel is usually the correct choice.

Writing a New Partition Table

Both the Analyze and Advanced entries are especially useful for damaged partition tables and deleted data (Figure 4). The other options can be used to change the drive geometry, write a new master boot record (MBR) and delete the partition table. See the box "What Is a Partition Table?" for more information.

What Is a Partition Table?

A partition table tracks the partitions that exist on all of the hard drives of the computer. Most of the time, partition tables are part of the master boot record (MBR). Recently, they are part of the GUID partition table (GPT). If the table is damaged, partitions and possibly entire hard drives are no longer visible to the filesystem.

Figure 4: Analysis is the first step.

When the user is confronted with a destroyed partition table, the first thing to do is perform an analysis. After confirming the preset setting, TestDisk immediately shows the analysis results for our case, which admittedly is a simple one. The program correctly recognizes the three partitions, although it does so twice for the first partition. If you now go to Quit and then to Advanced in the main menu, the next window indicates damage to the boot sector (Figure 5).

Figure 5: Discovering an invalid boot sector.

Using Boot opens options that include Rebuild BS . Choosing this option lets you rebuild the boot sector (Figure 6). After confirming your selection, the process will take a few moments to complete. Once the new boot sector has been created, you should confirm via Write that this will be written to the drive. After this, all three partitions should appear correctly (see Figure 7).

Figure 6: Writing a new boot sector.
Figure 7: A successful recovery. All of the three partitions are visible again.

Choosing Undelete lets you see all of the files with the correct names as noted above. This is because no data was actually deleted. Instead, the partitions had become invisible to the filesystem. Now you can return to the main menu via Quit , close the program, and start using the drive again after a reboot. The log can be helpful for any remaining questions. It is found as testdisk.log in the home directory.

Now I'll try using TestDisk to recover the data from the two partitions that I deleted with rm -rf . You should start the version without the GUI in a terminal. You will see the TestDisk dialogs and operating commands. In the first window, you should select the hard drive on which the data loss occurred. Then, confirm with Proceed and select the affected partition. Now you can immediately search under the pre-settings option or the file option for particular file types.

For example, if you have deleted JPGs, it is possible to filter out all of the more than 400 other data types and just look for JPG. Then, you should start the search, select the filesystem, and use the arrow keys to select another partition for recovering the data. Then, confirm with C . At this point, it is a good idea to have written down the path for this partition or the directory within the partition. This will prevent mistakes. After confirmation, PhotoRec begins the recovery process (Figure 8).

Figure 8: PhotoRec working on data recovery.

In this test, the program took some six hours to recover about 600GB, which included 10 different file types. Approximately 215,000 files were recovered, primarily JPG, PNG, PDF, and MP3 (Figure 9). The recovered files lie in numbered folders, which are labeled as recup_dir . The individual files now have cryptic file names, such as f12345678.jpg (Figure 10). This is common in this type of software, because it does not work at the level of the filesystem but rather one level further down.

Figure 9: All of the files have been found again.
Figure 10: Cryptic file names require a lot of extra work.

The graphical version QPhotoRec is not yet included in the Linux versions of TestDisk (Figure 11). Even if the less experienced user is not fazed by the prospect of operating PhotoRec in a terminal, it's good to know about the graphical interface for PhotoRec. It is available in version 1.0 of the current edition of the Parted Magic CD [7], now available only for a fee. For use with DEB and RPM distributions, it can be downloaded and installed. This is still just an early version. You probably shouldn't use it for actual data (Figure 11).

Figure 11: QPhotoRec is not quite yet ready for use.

Conclusion

Both of these recovery tasks were resolved with excellent results by TestDisk and PhotoRec. Both tools are on par with paid software for Windows when evaluated against the tasks they were designed to accomplish.

I would like to see a graphical interface for TestDisk rather than PhotoRec. The latter has just a few options, so the graphical interface isn't really required. TestDisk, on the other hand, could make good use of a GUI to help the lay person sort through the multitude of quite powerful data recovery capabilities. Users, however, can turn to the documentation [8], which is quite good. Otherwise, there appears to be no reason to spend money for a good data recovery package or to change operating systems.