Until a university study emerged last week, few experts suspected that it’s more difficult to erase data stored on solid-state drives (SSD) than that on hard disk drives (HDDs).
Industry experts were taken aback by the study, but noted that there are SSDs with native encryption capabilities that can prevent data from being seen even after a drive’s end of life, and that there are some SSD drive sanitation methods that are more successful than others.
Related stories
Solid-state drives versus hard-disk drives
How to swap your laptop’s hard drive for a fast solid state drive
7 ways to back up your data … and save your bacon
“I don’t think anyone ever knew about this,” said security technologist Bruce Schneier.
The study conducted by researchers at the University of California at San Diego (UCSD), showed that sanitizing SSDs of data is at best a difficult task and at worst nearly impossible. While overwriting data several times can ensure data erasure on many SSDs, the researchers found they were still able to recover data on some products.
One surefire method for protecting your SSD data is cryptographic erasure, said Kent Smith, senior director of product marketing at SSD controller manufacturers SandForce.
Crypto-erasure involves first encrypting an SSD so that only users holding passwords can access its data. When the SSD is at end of life, the user can delete the encryption keys on the drive, eliminating the possibility of unencrypting or accessing the data.
“Unless you can break the 128-bit AES encryption algorithm, there’s just no way to get to the data. The drive is now still a fully functioning drive and effectively able to begin writing again,” Smith said. “That takes a split second.”
The other security method SandForce-based SSDs afford is erasing all the NAND flash memory.
“We go through every single LBA, every single location … that could have held user data, as well as performing the crypto-erase,” Smith said. “That would take longer because you have to erase the flash. That could take a few minutes.”
Related story – SSDs are hot, but not without security risks
SandForce’s controllers, used by most major SSD vendors, include native 128-bit AES encryption that allows users to set up passwords. But some SSDs don’t come with native hardware-based encryption.
Crypto erase is performed on the drive either through the Security Erase Unit (SEU) command, or the soon-to-be released addition to the serial ATA specification under Sanitize Device Set.
When a user chooses the SEU command, all LBAs are erased in the Device Configuration Identity, which is everywhere an SSD can store user data. Additionally, the encryption key is zeroed or destroyed, leaving any existing data scrambled, and all mapping data is erased so the drive cannot even locate the prior scrambled data. The controller automatically creates a new encryption key for any new incoming data.
“The effectiveness of cryptographic sanitization relies on the security of the encryption system used (e.g. AES), as well as the designer’s ability to eliminate “side channel” attacks that might allow an adversary to extract the key or otherwise bypass the encryption,” the UCSD researchers wrote in their paper.
Related story – 3 encryption apps to secure your data
AES or Advanced Encryption Standard, is the successor to the older DES (Data Encryption Standard). The standard is used by the U.S. government for using the 128-bit and 256-bit strengths to encrypt secret and top-secret-level documents, respectively.
But it’s not enough to offer only AES encryption; much depends on how the encryption is deployed.
That’s important in part because users don’t always want to use passwords as long as needed for effective key generation. If a user chooses a password with fewer characters than would make a 128-bit or 256-bit key (one character = 8 bits, so we’re talking about passwords of 16 or 32 characters, respectively), the remaining characters often automatically become zeros.
In such cases, said Charles Kolodgy, research director for secure content and threat management products at IDC, the password can more easily be guessed.
Kolodgy recommends users create a passphrase rather than a password. “The first step is to take care of 90 per cent of the users out there,” Kolodgy says. After that, the best solution is to have a random password character generator on the drive.
Even if your drive comes with native encryption capabilities, Schneier believes there is no way to tell whether a vendor’s security is foolproof “apart from a $50,000 or $100,000 engineering effort” as he states in an essay on password security.
Schneier is a proponent of purchasing as inexpensive a drive as possible and then encrypting the data by using freeware, such as TruCrypt or a relatively low-cost product such as PGPDisk.
The UCSD researchers agreed that crypto-erasure is a good method of ensuring that an SSD can be sanitized at its end of life or when slated for re-use.
The researchers tested 12 SSDs and found that none of the available software techniques for erasing individual files is effective. Erasing entire SSDs with native sanitize commands was most effective, but only when performed correctly, and that software techniques work most, but not all, of the time.
The researchers did not identify the products used in the test.
UCSD’s Non-volatile Systems Laboratory designed a procedure to bypass the flash translation layer (FTL) on SSDs and directly access the raw NAND flash chips to audit the success of any given sanitization technique.
An SSD’s FTL performs the mapping of data between the logical block addresses (LBAs) via the ATA or SCSI interface and NAND flash memory’s physical pages.
In a paper titled “Reliably Erasing Data from Flash-Based Solid State Drives”, the university researchers wrote that “all single-file overwrite sanitization protocols failed: between 4 per cent and 75 per cent of the files’ contents remained on the SATA SSDs.”
USB flash drives didn’t fare much better. Between 0.57 per cent and 84.9 per cent of the data remained on the drive after an overwrite was attempted.
The researchers even attempted overwriting free space on the drives and defragmenting the drive to redistribute data, encouraging the FTL to reuse more physical storage locations, but it proved to be ineffective.
Of 12 SSDs they tested using the drives’ native “Erase Unit” command, only four were actually erased. One SSD had reported itself to be sanitized, yet the data was recoverable by the researchers.
In a separate overwriting test, which took up to 58 hours some of the SSDs, researchers found one out of eight remaining disks came back as sanitized. After two overwrites, all but one came back as erased. One drive still had 1 per cent of its data even after 20 overwrites.
Sanitizing a hard disk drive is a simpler task, the researchers found. At the consumer level, hard disks can be reformatted and overwritten. For commercial users, a degausser, which uses a strong magnetic field to demagnetize the disk platters, can effectively erase all data.
But SSDs don’t function in the same way as HDDs.
On a hard drive, the write and erase sectors are the same, meaning when a host overwrites data, it goes to the same block as the original data had been written to.
Flash memory is made up of pages and blocks. Data is written in 8KB pages, and erase operations occur in 2MB blocks, also known as “chunks.” Therefore, when an erasure occurs, an entire 2MB block must be marked for deletion.
So, when data is written to NAND flash memory it’s a two-step process known as a read-modify-erase-write cycle. First, existing data must be erased and then the old data combined with the new can be written to a different page on the memory. The old data, however, isn’t actually erased at the time of a new write; it’s only marked for deletion.
Manufacturers use ‘garbage collection’ algorithms to go back at a later time, typically when a drive is idle, and erase data marked for deletion. All NAND flash devices work this way. In the meantime, duplicate data exists on the NAND flash memory.
“And some drives don’t erase all that data,” said Gregory Wong, an analyst with market research firm Forward Insights.
For example, on most of today’s SSDs wear-leveling algorithms are used to more evenly distribute data across the drive so as to not wear out any one area of the NAND flash. The problem is, wear leveling can also defeat data erasure because it relocates blocks between the time when they are first written and then overwritten.
The National Institute of Standards and Technology (NIST) is currently being pushed by the SSD industry to redefine some of the military erase overwrite protocols to recognize encrypting drives that can be cryptographically erased without the need to overwrite the flash.
“But that’s not happening tomorrow. Government agencies take a long time to embrace standards,” Smith said.
Lucas Mearian covers storage, disaster recovery and business continuity, financial services infrastructure and health care IT for Computerworld. Follow Lucas on Twitter at @lucasmearian, or subscribe to Lucas’s RSS feed. His e-mail address is [email protected].