Recovering ssd after formatting. Data recovery from SSD drives: what is real and what is not

We perform data recovery from SSDs of all brands: Kingston, OCZ, Transcend, Intel, Corsair, Silicon Power, Patriot, A-Data, Crucial, Western Digital, Samsung, Apacer, etc.

SSD ( Solid State Drive)– are high-speed data storage devices based on NAND Flash memory. They have volumes and speeds similar in value to HDDs, but do not have mechanical parts, which allows them to easily withstand various external physical influences, such as vibrations, shocks, falls, etc.

The structure of an SSD drive is almost identical to conventional flash drives.. It has several NAND Flash chips and a management controller. The differences are that SSDs use a faster type of memory and controllers that can work with multiple memory chips in parallel.

Prices for data recovery services from SSD drives

How we recover data from SSD

Data recovery from SSD drives consists of several stages:

The main malfunctions that occur with SSD drives:

physical damage to SSD drives. This type includes damage to interface connectors, damage to controller and memory chips, radio elements of the SSD disk board and the printed circuit board as a whole due to mechanical or electrical influences.
logical damage to the file system of the SSD drive, erroneous deletion of information, formatting. When working with SSD drives, software glitches may occur, resulting in user data being inaccessible or damaged.
damage in the service information area of the SSD disk, used by the controller in the operation of the translation mechanism. An SSD drive contains areas that are used by the drive for official purposes. They are not involved in storing user data, but damage to the information in them leads to a complete loss of the drive’s functionality.

Recovering data from SSD drives is a much more complex and time-consuming process compared to conventional flash drives. A significant increase in the number of memory chips in an SSD drive greatly increases the number possible options actions at each stage of data recovery. Due to the fact that SSD drives are subject to much more stringent requirements for all basic characteristics than conventional flash drives, the technologies and methods for working with information used in them are also more complex. Because of this, to recover data from any SSD, an individual approach to each case and the availability of specialized equipment is required.

You can learn more about the equipment we use for data recovery from SSD drives by clicking on

Greetings to all Khabrovsk residents!

Today I propose to talk a little about recovering information from faulty SSD drives. But first, before we get acquainted with the technology for saving precious kilo-mega- and gigabytes, please pay attention to the diagram below. On it we tried to place the most popular models SSDs according to the probability of successful data recovery from them.

As you might guess, drives located in the green zone usually have the fewest problems (provided the engineer has the necessary tools, of course). And drives from the red zone can cause a lot of suffering to both their owners and restoration engineers. If such SSDs fail, the chances of getting back lost data are currently too small. If your SSD is located in or near the red zone, then I would advise making a backup before each brushing of your teeth.

Those who have already made a backup today, welcome to cat.

A small caveat should be made here. Some companies can do a little more, some a little less. The results illustrated in the chart represent an industry average as of 2015.

Today, there are two common approaches to recovering data from faulty SSDs.

Approach #1. Reading dumps of NAND flash chips

Solving the problem, as they say, head-on. The logic is simple. User data is stored on chips NAND flash memory. The drive is faulty, but what if the chips themselves are fine? In the vast majority of cases this is true, the microcircuits are operational. Some of the data stored on them may be damaged, but the chips themselves function normally. Then you can unsolder each chip from the drive’s printed circuit board and read its contents using a programmer. And then try to assemble a logical image of the drive from the received files. This approach is currently used in usb data recovery flash drives and various memory cards. I’ll say right away that this is not a rewarding job.

Difficulties may arise even at the reading stage. NAND flash memory chips are available in different packages, and for a specific chip the programmer may not be included in the package the required adapter. For such cases, the kit usually includes some kind of universal adapter for soldering. The engineer is forced, using thin wires and a soldering iron, to connect the required legs of the microcircuit to the corresponding contacts of the adapter. The task is completely solvable, but requires direct hands, certain skills and time. I’m not very familiar with a soldering iron myself, so this kind of work commands respect.

Let's also not forget that in an SSD there will most likely be 8 or 16 such chips, and each one will have to be unsoldered and counted. And the process of reading a microcircuit itself cannot be called fast either.
Well, then all that remains is to assemble an image from the received dumps and it’s done! But this is where the fun begins. I will not go into details, I will describe only the main tasks that the engineer and the software he uses must solve.

Bit errors

The nature of NAND flash memory chips is such that errors are bound to appear in the stored data. Individual cells memory begins to be read incorrectly, and consistently incorrectly. And this is considered the norm until the number of errors within a certain range exceeds a certain threshold. Correction codes (ECC) are used to combat bit errors. When saving user data, the drive first divides the data block into several ranges and adds some redundant data to each range, which allows you to detect and correct possible errors. The number of errors that can be corrected is determined by the power of the code.

The higher the code power, the longer sequence assigned bytes. The process of calculating and adding the mentioned sequence is called encoding, and correcting bit errors is called decoding. The encoding and decoding circuits are usually implemented in hardware within the drive controller. When executing a read command, the drive, along with other operations, also performs bit error correction. The same decoding procedure must be performed with the resulting dump files. To do this, you need to determine the parameters of the code used.

Memory chip page format

The unit of reading and writing for memory chips is a unit called a page. For modern chips, the page size is approximately 8 KB or 4 KB. Moreover, this value is not a power of two, but a little more. That is, inside the page you can place 4 or 8 KB of user data and something else. The drives use this redundant part to store correction codes and some service data. Typically a page is divided into several ranges. Each range consists of a user data area (UA) and a service data area (SA). The latter stores correction codes inside itself that protect this range.

All pages have the same format, and for successful recovery it is necessary to determine which byte ranges correspond to user data and which are service data.

Scrambling VS Encryption

Majority modern SSDs do not store user data in clear text; instead, it is pre-scrambled or encrypted. The difference between these two concepts is quite arbitrary. Scrambling is some kind of reversible transformation. The main task of this transformation is to obtain from the source data something similar to a random sequence of bits. This transformation is not crypto-proof. Knowledge of the conversion algorithm allows you to easily obtain the original data. In the case of encryption, knowing the algorithm alone does not give anything. You also need to know the decryption key. Therefore, if the drive uses hardware data encryption and you do not know the encryption parameters, then you will not be able to recover the data from the read dumps. It's better not to even start this task. Fortunately, most manufacturers honestly admit that they use encryption.

Moreover, marketers managed to turn this criminal (from the point of view of data recovery) functionality into an option that supposedly gives a competitive advantage over other drives. And it would be okay if there were separate models for the paranoid, in which there would be high-quality protection against unauthorized access. But now, apparently, the time has come when the lack of encryption is considered bad manners.
In the case of scrambling, things are not so sad. In drives, it is implemented as a bitwise XOR operation (addition modulo 2, excluding “OR”), performed on the original data and some generated sequence of bits (XOR pattern).

This operation is often denoted by the symbol ⊕.

Because the
Then, to obtain the original data, it is necessary to perform a bitwise addition of the read buffer and the XOR pattern:

(X ⊕ Key) ⊕ Key = X ⊕ (Key ⊕ Key) = X ⊕ 0 = X

It remains to determine the XOR pattern. In the very simple case The same XOR pattern is used for all pages. Sometimes the drive generates a long pattern, say 256 pages long, then each of the first 256 pages of the chip is added with its own piece of the pattern, and this is repeated for the next groups of 256 pages. But there are more complicated cases. When each page individually generates its own pattern based on some law. In such cases, among other things, you still need to try to unravel this law, which, to put it mildly, is not easy.

Build the image

After all the preliminary transformations have been done (bit error correction, scrambling removal, page format determination and possibly a few others) the final stage The image is being assembled. Due to the fact that the number of rewrite cycles for chip cells is limited, drives are forced to use wear leveling mechanisms to extend the life of the chips. The consequence of this is that user data is not stored sequentially, but is scattered chaotically within the chips. Obviously, the drive needs to somehow remember where it saved the current block of data. To do this, it uses special tables and lists, which are also stored on memory chips. The set of these structures is usually called a translator. It would be more accurate to say that a translator is a kind of abstraction that is responsible for converting logical addresses (sector numbers) into physical ones (chip and page).

Accordingly, in order to assemble a logical image of the drive, you need to understand the format and purpose of all translator structures, and also know how to find them. Some of the structures are quite voluminous, so the drive does not store it entirely in one place, and it also ends up scattered in pieces across different pages. In such cases, there must be a structure that describes this distribution. It turns out to be a kind of translator for a translator. They usually stop there, but you can go even further.

This approach to data recovery makes it possible to completely emulate the operation of the drive at a low level. This explains the pros and cons of this approach.

Minuses:

Labor intensity. Since we are completely emulating the drive, we will have to do all the dirty work for it.
Risk of failure. If it is not possible to solve at least one of the assigned tasks, then there can be no talk of restoration. And there are many options: the inability to read microcircuits because the programmer does not support them; unknown correction codes; unknown XOR pattern; encryption; unknown translator
Risk of ruining the drive even more. In addition to shaking hands, the risk is the heating of the memory chips itself. For worn-out chips, this may result in additional bit errors.
Time and cost of work

Pros:

Wide range of tasks. All that is needed from the drive is working memory chips. It doesn't matter what condition the other elements are in.

Approach #2. Technological mode

Very often, SSD developers, in addition to implementing the operation of the drive according to the specification, also provide it with additional functionality that allows you to test the operation of individual drive subsystems and change a number of configuration parameters. Commands to the drive that allow this to be done are usually called technological. They also turn out to be very useful when working with faulty drives whose damage is of a software nature.

As mentioned above, over time, bit errors inevitably appear in memory chips. So, according to statistics, the cause of SSD failure in most cases is the appearance of uncorrectable bit errors in service structures Oh. That is, at the physical level, all elements work normally. But the SSD cannot be initialized correctly because one of the service structures is damaged. Such situation different models SSD is treated differently. Some SSDs go to emergency mode work in which the functionality of the drive is significantly reduced, in particular, the drive returns an error for any read or write commands. Often, in order to somehow signal a breakdown, the drive changes some of its passport data. For example, the Intel 320 series returns a string with an error code instead of its serial number. The most common faults are from the “BAD_CTX %error code%” series.

In such situations, knowledge of technology teams comes in very handy. Using them, you can analyze all service structures, also read the internal logs of the drive and try to find out what went wrong during the initialization process. In fact, most likely, this is why techno-commands were added, so that the manufacturer would have the opportunity to find out the reason for the failure of their drives and try to improve something in their operation. Having determined the cause of the malfunction, you can try to eliminate it and bring the drive back to life. But all this requires truly in-depth knowledge of the device architecture. By architecture here I mostly mean the drive’s firmware and the service data it operates on. Only the developers themselves have this level of knowledge. Therefore, if you are not one of them, then you either must have comprehensive documentation for the drive, or you will have to spend a fair amount of hours studying this model. It’s clear that the developers are in no hurry to share their work and free access there is no such documentation. Frankly speaking, I doubt that such documentation exists at all.

Currently, there are too many SSD manufacturers, and new models appear too often, and there is no time for detailed study. Therefore, a slightly different approach is practiced.

Among the technological commands, the commands that allow you to read pages of memory chips are very useful. Thus, you can read entire dumps through SATA interface drive without opening the SSD case. In this case, the drive itself acts as a programmer for NAND flash memory chips. In principle, such actions should not even violate the terms of the warranty on the drive.

Often the processors for techno-commands for reading memory chips are implemented in such a way that it is possible to leave bit error correction, and sometimes data decryption, on the drive side. Which, in turn, greatly facilitates the data recovery process. In fact, all that remains is to figure out the translation mechanisms and, one might say, the solution is ready.

In words, it’s over, it all just sounds. But developing such solutions takes a lot of man-hours. As a result, we are adding just one SSD model to support.

But the data recovery process itself is greatly simplified! Having similar utility, all that remains is to connect the drive to the computer and run this utility, which, using techno-commands and analysis of service structures, will build a logical image. All that remains is the analysis of partitions and file systems. Which can also be a difficult task. But in most cases, the built image allows you to restore most of the user data without much difficulty.

Minuses:

Complexity and cost of development. Quite a few companies can afford to maintain their own development department and conduct this kind of research.
Solutions are individual.
Limited range of tasks. This approach is not applicable to all drives. The SSD must be physically intact. Also, it is rare, but still happens, that damage to some service structures eliminates the possibility of restoring user data.

Pros:

Simplicity.
In some cases, it allows you to bypass encryption. In fact, the approach to data recovery using technological commands is currently the only known way to recover data from some drives that use hardware data encryption.

Conclusion

In war, all means are good. But personally, I prefer the second approach as a more subtle tool. And the most promising, since the increasingly widespread hardware encryption eliminates the possibility of restoring information from “raw” chip dumps. However, the first approach also has its own niche of problems. By and large, these are the tasks that cannot be solved using the technological functions of the drive. First of all, these are drives with a hardware malfunction, and there is no way to determine the damaged element, or the nature of the damage excludes repair. And it is recommended to get down to business only if you already have successful experience in recovering information from a similar SSD model, or if you have information about the solution. You need to know what you will encounter: whether encryption or scrambling is used, what XOR pattern is most likely used, whether the translator format is known (is there an image collector). Otherwise, the chances of success are low, at least It will not be possible to quickly solve the problem. In addition, heating negatively affects worn-out memory chips, as a result of which additional bit errors may appear, which, in turn, can bring their own fly in the ointment in the future.

That's all for now. Take care of yourself! And may backup protect your data!

And on LiveJournal I repost:

OCZ Vertex series disks have an unpleasant feature (possibly inherent in disks from other manufacturers) that I had to deal with.

Sometimes when the power is turned off (for example, the laptop's batteries run out, or the computer freezes and you have to restart it), these drives are locked with an ATA password. And to gain access to the SSD again, you need to somehow unlock it. This is a bug in the firmware, so it is strongly recommended to reflash all SSD drives immediately after purchase to latest version firmware!
I didn’t reflash my disk, and this is exactly what happened to my Vertex 450 disk - it freezes, reboots the computer, and blocks the disk. As a result, nothing can be done with the disk, not even formatting. Googling on the Internet did not give any good information, it all boiled down to the fact that you can try using OCZ Toolbox, and it might help. Did not help. Even trying to run secure erase in this toolbox did not help at all - the disk does not allow you to do anything with it. The only alternative is to return the disc under warranty, this is a warranty case, and in response to such complaints on the OCZ forum they advise you to simply submit it to the warranty, and everything will be OK. But firstly, it was a shame for me to drag the disk somewhere, and secondly, it was interesting to solve this problem myself (and today, and not someday when they do it under warranty).

What saved me was googling information on the hdparm utility for Linux. How I came across this utility is a completely different story, but that doesn’t matter.

2. Burn the image to a CD/DVD disc.

3. Reboot the computer, disable everything in the BIOS hard disks, if there is more than one SSD, but leave the CD/DVD drive, of course.

3. Boot from Ubuntu disk, select Live mode CD (“Try Ubuntu”).

4. Click on the left top corner button with the Ubuntu logo, enter terminal there, and launch Terminal in the programs found.

5. Enter the command

sudo hdparm -I /dev/sda

6. Read the command output, there will be something like this:

Model Number: OCZ-VERTEX450

We need to make sure that this is the right disk, and this is it. Ok, let's move on.

7. At the very end of the command output we look for this:

Security:

supported
enabled
locked
not frozen
not expired: security count
not supported: enhanced erase
Security level high

We are interested in “locked” - that’s where the problem lies, it should be “not locked”! This means the disk is actually blocked.

sudo hdparm --security-unlock "" /dev/sda

Here "" is two double brackets, there is nothing inside them, it’s like an empty password. I don’t know how it is on other drives, but on the Vertex 450 I ran an empty password.

9. Again sudo hdparm -I /dev/sda
We see:
Security:
Master password revision code = 24519
supported
enabled
not locked
not frozen
not expired: security count
not supported: enhanced erase
Security level high
Everything is ok, “not locked”!

10. Now we disable security (so far we have only entered a password to gain access), so that after a reboot everything will be fine:

sudo hdparm --security-disable "" /dev/sda

11. Now download the OCZ Toolbox utility and use it to update the SSD firmware: http://ocz.com/consumer/download/firmware

Under Ubuntu, this is easy to do by downloading the archive for Linux from the link above, unpacking it to your desktop and entering the command:

sudo ~/Desktop/OCZToolbox

The firmware update should be successful, and then booting the computer with this disk should also be successful, and everything should work without problems. Work - 10-20 minutes!

December 3, 2011 at 12:32 pm

Recovering a damaged SSD

Computer hardware

I thought that suddenly this topic would make some of the owners of an SSD device think about backup, some about a generally more cautious attitude, and some would save others from communicating with the not too hasty support service. Everything written does not apply Only to devices of the same series and manufacturer that I have.

About 10 days ago I happened to leave a laptop with a battery in critical condition without charging overnight. I'm not too worried about battery life, but the blow came from the other side. In the morning, when I plugged in the laptop and turned it on, I was surprised to find that:

In BIOS the hard drive was detected. Grabbing the one at hand Ubuntu Live CD on flash and armed with the command line, I got ready to debug.
It’s worth saying right away that in cases of such failures, it would be more convenient to use some kind of Data Rescue Live CD, with diagnostic utilities already installed, instead of a completely unnecessary office suite, but nonetheless.

Let's collect an arsenal that will be useful to us:

$ sudo apt-get install hdpam partx smartmontools

Let's see what happened to us:

$ sudo partx -s /dev/sda
partx: /dev/sda: failed to read partition table

So, it seems like you can say goodbye to the partition table.

$ sudo smartctl -s on -d ata -A /dev/sda -T verypermissive
smartctl 5.41 2011-06-09 r3365 (local build)
Copyright 2002-11 by Bruce Allen,
SMART support is: Unavailable - device lacks SMART capability.
=== START OF ENABLE/DISABLE COMMANDS SECTION ===
Error SMART Enable failed: Input/output error

I/O error? The drive doesn't support SMART? This is already some kind of nonsense.

$ sudo hdparm -I /dev/sda
ATA device, with non-removable media
Model Number: INTEL SSDSA2CW080G3
Serial Number: BAD_CTX 00000150
Firmware Revision: 4PC10302
…
Configuration:
Logical max current
cylinders 16383 16
heads 16 16
sectors/track 63 63
-
CHS current addressable sectors: 16128
LBA user addressable sectors: 156301488
LBA48 user addressable sectors: 156301488
Logical Sector size: 512 bytes
Physical Sector size: 512 bytes
device size with M = 1024*1024: 76319 MBytes
device size with M = 1000*1000: 80026 MBytes (80 GB)

Yeah. You can notice that the number of conditional SSD cylinders has dropped by 10,000 times and according to desktop Gparted, the size of the hard drive is 8MB (I confess, it was not saved in the logs console command and her conclusion for viewing this disgrace, please take my word for it). Serial number is missing and instead of it BAD_CTX something is there. Okay, you understand the symptoms, you can contact search and support. Indeed, it turns out the problem is far from isolated, but, alas, I’m the only idiot with Linux.

In short, for those unfamiliar with the language and the lazy, forum users talk about the widespread susceptibility of all Intel SSDs to such a bug, especially affecting the 320 series and X25M. There is news about firmware 0362, which is designed to get rid of this particular bug, but the number of requests from people is already This firmware with the same symptoms indicates that the problem has not been resolved. Yes, the best solution is. in this case It would be nice to send the hard drive back to Intel so that they have an incentive to correct their mistakes.

Unfortunately, Intel support is not very prompt, and replies about once a day, obtusely on technical issues, and highly recommending installing their SSD Toolbox to determine the problem. I would like to separately note that the main segment of SSD users are owners of MacBooks, who, like me, have difficulties installing software under Windows. It is worthy of special mention that this tool, designed for identifying faults, requires:
- Java
- .NET 3.5
- Windows Media Player Redistributable 11
which makes installing it on a computer loaded from a Live CD an almost impossible task (firstly due to capacity limitations on the virtual hard drive, and secondly due to the fact that WMP 11 requires Windows authentication verification, which only creaks and groans some particularly outstanding personalities are produced in Wine.
Warm greetings to the developers of this software.
I miraculously managed to explain the situation to support, and they agreed to a replacement, but for a replacement I need to fill out an incredible number of forms, to which I also need to attach confirmation of my purchase of the device. As fate would have it, I am now ten thousand kilometers from home, and I did not expect such a catch.

Fortunately, everyone on the forums clearly says that the contents of the disk cannot be restored, but that it is possible to restore functionality. And the time that was spent on correspondence with the support service was not wasted, but was usefully spent reading forums and experiments, the brief results of which are given here.

It is necessary to restore the number of cylinders, returning the treasured 16383.
For this operation we will need two commands, launching both is difficult to protect against fools and pests.

We set the user and password for master operations on the disk.

$ sudo hdparm –user-master user –security-set-pass abc /dev/sda

Next, we need to unlock an extended set of ATA commands, in particular secure-erase, which are blocked when the system boots. There are several ways to do this, one of which is to turn the external box off and on. I didn’t have an external box, but sending the laptop to sleep and waking it up miraculously worked.

The following commands do some safe cleaning, I launched both, since I wasn’t sure which one would be needed. Before each, I started setting a master password and closing the laptop lid.

$ sudo hdparm –user-master user –security-erase abc /dev/sda
$ sudo hdparm –user-master u –security-erase-enhanced abc /dev/sda

Now this is not for the faint of heart. Reset disk settings to factory defaults. To run the command you need another key, which will tell you command line, and to clear my conscience I will not present it here, just as I will mention that in the documentation this command is marked as ESPECIALLY DANGEROUS and DO NOT RUN.

$ sudo hdparm --dco-restore /dev/sda

Total conclusions that I made for myself:
- keep a Live CD handy
- do not leave the laptop completely without power at a critical charge
- make backups, including keyrings and lists installed packages, configs and rsa keys
- update the firmware (after you have found out that it definitely works well)
- take care of your nerves

I would also like to note that this method does not always completely restore functionality, and that sometimes the disk remains glitchy and slow.

Once again I send my warm regards Intel support, and inform them that I still cannot log in with my username and password to their community in order to publish this miraculous recipe with them, and remind them that I have been waiting for a week from them for at least some answer, why can’t I do this.

In the next topic I will tell you about interesting statistics on SSD deaths, returns, repairs and errors in work by manufacturer and model.

PS Dear Habr, please correct the display of the “code” tag.
PPS Found it by chance