"scrubbing" below the FTL

D

Don Y 12 years ago

Hi,

When implementing a persistent store in FLASH technology, how do you "scrub" below the FTL? I.e., is this even possible without windows *through* the FTL? Or, support for a "scrub" operation exported by the FTL?

Building the example from more conventional (e.g., magnetic media), ideally, any "sectors" (lets ignore blocking for the moment) that are written in a file are effectively scrubbed *by* the write operation itself. I.e., the new data overlays the old.

An "OS" can ensure this remains the case even if the file being written doesn't completely "fill" some integral number of sectors (pad the balance with zero and/or random noise).

Similarly, it (OS) can protect against an application leaving the tail of a file "exposed" (e.g., by explicitly overwriting and portions of the PREVIOUS FILE CONTENTS that have been snipped off the tail -- through a truncate() or operations on the directory itself (change file length).

So, a careless *or* clever application can't see "leftovers" from a file as it existed in an earlier state (perhaps before the current actor had permission to look at it!)

Or, if that particular "block" on the medium happens to eventually end up being reused in some *other* file for which another actor has full rights, etc.

In the degenerate case, you can "wipe" the entire media to be sure everything is "gone". Every "sector" is explicitly accessible. You could craft an algorithm that scrubs a single file, all free space or the entire medium. Predictably!

Now, throw in the FTL and the management functions it adds. When I write a "sector", in reality, I may be writing a different physical sector chosen by the FTL based on wear-leveling criteria, etc. The "stuff" in the sector previously still exists -- the FTL has just moved it aside and found a "better" sector for me to use for this "new data". I.e., I can't force *that* "old" sector to be overwritten!

I.e., how to COTS SSD users scrub *portions* of their disks? (presumably thermite works when the object is to scrub the ENTIRE disk!). Is there a "srub unused areas" API? Or, "scrub the blocks currently mapped to this sector"?

Thx,

--don

Vote

M

mroberds 12 years ago

It's tricky. About the best you can do, absent special support, is to create enough files (full of nulls or random data or whatever you think will work best) to fill up the rest of the SSD (as seen by the OS), and then delete them. This won't catch any "spare" sectors that the FTL uses, and has other problems, but it should get most of it.

UCSD has done some research on this - see

formatting link

and in particular

formatting link

, where they wrote known data to drives, tried various "erase all" strategies, and then pulled the flash chips off of the drives and checked them to see what was still there.

Then you have to trust the FTL. At least one of the drives tested in the paper above said it had a "secure erase" function, so they invoked it, and the drive said "OK, done". On testing the flash chips, they found that the sole effect of the "secure erase" command was to say "OK, done" - all the data was still there!

If the stuff that's on the SSD got there by traveling through a major Internet backbone in the United States, even using thermite on the SSD is probably futile - No Such Agency still has a copy of the file. :)

Matt Roberds

Vote

M

Martin Riddle 12 years ago

Wait, I thought Flash can only be erased in blocks. Dosen't the Current block be set aside and a new block used for the new data? THe old block is then erased at some point in time when it is needed ge; no more free blocks or when a TRIM is sent for SSD's. I guess IF you had low level control and manage the blocks yourself you can erase them.

Cheers

Vote

R

Robert Wessel 12 years ago

Basically yes, but if you can force the equivalent of a garbage collection or compaction action on the SSD, all of the "old" sectors will end up in erasable blocks. TRIM doesn't help this*, it's just a quick way to release a now unused area of the disk.

*There is a good case to be made that TRIM doesn't really help at all**, and in the few scenarios where it might, a tiny bit of smarts on the controller and a write of a zeroed sector from the host would be just as effective. The widespread buggy implementations of TRIM (now considerably better) didn't really help the case either. **For example, in many cases the TRIM is followed in fairly short order by an actual rewrite of the sectors, which makes the TRIM just pointless overhead.

Vote

D

Don Y 12 years ago

-------------------------------------------^^^^^^^^^^^^^^^^^^^^^^

That's the essence of my question -- what should that "special support" look like in the FTL (which I will have to write).

Then, how to present those "scrub" semantics to the file system API (i.e, *through* the OS) to, ultimately, the application & user.

E.g., adding an "fscrub(2c)" seems heavy-handed...

It won't deal with over-provisioning. And, it also doesn't ensure the portions of the media associated with file system *housekeeping* get scrubbed! (how do you know when space set aside for directory structures has effectively been scrubbed, etc.)

And, to deal with scrubbing *one* file, it seems like overkill: fill the medium -- so there is no place for the file to hide (i.e., be relocated to by the FL!) -- then erase the file and fill the space this has created. Then, *free* all of this space, etc.

I.e., I could easily implement a "scrub" function on a magnetic medium and do so without buggering the standard API. And, for a modest cost. But, with an FTL interposed, that becomes problematic -- even if I write the FL (because the FTL has a different goal as is essence: it wants to KEEP me from accessing the same physical block "again")

Ah, OK. I'll give them a read. But, that basically tests how well drive manufacturers have implemented their FTLs? They typically are more concerned with *retaining* existing data and increasing media longevity (esp wrt SSD's!) than ensuring erasure! :<

I had to reread this a couple of times as it didn't make sense the first few attempts. Now I see that it actually *doesn't* make sense! I.e., as if the controller manufacturer's "hadn't got around to" implementing that feature -- yet! :<

I'm not worried about spooks or folks microprobing die, etc.

Rather, I am concerned about "sleight" in OS/application implementation that lead to unintended data "leakage" (via exploits). And, given we tend *not* to be talking about "mainframes" here (where the hardware resources tend to be physically access controlled) it is far more likely that a user ("possessor") of such a device may be inspired/motivated to see what remnants of data remain accessible after they were *thought* to have been "erased".

[E.g., we would regularly encounter donated computers with personal information still accessible (i.e., not even "erased"). And, very few devices that had been *securely* erased (i.e., that you couldn't easily recover with OTS tools). OTOH, some devices arrived as if they had been subjected to strong magnetic fields (servo tracks on media were apparently absent) while others showed evidence of *holes* drilled through the platters (i.e., a forensic lab might be able to recover portions of the data but no one "less motivated"!)]

If the medium is always accessed via the "intended" software, then it (devoid of exploits) can refuse to expose "erased" data -- even if the media hasn't actually been scrubbed!

OTOH, if the "current possessor" can install other software that ignores those "conventions" (e.g., root-ing a device), then anything not ACTUALLY scrubbed is vulnerable.

E.g., you might want to be able to leave the vast majority of a medium's contents "as is" -- the software that gives the device its functionality, for example -- yet *scrub* (not just "erase") any user-specific data from the device before passing it on to another owner/user (who will be able to do whatever he wants with that hardware now in his possession!).

It would be nice if applications could "register" data that is of a sensitive nature (e.g., "My Documents") so that a mechanism (or even a CONVENTION!) could be used to ensure all this stuff goes away when intended. I.e., without having to purge (thermite) the entire medium!

[Imagine how you would handle donating a computer with a SSD to an unknown "new user" -- one who, perhaps, buys them allegedly for their "scrap" value but, in fact, is "data diving" and looking to score big on someone else's ignorance ("Jackpot! An unmolested copy of last year's tax return..." or "Ha! He *thought* he had 'erased' it... little did he know how easily *I* can recover it given the tools in which I've invested!")]

Vote

D

Don Y 12 years ago

Yes. A flash device consists of some number of "Blocks" each of which contain some number of "Pages" -- of some "Size".

Reads and Writes happen in units of Pages. Erases happen in units of Blocks.

Note that the file system may choose to deal with units of "buffers" (trying to avoid using the term "blocks", again) or "sectors", etc.

And, an application may be interested in "records" of some other size entirely.

I.e., mapping between what the application wants to deal with and the constraints imposed by the hardware requires dealing with several different sized "units" in any given transaction (as well as possibly requiring several different *actions* therein!)

E.g., page sizes of 512B, 2KB, 4KB, etc. are common. Block sizes can range from 16KB to 1MB or more! Erasing is, thus, very "precious"!

(assuming you mean "Block" in the above sense and not "block" in the more traditional sense of "512B" sense...)

No, because there are lots of Pages in that Block. Instead, the Page(s) in question are "discarded" (marked as dirty) and new Pages are found to hold the new data. References (pointers) are updated to reflect the moved location of the new data (in the "original file") and the Pages holding the stale data are remembered as ready to be collected.

Of course, if there are other Pages currently "in use" in the Block that contains those Pages, the Block can't yet be erased (well, it *could* but you would first have to move the "in use" Pages to some other location -- possibly even *here* after the erase is complete -- before you could perform the erase)

Keep in mind that you are also trying to keep track of how many cycles a particular Block has experienced so you can spread the "work" around the die. (you also have to watch the number of *reads* encountered in "nearby" Blocks as they can corrupt the data in *your* Block, etc.)

And, you also have to remember that not all writes require a "new Page"!

E.g., if a particular Page happens to be used at the end of a file, it will typically NOT be fully utilized. File_Size % Page_Size bytes will be used in that Page with the rest "unused".

But, you have to specify *some* value for all of those bytes because you have to write the entire page in one operation!

(assume '1' is the erased state of the memory cells)

One approach is to pad the balance of the Page with 16rFF. This allows a subsequent *append* operation to OVERWRITE the contents of the same Page without forcing all the above mechanism into place! I.e., the first File_Size % Page_Size locations in the page are unchanged so they are rewritten with their same values (safe bet). And, some number of the locations *following* this that had been previously written as

16rFF now see new values -- which can overwrite the "FF"s. [Keep in mind that you could conceivably have part of the Page holding one file while another part of the Page holds another(s)! It boils down to how finely you want to manage Pages, Blocks, subPages ("buffers" in the above), etc. And, the usage pattern you expect to see! So, a "Page Update" -- like the append I described -- is really: copy old Page contents into memory, modify contents taking care never to cause a 0 bit to become a 1 bit, write Page back into store.]

Now, look at the opposite of the append -- a truncate! Lop the end off the existing file. Here, some number of the File_Size % Page_Size bytes in the original last Page of the file will be discarded.

The *cheap* way to do this is just to update housekeeping to claim the truncated file size as the *new* file size -- i.e., ignore those bytes beyond the *new* end of the file (even though their previous values remain intact -- they just can't be "seen" if everything behaves!)

You can keep truncating in this way until the file's size is "0".

However, any attempts to append to a file of this sort will require the last page to be discarded -- becuase the appended data won't (typically) coincide with the OLD data that is already present there and a fresh Page will be required.

Another option for truncate is to overwrite the tail end of the data with "0" -- a safe bet regardless of whatever data may have been there! This costs more that the first scheme because a Page update is required. But, the data-to-be-discarded (truncated) that *was* there is definitely *gone*! The same "subsequent append" problems exist.

Yet another option for truncate is to create a new Page, copy the data that is to be preserved into it filling unused space with 16rFF. This is the *most* expensive option (it could lead to a Block erase, etc) but leaves the Flash contents in the same state as if the file had originally been written in this truncated form (i.e., a subsequent append *only* requires a Page update!)

-------------------------------^^^^^^^^^^^^^^^^^^^^^

This can be an ongoing background task or a batch "GC" cycle that is triggered when needed. It's a performance question related closely to usage patterns and the hardware topology of the Flash array (i.e., can I be erasing one Flash chip while still *reading* -- or writing -- other Flash chips in the array? Or, does everything grind to a halt while erasing? etc)

REGARDLESS, until the block is erased, the stale data persists in it!!

Note that scrubbing just has to make the stale data irretrievable. You can ERASE it *or* OVERWRITE IT! As you can always do a Page update with "0" for any portion of the contents, you can safely write 0's to scrub the data *before* enqueuing the Page (and its containing Block) for erasure!

So, you could implement a "scrub" operation that is visible to the application USING THE TRADITIONAL API by adopting the semantics that rewriting an existing file causes the PHYSICAL REPRESENTATION of that file to be rewritten!

Scrubbing files *selectively* becomes a matter of rewind()-ing the existing file and then write()-ing 0's for the current file length. Then, unlinking the file (so it can be GC'ed)

Scrubbing the entire medium would require writing files until full, then doing this for each such file. (housekeeping records need special consideration with this as a goal!)

[But, you have to keep this goal in mind throughout the implementation. You can't retrofit it as things can go wrong in pats of the process that shouldn't interfere with that goal. And, you need a way to report any failures in this (fwrite error?) so the application can sort out how it wants to react -- reread the file to see what *didn't* get scrubbed? etc.]

It seems like avoiding changes to the API would be desirable (?) Exposing low level functions opens a whole new can of worms (how do you ensure the actor isn't scribbling on a portion of the media that is currently "owned" by another application?"

I'm just not sure that you can handle all the potential errors that can creep up *in* the FL in a way that you could recover from or intelligently inform the application.

[And, there's still no way to deal with overprovisioning!]

Vote

D

Don Y 12 years ago

This can be very expensive and actually *increase* wear on the device! It's akin to folks who think they "need" to defragment their disks

*daily*!

To recover as many blocks as possible, any data present in *partial* Blocks (where some other portion of the Block is "dirty") have to be harvested and moved elsewhere in the store. This could end up moving a *lot* of data (depending on how aggressive the implementation is and the granularity with which it manages the store). I.e., hindsight could reveal that some Page could have remained where it was indefinitely -- or, may have been deleted in the immediate future instead of having to *move* it NOW (incurring that time penalty AND the wear that it places on its "new location") just to free up its containing Block.

[What happens if you can't create a reliable *copy* of it! *gasp*]

I think it would be hard to design an effective/optimal Flash management algorithm without some idea of usage patterns and/or

*informed* control "from above" (like an IT department *knowing* when defragmenting a disk will yield results in excess of its cost!)

Vote

M

Martin Riddle 12 years ago

I would imagine that rewriting a file of the same length generates new pages in the block and blocks at the hardware level. Tho, the logarithms may be written to allow an overwrite vs creating new pages and blocks. Only the vendor/chipset mfg would know. An SD flash card may behave more like a HD depending upon the mass storage driver. (originally I thought you were referring to Flash SD card's)

Yea, over provisioning means that the hardware is now in control of what part of the flash is used. I imagine the flash is pretty fragmented after a few 100Mb of writes.

I've had a on going experiment with a plextor 256mb SSD on a IDE adaptor. The latest GC seems to work well with out the sata ssd support, but I am over provisioned by about 60Gb.

Cheers

Vote

D

Don Y 12 years ago

If I had to *guess*, that's what I would assume! Most of the time, you are writing something different and *NOT* "0000...." so it seems most expeditious to toss the original pages and start fresh.

Ideally, one would look at the new contents for the Page and see if they are "compatible" (i.e., no 0's being turned into 1's) and, if so, overwrite the old with the (compatible) new.

In reality, this would be very expensive and seldom (?) pay off.

OTOH, it wouldn't be as hard to verify that a new pages consists

*entirely* of zeroes and, as such is compatible with *any* existing Page contents -- "update in place".

Exactly. OTOH, if you're buying raw Flash chips, you're already exposed at a much lower level than most designs!

No, sorry. I was just talking about Flash, in general (i.e., as a technology). Prepackaged Flash devices already have a fair bit of hand-waving going on inside the devices.

I think it would depend on the nature of those writes compared to the granularity of the "managed units" within the store. I.e., if you're always writing block sized objects, then management is relatively trivial! :>

Yeah, the scrubbing issue means that 60GB is a "challenge". Unless you know how and when it can be *called* into play (so you can get your paws on it!), you're just shooting blind. Esp if the algorithms managing it are REALLY GOOD or REALLY BAD! (assuming they are implemented correctly!)

Vote

M

mroberds 12 years ago

Even if flash chips are free, if you spend more than about a day writing an FTL, you'll lose money vs. going to newegg.com and buying some SSDs. If your requirements are so unique that you can't use a COTS SSD, then knock yourself out. :) I know that if you buy (or intend to buy) enough flash chips, the "name" vendors (like SanDisk) have applications support people who have probably seen these questions before. Delkin builds industrial SSDs in the USA; I haven't dealt with them but I know they exist, and perhaps they might be someone to talk to as well.

ioctls are the traditional "uh... I need some way to set that that

*doesn't* look like writing to a file" on Unix. I agree it would be a little clunky, but I can't quickly think of a better way. Things like shred(1) are implemented on top of the existing filesystem API, rather than as part of it.

Pretty much. One of the questions they were asking was "how do I, as a user, 'secure erase' an SSD, even with the FTL possibly working against me?"

It makes perfect sense to me, having seen (and heard of) the firmware in various SSDs and rotating drives doing odd things.

I've seen that before. At a previous employer, we got some used PCs that came from a business that was owned by one of our investors that had been closed down. There was stuff still on those hard drives that should not have been there.

This is also a good argument for putting your OS on one drive/partition and all your user data on another drive/partition, although it's not easy to make Windows completely respect this split. On Unix, if /home is on its own drive or partition, scrubbing that goes a long, long way towards removing all the user data.

Part of this is driven by the cost of SSDs. Hard drives used to be this way; they were expensive enough that taking one out of the machine would make the machine hard to sell or donate, so people wanted to wipe them without damaging them, so they could still sell the complete machine. High security users didn't care; they'd scrap the HDs and accept a lower price when selling the machine. Now, hard drives are cheap enough that many more potential customers will accept "here, you can have this computer (or buy it cheap), but you'll have to bring your own hard drive." Once SSDs get as cheap as hard drives are now, this will probably happen again; people will keep the old SSD or have fewer qualms about thermiting it to ensure security.

Matt Roberds

Vote

S

Stefan Reuther 12 years ago

Thanks ECCs, generation counters, compression, and what else the FTL may apply to the user data, the chance that this happens would be so close to zero that it doesn't even make sense to think of it.

Plus, we're talking about NAND flash normally, which does not like reprogramming pages at all.

If I were making an FTL and entirely-zero pages were a frequent use-case, I would invent just a special marker in the index data to mark "this user data page is entirely zero" without allocating a flash page at all. Much like a sparse file.

I believe typical FTLs already support that when reading (at least, mine do), to make a totally-unformatted device appear all-zero to the upper layers.

Stefan

Vote

D

Don Y 12 years ago

Exactly! Or, to notice it when it is *created*, instead (look at it *once* instead of repeatedly).

That's not entirely true. For MLC technology, NOP=1. But, for SLC -- especially with support for subpages -- NOP=4 is common. (I thihk some devices N=8)

As I said originally, it boils down to how fine-grained you want to manage the device (and how fine-grained your *application* requires that management!).

In the future, I suspect you will see limited forms of "reprogramming" supported on MLC devices -- with restrictions/

Keep in mind that cells in an MLC device don't really encode two

*separate* bits. Rather, they encode FOR LEVELS that tend to be *interpreted* as two bits. There is an important but subtle distinction.

If you adopt the "two bits" interpretation, then (given that 0's can't be rewritten as 1's) the possible state transitions are:

11 (erased) -> 10 -> 00 or 11 (erased) -> 01 -> 00 or 11 (erased) -> 00

(sorry, easier to write this out than try to fabricate an ASCII-art drawing!)

I.e., once you have left the erased state, you have at most one possible change that can be made to the value represented before you end up at the "00" dead end (requiring erasure for furher changes).

OTOH, if you look at the cell as a four-state device, the transitions are more numerous:

11 (erased) -> 10 -> 01 -> 00 or 11 (erased) -> 01 -> 00 or 11 (erased) -> 00

I.e., you can (conceivably) change a 10 into a 01 -- not possible if you interpret them as individual bits! If you use such cells to track the location of blocks/pages/subpages in memory/blocks/pages and always expect to move them in a fixed direction... etc.

While it seems like a pittance, consider how you would implement a large counter using each of these schemes -- without erasures! I.e., you can only represent 3 states with "2 bits".

[Of course, you still have to worry about program disturb errors. But, for a specialized controller, that can be factored into the design so it *knows* where the disturbances manifest, ahead of time, and can be proactive in their repair and/or avoidance!]

I suspect controllers will learn to exploit the actual geometries of newer MLC devices to manipulate gate voltages instead of "binary states" in the future. (consider the advantages for TLC and X4 devices already in production!)

I suspect much of this will get buried inside eMMC. OTOH, we may find more technology standardization following along similar historical trajectory of DRAM, EPROM, etc.

For example, early on, you built a DRAM controller out of SSI/MSI. Then, LSI controllers came on the scene. Then the controllers migrated

*into* the processors. (Imagine designing a DDR2 DRAM controller out of MSI today???)

Ditto EPROM going from 3 supply voltages to one. From very high programming voltages to ISP.

Perhaps, someday, FLASH *components* will be standardized enough for (third party!) vendors to offer silicon solutions to implement each of these exploits. (alongside eMMC solutions)

In the scenario I'm discussing, you can't say, for sure, how common a "scrubbed" page will be. Or, if it will be scrubbed as a whole -- or, incrementally, in parts.

The presence of an "opaque FTL means the designer is at a loss in determining what to expect from that FTL.

Vote

C

Chris Jones 12 years ago

Rather than trying to scrub entire drives these days it might be better to use full disk encryption, and when you hand on the drive to the next user, just withhold the key.

I'm not sure that there is much security difference between magnetic disks and SSDs, provided the attacker doesn't probe the chips on the SSD and provided the SSD doesn't have backdoors around the FTL, and maybe also provided you write random junk to all logical sectors before installing your OS. If the user later overwrites a logical sector that contained secret data sector with zeros, the SSD can't subsequently show anything other than those zeros at that logical sector, and the original data shouldn't show up at another logical sector because the drive doesn't know that you didn't want whatever different stuff (e.g. random junk) you wrote there before. I guess the only way to get at the data would be some hardware probing or firmware hack to get around the FTL.

In the case of donating a computer to untrustworthy mere mortals, I wouldn't worry about wiping the drive, provided I had always used whole disk encryption and didn't give them the key. This would save a lot of time in the donation process, though I keep computers that are much older than anyone would accept as a donation so it is more relevant to me when drives fail or get lost.

Recently I have begun to wonder how much I would trust the built-in encryption provided in SATA drives, if someone with really serious resources wanted the data. There are only a few drive manufacturers, and there are surely some people who might lean on them or offer them incentives to not make the encryption quite as good as it might otherwise be. Those same people are probably just about their biggest single customers too, so they might have leverage that way... There are probably only at most a few people in the world who go to the trouble of verifying the security (and lack of backdoors) of the encryption built into any model of drive. Had I any really interesting data, I would like my chances better with open source software full disk encryption since that might be getting some independent scrutiny. There would be nothing wrong with using both, provided you use different keys.

Chris

Vote

J

Jan Panteltje 12 years ago

On a sunny day (Mon, 25 Nov 2013 23:59:42 +1100) it happened Chris Jones wrote in :

Does not take that long to do: dd if=/dev/random of=/dev/sda

**** PLEASE DO NOT DO THIS TO TEST IT ;-) ****

Or from /dev/zero if you do not like noise.....

It won't stop a professional attacker to read your harddisk though. rewrite always leaves some data side-tracks.

best is to drop the thing in an active volcano, but make sure to hack the drive to pieces before that, YNN

Vote

D

David Brown 12 years ago

Don't worry so much - it won't get very far. Reads from /dev/random will block when it runs out of entropy so you won't lose more than a sector or two of the drive before you run out of patience. Reading from /dev/urandom will go a lot faster, however.

This will work on a harddrive, but not necessarily on an SSD. In particular, many SSD controllers will compress data - and a write from /dev/zero is very compressible. This will leave a lot of old real data on the flash chips (which are difficult to read directly, but certainly not impossible). Writing from /dev/urandom will give incompressible data, but you are never guaranteed to write over the overprovisioning on the SSD - so some real data may remain on the SSD.

But if you assume the new user of the drive is a "normal" person and is not going to spend significant sums of money scraping data off the disk, then copying from /dev/random is fast, simple, and pretty secure.

On the other hand, a "secure erase" from the manufacturer should be faster, simpler, and just as secure. (Note that the "should be" - it assumes a reasonable implementation.) "Secure erase" on an SSD should simply mark all used blocks as garbage, ready to be erased and re-used.

Some disks implement transparent encryption on the disk. "Secure erase" simply erases the encryption key, and is thus very secure and very fast. (And on an SSD, it will also mark all used blocks as garbage.)

This is a myth. Once data on a harddisk is overwritten, the old data is gone without a trace. The myth is kept alive by people selling totally useless software to overwrite disks multiple times.

Any data that is written to a sector that is later re-located will still be readable with professional disk recovery equipment. But such relocations are rare - perhaps a dozen sectors on an old, heavily used disk, and only exist because the sector cannot be read properly. So it is only an issue for the most paranoid of users.

That would work, of course, but it's a little impractical. A screwdriver and a hammer will do a perfectly good job if you need to be sure.

Vote

J

Jan Panteltje 12 years ago

On a sunny day (Mon, 25 Nov 2013 15:04:38 +0100) it happened David Brown wrote in :

You are 100% right.

Not so sure, I did read a quite professional article many years ago, with pictures, data, examples, they made the tracks visible too.

:-)

The whole issue with FTL (not Faster Than Light) is a very interesting one.

An other I have found is burning DVDs, where accidently sectors my be copied to disk that actually contain private data (within your movie file).

Vote

D

Don Y 12 years ago

That would work for the "PC" example -- if the drive could still be reused with a "changed" key (i.e., previous contents obfuscated but NEW contents -- under the new key -- accepted). So, the original owner need only remember to "erase" the key from the BIOS screen prior to donation (for example).

[this assumes key is recorded in NOR Flash/BBSRAM or some more readily "eraseable" medium!]

But, for a *product* in which much/all of the product's functionality is embodied in the "software" contained on that drive, you've now rendered the entire device "useless". Imagine if you could erase the FLASH in your cell phone "just to be safe". The phone now has very little value as a paperweight!

[PC's, of course, are ROUTINELY reinitialized "from scratch" -- as damn near every MS "problem solution" is either: "reboot" or "reinstall windows"! :-/ ]

The problem is that you have no way of *knowing* that you have overwritten "all logical sectors"! You've just overwritten the Flash blocks that were currently *mapped* to the set of logical sectors! On the very next write (actually, also true of READS depending on where a particular flash block happens to be in it's "read count"), an entirely new Flash block may be called in to replace the one that you are overwriting -- so the "overwritten" data has really just been "moved aside" for the time being.

[In an SSD, without probing chips, you probably are safe as the drive's controller acts as a gatekeeper to the FLASH contents. OTOH, if there are bugs in the implementation *or* ways to upgrade that controller's firmware, all bets are off]

Note that the same problem is also present with modern magnetic disks! Any drive that does bad sector management can, conceivably, take a physical sector holding something that you want to clobber and shuffle it out of the "active" sector pool -- "never" to be used (referenced) again. Replacing it, instead, with a "spare" (physical) sector.

So, if you can coax the drive's (internal) controller to reveal that original "bad" sector, its contents (or, portions thereof) can be examined -- possibly (depends on how "bad" it was).

E.g., modern disks now maintain two "defect lists", internal to the drive. These are used by the controller *in* the drive to manage bad "blocks" (sectors) in a manner transparent to the user. When a PHYSICAL block starts to throw "lots" of errors, it can be replaced with a different one at the same LOGICAL "address". The bad (physical) block is then added to the "grown" defect list (as opposed to the PERMANENT defect list created by the drive's manufacturer before the drive was sold)

If you tell the drive to *reset* the grown defect list, then those physical sectors -- ALONG WITH THEIR UNDISTURBED CONTENTS -- are remapped into the drive's logical sector space. Of course, if the sectors are truly defective (e.g., localized surface defects in the medium), then they will soon exhibit the same sorts of errors that got them placed on the GLIST to begin with, etc.

The trick is getting the drive to clear the GLIST *without* scrubbing the contents of those PHYSICAL sectors (often, the GDL can only be cleared in a FORMAT command which can also overwrite the sectors' contents).

You can't be sure you are overwriting the physical FLASH block that contains the data that you want obliterated! The controller (in the SSD) sees that you want to write new data to LOGICAL sector 123. It *knows* that it can't overwrite existing data without an intervening ERASE cycle. So, it moves the Flash block aside and pulls in a "clean" (erased) block into which your new data should be stored. Your "stale" data sits where it was, physically, until (*if ever*!) that block is erased and recycled.

[I am playing fast-and-loose with the term "block", here]

In the case of an embedded device where your *firmware* is managing these blocks (i.e., *you* are the FTL), any new firmware can provide access to the raw FLASH device(s) -- and deliberately decide not to honor the marking that "this block is awaiting erasure".

[it is fairly common for devices to be "rooted" nowadays! if you had stored the password to your secret swiss bank account in your cell phone, "erased" it using the typical features present *just* to delete unwanted data/apps, wanna bet I'd find a way to retrieve it?? :> ]

Likewise, even an SSD can be plagued with these sorts of problems. A firmware update wants to preserve *existing* data. But, that doesn't mean that it can't also leak "erased" data in the process. We would consider this a "bug" in this discussion. But, the drive could be fully operational in this configuration! It *thinks* these sectors are "ready to be used" (even though they contain stale data) and only discovers this NOT to be the case when it tries to UPDATE them (update fails, blocks are marked for erasure, new "erased" blocks -- which may also NOT be actually erased -- are called in, process repeats.... until, eventually, blocks are ACTIVELY erased after which the write() finally succeeds).

You have to remember to *extract* the key from the machine prior to donation. (Most machines with encrypted disks will "remember" the key as a convenience for you -- so you don't have to type it in on each boot). Will you remember to do this? Will your heirs even

*know* to do it? :>

Are you even sure it is *strong* crypto? And, not just something to obfuscate the data for the "masses"?

HP used to have a "Secure Web Console". This was a little network box that let you talk to a serial port from the network (connect the serial port on the SWC to the "console" port on your "mainframe" and you now have a one port terminal server to access that port!). You "talk" to the device via a web page (the device has a little HTTP server built in and exposed to its network interface).

To protect the manage serial port from casual access, you have to authenticate yourself to the SWC with a password, etc. (via the web interface). Once authenticated, your keyboard/display are effectively connected to the serial port on the SWC.

Simple.

Now, think about where this is INTENDED to be employed! I.e., where "commands of privilege" will tend to be executed -- for the purpose of MAINTAINING the "mainframe"/server/etc to which the SWC is tethered.

But, the connection *to* the web server (remember, that's what you are interacting with from your keyboard) isn't "https". "Ah, but

*surely* the traffic is encrypted IN SOME WAY!!"

Yes. A simple XOR of each byte with a fixed constant that doesn't EVER vary (i.e., it's the same on every SWC).

So, not only can a low sophistication eavesdropper monitor the content of this traffic ("Ah, I just saw the letters 'root' so the password will soon follow...") but he can also hijack the connection relatively easily.

Did you remember the name of the product? SECURE web console.

The problem with drive encryption is it takes resources. And, it encrypts *everything*! There's nothing "secret" about the executables for all the applications on my machine. Why bother encrypting and decrypting those? And, for "personal data", if I am not worried about the machine being taken FROM me, then why RISK losing access to that data (if I forget the key)? Really all you want to do is be able to KNOW that it is deleted WHEN you actually delete it!

[Drive encryption also means a partially corrupted sector is effectively a *lost* sector -- in terms of its contents]

Vote

S

Stefan Reuther 12 years ago

That doesn't mean the flash likes reprogramming.

Even if your flash has NOP=4, reprogramming gives you higher bit error rates and thus a smaller safety margin compared to programming the page just once. At least that's what my experiments with NAND flash showed.

Plus, it's logical. There's no counter in the pages which says "hey, you programmed this page already four times, I won't let you program for a fifth time". It's just that someone discovered by experiment that programming four times has a statistically high chance to be able to give some data retention guarantees.

Remind you, data retention guarantees of NAND flash have the form of "98% of all blocks survive for 10 years if you use the right ECC". Which means that 2% of all blocks are guaranteed to die even if you use the right ECC. So you want to know that beforehand, and have some safety margin.

Stefan

Vote

D

Don Y 12 years ago

Of course! It actually isn't particularly happy just *sitting* there, either! Each read introduces disturbances as do nearby writes, etc.

The reason it is so much more troublesome with MLC devices (and beyond) is you are playing with even fewer electrons to differentiate between "states" ("levels" of charge in the cells).

IIRC, 20nm ("current" technology?) gives you about 100 electrons to play with in a cell. (the phrase "How many angels can dance on the head of a pin always comes to mind when I think about things this tiny). For an SLC technology, that's a pretty big margin (relatively speaking). Even allowing for statistical variations between cells, process variables, etc.

Move to a MLC (4 levels per cell) and you're now juggling ~30 electrons (keep in mind the statistical variations!). At TLC, you've got a bit more than a *dozen* electrons to resolve!

Every "disturbance" runs the risk of electrons slipping on/off the gate. When a dozen is all it takes to make a noticeable change, it doesn't take much of a disturbance to MAKE this change!

The "counter" is actually the number of electrons present after each disturbance. And, environmental factors (in addition to operating cycles) come into play -- along with process variation (during fab), etc.

There's no "guarantee" that a *particular* condom won't break. Variations in manufacturing will make some better/worse than others in this regard. And, how you use/abuse them probably also pllays a role! :>

But that's true of everything in our designs! There's no guarantee that he right amount of charge will get injected to a DRAM cell to store a particular "value". Nor that the dielectric for the C on which it's stored is as robust as it *should* be in this region of the die after some particular history of use, etc.

Even semi manufacturers can't *guarantee* performance. They make an educated estimate based on process history, sampling, etc. and *hope* something unexpected hasn't happened AND escaped their scrutiny (leading to product returns, loss of customers, etc.)

Vote

D

David Brown 12 years ago

It's a myth.

If there were any reliable way to get data out of some extra part of the disks, hard disk manufacturers would use it to store more data in the same space. If there were any reliable way to recover data after new data was written on top, hard disk manufacturers would be selling "multi-level" hard disks. And if there were any way of getting a bit of the old data occasionally, using expensive and time-consuming methods, then data recovery companies would be offering it as a service.

Whenever you wonder about something being true or a myth, just think about the money - who stands to make money if it is true, who stands to make money if it is a myth, and who is /actually/ making the money. In this case, it is the people selling software to overwrite the disk surface 35 times - not anyone recovering data from overwritten disks (and these guys happily recover data from disks "destroyed" by fire and other physical abuse).

I too remember reading a scientific article about this a couple of years ago. A team of experts recorded 32 bits (IIRC) onto a piece of hard disk platter. After days of work using electron microscopes and other such toys, they recovered a few of the bits - something like 5 or 6 bits out of the 32.

There have a few cases where people have found extra data copied along with the real data, usually due to sloppy application code. MS Office was (and perhaps still is, I never use it so I can't be sure) notorious for this - deleted text would still be in the saved files.

Vote

"scrubbing" below the FTL

Join the Discussion

Didn't find your answer?