Just an aside: if your backup images are important to you, you should never compress them, especially with gzip. A single bit error in a gz file makes it just so much junk bits. Or with the program gzip recovery you might get something resembling your original data but that might still be useless for a file system image.
If you really have to, bzip2 uses blocks so you just lose the block with the error. Block oriented backup apps like fsarchiver and partimage do likewise and they usually have the smarts to just leave out parts of the filesystem that aren't in use so you get less data that way.
So, instead of compression, I'd propose generating error correction data with par2 for your images.
Your ECC hard disk does you no good when the file is corrupted DURING TRANSFER, which is the scenario under discussion. As the OP stated, there is little hope of recovery if the file is compressed. With an uncompressed (or block compressed) file, at least you can recover important data. Even with an ECC hard disk, RAID, or whatever scheme you choose, you can STILL EXPERIENCE BIT CORRUPTION! The ECC presumes that the data stream written to it was uncorrupted to begin with. That is far from a reasonable assumption - if you are copying files over a net connection (TCP delivery is NOT guaranteed), buffers can overflow/underflow, pipes can fail, any number of things can happen outside the control of the destination file system. Critical data should NEVER be compressed with an algorithm that cannot be reversed in the presence of bit corruption. Period.
All of the above transfer methods have checking algorithms in place that guard against single-bit corruption.
And again, I prefer a completely garbled image over a single bit corruption any time. Nothing worse than an error that isn't obvious.
Currently I am using a computer without ECC memory, but after some experiences with memory errors in the past years I am pretty sure that the next one will have ECC RAM again. (previously I always made sure there was at least parity checking)
Pardon me, I accidentally sent that before I finished typing.
The underlying IP layer which TCP sits on top of has no guaranteed delivery. But that's beside the point - a failure can happen at multiple endpoints and in multiple failure modes (missing data, single bit errors, bit reversals, double bit errors, etc.). The ECC of a hard disk plays no role in ensuring data integrity until AFTER the data has been written to disk. It cannot detect whether or not that data had been corrupted ENROUTE to the hard disk.
And w.r.t. your other remark: such checks have been added to other places as well. For example, the transfer from the motherboard to the disk used to be unchecked, but since the introduction of SATA there also is a CRC check on that connection.
I am with Rob in not wanting an almost correct file.
Rather than using par2 error correction techniques, I would create the uncompressed image then checksum it before gzipping. I can then unzip it and recheck which would validate the gzipping. I could go further and check the unzipped file byte for byte against the original with a re-read, or perhaps just read the original source into a checksum program. In that case I don't need the initial intermediate uncompressed image.
A further thought is to use tee when reading the initial file to create the checksum while creating the gz file.
Alternatively: use rsync to maintain uncompressed backups.
This is fast because rsync only does the minimum work needed to make the backup an exact copy of the filing system being backed up. This means that the first backup is slower than making a dd copy or using tar, but subsequent backups are a lot faster because unchanged files are not copied: deleted files are removed from the backup and added or changed files are backed up again.
My main backups, which are kept off line are done this way - I back up all my systems (all Linux - house server, laptop and RPI) to a USB drive mounted on the house server and kept offline on a firesafe. There are two backup disks which are used alternately, so there is a backup copy in the firesafe at all times. If you're just backing up an RPi, the same trick will work just as well with two or more SD cards and a USB SD card reader.
rsync is a standard Linux utility program and is part of the Raspbian distro. You can use it 'bare', but backups are easier if you run it from a bash script which mounts the SD card, runs rsync configured as you want it [*] and then unmounts the SD card.
[*] rsync has a lot of potions for excluding files, etc. You may also want to use separate rsync runs for each partition on the main SD card.
martin@ | Martin Gregorie
gregorie. | Essex, UK
I use rsync to back up data using a script, and as you say it is very efficient. I have a NAS which uses 2 2TB drives Raid 0 and I back this up once a day to a 2TB drive on my PC. Might sound a** about face but the NAS is also used by other machines. I also back up online to Amazon S3 using a set of bash scripts I wrote.
But for my PC systems I use system images - I like to make a complete image and keep it intact as a historic copy, not updating it. I figure systems only change gradually so occasional copies are fine. Restoration from a backup doesn't need much updating and it's an automatic process anyway, whereas for personal files, updating old copies is a manual operation and damn near impossible to find the incremental source info anyway, so you do need very frequent updates, preferably with means to get back to any older version. I used SpiderOak for this for a while because they save older versions, but their Linux support was non-existent when I had an issue. Wuala was even worse. Which is why I wrote my own. If an old version isn't on my current local backup I can find it online.