Initialization of checksums

Can anybody let me know what is the ideal value for initial value of a checksum, if at all there is one.Also what is ur idea about checksums that employ simple addition - i mean their reliability??

Reply to
vishal
Loading thread data ...

Nothing is to be gained from any particular initial value, in the general case. Or to turn it around: if you have to worry about the initial value in any way other than possibly disallowing some degenerate cases, the checksum is flawed anyway and not worth bothering with.

Impossible to tell without knowing what kind of fault you're trying to checksum against. Checksumming without a fault model is like treatment without proper diagnosis: it'll just be useless if you're lucky, but may be desastrous if you're not.

--
Hans-Bernhard Broeker (broeker@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.
Reply to
Hans-Bernhard Broeker

In CRCs an intial value of 0000 and a string of bytes with the value 00 at the beginning would not allow you detect a missing byte. So for CRCs a nonzero value like FFFF is usually used. But in a checksum over memory missing byte is probably not a problem.

For memory options are ( 8 bit CPU, code in assembler assumed ):

  • addition: compact code, fast, insecure
  • 8 bit CRC : compact code in slow implementation, insecure not much better then addition, only slower.
  • Fletcher : a compromise in speed/code/security
  • 16 bit CRC : probably too slow or bulky code for fast implementation ; but: good security

Fletcher is worth a look. As for "faultmodel": if one has time to play its educational to fill an array with random numbers, flip a specified number ( 1 - 10 ) of bits with location given by a another random number generator. And watch the different checksums detect or not detect.

MfG JRD

Reply to
Rafael Deliano

Rafael Deliano wrote in news:40F27EA7.167A9797 @t-online.de:

So is it fair to say that with a standard 8 bit addition CRC, any packet sent has a 1/256 chance of being of being read as GOOD, when infact it is BAD?

DaveC

Reply to
DaveC

Not sure what an "addition CRC" is, but I don't think it matters.

Yes. If you randomly change some of the data (including the checksum), you have a 1/256 chance of the resulting packet/checksum combination checking OK. That's what you meant, right?

With a 16-bit checksum (CRC or otherwise), a randomly generated packet and checksum will be check OK with a probability of

1/56536.
--
Grant Edwards                   grante             Yow!  I'm pretending I'm
                                  at               pulling in a TROUT! Am I
                               visi.com            doing it correctly??
Reply to
Grant Edwards

We all know I meant 1/65536, right?

--
Grant Edwards                   grante             Yow!  I'm a fuschia bowling
                                  at               ball somewhere in Brittany
                               visi.com
Reply to
Grant Edwards

Well I was thinking of following up then thought, no it is just a typo that we all make from time to tiem....

--
Paul Carpenter		| paul@pcserv.demon.co.uk
        Main Site
              GNU H8 & mailing list info.
             For those web sites you hate.
Reply to
Paul Carpenter

On 12 Jul 2004 21:20:28 GMT, Grant Edwards wrote in comp.arch.embedded:

I thought it was little-endian! ;)

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html
Reply to
Jack Klein

Simple test i once did: A field with 8 bytes, the last byte is the checksum. The first 7 bytes are filled with random numbers, then the checksum is calculated. 5 variants were used:

  • addition
  • XOR all bytes ( used in chipcards )
  • 8 bit CRC Poly "TM" 8Ch = Dallas TouchMemory
  • 8 bit CRC Poly "PDV" E0h
  • 8 bit CRC Poly "CRC8" 4Dh Then 1 - 8 bits ( column "errors" ) in the 8 byte field were flipped, location given by random number generator. Test was done 1000d times ( "samples" ). But when one is flipping 2 bits it may happen, that the second time one is hitting the same location and so the message isn´t changed at all ( "identical" ) and there is no error to be detected. This leaves the "deteteced" errors and more interesting the "missed" errors.

All 8 bit CRCs aren´t created equal it seems, but i do not see them as beeing much better then addition.

MFG JRD

\ CRC TM Errors=01 Samples=03E8 Missed=0000 Detected=03E8 Identical=0000 Errors=02 Samples=03E8 Missed=0000 Detected=03D0 Identical=0018 Errors=03 Samples=03E8 Missed=0000 Detected=03E8 Identical=0000 Errors=04 Samples=03E8 Missed=0006 Detected=03DF Identical=0003 Errors=05 Samples=03E8 Missed=0000 Detected=03E8 Identical=0000 Errors=06 Samples=03E8 Missed=0005 Detected=03E3 Identical=0000 Errors=07 Samples=03E8 Missed=0000 Detected=03E8 Identical=0000 Errors=08 Samples=03E8 Missed=0004 Detected=03E4 Identical=0000

\ CRC PDV-Bus Errors=01 Samples=03E8 Missed=0000 Detected=03E8 Identical=0000 Errors=02 Samples=03E8 Missed=0000 Detected=03D0 Identical=0018 Errors=03 Samples=03E8 Missed=0000 Detected=03E8 Identical=0000 Errors=04 Samples=03E8 Missed=0010 Detected=03D5 Identical=0003 Errors=05 Samples=03E8 Missed=0000 Detected=03E8 Identical=0000 Errors=06 Samples=03E8 Missed=0009 Detected=03DF Identical=0000 Errors=07 Samples=03E8 Missed=0000 Detected=03E8 Identical=0000 Errors=08 Samples=03E8 Missed=0004 Detected=03E4 Identical=0000

\ CRC CRC8 Errors=01 Samples=03E8 Missed=0000 Detected=03E8 Identical=0000 Errors=02 Samples=03E8 Missed=000B Detected=03C5 Identical=0018 Errors=03 Samples=03E8 Missed=0000 Detected=03E8 Identical=0000 Errors=04 Samples=03E8 Missed=000F Detected=03D6 Identical=0003 Errors=05 Samples=03E8 Missed=0005 Detected=03E3 Identical=0000 Errors=06 Samples=03E8 Missed=000B Detected=03DD Identical=0000 Errors=07 Samples=03E8 Missed=0008 Detected=03E0 Identical=0000 Errors=08 Samples=03E8 Missed=0006 Detected=03E2 Identical=0000

\ + Errors=01 Samples=03E8 Missed=0000 Detected=03E8 Identical=0000 Errors=02 Samples=03E8 Missed=003D Detected=0393 Identical=0018 Errors=03 Samples=03E8 Missed=0008 Detected=03E0 Identical=0000 Errors=04 Samples=03E8 Missed=0012 Detected=03D3 Identical=0003 Errors=05 Samples=03E8 Missed=0008 Detected=03E0 Identical=0000 Errors=06 Samples=03E8 Missed=0006 Detected=03E2 Identical=0000 Errors=07 Samples=03E8 Missed=0002 Detected=03E6 Identical=0000 Errors=08 Samples=03E8 Missed=000A Detected=03DE Identical=0000

\ XOR Errors=01 Samples=03E8 Missed=0000 Detected=03E8 Identical=0000 Errors=02 Samples=03E8 Missed=0063 Detected=036D Identical=0018 Errors=03 Samples=03E8 Missed=0000 Detected=03E8 Identical=0000 Errors=04 Samples=03E8 Missed=0024 Detected=03C1 Identical=0003 Errors=05 Samples=03E8 Missed=0000 Detected=03E8 Identical=0000 Errors=06 Samples=03E8 Missed=0011 Detected=03D7 Identical=0000 Errors=07 Samples=03E8 Missed=0000 Detected=03E8 Identical=0000 Errors=08 Samples=03E8 Missed=000D Detected=03DB Identical=0000

Reply to
Rafael Deliano

I agree with the above, just want to add some...

The CRC's generally guarantee to detect any up-to-three bit errors and any odd-number-of-bits errors if a proper generator polynomial is chosen (go with a standard one). This holds for any distribution of the errors within the checksummed area. They are also capable of detecting any burst error shorter than the length of the crc itself (15 bits for a CRC16). This makes CRC's very good for serial communication lines where occasional single-bit errors or burst errors are expected.

If only single bit errors with a low probability are expected a CRC8 will then outperform an additive 8-bit checksum by far since it will never fail until you get four errors in the same message (with a probability of 1/256) while the additive checksum may fail already at two errors (with 1/256 probability). If the prbability of an error in a message is P=1/1000 the probability of an undetected error using CRC is P raised to 4 times a constant while using the additive checksum the probability is P raised to 2 times the same constant. The difference in detection capability in this example is about one to a million.

If, on the other hand, the checksum is used for memory verification where errors ususally don't distribute in this way but rather in larger blocks of data that are totally corrupted, the above reasoning is not very interesting. Then I would generally recommend a 32-bit Fletcher checksum because of it's simplicity in implementation. In this context it is also generally better than a CRC16 (but still not quite as good as a CRC32...).

Fletche checksum:

int A = 0 int B = 0

for n = 0 to size A = A + M[n] B = B + A

A and B must be allowed to overflow at 0xFFFF. At the end A and B holds the checksum

/Pär

Reply to
Par

If you're talking about checksums and not CRC's, then there's no real advantage of having a particular initial value -- it might as well be

  1. You have a 1 in 256 chance of detecting an error for an 8 bit checksum. If the data is of high criticality, then I would employ a CRC-16 or more.

Ken.

+====================================+ I hate junk email. Please direct any genuine email to: kenlee at hotpop.com
Reply to
Ken Lee

As others have pointed out, the reliability of a single byte checksums is quite limited, but if you use it in some (communication) systems with a (link) failure might appear as along string of 0x00 bytes, it would be a good idea to avoid the initial value of 0x00, since the checksum would match all the time.

Also if instead of checksum, you are using longitudinal parity (XORing every byte) an error condition of all 0x00 would always go undetected with an initial value of 0x00. An error condition of all 0xFF bytes would go undetected 50 % of the cases with initial values of 0x00 or

0xFF. Thus, an initial value with only a few bits set (such as 0xA5) might be a better idea.

In addition to communication links you might get all 0x00 or all 0xFF error situations with unprogrammed EPROMS or EPROMS (or other chips) missing from their sockets, so avoiding 0x00 and 0xFF as the initial value for checksum or longitudinal parity might be a good idea. For a routine EPROM consistency check, the checksum should be enough, unless you expect high error rates e.g. due to radiation.

Paul

Reply to
Paul Keinanen

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.