CRC versus CheckSum

Hi, I have a situation where I have a byte in which to place either a checksu or a CRC. I am on an embedded system. I am using a checksum that is

BYTE create_csum(BYTE data_in,BYTE csum) { csum += data_in + 3; while ((csum == 0) || (csum == 0xff)) csum += 3; return(csum); }

This function elimitates 0x00 and 0xff being calculated.

Our packets cannot have bytes in a packet swapped in order (the transpor layer sees to that) and we are not using the CRC/Checksum to fix ba packets just to reject them. I contend that in this case the checksum i jsut as good as the CRC.

Any opinions?

Msg

Reply to
msg
Loading thread data ...

In some sense you are right. It depends on the pattern of errors you expect to correct. CRCs tend to be optimized for detecting short burst errors such as might occur during communication. Simple checksums tend to have a more even distribution of the errors you can detect. Neither detect all errors (there are after all the same number of states for the sum in either case).

For communications I suspect CRC would be better under the assumption that any interference is likely to occur in short bursts.

Robert

--
Posted via a free Usenet account from http://www.teranews.com
Reply to
Robert Adsett

But the OP said he DIDN'T want to /need to correct errors, just reject them.

Reply to
TT_Man

???

Who said anything about correcting?

I think you are confusing a CRC with an ECC/EDC.

Robert

--
Posted via a free Usenet account from http://www.teranews.com
Reply to
Robert Adsett

Write some test code, and try it on what you consider likely error patterns, on both schemes, and see :

Most error sense systems try to cover like this :

100% Single bit errors xx% Two bit errors YY% Three bit errors etc In a comms link, things like stretched bits, shifted bits and dropped bits are likely occurances.

The Sum should be designed so a stuck-line gives an invalid answer, so commonly the sum complements the bytes eg - everything _including_ the sum = 0FFH on RX, and as Bytes INC, the CkSum decrements.

-jg

Reply to
Jim Granville

Also, something else I just thought of, check out the Fletcher checksum. It fits somewhere between a simple checksum and a full CRC.

Robert

--
Posted via a free Usenet account from http://www.teranews.com
Reply to
Robert Adsett

For the given number of check bits, CRC is the best function for detecting the errors. The checksum is fooled by a double error, CRC isn't. The likelihood of the undetected error is at minimum for CRC. No other function can do better. However it is for you to decide if this is important for your application.

Vladimir Vassilevsky DSP and Mixed Signal Design Consultant

formatting link

Reply to
Vladimir Vassilevsky

YOU DID........I quote

"It depends on the pattern of errors you expect to correct."

Reply to
TT_Man

That depends on the error characteristics of your transport channel. For some types of errors, a CRCs will be better, for others, an additive checksum may be better. CRCs were designed specificially to address the types of errors seen in serial communications via media which suffer from burst errors.

In order to intelligently choose a checksum alrogrithm, you need to know the error characteristics of the medium.

--
Grant Edwards                   grante             Yow! PEGGY FLEMMING is
                                  at               stealing BASKET BALLS to
                               visi.com            feed the babies in VERMONT.
Reply to
Grant Edwards

You don't have enough information to make that assertion.

All N bit checksums map an arbitray length input message into a

2^N space. A good checksum is one that maps an uncorrupted message into a different value in the output space than it does a corrupted message. If errors are completely random, then all non-degenerate checksums will perform the same: there is a 1/2^N chance that a corrupted message will be passed.

Unless you know the error distribution, you can't pick out one of the mappings as "the best function for detecting the errors". CRCs are designed for analog telecom links with very specific error distributions.

--
Grant Edwards                   grante             Yow! I once decorated my
                                  at               apartment entirely in ten
                               visi.com            foot salad forks!!
Reply to
Grant Edwards

I do have enough of information: the CRCs are the perfect codes, the checksums are not. Hence the CRC based codes are better. Whether it is important or not is a different question.

This is not right. The chances are never exactly 1/2^N, and they are different for CRCs and checksums.

Vladimir Vassilevsky DSP and Mixed Signal Design Consultant

formatting link

Reply to
Vladimir Vassilevsky

This isn't my area of expertise but the paper at the link below may be germane to the discussion. Shorter version: All CRCs are not equally good. An assertion of perfection may not be supportable.

formatting link

Reply to
Rich Webb

Bollocks. Mapping randomly corrupted messages into a 2^N space results in a 1/2^N chance that the mapping for any given raondomly corrupted message will be the same as for the uncorrupted message.

If a 1-bit checksum maps half of the input space to 0 and half of the input space to 1, then a randomly corrupted message will have a 50% chance of having the same checksum as the uncorrupted message. Likewise a 2-bit checksum that maps 1/4 of the input space to 00, 1/4th to 01, 1/4th to 10, and 1/4th to 11: a randomly corrupted message will have a 25% chance of having the same checksum. The math works the same no matter how many bits are in the checksum.

The catch is that for a given medium, messages aren't corrupted randomly. For a medium where errors are rare but "bursty" a CRC performs very well (and even more importantly is trivial to implement in hardware). A CRC is not, however, "perfect" and is not the best choice for every type of error distribution.

--
Grant Edwards                   grante             Yow! Where's the Coke
                                  at               machine?  Tell me a joke!!
                               visi.com
Reply to
Grant Edwards

RTFM

The catch is the mapping is not evenly distributed. The different number of codewords is mapped to the different clusters of the 2^N space. It depends on the way the check bits are computed. A simple checksum is not a good way for doing that.

Yes.

No. You have ignored the mutial information and the error probability.

This is true however this is a different matter.

Do you deny the fact that the Hamming codes are perfect? The CRCs are a kind of Hamming codes.

Here is a simple example: consider the word of 11 bit data, 5 bit CRC and a parity bit (the total of 16 bits).

This configuration detects all combinations of errors but 4,6,10,8,12,16 errors.

Now try to match it with whatever 6-bit checksum. The CRC is better for then a checksum for any error probability.

It is possible to design the *best* error detection code for any given size of the data and number of check bits.

Vladimir Vassilevsky DSP and Mixed Signal Design Consultant

formatting link

Reply to
Vladimir Vassilevsky

Missed that, my typo. That should have been detect.

Robert

Reply to
Robert Adsett

Checksums can not detect byte exchanges. CRCs can. That's why the checksum variation using alternate weights of 1,2,1,2 etc. are sometimes used. Also why CRCs are initialized to 0xffff, thus making initial bursts of zeroes detectable.

--
 Chuck F (cbfalconer at maineline dot net)
   
   Try the download section.
Reply to
CBFalconer

As other posters have hinted, the "best" function depends on the pattern of errors you intend to correct.

A course in coding theory would help you a great deal.

CRC's are designed to detect two specific types of errors: distributed errors and burst errors.

DISTRIBUTED ERRORS: The data with the CRC appended is called, in coding theory, a codeword. The Hamming distance is the minimum number of bit corruptions that are necessary to mutate one valid codeword into another. For the simple codes that are discussed, the Hamming distance is 2.

formatting link

There is some higher-powered math required if you want Hamming distance > 2.

Any number of errors up to one less than the Hamming distance will be detected.

BURST ERRORS: An n-bit CRC will protect against n-1 contiguous errors.

formatting link

formatting link

Now, basically, in order to show that one code is better than another code, you'd need to show that the expected value of the number of undetected errors is less with the better code. In order to show this, you'd need to know something about the types of errors that can occur.

If a communication channel is noisy, it stands to reason that distributed errors are possible, and the Hamming distance is a useful measure of goodness.

However, if errors tend to occur in specific patterns, the behavior of the code in the presence of those specific patterns needs to be evaluated.

Let me give you an example. Assume I know (because of the nature of my communication channel) that it is unlikely that two errors will occur in the same bit position in different bytes. Then simply XOR'ing all the bytes together to get an 8-bit CRC/checksum may be a great approach. What makes it great is how it matches up against the possible errors. In another application it may be less great.

The mathematics of CRC's is the most highly-developed I'm aware of as far as coding (but I'm not an expert). I've never seen a practical example of some communication channel where Hamming distance and burst error tolerance weren't the best metrics of goodness, and so CRC's are attractive.

In the case of a CRC vs. a checksum, I believe you'd need to make assumptions about the errors and then work from there. I believe you'll find that because of the carries in a checksum, there are certain adjacent bit errors that may escape detection there but won't with a CRC. Any CRC will be the better code under practical assumptions.

--
David T. Ashley              (dta@e3ft.com)
http://www.e3ft.com          (Consulting Home Page)
http://www.dtashley.com      (Personal Home Page)
http://gpl.e3ft.com          (GPL Publications and Projects)
Reply to
David T. Ashley

of

There are two different issues: the weaker codes and the stronger codes and the expected error patterns. The later issue is addressed by the proper interleaving and grouping of the data regardless of the method of the calculating of the check bits. As for the strength of the code, the optimally designed CRCs are definitely better then any checksums.

The probability of the undetected error is a dot product between the probability density function of the number of errors and the distance spectrum of a code.

Exactly. I run some typical numbers and here are the conclusions:

  1. For the bit error rates at the order of 1% or less the probability of the undetected error is ~orders of magnitude less for CRC, then for a checksum.
  2. For the bit error rates at the order of 10% or higher the probability of the undetected error is close to 1/2^N and about equal for a checksum and a CRC.
  3. The CRC provides better error detection then the checksum at almost any BER. However there could be the area of BER around 15% or so where CRC appears to be actually weaker then a checksum. But the difference is very small and the BER that high is not practical anyway.

Vladimir Vassilevsky DSP and Mixed Signal Consultant

formatting link

Reply to
Vladimir Vassilevsky

With which checksum were you comparing the CRC?

--
Grant Edwards                   grante             Yow! Maybe I should have
                                  at               asked for my Neutron Bomb
                               visi.com            in PAISLEY --
Reply to
Grant Edwards

The block of 128 bits (120 data + 8 redundancy). The redundant bits are either a bytewise arithmetic sum or a 8 bit CRC.

Vladimir Vassilevsky DSP and Mixed Signal Design Consultant

formatting link

Reply to
Vladimir Vassilevsky

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.