Suppose you have some software that is 16K bytes. Is CRC16 - 16 bit crc, a reasonable way of producing a number that uniquely identifies a particular revision of this software? What are the odds of two different revisions having the same CRC? Are they any better than using a 16 bit arithmetic checksum?
While the chances of two images having the same CRC is 1:65536, after you have built a few versions the probability that you will get two the same rapidly increases.
This is similar to the classic example of shared birthdays in a school class. The probability that two children have the same birthday is
1:365 but the probability that two children out of a class of 20 share a birthday is well over 50%.
A better way is to have a const initialised data element and write a utility to autoincrement the value it stores. Add this to the build and ensure that the file is re-compiled on every build.
Based on experience, I would say that placing a check value in the software is a terrific idea, especially if you are able to calculate it and embedded it in the output file. This way, the system can verify the integrity of the rom by running a check on the memory.
Below, I have included a short assembly code snipit that I used in a project. The compiler has a macro called DATE that inserts the PCs compile time and date. With this, you compute the low and high bytes (assuming byte wide memory) and stick them in the software. Next, you calculate a negative number that subtracts out the values added by the date and time.
This idea won't work with a CRC wich is order dependant, but will work with an addition checksum, which is far easier and faster to compute during startup. While it isn't as secure as a CRC algorithm you get the benefit of having the compile time and date embedded in the rom and it is run time accessible so it can be displayed, etc. A combination of a check value like a CRC along with a specific compile date and time can give you a pretty good indication that the image is the correct one.
FIRMVER dc.b 'Software Version X.X'' dc.b 'Created on ' dc.b MONTH_MSB,MONTH_LSB dc.b '/'
CRCFIX dc.b 0-(low (MONTH_MSB+MONTH_LSB+DAY_MSB+DAY_LSB+YEAR_MSB+YEAR_LSB+HOUR_MSB+HOUR_LSB+MINUTE_MSB+MINUTE_LSB+SECOND_MSB+SECOND_LSB)) ;The CRCFIX subtracts out the month,day,year, etc for the purposes of compling a checksum /********* end snip ********/
That depends on the data block's failure modes. CRC's are particularly good for serial data transmission media susceptible to burst-mode errors. If serial transmission with burst-mode errors isn't what you're trying to detect, then a CRC might not be "less easily fooled" than another hash.
The odds of a collision between two random blocks of data are exactly the same for an N-bit CRC and for an N-bit "checksum".
That depends on the failure mode, the cost of the calculation, and the cost of a failure.
A CRC is a hash value. So is a checksum.
I presume you're talking about "one-way" hash functions like MD5 or SHA?
The "one-way" property prevents somebody from forging a message with something other thana brute-force approach. The length of the hash is to make a brute-force attack more expensive.
If you don't care about forgeries, there are probably cheaper hash functions.
--
Grant Edwards grante Yow! Finally, Zippy
at drives his 1958 RAMBLER
visi.com METROPOLITAN into the
faculty dining room.
On Thu, 06 Oct 2005 20:32:40 +1300, darkknight wrote in comp.arch.embedded:
I used to use CRC for validating images, but looking for programming errors in a flash/EPROM/ROM image has a considerably different failure mode and error pattern than serial communications, which is what CRCs are most useful for.
These days I use a Fletcher checksum, which is faster than a table loop up CRC and doesn't need space for the table, and much faster than bit banging a CRC in code. The Fletcher checksum is sensitive to the order of the data values as well as the values themselves. Swap any two (bytes, words) in the image and an ordinary checksum comes out the same, but a Fletcher checksum, like a CRC, will spot the difference.
For small blocks less than 256 bytes, like EEPROM values, we use an
8-bit Fletcher sum, for sizes up to 64K a 16-bit sum, and for larger values a 32-bit sum. The total number of bits in the sum is twice that of algorithm, so doing a 16-bit Fletcher sum on your 16K data would give you a 32-bit result, which is sensitive to the ordering as well as to the values.
Here's basic C code for a 16-bit Fletcher checksum:
while (size--) { result.sum += *src++; result.total += result.sum; }
return result; }
--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html
I'll take a run in a different direction. Rather than a single value that may repeat, consider other options such as time and date stamping, actual version numbers, etc., embedded in the code.
You didn't mention what compiler you are using, but many have features for putting date/time stamps in code. If you are using make files it is not too difficult to have each run of make spawn a simple batch file to either collect the system date/time or to even use a script to increment a version/build number. Anything generated by a script/executable could put the data in a file that you can include in the compile.
I think you will find this much more applicable to your stated goal than using the CRC. At the same time, if you need to check the code validity at run time, a CRC embedded in the file is a good choice.
ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here.
All logos and trade names are the property of their respective owners.