Patch fixed strings in .hex file

In one project I have many quasi-fixed strings that I'd like to keep in non volatile memory (Flash) to avoid losing precious RAM space.

static const char s1[] = "/my/very/long/string/of/01020304"; static const char s2[] = "/another/string/01020304"; ...

Substring "01020304" is a serial number that changes during production with specific device. It has the same length in bytes (it's a simple hex representation of a 32-bits integer).

Of course it's too difficult and slow to rebuild the firmware during production passing to the compiler the real serial number. I think a better solution is to patch the .hex file generated by the compiler.

I'm wondering how to detect the exact positions (addresses) of serial numbers to fix.

The build system is gcc, so I could search for s1 in the elf file. Do you know of a tool that returns the address of a symbol in the elf or map file?

Could you suggest a better approach?

Reply to
pozz
Loading thread data ...

In the source code, put the serial number in as "PQRXYZ" or some other distinct string of characters. Generate bin files, not hex (or convert with objcopy). Then do a simple search for the special string to find its position and replace it with the serial number using a simple Python script or your other favourite tool (awk, sed, perl, whatever).

Oh, and in the source code, don't forget to make the string "volatile".

Reply to
David Brown

Generate two binaries with different substrings and then do a binary file compare to find the position.

Reply to
Herbert Kleebauer

I thought about this approach, but is it so difficult to have the same exact sequence of bytes somewhere else in the output?

Why?

Reply to
pozz

Assuming there's a symbol associated with the address, the link map will tell you what the address is.

Reply to
Grant Edwards

Try it and see.

If you have :

static const char s1[] = "PQRXYZ";

and your code later does, say :

const int last_digit = s1[5] - '0';

the compiler will optimise it to :

const int last_digit = '*';

i.e., it will calculate 'Z' - '0' at compile time - and if I remember by ASCII codes correctly, that matches '*'.

You will be messing with the string behind the compiler's back. Make it volatile. "volatile const" might be unusual, but it is useful in exactly this kind of circumstance.

Reply to
David Brown

Making the symbol extern linkage (remove the "static") would help with that!

Reply to
David Brown

Another - perhaps more reliable - method would be to put the string in its own section with __attribute__((section('serial_number'))), and then have a linker file entry to fix it at a specific known address.

Reply to
David Brown

Un bel giorno pozz digitò:

Extremely unlikely, especially since you use text strings and therefore you actually use 64 bits (eigth ASCII characters) to represent a 32 bit number. Besides, you don't need to use an ASCII string as the placeholder, you can use any 64 bit number.

If for example your binary file is 1 MB, there is one chance over 2.2 trillion to have the same number duplicated somewhere else.

To avoid that the compiler will optimize the code and "obfuscate" your string. I don't think it is very likely, but it is not impossible, especially if you use a very aggressive optimization level.

Reply to
dalai lamah

The map file is simple to read by human, but I think it's better to use some tool (readelf or objdump) that access elf file.

Even if I weren't able to create a command line for this task.

Reply to
pozz

Am 16.01.2024 um 13:19 schrieb pozz:

Last time I needed that, I hacked it up myself; at least back in 32-bit times, ELF was not that hard (but I had to do that anyway to convert ELF into something the controller could boot).

Define your memory allocations explicitly. Instead of building a binary and hacking the strings, place the strings at a fixed address and regenerate the ELF or .hex file containing them from scratch. Whether you then give the fixed addresses a name using linker magic, or just cast pointers, is a matter of taste.

Stefan

Reply to
Stefan Reuther

Actually, this sort of thing really does happen in practice. In one of my current projects, I have some data that is filled in by post-processing the binary file, and I had to use volatile accesses to read the data or the compiler would optimise based on its knowledge of the contents it saw at compile time. This is not just theoretical.

(To be fair, it is a bit more likely if - like in my case - the source file uses null characters rather than a pseudo-random string of characters.)

Reply to
David Brown

IIRC, if you're using gcc/binutils, there are ways to get even static symbols to show up in the link map (e.g. --fdata-sections), but making the symbol global is smplest.

Reply to
Grant Edwards

Am 16.01.2024 um 13:19 schrieb pozz:

You do not.

Instead, you set up linker scripts, linker options and/or add __attribute(()) to the variables' definitions to _place_ them at a predetermined, fixed, known-useful location.

And do yourself one favour: have only _one_ instance of that number in your code. Use concatenation or similar to output it where needed.

Then you can use tools like srecord GNU binutils to stamp your desired number into that fixed location in the hex file. Professional-grade chip flashing tools for production environments can usually do that by themselves, so you don't even have to edit your "official" files.

Details will obviously vary by tool chain.

Reply to
Hans-Bernhard Bröker

I think scanelf from pax-utils will do it.

formatting link

Reply to
Grant Edwards

libelf should help.

The requirements sound similar to "we need to patch the checksum in the vector table so that a LPC MCU will boot":

formatting link
It should be easy to modify that to patch serial numbers.

Yes. Placing the string in a special section via the linker script will make it easier for the patch tool to locate the string.

cu Michael

Reply to
Michael Schwingen

Il 16/01/2024 19:35, Hans-Bernhard Bröker ha scritto:

Do you mean to choose by yourself the exact address of *each* string? And where would you put them, at the beginning, in the middle or at the end of the Flash? You need to calculate the address of the next string from the address *and length* of the previous string. It seems to me a tedious and error-prone job that could be done easily by the linker.

Patching the .hex or .bin file replacing 8 bytes starting from a known address is simple. I would write a Python script or would use one of srecord[1] tools.

[1]
formatting link
Reply to
pozz

Il 17/01/2024 08:45, pozz ha scritto:

The command to patch 8 bytes in the address range 0x800-0x808 with the string "01020304" would be:

srec_cat original.hex -I -E 0x800 0x808 -GEN 0x0800 0x0808 -REP_S "01020304" -O patched.hex -I

-I is for Intel hex formato (input and output)

-E is to exclude the bytes to patch from the original hex

-GEN is to generate new bytes at a certaing range

-REP_S is the constant string to repeat in the range

In my case I don't really need to repeat the string in the range, because the length of the string is exactly the length of the address range.

Reply to
pozz

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.