Some years ago I developed a systems. The primary operation was to scan the source text and convert all strings. One of the inputs was a file list of sources to be scanned, so the actual source could be represented by a small integer, and the string proper by another. The primary input filter then converted things such as:
puts("Blah blah"); into puts(_(n,m)/*"Blah blah"*/);
where n described the file, m the actual string. The extracted strings were collected in an auxiliary file. Lets call the original (valid c source) file a .src, then a modified .c file and an auxiliary .d file would be generated. The c file remains readable.
There are limitations, such as arrays of strings (which I didn't bother with).
At any rate, at final linking a suitably indexed file is generated from the .d files, and a module included that defines the _(int,int) function, which is of type char*. At this point the object code is, to most purposes, language independent. There is an organization problem, to do with removing duplicate strings, and so forth remaining, besides the action of the _() function.
In my case I knew that no more than about 5 of these strings would be active at once, and had a maximum string length, so I made _() select the next in a circular linked list of strings, and read in the appropriate string from the external file.
One of the advantages is that, once done, (and largely automated) the indexed file can be independently edited and translated. However the intent of the code remains clear in the annotated .c file.
The filelist file is crucial to tying the various indices to actual files.
At any rate, it all worked very nicely. Removing the text from the code reduced the object code size, which was the prime motivator for the system. There was a slowdown due to the sequence of operations needed to translate the indices into an actual file read.