ARM/Linux: Is this a cross-compiler bug? $B!J(Bmemcpy doesn't work as expected)

S

Steven Woody 17 years ago

Hi,

I tried two different Linux/ARM cross-compiler on the following code ( see the last part of this message), one is from ELDK

formatting link

another is from our board vendor, both result a same error.

Below is the running output on an ARM920T board:

before: p1:

0xfc 0x01 0x12 before: p2:
0x40 0x19 0x21 after: p1:
0x40 0x01 0x02 after: p2:
0x40 0x19 0x21 !!! cp is wrong

That is, the line 38 which copy 3 bytes, starting from p2, to p1, but the immediately followed memcmp(p1, p2, 3) failed. I am sure the memcpy did not do the job, since if I provide my own memcpy implemention as below, the error will go disappear.

void memcpy(void *dest, void *src, size_t n) { uint8_t *d = (uint8_t*)dest; uint8_t *s = (uint8_t*)src;

for (size_t i = 0; i < n; ++i) *d++ = *s++; }

Another way to make the error going away is not to use any optimization when compile. My test case use -O2 to bring up the error, but If I don't use it, the error won't occur.

I can not understand what happened here, the result shows, as you see, my assignment of p2 (line 34) is okay because the print output of line

36 is correct. And, if that is a bug in runtime c library -- remember my own memcpy works -- why the two cross compiler coming from different sources result in the same error?

------------------------------------- the minimum sample

--------------------------------------------------

1 #include
2 #include
3 #include
4 #include
5
6 struct Foo {
7 uint8_t x;
8 uint8_t y;
9 uint8_t z;
10 uint8_t m[3];
11 };
12
13 struct Bar
14 {
15 uint8_t m[3];
16 };
17
18 void pr(const char *title, const void *block, size_t n)
19 {
20 printf("%s\n", title);
21
22 uint8_t *p = (uint8_t*)block;
23 for (size_t i = 0; i < n; ++i)
24 printf("0x%02x ", *p++);
25
26 printf("\n");
27 }
28
29 void cp(const Foo *foo)
30 {
31 Bar bar;
32
33 Bar *p1 = &bar;
34 Bar *p2 = (Bar*)(foo->m);
35 pr("before: p1:", p1, 3);
36 pr("before: p2:", p2, 3);
37
38 memcpy(p1, p2, 3);
39 pr("after: p1:", p1, 3);
40 pr("after: p2:", p2, 3);
41
42 if (memcmp(p1, p2, 3) != 0)
43 printf("!!! cp is wrong\n");
44 }
45
46 int main()
47 {
48 Foo foo;
49 foo.x = 1;
50 foo.y = 2;
51 foo.z = 3;
52 foo.m[0] = 0x40;
53 foo.m[1] = 0x19;
54 foo.m[2] = 0x21;
55
56 cp(&foo);
57 cp2(&foo);
58 return 0;
59 }

---------------------------------------------------------------------------------------

Vote

F

Frank Buss 17 years ago

I can reproduce the problem. But if I write this:

void *p1 = &bar; void *p2 = (void*)(foo->m);

it works, at least with g++ 4.1.2 on my NAS system, for which I've changed the internal firmware to a standard Debian Linux. The interesting thing: If I'm using my own memcpy, as you described:

void mymemcpy(void* p1, void* p2, int size) { unsigned char* p11 = (unsigned char*) p1; unsigned char* p22 = (unsigned char*) p2; for (int i = 0; i < size; i++) p11[i] = p22[i]; }

it works all the time. I guess this could be an alignment problem with memcpy. You should file a bug report, see

formatting link

for instructions. You can add my report.

Compile command: g++ -O2 -Wall test.c

System on which I've tested it:

# g++ --version g++ (GCC) 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)

# cat /proc/cpuinfo Processor : ARM926EJ-Sid(wb) rev 0 (v5l) BogoMIPS : 266.24 Features : swp half thumb fastmult edsp java CPU implementer : 0x41 CPU architecture: 5TEJ CPU variant : 0x0 CPU part : 0x926 CPU revision : 0

Frank Buss, fb@frank-buss.de http://www.frank-buss.de, http://www.it4-systems.de

Vote

F

Frank Buss 17 years ago

I guess most C ARM cross compilers are based on the GNU compiler, which could be the reason why it is the same error. There are other bugs for the ARM implementation, which I could confirm and was fixed:

formatting link

Maybe try the latest version of GCC, if you are lucky and manage to compile the compiler :-)

If I compile the code with Visual Studio 2005 for ARM (for a WindowsCE system), your code works, too (needs only one additional "typedef unsigned char uint8_t", because the compiler is not fully ANSI C99 compliant and doesn't know the header file stdint.h).

Frank Buss, fb@frank-buss.de http://www.frank-buss.de, http://www.it4-systems.de

Vote

N

Nils 17 years ago

This is not strictly a bug..

You're violating the C-rules of aliasing. In short: You must not modify a value/stucture of type A and read from the same memory by casting a pointer from A to B. The compiler is allowed to assume that because the types differ the two pointers point to different memory.

I know this is common practice. However, the C-compiler can do anything it wants to in this case. Gcc got a better aliasing analysis backend recently that is more strict about the rules. Lots of old code that depended on this behaviour broke, but even more code improved in performance.. All in all that was a good change.

The way around this is to cast the one of the pointers temporarily to (char *). That's the official way to tell a C-compiler that the data may alias with other data. Char * is the wild-card. Another - more drastic way would be to declare the pointer you read from as volatile, but that effectively disables any optimizations.

FYI: Delacring the pointers as void* may help as well. Also: x86 code often works simply due to the fact that values have to be reloaded from memory due to the lack of registers. The compiler could optimize bit it can't because 7 registers aren't enough... So this is one of the bugs you'll first encounter (if ever) if you compile your code on ARM or MIPS or other architectures that have plenty of registers..

Nils

Vote

W

Wilco Dijkstra 17 years ago

value/stucture of type A and read from the

assume that because the types differ the two

There is no aliasing issue in the example. You would be right if he accessed both foo and p2 at the same time in a context where the obvious relation between the pointers is not visible (eg. if both are passed to a different function). There is no aliasing, and neither the memcpy nor the initialization of Foo is redundant (as its address is leaked), so it's a compiler bug.

Wilco

Vote

F

Frank Buss 17 years ago

Can you cite the chapters in the standard? My first feeling was the same, because using void* worked, but it works with Visual C for ARM, which I think does a good register optimization, too, and Wilco says, it is not an aliasing problem.

Frank Buss, fb@frank-buss.de http://www.frank-buss.de, http://www.it4-systems.de

Vote

J

JSprocket 17 years ago

In what sense is that "better"? A warning, perhaps, a switch for "strict" mode, but breaking existing code is a stupid thing to do.

JS

Vote

F

Frank Buss 17 years ago

Looks like it is a problem with memcpy:

formatting link

but I'm not sure, if it is a compiler bug. If you specify "-Ono_memcpy" or "-fpack-struct" when compiling your code, there is no bug anymore. I guess the casting of a member of a struct to a struct pointer is dangerous in combination with the built-in memcpy functions.

Frank Buss, fb@frank-buss.de http://www.frank-buss.de, http://www.it4-systems.de

Vote

W

Wilco Dijkstra 17 years ago

What the standard says is not relevant. The C standards are consistently and deliberately underspecified, unclear and are more concerned with features that were obsolete 20 years ago (like ones-complement and signed overflow) than actually specifying important features like bitfields. So compiler writers typically add their own rules, the most important of which says "thou shalt correctly compile existing code".

What would be more interesting is the generated code for the cp() function. That would clear up whether it is a bug in memcpy or inter procedural alias analysis.

Btw. The -Ono_memcpy option you mentioned in another post only works on the ARM compiler, which is not based on GCC.

Wilco

Vote

J

John Devereux 17 years ago

If that were always true, compilers could never improve. Breakage of

*incorrectly* written code is quite common. For example non-volatile accesses to hardware, or "Optimising" of delay loops and time sensitive code.

John Devereux

Vote

C

CBFalconer 17 years ago

What a horrible (and dangerous) attitude. Not only that, but inaccurate also.

One of the great problems with C is the failure to catch overflow in signed integral expressions. It is not solvable, because of the necessity of supporting older code. Bitfields are unimportant, because there are adequate replacements available in the form of constants and bit manipulation operations.

One of the strengths of C is the fact that code that meets the requirements of the standard is almost always totally portable. By itself this is not the final word, but it makes it easy to isolate and emphasize what code needs modification during a port.

[mail]: Chuck F (cbfalconer at maineline dot net) [page]: Try the download section.

Vote

C

CBFalconer 17 years ago

The standard hasn't changed. Thus 'better' in the sense that code that violates the standard is detected. The code didn't break, it was always broken.

[mail]: Chuck F (cbfalconer at maineline dot net) [page]: Try the download section.

Vote

W

Wilco Dijkstra 17 years ago

Most compiler optimizations are independent of language semantics, so it's not true compilers could not improve if they had to be conservative. I know for a fact that one can beat compilers with aggressive optimizations (like GCC) by a huge margin while being very conservative.

Of course if you don't use volatile correctly then it's your own fault. Non-conforming code is not the same though. In general incorrect code fails on most compilers when optimizations are enabled, while non-conforming works on most compilers.

Wilco

Vote

F

Frank Buss 17 years ago

But the code of the OP demonstrates that sometimes it is difficult to meet the standard. For me it is not clear, if the code is violating the standard. Would be nice, if a compiled source code would behave exactly the same on every platform and for all optimization switches, otherwise the compiler should signal an error for ambiguities.

Is there any other standarized computer language, which is more restrictive defined? Like Java: There are no dangerous pointer operations, alignment and aliasing problems, the size of the basic types are fixed for every platform, the VM is simple and well defined etc. But Java is not a standard, e.g. like the ECMAScript standard (aka JavaScript). Looks like C# and the upcoming C++/CLI are interesting languages, but because of the CLI it would be difficult to use it for small microcontrollers.

Is it possible to define it more restrictive and still fast at all? If "int" would be defined 32 bit for all platforms, but the microcontroller has only 16 bit native registers, this would be slower than a C implementation.

Frank Buss, fb@frank-buss.de http://www.frank-buss.de, http://www.it4-systems.de

Vote

W

Walter Banks 17 years ago

GCC may not be a good example of aggressive optimization. GCC's generic targeting is rarely competative with well targeted specific processor compiler. Penalties of 30% or so in the processors I regularly work with.

The biggest weakness of many C compilers is in error reporting. Good error test suites are much less common than functional testing and syntax testing test suites.

Regards,

-- Walter Banks Byte Craft Limited

formatting link

snipped-for-privacy@bytecraft.com Canada

Vote

J

John Devereux 17 years ago

But the optimisation possible on non-volatile variables *did* break a lot of code that was out there (and still does, with every improvement to this area of optimisation). So my point stands I think: a "conservative" approach would have left all memory accesses alone, in case they were "important".

I admit I am unclear about the difference here!

John Devereux

Vote

C

CBFalconer 17 years ago

... snip ...

Yes. For example, ISO10206. Note that a more complex language is specified in a considerably shorter standard. Yet the language is simpler to use. I am refraining from referencing ISO 7185 because there are usage limitations there.

As referenced above. In your example, if that size of operand is required, the code writer should have specified the long type. Failure to do so is simply a mistake.

[mail]: Chuck F (cbfalconer at maineline dot net) [page]: Try the download section.

Vote

N

Nils 17 years ago

Want a warning for that? No problem.

gcc -Wstrict-aliasing ... does exactly what you're asking for.

Try some -Wstrict-overflow for extra fun. You may be surprised how often you depend on non portable behaviour.

Nils

Vote

N

Nils 17 years ago

Hm... well. I'm not sure.. He's not explicitely casting from one type to another (that one would be to easy). Instead he copies via a self-written memcpy function.

However, he does his memcpy via the unit8_t type. Isn't that one a built-in type these days? I only know that the compiler is forced to assume aliasing if the data is cast to char* or volatile * something.

I don't have the slightest idea if the compiler can assume aliasing if you copy via uint8_t. It's the same kind of data-type in the end, but afaik char* is special because of the aliasing handling.

I don't have access to the standard, but I know some regular posters do have access. Maybe they can chime in and take a look at this special case..

From the guts I'd still say it's an aliasing issue. I bet it goes away if memcpy is rewritten to use char* instead of unit8_t*.

Cheers, Nils

Vote

F

Frank Buss 17 years ago

I think the Pascal syntax is too redundant and you have to think all the time when you have to write ";" or when it is not allowed. And in the ISO10206 text many parts are implementation-defined. Not only the size of integer types, but important things like module activation order. The text contains more than 50 occurances of "implementation-defined". This doesn't look like a very restrictive definition, which allows to compile programs for different implementations without modification, like it is possible with Java.

But this makes programming a lot harder. For writing portable programs, you have to check each statement with the standard definition. In languages like Java you can write tests for the corner cases for one compilation and then you can be sure it works the same on every VM.

Frank Buss, fb@frank-buss.de http://www.frank-buss.de, http://www.it4-systems.de

Vote

ARM/Linux: Is this a cross-compiler bug? $B!J(Bmemcpy doesn't work as expected)

Join the Discussion

Didn't find your answer?

ARM/Linux: Is this a cross-compiler bug? $B!J(Bmemcpy doesn't work as expected)

Join the Discussion

Didn't find your answer?

ARM/Linux: Is this a cross-compiler bug? $B!J(Bmemcpy doesn't work as expected)