This is related to my question about interrupts in an STM32F303 processor. It turns out that the problem is in the compiler (or I'm going insane, which is never outside the realm of possibility when I'm working on embedded software).
I'm coding in C++, and I'm using a clever dodge for protecting chunks of code from getting interrupted. Basically, I have a class that protects a block of code from being interrupted. The constructor saves the interrupt state then disables interrupts, and the destructor restores interrupts.
This has been reliable for me for years, but now the destructor is not being called. I suspect that the optimizer can't make sense of it because of the asm statements, and is throwing it away.
If someone knows the proper gnu-magic to tell the optimizer not to do that, I'd appreciate it. I'm going to look in my documentation, but I want to make sure I use the right method, and don't just stumble onto something that works for now but should be depreciated, or is fragile, or whatever.
Here's the "protect a block" class:
typedef class CProtect { public:
CProtect(void) { int primask_copy; asm("mrs %[primask_copy], primask\n\t" // save interrupt status "cpsid i\n\t" // disable interrupts : [primask_copy] "=r" (primask_copy)); _primask = primask_copy; }
~CProtect() { int primask_copy = _primask; // Restore interrupts to their previous value asm("msr primask, %[primask_copy]" : : [primask_copy] "r" (primask_copy)); }
private: volatile int _primask; } CProtect;
and here's how it's used:
{ CProtect protect;
// critical code goes here }
--
Tim Wescott
Wescott Design Services
http://www.wescottdesign.com
I could, but here at Wescott Design Services we have a fairly hard to overcome rule that says "don't thrash the heap". My boss would kill me, which would hurt me twice because I'm my boss.
--
Tim Wescott
Wescott Design Services
http://www.wescottdesign.com
This works (with the optimize attribute specified for each function, and the level set at O0), but I would like some opinions on whether it is kosher. It works even when the overall optimization level is set to "O3", which is cool.
typedef class CProtect { public:
CProtect(void) __attribute__ ((__optimize__ ("O0"))) { int primask_copy; asm("mrs %[primask_copy], primask\n\t" // save interrupt status "cpsid i\n\t" // disable interrupts : [primask_copy] "=r" (primask_copy)); _primask = primask_copy; }
~CProtect() __attribute__ ((__optimize__ ("O0"))) { int primask_copy = _primask; // Restore interrupts to their previous value asm("msr primask, %[primask_copy]" : : [primask_copy] "r" (primask_copy)); }
private: volatile int _primask; } CProtect;
--
Tim Wescott
Wescott Design Services
http://www.wescottdesign.com
Sorry Tim, but my initial reaction, in a good natured way, is yuck! :-)
The code feels to me like you are trying to trick the compiler instead of solving the core problem and the proposed solution feels "fragile".
Are you sure you can't use "asm volatile" with C++ code ?
I don't know if that would solve your problem but if it did, it would feel more "legitimate" to me as volatile is documented to behave in certain ways as you can see from the page I pointed you to.
Simon.
--
Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP
Microsoft: Bringing you 1980s technology to a 21st century world
Some compilers consider inline assembly as "volatile" - they view them as something scary, and make sure everything before them is completely finished before executing the secret assembly code, and basically turn off all optimisation around the inline assembly call.
gcc (and clang, and a few other compilers) is not like that - it provides ways for the programmer to tell the compiler exactly what the assembly code affects or depends on, so that it can optimise around it. This is extremely useful for some sorts of inline assembly, and it lets you make good use of processor instructions that cannot easily be expressed in C (such as a bit reverse instruction) with only the bare minimum being written in assembly. It also means you don't have to mess around with things like the "primask_copy" variable in this CProtect class - gcc understands these things, and makes copies in registers as needed.
The flipside is that you have to know the rules, and be very careful to apply them.
A key rule here is "volatile". A normal inline assembly instruction is considered non-volatile - the compiler is free to omit it if it is dead code, and can re-order it as it finds convenient. (Inline assembly statements with no outputs, and whose inputs don't involve addresses, are considered "volatile" by default as they would be pointless if they didn't do something unknown to the compiler.) So step one is to make the inline assembly codes "volatile" so the compiler knows it has execute them, and it has to do so in order.
The second key rule is the interaction of "volatile" accesses (either volatile reads and writes, volatile inline assembly, or calls to unknown external code) and normal accesses. C does not specify this ordering in any way. So in code like this:
int a; volatile int v;
void foo(void) { a = 0; v = 1; a++; v = 2; a++; }
the compiler can re-arrange writes to "a" with writes to "v". It can replace all accesses to a with a "a = 2;", and it can put that before, in the middle, or at the end of the two volatile writes to v.
This will not work, except by luck - the compiler can re-order the write to "big" with respect to the interrupt disable/enable, and therefore destroy your hopes of making an atomic write.
The way to deal with this is either by making the write to "big" volatile, to add artificial volatile dependencies that enforce the order, or by using "clobbers" in the assembly statements. Clobbers can be quite sophisticated when you want to get the maximal performance (by using minimal clobbers), but the easiest and therefore safest method is to clobber "memory":
The memory clobber tells the compiler that the inline assembly might read or write memory in unexpected ways - all statements that logically write something to memory that appear before the inline assembly, must complete those writes. And any logical reads from memory after the inline assembly, cannot be started until after the assembly. Data from memory cannot be cached in registers across the assembly.
Once we have cleaned up the other minor issues in your class (the unnecessary "volatile" on the private member, the unnecessary typedef, the use of "int" instead of "uint32_t", and the use of reserved identifiers with leading underscores), we get this:
Well, there's a reason I'm tossing it out to the group for comment!
Me, too. Actually, I had been compiling at -O1, possibly because with the Cortex M3 processor set it worked at that level but not higher.
I can. I just can't use "volatile asm". See my own reply that's parallel with yours.
"asm volatile" certainly seems to fix the issue (which ended up being that the optimizer had an extraneous call to part of the constructor, not a missing call to the destructor, BTW).
The general rule is that if you think you need to reduce optimisation to make your code work, your code is wrong. Very occasionally, the compiler is broken - but that should be rare indeed.
"-Os" does most of the "-O2" optimisations, except for an emphasis on smaller size if the speed optimisation in "-O2" would expand the code significantly. (Note that you still get inlining and occasional loop unrolling - but only if the result is smaller code, or if you asked for the inlining explicitly.)
As always with optimisation flags, it keeps correct code correct - but makes it more likely that poor code (such as missing or incorrect volatiles) breaks dramatically.
ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here.
All logos and trade names are the property of their respective owners.