Question About Sequence Points and Interrupt/Thread Safety

Jujitsu Lizard · 2009-02-24T23:58:15+00:00

I've included a function below and the generated STM8 assembly-language. As it ends up (based on the assembly-language), the function is interrupt safe as intended.My question is, let's assume I have this:DI();if (x) x--;EI();where DI and EI just expand to the compiler's asm( ) feature to insert the right machine instruction to disable and enable interrupts, ...Is there any reason that the compiler cannot delay writing "x" back so that I get effectively this:DI();cpu_register = x;if(cpu_register) cpu_register--;EI();x = cpu_register;???It isn't clear to me if "volatile" is required on "x" or if there is any possibility of the write of the variable back to memory being delayed.Thanks for any insight.Function below.The Lizard----------//--------------------------------------------------------------------------------//DESCRIPTION// Decrements an array of zero or more 8-bit unsigned integers, but not// below zero. This function is intended for software timers, but may have// other applications as well.////INPUTS// in_arg// Pointer to first element to be decremented. This pointer must// be valid if in_nelem > 0.//// in_nelem// Number of elements to be decremented. If this value is 0, in_nelem// will not be dereferenced and may be NULL or otherwise invalid.////INTERRUPT CONSIDERATIONS// This function must be called only with interrupts enabled (it uses simple// DI/EI protocol).//// This function may be called from non-ISR software only.//// In the case of software timers, individual software timers may be safely// shared with interrupt service, due to the critical section protocol. So,// an ISR may safely set and test software timers. Note that the behavior// of individual software timers is guaranteed by DI/EI, but the relationship// between timers is not, as an interrupt may occur while an array or sets// of arrays are being decremented.////MNEMONIC// "dec" : decrement.// "u8" : unsigned 8-bit.// ...

I

Ike Naar 17 years ago

Assigning -1 is an established trick^H^H^H^H^Hmethod to assign the maximum possible value to an unsigned variable.

Vote

C

CBFalconer 17 years ago

Setting an unsigned to -1 is a normal and standard way of setting it to unsignedwhatever_MAX. Not in the least questionable.

[mail]: Chuck F (cbfalconer at maineline dot net) [page]: Try the download section.

Vote

G

Gil Hamilton 17 years ago

But the point is that "x = a + b + c + v" is NOT merely *an* expression. It is an expression that combines a tree of other expressions. The tree for this example is easily derived from the grammar:

= Expr 1 (assignment) / \ 'x' + Expr 2 (addition) / \ 'v' + Expr 3 (addition) / \ 'c' + Expr 4 (addition) / \ 'b' 'a'

(Not including the trivial "primary expressions" that the variable references themselves represent.) Expressions 1 and 2 "refer to" a volatile object. Expressions 3 and 4, though their evaluation is required as input to 2 and 1, do not themselves refer to any volatile object.

I've just shown you how to construct such an interpretation.

Indeed, were one to try to apply the intent that you consider so obvious, there is really no consistent and reasonable way to apply it. What if the expresssion was instead? x = ((a + b + c) && 0) ? 1 : v; Now, according to your interpretation, the compiler must generate code to evaluate 'a + b + c' (in the correct order, etc.), logically OR the sum with the constant 0, compare that result with 0, then and only then access the volatile object 'v' assigning the resulting value to 'x' (presumably including the code that would assign 1 in the event that 0 was somehow found not to be equal to 0).

I find it extremely difficult that anyone would find *this* a reasonable interpretation of the standard.

On the other hand, with this expression (both v1 and v2 volatile): x = (v1 && ((a + b + c) && 0) && v2); it *would* be extremely reasonable to conclude that the standard requires an "access" to v1 (however an access is defined by the implementation), followed by an assignment of 0 to x. (But no additions, or comparisons, or code to access v2.)

Yet, use of the well-defined term would have clearly expressed what you say is the 'obvious intent'. And without that term, there is ample reason for a different interpretation.

GH

Vote

G

Gil Hamilton 17 years ago

Richard Heathfield wrote in news:y snipped-for-privacy@bt.com:

It is not guaranteed. The standard does not require twos' complement representation to be used.

Indeed, this is the primary reason for having UCHAR_MAX in the first place: so that you don't have to make the assumption that the maximum unsigned value has the same representation as signed -1.

GH

Vote

J

Jon Kirwan 17 years ago

Agreed. The standard explicitly mentions one's complement, for example.

Isn't it the case that the value -1, cast to unsigned and regardless of representation, must become whatever represents -1 MODULO (UCHAR_MAX+1)? In other words, UCHAR_MAX? Or am I forgetting my modulos?

Jon

Vote

N

Nate Eldredge 17 years ago

Please see 6.3.1.3 (1-2) of the C99 standard.

(1) When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged. [Doesn't apply here, included for context.]

(2) Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.

So although the standard does not require two's complement representation to be used internally, conversion of negative values to unsigned types is required to behave as though they were two's complement, as far as I can tell. This might require extra code to be generated if the representations are actually different.

Thus, given

int main(void) { signed char s = -1; unsigned char u1, u2; u1 = s; u2 = *(unsigned char *)&s; printf("%u %u\n", u1, u2); return 0; }

I believe the first value output must be equal to UCHAR_MAX. The second value output examines the representation of `s', and need not equal UCHAR_MAX. On a one's complement machine, it would presumably be UCHAR_MAX - 1. On a sign-magnitude machine, it might be 129.

They needn't have the same *representation*, but the latter is *converted* to the former. UCHAR_MAX looks nicer than (unsigned char)-1, and also provides consistency with CHAR_MAX, etc, but I don't think it could actually have a different value.

Vote

R

Richard Heathfield 17 years ago

Er, so what? This is about values, not about representation.

This isn't about representation.

When you try to assign to an unsigned integer type a value that is not within the range it can store, the value is "reduced" (which might mean increasing it) into the appropriate range. Thus, if you do this: unsigned char uc = -17; then what you actually get in uc is the value (UCHAR_MAX + 1 - 17). If UCHAR_MAX is 255, then you get 256 - 17, or 239. Representation has nothing to do with it. I refer you again to the above citation from the Standard.

Richard Heathfield Email: -http://www. +rjh@ Google users: "Usenet is a strange place" - dmr 29 July 1999

Vote

H

Hans-Bernhard Bröker 17 years ago

Nonsense. Representation has nothing to do with it. The result is defined by value, not by representation.

Vote

D

David Brown 17 years ago

The code has let to several questions in this thread - ergo, it is more than a little questionable.

I write embedded software for a living. To do that well, I aim to write code that makes sense logically, that is as clear in its meaning and self-explanatory as possible, and that is portable to different *real* embedded compilers where practical (but taking advantage of better tools if it makes sense). The C standards are not a Bible to me - *real* embedded processors and *real* embedded development tools are what count.

So for me, the line "unsigned char l_12 = -1;" has three glaring faults, and I consider it incorrect code regardless of whether it works or not.

First and most obviously, the name "l_12" is unreadable and uninformative - even something like "x_12" would be better, since it avoids using a lonely small "l" that looks like a "1".

Secondly, assigning a negative value to an unsigned type is mathematical and logical nonsense. I don't give a *beep* what the standards say, or even whether the code works or not - it doesn't make sense, so the code is wrong.

Thirdly, a "char" holds a character - something like 'A'. If you want to hold a number, use a number type. That's called "good typing", and is part of "good programming". It is one of C's many unfortunate design flaws that there are no fundamental types for smaller numbers, and that instead you must abuse "char". Fortunately there is "typedef", and especially , which allows you to write "uint8_t".

So for an embedded programmer, correct code will be more like:

"uint8_t x_12 = 0xff;"

If you are working with code for a wider range of systems, including those that can't handle 8-bit data, and you are really writing code that is portable across such systems, you might want something like:

"uint_fast8_t x_12 = UINT_FAST8_MAX;"

Of course, if you are writing such portable code, then lines like this are the least of your worries - a great many uses of "UCHAR_MAX" and related constants are a waste of time since the rest of the code would not work if it were anything other than 0xff.

Vote

K

Keith Thompson 17 years ago

Yes, but I fail to see how that contradicts what I wrote. The standard says "any expression". It does not say, or imply, "any expression that doesn't combine a tree of other expressions".

You've reversed the order of the operands of the "+" operators. That shouldn't matter, since addition is commutative, but I think it would be clearer to write it as:

= Expr 1 (assignment) / \ 'x' + Expr 2 (addition) / \ + 'v' Expr 3 (addition) / \ + 'c' Expr 4 (addition) / \ 'a' 'b'

Expressions 1 and 2 refer to a volatile object, therefore expressions

1 and 2 must be evaluated strictly according to the rules of the abstract machine. To evaluate an expression strictly according to the rules of the abstract machine, you must evaluate any of its subexpressions strictly according to the rules of the abstract machine.

And I disagree with your reasoning.

Yes.

I agree that it's not a reasonable *requirement*, but I think it's the only possible interpretation of what the standard actually says.

[...]

I didn't use the phrase "obvious intent" in this context; I used it only to refer other cases where the standard's wording is flawed. (For example, the standard's definition of "expression" implies that

42 is not an expression, but the obvious intent is that it is. Similarly, the standard's definition of "lvalue" implies that, assuming 42 is an expression, 42 is an lvalue, but the obvious intent is that it is not.)

Personally, I see no obvious intent here; I honestly don't know what the standard *should* say. (I've never really used "volatile", so I don't have much to say about how it should work.)

If you're dissatisfied with what the standard currently says about "volatile" (and it seems to me that you should be), then the first step in fixing it is to figure out what it *should* say; the next step is to express that in standardese.

Keith Thompson (The_Other_Keith) kst@mib.org Nokia "We must do something. This is something. Therefore, we must do this." -- Antony Jay and Jonathan Lynn, "Yes Minister"

Vote

B

Ben Bacarisse 17 years ago

That is not useful criterion -- you would not be able to write a single lien of C if you avoided every construct that has let to questions here!

I think you are putting too much emphasis on the constant expression. If you write code that subtracts from an unsigned integer, the result can be negative and C's "reduce modulo max + 1" rule kicks in. In fact it kicks in when you add as well if the result is larger the maximum value for the type. In other words assigning -1 is no different than x -= 1 when x is zero. I think all embedded programmers will know that this "wraps round".

My point is that I think it helps to know what is really happening: that the right hand side of x = x - 1; is often a negative number being assigned back to an unsigned variable. In that context x = -1; is not surprising.

I know that I stand little chance of persuading someone who has used "real" (in bold) three times (and who has bleeped out an expletive) but I throw it out there as food for thought!

One other minor point:

The trouble with this (and it is only a small style matter) is that the type information "leaks" over to the right hand side. It is very handy that the zero of all arithmetic types is written the same way (0) and I like the fact that the maximum value of all unsigned types is similarly available (-1). (OK, I know -1 is no really such a maximum, but I hope you see what I am getting at).

Outside of the embedded world it can be useful to write "width" neutral code so that simply by changing a typedef the code can work on a different range or values. I've even see it used like this:

typedef struct { unsigned value:WIDTH; } integer;

if (x.value < (integer){-1}.value) ...

although I would agree if you accused this of being a bit too "tricksy".

Ben.

Vote

P

Phil Carmody 17 years ago

Representation is not being questioned, value is. However, 6.2.6.1 clause 3 implies that what applies to one applies to both. The value of the expression ``-1'' is the value to which 0 will be yielded if 1 is added to it, by the rules of modulo arithmetic. UCHAR_MAX uniquely has that property.

Are you sure - do you have any evidence to back up that claim?

Phil

I tried the Vista speech recognition by running the tutorial. I was amazed, it was awesome, recognised every word I said. Then I said the wrong word ... and it typed the right one. It was actually just detecting a sound and printing the expected word! -- pbhj on /.

Vote

D

David Brown 17 years ago

Yes, embedded programmers know that "all" processors and C compilers use twos-complement wrapping arithmetic (except when they use saturating arithmetic...).

I am not doubting that in this context "x = -1" and "x = 0xff" have identical effects, nor do I doubt that (as others have pointed out) "x =

-1" is commonly written in C programs even when "x" is unsigned. But it is illogical and non-sensical when read or written by a person, and it will be flagged as an error by more stringent tools such as lint because it is assigning a value outside the range expressible by the type. Other programming languages that have stronger compile-time checking (such as Pascal or Ada) would reject it outright.

Just because the C compiler accepts the code, and C programmers write such code, and (baring compiler bugs) the generated code does what the programmer expects, does not mean it is good code!

These bolds look a little more dramatic than the asterisks when I wrote it...

I see your point, and I understand why people use -1 in this way. I don't see a problem with the type information "leaking" in this way - you should be aware of the type of both the left hand side and right hand side of any assignment.

"Tricksy" code is often useful, but where possible such code should be hidden through macros, constants, inline functions and the like, so that the main code remains clear. You could write something like:

#define maxUInt -1

hidden away in a header file, and in the main code write:

void foo(void) { uint8_t x = maxUInt; }

That would have an identical effect as far as the compiler is concerned (and still annoy tools like lint), but at least it would make sense to human readers.

I agree that it is sometimes useful to write could with types whose size is easily changed - that is perhaps more important in embedded systems, when you often need to use as small types as possible for efficiency (for example, a buffer index on an AVR would be 8-bit if possible, but

16-bit if necessary). But it is seldom a hardship to define things like maximum values as symbolic constants of some kind at the same place as the actual concrete width is defined.

Vote

B

Ben Bacarisse 17 years ago

I am a bit stumped by this. My sarcasm detector went off, but I can't see your point. If your compiler does not do standard (as in the C standard) arithmetic, then you are not writing in C but something a bit like it. Are such non-conforming compilers common in the embedded world?

Agreed. I was arguing that it is good. I am happy that we just disagree about that.

I expressed it badly. You have duplication of information. The correct constant is tied to the type. Often this does not matter, but if the type might change it does matter a bit more.

I'd rather not have to check that the macros is correct. If I see -1 being assigned, I know what is happening. I'd have no objection to a comment:

uint8_t x = -1; /* max value for that unsigned type */

Ben.

Vote

P

Phil Carmody 17 years ago

Do you feel the same way about

unsigned char uc = -1u;

?

Phil

I tried the Vista speech recognition by running the tutorial. I was amazed, it was awesome, recognised every word I said. Then I said the wrong word ... and it typed the right one. It was actually just detecting a sound and printing the expected word! -- pbhj on /.

Vote

I

Ike Naar 17 years ago

-1u is parsed as -(1u), not as (-1)u . So, adding the ``u'' does not change anything to the fact that an unsigned variable is initialised with a negative number.

Vote

I

Ike Naar 17 years ago

Let me choose my words more carefully: Adding the ``u'' does not change anything to the fact that an unsigned variable is initialised with something that has the appearance of a negative number (effectively, -(1u) is unsigned).

Vote

N

Niklas Holsti 17 years ago

Yes and no; for Ada it depends on the type definition. Ada has both unsigned types with modular arithmetic, similar to the C unsigned, and integer types with range restrictions but with non-modular arithmetic. For example, here is a small Ada program:

with Ada.Text_IO;

procedure Uns is type Unsigned_T is mod 256; -- Unsigned type with arithmetic modulo 256. -- The range is 0 .. 255.

Minus_1 : constant Unsigned_T := -1; -- Correct; the result is 255, as in C, because -- the unary "-" operation is taken mod 256.

Non_Negative : Natural; -- A variable of a (predefined) non-modular type -- with the range constraint 0 .. Integer'Last.

begin

Ada.Text_IO.Put_Line (Unsigned_T'Image (Minus_1)); -- Outputs 255.

Non_Negative := -1; -- Compiler warns that the value -1 is not in range. -- Exception Constraint_Error is raised at run-time.

Ada.Text_IO.Put_Line (Natural'Image (Non_Negative));

end Uns;

So for modular integer types Ada has made the same choice as C: the expression -1 is acceptable and yields the largest value of the type.

Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .

Vote

D

David Brown 17 years ago

Sometimes saturating arithmetic is used in embedded processors, especially in DSPs (or DSP extensions to conventional processors).

It will not matter if the constant used is a symbolic constant which is also tied to the type in its definition. You write something like "indexType i = indexTypeMax", and make sure that indexTypeMax is appropriately defined depending on the size of indexType.

Never write something in a comment if it can be expressed equally well in the language itself.

You don't have to check that the macro is correct any more or less than you have to check that the type definition is correct - and the definitions should be in the same place.

Vote

C

CBFalconer 17 years ago

... snip ...

And it may not work if it is 0xff. Nothing forces a byte to be 8 bits. Examine the value of CHAR_BIT. The use of -1 will reliably set the U*_MAX value.

[mail]: Chuck F (cbfalconer at maineline dot net) [page]: Try the download section.

Vote

Question About Sequence Points and Interrupt/Thread Safety

Join the Discussion

Didn't find your answer?