The Semantics of 'volatile'

The Semantics of 'volatile' ===========================

I've been meaning to get to this for a while, finally there's a suitable chunk of free time available to do so.

To explain the semantics of 'volatile', we consider several questions about the concept and how volatile variables behave, etc. The questions are:

  1. What does volatile do? 2. What guarantees does using volatile provide? (What memory regimes must be affected by using volatile?) 3. What limits does the Standard set on how using volatile can affect program behavior? 4. When is it necessary to use volatile?

We will take up each question in the order above. The comments are intended to address both developers (those who write C code) and implementors (those who write C compilers and libraries).

What does volatile do?

----------------------

This question is easy to answer if we're willing to accept an answer that may seem somewhat nebulous. Volatile allows contact between execution internals, which are completely under control of the implementation, and external regimes (processes or other agents) not under control of the implementation. To provide such contact, and provide it in a well-defined way, using volatile must ensure a common model for how memory is accessed by the implementation and by the external regime(s) in question.

Subsequent answers will fill in the details around this more high level one.

What guarantees does using volatile provide?

--------------------------------------------

The short answer is "None." That deserves some elaboration.

Another way of asking this question is, "What memory regimes must be affected by using volatile?" Let's consider some possibilities. One: accesses occur not just to registers but to process virtual memory (which might be just cache); threads running in the same process affect and are affected by these accesses. Two: accesses occur not just to cache but are forced out into the inter-process memory (or "RAM"); other processes running on the same CPU core affect and are affected by these accesses. Three: accesses occur not just to memory belonging to the one core but to memory shared by all the cores on a die; other processes running on the same CPU (but not necessarily the same core) affect and are affected by these accesses. Four: accesses occur not just to memory belonging to one CPU but to memory shared by all the CPUs on the motherboard; processes running on the same motherboard (even if on another CPU on that motherboard) affect and are affected by these accesses. Five: accesses occur not just to fast memory but also to some slow more permanent memory (such as a "swap file"); other agents that access the "swap file" affect and are affected by these accesses.

The different examples are intended informally, and in many cases there is no distinction between several of the different layers. The point is that different choices of regime are possible (and I'm sure many readers can provide others, such as not only which memory is affected but what ordering guarantees are provided). Now the question again: which (if any) of these different regimes are /guaranteed/ to be included by a 'volatile' access?

The answer is none of the above. More specifically, the Standard leaves the choice completely up to the implementation. This specification is given in one sentence in 6.7.3 p 6, namely:

What constitutes an access to an object that has volatile-qualified type is implementation-defined.

So a volatile access could be defined as coordinating with any of the different memory regime alternatives listed above, or other, more exotic, memory regimes, or even (in the claims of some ISO committee participants) no particular other memory regimes at all (so a compiler would be free to ignore volatile completely)[*]. How extreme this range is may be open to debate, but I note that Larry Jones, for one, has stated unequivocally that the possibility of ignoring volatile completely is allowed under the proviso given above. The key point is that the Standard does not identify which memory regimes must be affected by using volatile, but leaves that decision to the implementation.

A corollary to the above that any volatile-qualified access automatically introduces an implementation-defined aspect to a program.

[*] Possibly not counting the specific uses of 'volatile' as it pertains to setjmp/longjmp and signals that the Standard identifies, but these are side issues.

What limits are there on how volatile access can affect program behavior?

-------------------------------------------------------------------------

More pr An object that has volatile-qualified type may be modified in ways unknown to the implementation or have other unknown side effects.

Nowhere in the Standard are any limitations stated as to what such side effects might be. Since they aren't defined, the rules of the Standard identify the consequences as "undefined behavior". Any volatile-qualified access results in undefined behavior (in the sense that the Standard uses the term).

Some people are bothered by the idea that using volatile produces undefined behavior, but there really isn't any reason to be. At some level any C statement (or variable access) might behave in ways we don't expect or want. Program execution can always be affected by peculiar hardware, or a buggy OS, or cosmic rays, or anything else outside the realm of what the implementation knows about. It's always possible that there will be unexpected changes or side effects, in the sense that they are unexpected by the implementation, whether volatile is used or not. The difference is, using volatile interacts with these external forces in a more well-defined way; if volatile is omitted, there is no guarantee as to how external forces on particular parts of the physical machine might affect (or be affected by) changes in the abstract machine.

Somewhat more succinctly: using volatile doesn't affect the semantics of the abtract machine; it admits undefined behavior by unknown external forces, which isn't any different from the non-volatile case, except that using volatile adds some (implementation-defined) requirements about how the abstract machine maps onto the physical machine in the external forces' universe. However, since the Standard mentions unknown side effects explicitly, such things seem more "expectable" when volatile is used. (volatile == Expect the unexected?)

When is it necessary to use volatile?

-------------------------------------

In terms of pragmatics this question is the most interesting of the four. Of course, as phrased the question asked is more of a developer question; for implementors, the phrasing would be something more like "What requirements must my implementation meet to satisfy developers who are using 'volatile' as the Standard expects?"

To get some details out of the way, there are two specific cases where it's necessary to use volatile, called out explicitly in the Standard, namely setjmp/longjmp (in 7.13.2.1 p 3) and accessing static objects in a signal handler (in 7.14.1.1 p 5). If you're a developer writing code for one of these situations, either use volatile, code around it so volatile isn't needed (this can be done for setjmp), or be sure that the particular code you're writing is covered by some implementation-defined guarantees (extensions or whatever). Similarly, if you're an implementor, be sure that using volatile in the specific cases mentioned produces code that works; what this means is that the volatile-using code should behave just like it would under regular, non-exotic control structures. Of course, it's even better if the implementation can do more than the minimum, such as: define and document some additional cases for signal handling code; make variable access in setjmp functions work without having to use volatile, or give warnings for potential transgressions (or both).

The two specific cases are easy to identify, but of course the interesting cases are everything else! This area is one of the murkiest in C programming, and it's useful to take a moment to understand why. For implementors, there is a tension between code generation and what semantic interpretation the Standard requires, mostly because of optimization concerns. Nowhere is this tension felt more keenly than in translating 'volatile' references faithfully, because volatile exists to make actions in the abstract machine align with those occurring in the physical machine, and such alignment prevents many kinds of optimization. To appreciate the delicacy of the question, let's look at some different models for how implementations might behave.

The first model is given as an Example in 5.1.2.3 p 8:

EXAMPLE 1 An implementation might define a one-to-one correspondence between abstract and actual semantics: at every sequence point, the values of the actual objects would agree with those specified by the abstract semantics.

We call this the "White Box model". When using implementations that follow the White Box model, it's never necessary to use volatile (as the Standard itself points out: "The keyword volatile would then be redundant.").

At the other end of the spectrum, a "Black Box model" can be inferred based on the statements in 5.1.2.3 p 5. Consider an implementation that secretly maintains "shadow memory" for all objects in a program execution. Regular memory addresses are used for address-taking or index calculation, but any actual memory accesses would access only the shadow memory (which is at a different location), except for volatile-qualified accesses which would load or store objects in the regular object memory (ie, at the machine addresses produced by pointer arithmetic or the & operator, etc). Only the implementation would know how to turn a regular address into a "shadow" object access. Under the Black Box model, volatile objects, and only volatile objects, are usable in any useful way by any activity outside of or not under control of the implementation.

At this point we might stop and say, well, let's just make a conservative assumption that the implementation is following the Black Box model, and that way we'll always be safe. The problem with this assumption is that it's too conservative; no sensible implementation would behave this way. Consider some of the ramifications:

  1. Couldn't use a debugger to examine variables (except volatile variables);
  2. Couldn't call an externally defined function written in assembly or another language, unless the function is declared with a prototype having volatile-qualified parameters (and even that case isn't completely clear, because of the rule at the end of 6.7.5.3 p 15 about how functions types are compared and composited);
  3. Couldn't call ordinary OS functions like read() and write() unless the memory buffers were accessed using volatile-qualified expressions.

These "impossible" conditions never happen because no implementation is silly enough to take the Black Box model literally. Technically, it would be allowed, but no one would use it because it breaks too many deep assumptions about how a C runtime interacts with its environment.

A more realistic model is one of many "Gray Box models" such as the example implementation mentioned in 5.1.2.3 p 9:

Alternatively, an implementation might perform various optimizations within each translation unit, such that the actual semantics would agree with the abstract semantics only when making function calls across translation unit boundaries. In such an implementation, at the time of each function entry and function return where the calling function and the called function are in different translation units, the values of all externally linked objects and of all objects accessible via pointers therein would agree with the abstract semantics. Furthermore, at the time of each such function entry the values of the parameters of the called function and of all objects accessible via pointers therein would agree with the abstract semantics. In this type of implementation, objects referred to by interrupt service routines activated by the signal function would require explicit specification of volatile storage, as well as other implementation-defined restrictions.

Here the implementation has made a design choice that makes volatile superfluous in many cases. To get variable values to store-synchronize, we need only call an appropriate function:

extern void okey_dokey( void ); extern int v;

... v = 49; // storing into v is a "volatile" access okey_dokey(); foo( v ); // this access is also "volatile"

Note that these "volatile" accesses work the way an actual volatile access does because of an implementation choice about calling functions defined in other translation units; obviously that's implementation dependant.

Let's look at one more model, of interest because it comes up in operating systems, which are especially prone to want to do things that won't work without 'volatile'. In our hypothetical kernel code, we access common blocks by surrounding the access code with mutexes, which for simplicity are granted with spin locks. Access code might look like this:

while( block_was_locked() ) { /*spin*/ } // getting here means we have the lock // access common block elements here // ... and access some more // ... and access some more // ... and access some more unlock_block();

Here it's understood that locking ('block_was_locked()') and unlocking ('unlock_block()') will be done using volatile, but the accesses inside the critical region of the mutex just use regular variable access, since the block access code is protected by the mutex.

If one is implementing a compiler to be used on operating system kernels, this model (only partially described, but I think the salient aspects are clear enough) is one worth considering. Of course, the discussion here is very much simplified, there are lots more considerations when designing actual operating system locking mechanisms, but the basic scheme should be evident.

Looking at a broader perspective, is it safe to assume this model holds in some unknown implementation(s) on our platforms of choice? No, of course it isn't. The behavior of volatile is implementation dependant. The model here is relevant because many kernel developers unconsciously expect their assumptions about locks and critical regions, etc., to be satisfied by using volatile in this way. Any sensible implementation would be foolish to ignore such assumptions, especially if kernel developers were known to be in the target audience.

Returning to the original question, what answers can we give?

If you're an implementor, know that the Standard offers great latitude in what volatile is required to do, but choosing any of the extreme points is likely to be a losing strategy no matter what your target audience is. Think about what other execution regime(s) your target audience wants/needs to interact with; choose an appropriate model that allows volatile to interact with those regimes in a convenient way; document that model (as 6.7.3p6 requires for this implementation-defined aspect) and follow it faithfully in producing code for volatile access. Remember that you're implementing volatile to provide access to alternative execution regimes, not just because the Standard requires it, and it should work to provide that access, conveniently and without undue mental contortions. Depending on the extent of the regimes or the size of the target audience, several different models might be given under different compiler options (if so it would help to record which model is being followed in each object file, since the different models are likely not to intermix in a constructive way).

If you're a developer, and are intent on being absolutely portable across all implmentations, the only safe assumption is the Black Box model, so just make every single variable and object access be volatile-qualified, and you'll be safe. More practically, however, a Gray Box model like one of the two described above probably holds for the implementation(s) you're using. Look for a description of what the safe assumptions are in the implementations' documentation, and follow that; and, it would be good to let the implementors know if a suitable description isn't there or doesn't describe the requirements adequately.

Reply to
Tim Rentsch
Loading thread data ...

The C Standard only addresses the single thread program model, except for external signal processing. Interacting threads aren't addressed. Apparently Posix incorporates a version of the C Standard (I know zilch of Posix). Since it does support multiple threads, etc., that may be a better standard to explore those issues of volatile.

Yes, but when you interface with memory-mapped hardware or concurrent threads you are stepping outside the realm of Standard C's purview. The implementation is the appropriate level to define that support.

To me, this is another way of saying that the since the implementation can't see all the relevant accesses in the source code, it has to do reads and writes to the volatile objects when the code days to. It's also saying that the operation of the program may rely on features not expressed in Standard C, such as DMA hardware, which might not be fully known to the specific implementation.

This makes sense within the context of Standard C.

While true, I interpret the primary meaning that mechanisms not fully understood by the compiler are at work. As a programmer that addresses these features, such as hardware registers, I need to understand them, but the compiler doesn't have to.

Not true. Debuggers can have magic powers to know details that are not promised by the language. Compilers and debuggers can collude to provide the debugger all the information needed to show variable values.

This isn't promised because the C Standard doesn't discuss functions not provided to the implementation in source form, except for implied OS operations, etc., to support the standard library routines.

It would only be usable for programs that only call standard library functions or functions supplied to the implementation at the time of program translation.

This makes sense.

What are the differences between the gray box and this later one? I understand your description of typical use, but don't see the required difference in generated code.

Here's a question for the OP: what issues come up in actual implementations and use that makes this a important issue? An actual implementation issue would help illuminate the various choices.

--
Thad
Reply to
Thad Smith

In Dread Ink, the Grave Hand of Tim Rentsch Did Inscribe:

I studied this recently both in C and Fortran. I have a small, embedded job that motivates this interest.

The original post is 350 lines long, and I snipped it not out of spite; the length makes it really hard to quote.

A question for OP: are you using C89 or C99?

--
Frank

And just like in 1984, where the enemy is switched from Eurasia to
 Click to see the full signature
Reply to
Franken Sense

No, because Posix specifies that one does not need to use volatile when doing communication between threads.

--
Erik Trulsson
ertr1013@student.uu.se
 Click to see the full signature
Reply to
Erik Trulsson

I once wrote an article on compiler validation for safety critical systems. In return somebody sent me a paper they had published regarding different compilers implementation of volatile. I forget the numbers now, but the conclusion of their paper was that most compilers don't implement it correctly anyway!

--
Regards,
Richard.
 Click to see the full signature
Reply to
FreeRTOS.org

If it was the same one posted here a few months ago, it started out with a very basic false assumption about what volatile *means*. Casting the rest of the paper into doubt as far as I can see.

--

John Devereux
Reply to
John Devereux

...

Speaking of undue mental contortions, and without intending to slight or denigrate your well written explanation, it might be more straightforward to explain what volatile might mean to the developer, not implementor. To my knowledge, in context of C++ not C, volatile has meaning only in relation to optimizations. Specifically, the compiler is made aware that the referenced value can change from external actions. It implies acquire and release semantics, and limits reordering operations to not violate the operation order before and after the volatile access. But, you tell me... This simplistic view is as much thought as I had given the topic in quite a long while.

Reply to
MikeWhy

The OP didn't specify any conditions for using volatile. He may well be referring to the West Podunk fire safety standards. Mr Smith brought in the C standard and Posix. We can't trust the posting to c.l.c, since that has been seriously affected by trolls.

In West Podunk anything volatile is easily ignited, and requires special storage.

--
 [mail]: Chuck F (cbfalconer at maineline dot net) 
 [page]: 
 Click to see the full signature
Reply to
CBFalconer

My comments were based on the most recent draft of the current (C99) standard, more specifically n1256. However I think most of what was said still applies to C89/C90, because the basic decisions about volatile were made pretty early. Of course, to be sure it would be necessary to look at the C89/C90 documents and verify that.

Reply to
Tim Rentsch

It's important to realize the "implementation-defined" clause for what constitutes a volatile access provides a loophole so large that almost no implementation of volatile can be called incorrect categorically. As long as the implementation accurately documents the decision it made here, arbitrarily lax choices are allowed (or so the claim has been made), and these implementations aren't incorrect. Example: "An volatile access is the same as any other access in the abstract machine, except for the first billionth billionth billionth second of program execution, in which case the volatile access is done as early as it possibly can but after all necessarily earlier accesses (in the abstract machine sense).

Reply to
Tim Rentsch

To give the paper its due, I have looked it over, and I think its conclusions are basically right. There are some details that are either oversimplified or slightly wrong, depending on how certain statments in the Standard are read; that's a whole other big discussion, and I don't want to say any more about that right now. But I would like to repeat that the paper mentioned is, IMO, quite a good paper, and it contributes some important observations on volatile and how it is implemented.

Reply to
Tim Rentsch

My intention was to provide discussion for both developers and implementors, because I think it's important for both sets of people to understand the thinking of the other. Having said that, I concede that the comments could have been better on the developer side.

Certainly it is true that using volatile will (usually) inhibit certain optimizations, but the question is, which optimizations and under what conditions? The basic answer is that this is implementation-defined, and there is no single right answer. /Usually/ volatile implies canonical ordering of at least those accesses that are volatile-qualified, but does/should/must it imply more? Depending on implementation choice, volatile /might/ (for example) cause a write-barrier to be put into the storage pipeline, but that isn't absolutely required. Other considerations similarly.

The lack of a single model for which other memory regimes are synchronized is part of the murkiness of volatile; my main point is that each implementation must identify which memory regime(s) are coordinated with by using volatile (for that implementation).

Reply to
Tim Rentsch

And therefore Posix implicitly imposes requirements on what implementations must do to be Posix-compliant, because otherwise volatile would be necessary in such cases.

Reply to
Tim Rentsch

The Posix model is only one model. Other environments may choose to use different models. It's important to understand these possibilities also.

I think you missed the point here. The implementation-defined aspect doesn't have to do with memory-mapped hardware or concurrent threads. Typically an implementation doesn't define these things at all. It may (or may not) be aware of them, but usually it doesn't define them.

It's true that volatile-qualified accesses must occur strictly according to the abstract semantics, but it isn't this sentence that requires that. Nor does it say that the program can rely on features not specified in the Standard; in fact, what it is saying is that the Standard isn't sure /what/ can be relied on, and the implementation isn't sure either. It's /because/ the implementation can't be sure how volatile will affect program behavior that it imposes such severe requirements on program evaluation in the presence of volatile.

The key point is that the Standard doesn't place any limitations on what might happen.

Here you have again missed the point. Certainly a /cooperative/ implementation could make its information available to a debugger, but conversely an arbitrarily perverse implementation could make it arbitrarily difficult for a debugger to get out this information; at least practically impossible, even if perhaps not theoretically impossible. And that's the point: no implementation chooses to be that perverse, which means it doesn't really implement the Black Box model.

It isn't promised by the Standard, but essentially every implementation provides it, because almost no one would choose an implementation that didn't provide it.

Yes, and for that reason no one would use it (or at least almost no one).

The earlier Gray Box model synchronized on all external function calls. The "OS/kernel" Gray Box model doesn't have to synchronize on function calls, only around volatile accesses (and so it could do inter-TU optimization that would reorder function calls if there were no volatile accesses close by).

Consider the example mutex/spinlocking code. Here is a slightly more specific version:

while( v = my_process_id, v != my_process_id ) {/*spin*/} shared = shared + 1; v = 0;

Here 'v' is a volatile variable (that incidently has a magic property that makes it work in the spinlock example shown here, but that's a side issue. The variable 'shared' is not volatile.

Question: can the accesses to 'shared' be reordered so that they come before the 'while()' loop (or after the subsequent assignment to v)?

This kind of question comes up frequently in writing OS code. It's not a simple question either, because it can depend on out-of-order memory storage units in a highly parallel multi-processor system. Clearly if we're trying to do locking in the OS we care about the answer, because if the accesses to 'shared' can be reordered then the locking just won't work. We want the implementation to choose a model where the accesses to 'shared' are guaranteed to be in the same order as the statements give above. Make sense?

Reply to
Tim Rentsch

If you want a discussion, you'll have to write more directly and less generally.

Do you mean a cache- or bus-lock? Good heavens, no. I'd throw out that compiler with yesterday's bad news. If the developer wants a cache lock, he'll have to code one explicitly. Volatile means nothing more than to limit the optimizations.

Again, I must be misunderstanding your intent. Please write more concretely.

A compiler should compile the developer's code with the least amount of surprises. Synchronization remains the purview of the hardware (read-modify-write, for examplee) or operating system (mutual exclusion). Volatile doesn't imply any of that. It means simply to not make assumptions about the value referenced by the variable. If I wanted the compiler to do more for me, I would write in some other language, Java or VisualBasic perhaps. Volatile, in fact, is just the opposite of exciting. It is dull and boring. It tells the compiler to not do any of that fancy shuffling around stuff.

Reply to
MikeWhy

"Necessary?" Why? Didn't you just finish telling us that `volatile' means almost nothing, in the sense that nearly all its semantics are implementation-defined? If we can't say what effects `volatile' has (and I agree that we mostly can't), I don't see how we can class those unknown effects as "necessary."

Further discussion of threading topics should probably occur on comp.programming.threads, where one can learn that `volatile' is neither necessary nor sufficient for data shared by multiple threads.

--
Eric Sosman
esosman@ieee-dot-org.invalid
Reply to
Eric Sosman

In the absence of information to the contrary, using volatile is essentially always necessary when making use of extralinguistic mechanisms. This conclusion follows from 5.1.2.3 p 5, which gives the minimum requirements that a conforming implementation must meet (and explained in more detail in the earlier comments on the Black Box model). What I mean by 'necessary' is that, if volatile is omitted, a developer has no grounds for complaint if something doesn't work, just as there are no grounds for complaint if one expects 'sizeof(int) Further discussion of threading topics should probably occur

I concur, except that I think the marking isn't necessary in this case. As a generic statement, an observation that volatile is neither necessary nor sufficient for thread-shared data is (IMO) quite apropos in comp.lang.c. It's only when the discussion starts being limited to particular threading models that it becomes a significant impedance mismatch for CLC.

Reply to
Tim Rentsch

Probably good advice. About the best I can offer in response is that I did what I could considering the subject matter and the time available.

What volatile means is determined almost entirely by a decision that is implementation-defined. If an implementation chooses to interpret volatile so that it supplies a cache lock, it may do so. Or a bus lock. Or no lock at all. The Standard grants an enormous amount of latitude to the implementation in this area.

The choice of "synchronized" was a poor choice; "aligned" probably would have been better.

In the absence of volatile, the only limits for how the abstract machine can map onto a physical machine is the "as if" rule and the minimum requirements on a conforming implementation as stated in 5.1.2.3. Using volatile tightens those limits, but (and here is the point), the Standard /doesn't say what physical machine model must match the abstract machine at volatile access points/. Your comment, "[volatile] means simply to not make assumptions about the value referenced by the variable", isn't completely wrong, but it's an oversimplification. The reason is, when one talks about "the" value referenced by a variable, those comments must take place in the context of a particular memory model. Since the Standard doesn't specify one, we can't really talk about what volatile does without taking that implementation-defined choice into account. A cache lock memory model is different from a bus lock memory model, etc.

Do those comments make more sense now?

Reply to
Tim Rentsch

...

Give me an example. Given a volatile qualifier, what memory context makes it reasonable and useful for the compiler to generate a cache lock where the developer didn't specifically code one? I can't think of one. If the compiler can generate opcodes to do so, the developer can also write code to do so if he had wanted one. Or are we talking something much simpler, something like flipping a bit in a R/W register, as on a baby PIC?

The point of standardizing a language is to give some assurance that the same code run through different, compliant compilers will generate functionally identical behavior. This puts a limit on just how unspecified "implementation defined" loopholes are in the language spec. Volatile limits the actions of the compiler, not grants more freedom through the implementation-defined loophoole.

Reply to
MikeWhy

Oh, I never said it was reasonable, only that it's allowed.

You're right that volatile limits the actions of the compiler, but how much they are limited is determined by an implementation-defined choice, and that choice has so much leeway that we cannot in general make any guarantees about restrictions volatile imposes.

I'm not trying to advocate that the Standard adopt a particular memory model for volatile, or even that it limit the set of choices of memory model that an implementation can choose amongst. Maybe that's a good idea, maybe it isn't, I just don't know. I /do/ think what memory models each implementation supports should be described both more explicitly and more specifically, for the benefit of both developers and implementors.

Reply to
Tim Rentsch

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.