Verify execution speed - "cycle counting" etc?

But this isn't very persistent, and if you get a 1 in a 100,000 failure, you'll probably miss it.

--
Les Cargill
Reply to
Les Cargill
Loading thread data ...

If the thing you're measuring takes about as much time as a port pin toggle, there's no point using any measurement technique. In those cases, you _can_ either prove your timing correctness by cycle counting with pen and paper, or you're toast.

And anyway, a job that's worth timing to that kind of precision must involve some kind of peripheral access, anyway. Because if it didn't, it wouldn't actually matter how long it takes, as it's all behind the scenes. The shortest time really worth measuring on an embedded system is that between two _external_ events. Framing these external events by an additional pair of external events (port pin toggles) won't make enough of a difference to invalidate the result unless you're cutting it _extremely_ close.

Or, to turn this issue around: if you need that kind of timing precision, why on earth did you do it on a serial processor instead of, say, an FPGA? Seemingly hard problems are sometimes just a result of having picked the wrong tool.

And you know exactly how many there are, so you can subtract the time taken by the port pin toggles. Since that doesn't generate any doubt about the actual result, there's no problem.

Reply to
Hans-Bernhard Bröker

Huh? Do you think that priority-based preemptive scheduling plus some schedulability analysis (say, response-time analysis with deadline-monotonic priority assignment) is "broken" in some way?

If a system is designed along those lines, changing thread priorities can surely cause deadline misses, but IMO this does not mean that the code or the design/analysis method are "broken".

--
Niklas Holsti 
Tidorum Ltd 
niklas holsti tidorum fi 
      .      @       .
Reply to
Niklas Holsti

As long as the "doSomethingSlow" is compiled separately from the timing loop, the compiler can't reorganize those statements.

Any externally compiled dummy routine should prevent any optimization across calls.

Are you sure that usleep(0) does nothing ?

At least on some cooperative or round robin systems, the XYZsleep(0) type function will at least run the scheduler, possibly activating the thread at next clock tick.

Reply to
upsidedown

doSomethingSlow();

AFAIK, if you declare startTime and slowTime as volatile that shouldn't happen -- but that's "shouldn't" in a moral sort of way, not in a "you can always expect this" sort of way. Apparently sometimes it does anyway.

There's a paper floating around in the eWorld that gets very technical about what volatile's supposed to do, what compilers often fail to do under aggressive optimization, and ways that you can force correct behavior (sometimes at the cost of some execution speed).

--

Tim Wescott 
Wescott Design Services 
http://www.wescottdesign.com
Reply to
Tim Wescott

Compilers with link-time optimization (LTO) can reorganize code even then. AIUI, recent gcc versions have LTO. We should hope that LTO can be disabled with suitable compiler options.

--
Niklas Holsti 
Tidorum Ltd 
niklas holsti tidorum fi 
      .      @       .
Reply to
Niklas Holsti

Actually, not even the moral sort of way really works. The key issue is that making objects volatile affects _only_ volatile objects in any way. It really creates no extra restrictions whatsoever on non-volatile objects. The compiler would be fully allowed, in any sort of way including the moral, to move that doSomethingSlow() before the assignment to startTime, after that to slowTime, or even split it into three separate parts to go before, between and after them. The only thing those volatiles really do prohibit is moving slowTime before startTime.

Making stuff volatile, if it's actually needed, may _always_ costs some execution speed. That's part of the reason it exists.

Reply to
Hans-Bernhard Bröker

Perhaps I should have said "more execution speed than is necessary to just do what it should".

Clearly the C language needs to allow you to mark a block as volatile, to say "execute everything inside this block after any "volatile" stuff before and before any "volatile" stuff after, but do whatever you want inside.

volatile { do_some_stuff; }

And it needs to be simpler, too -- it's just getting too complex.

--

Tim Wescott 
Wescott Design Services 
http://www.wescottdesign.com
Reply to
Tim Wescott

Absolutely not. Your dummy routine is a really good variation on what I'd said.

it will not be costless. It may cause a context switch - I'd count on it.

Right. And even with preemptive systems, it may realign you with the clock tick that's usually at 10 msec.

>
--
Les Cargill
Reply to
Les Cargill

Outside of an ... FPGA register that's updated asynchronously, I haven't trusted the "volatile" keyword for over a decade. There were failures, and rather than spending my employer's dollar on chasing them , I just changed direction.

Using objdump was enough. W. T. F?

--
Les Cargill
Reply to
Les Cargill

I mean that these things are inherently stochastic, and if you are betting the ranch based on awhat a 5"X4" screen tells you...

--
Les Cargill
Reply to
Les Cargill

Absolutely. I mean absolutely no offense, but having done that sort of thing for a long time, it finally dawned on me that it was mostly a waste of time. Add silicon; it's the only way.

we don't have to scale processing to purchase price any more.

The number of cases where this design choice is defensible gets smaller in my mind every year. I swear I cannot even remember why one would do that in the first place, and it's a common element in so many failure stories.

Granted, if a deadline miss doesn't *matter*, then by all means. But if it does, I'd replan and put specialist processors or FPGA on the job.

At least think about it. It may well be that you live in a domain where that sort of thing is inevitable. I have managed to stop doing that.

--
Les Cargill
Reply to
Les Cargill

Holy cow! So who watches the watchers?

--
Les Cargill
Reply to
Les Cargill

I suppose you mean to ask, in your quaint way, why one should trust these tools. Of course they should not be the only means of verifying real-time performance -- validation of the real behaviour in the real environment remains necessary for critical things -- but they add evidence that the code is fast enough even in very rare situations which are unlikely to happen in validation tests.

The static analysers can find performance problems early on, before the system hardware is running, and both kinds of tools are good at finding the most time-consuming parts of the SW, as an alternative to profiling.

--
Niklas Holsti 
Tidorum Ltd 
niklas holsti tidorum fi 
      .      @       .
Reply to
Niklas Holsti

Assuming that the silicon is used for parallel processing (FPGA or otherwise) that is an alternative, I agree. But not for everyone.

SW in space systems. Constrained size, mass, power, clock frequency. Radiation-tolerant, big-feature chips with internal triple modular redundancy.

Even if most or all of the high-rate, high-volume processing is delegated to FPGAs, the CPU or CPUs still have to do both slower background processing and rapid responses. Preemptive scheduling and priorities are still needed, IMO, at least in my field.

What sort of systems do you work on?

Are there many others in this group who share Les' view on priorities?

--
Niklas Holsti 
Tidorum Ltd 
niklas holsti tidorum fi 
      .      @       .
Reply to
Niklas Holsti

I have been working with priority based pre-emptive systems for just a few decades and I still think this is the right way to handle most things. Some things to check:

1.) Can you run the system with RT priorities ? The nice thing about priority based systems is that you can _lower_ the priority of non-critical tasks, i.e. you can "sacrify" less critical tasks. 2.) run (at least part of the functionality) in interrupt context and do the actual processing of the queue in normal context. Suitable for burst with low average load. 3.) Run everything in FPGA with low average load.
Reply to
upsidedown

You can instead set the scope to trigger when a pulse width is exceeded. That will catch any such event, there is no dead time. Of course that in itself does not prove anything, but it can catch your 1 in 100000 failure.

--

John Devereux
Reply to
John Devereux
[...]

Depends on the hardware counter, it might rollover at other values than 2^32.

One has to look at the code and use barriers if needed.

Oliver

--
Oliver Betz, Munich http://oliverbetz.de/
Reply to
Oliver Betz
[...]

again a generalisation I disagree with.

I do so.

Size, cost, power, peripherals. I don't know a FPGA with integrated "precise enough for UART" oscillator, ADCs and low power consumption in tiny packages for less than 1USD.

Oliver

--
Oliver Betz, Munich http://oliverbetz.de/
Reply to
Oliver Betz

Yes, that is exactly the point I was getting at (but letting people think a little before giving them the answer). "volatile" accesses are strictly ordered (assuming there are sequence points between them - there are no order guarantees if you do something like "foo(vol1, vol2)"). But volatile accesses have no ordering requirements with respect to non-volatile accesses or any calculations.

There are various ways to enforce - or at least, partially enforce - the desired behaviour. Some have been mentioned in this thread.

If the compiler does not know anything about "doSomethingSlow()", because it was separately compiled (and you are not using LTO or other whole-program optimisation), then the compiler has to assume it will contain volatile accesses, and it cannot do any re-ordering. You can achieve a similar "zero knowledge" effect by using function pointers, although gcc can sometimes optimise function pointers too. If the function pointer itself were volatile, you would be okay.

As long as the function doSomethingSlow() is compiled as a stand-alone function, rather than being inlined, then it is very unlikely that the compiler will re-order around it - but it /could/, if it wanted to.

Most compilers have some sort of "memory barrier" - there is no standard C solution. Sometimes this is done using intrinsics or compiler extensions. In gcc, it is done using inline assembly with a "memory" clobber: "asm volatile ("" ::: "memory"); ". This tells the compiler that memory may be read or written in an unexpected (volatile) manner. It does not force a complete ordering on code, but it forces an ordering on data in memory - no memory accesses can be re-ordered across a memory barrier, or "cached" across the barrier. So a memory barrier before and after the call to doSomethingSlow() will disallow the compiler from most re-ordering of code, and will usually have low impact on the speed.

It would be nice if there were some C standard way to deal with this sort of thing, but there is none - and there is not likely to be any. C simply does not have any concept of the timing of events, and only considers the order of /observable/ events to be relevant (i.e., volatile accesses and calls to external code).

Reply to
David Brown

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.