KiCad Spice, Anyone Tried It?

You're blowing smoke at me now.

Your example was incorrect--if the port register is declared volatile, the compiler is not at liberty to re-order code across sequence points such as the semicolon at the end of the statements. Thus the pin will go high before the 'big calculation' and low afterwards.

Other things that interrupt the program flow (e.g. interrupts and context switches) can make the interval longer, but that would most often occur during the big calculation, so it's a real measurement.

Cheers

Phil Hobbs

--
Dr Philip C D Hobbs 
Principal Consultant 
 Click to see the full signature
Reply to
Phil Hobbs
Loading thread data ...

Yeah, but your example had semicolons. Keep going, bro, you're digging a nice deep hole.

Cheers

Phil Hobbs

--
Dr Philip C D Hobbs 
Principal Consultant 
 Click to see the full signature
Reply to
Phil Hobbs

Okay, so you don't actually have any hardware examples. A pity.

Cheers

Phil Hobbs

--
Dr Philip C D Hobbs 
Principal Consultant 
 Click to see the full signature
Reply to
Phil Hobbs

We were talking about thermocouples and so forth, which are mildly perturbed straight lines that need only engineering accuracy. Polynomials work fine for that.

Cheers

Phil Hobbs

--
Dr Philip C D Hobbs 
Principal Consultant 
 Click to see the full signature
Reply to
Phil Hobbs

I'm sorry, but you are /completely/ wrong here.

In practice, as I noted, a compiler is unlikely to re-order these things unless it sees an advantage to it. But it is without doubt free to do so. Only observable behaviour has an order in C - things without observable behaviour have no order.

Here is a variation on the original code:

extern volatile unsigned char pin;

unsigned int t = 1;

void testmults(void) { pin = 1;

for (unsigned int x = 1; x

That is of course true, and useful to remember when measuring speed - but a side-track here.

Reply to
David Brown

Semicolons and other sequence points are relevant for the abstract machine. They don't impose an ordering on the generated code, as long as the final results match between the abstract machine and the real assembly at points of "observable behaviour" - and then only the orders and values matter, not timing.

This is known as the "as if" rule of C.

Yes, I'll keep going - I believe you'll understand in the end. I've posted an example that you can test yourself on , and I hope you will try that before replying.

Reply to
David Brown

That's because the compiler optimized away the whole loop. If your

big_calculation()

wound up doing nothing, and so was optimized away like that, then your example above

would correctly show that this was the case.

Keep digging, man!

Cheers

Phil Hobbs

--
Dr Philip C D Hobbs 
Principal Consultant 
 Click to see the full signature
Reply to
Phil Hobbs

It is difficult to get the kind of flexibility and efficiency that C and C compilers can offer, without also having these kinds of risks and complications. It's usually not hard to get this kind of thing right - it just involves making sure that your calculation depends on a volatile read (after the pin high) to get started, and generates a volatile write with the result (before the pin low). The compiler can still move unrelated unobservable calculations inside the measurement period, but that's fairly unlikely to be a big issue.

In general, as long as you remember that C doesn't have any timing information, and the generated code only matches your source code on observable behaviour, you'll be fine.

(I don't know of any other language, other than assembly, that guarantees more than that. Even XC on XMOS does not, IIRC - it lets you check some aspects of timing, but does not give you control of the details of execution.)

I showed in another post an example of where gcc re-arranges code around the volatile accesses:

extern volatile unsigned char pin;

unsigned int t = 1;

void testmults(void) { pin = 1;

for (unsigned int x = 1; x

Reply to
David Brown

Look again. It combined the loop - it did not optimise away the calculation.

Please tell me you are not imagining that the C language has different rules for a "small" calculation and a "big" calculation?

Reply to
David Brown

Perhaps you think that using a function call matters:

extern volatile unsigned char pin;

unsigned int t = 1;

unsigned int big_calc(unsigned int x) { unsigned int y = x * x; unsigned int z = x * y; return y * y + 2 * z + 3 * y + 4; }

void foo(void) { pin = 1;

t = big_calc(t);

pin = 0; }

foo: mov eax, DWORD PTR t[rip] mov BYTE PTR pin[rip], 1 mov BYTE PTR pin[rip], 0 mov edx, eax imul edx, eax lea eax, [rdx+3+rax*2] imul eax, edx add eax, 4 mov DWORD PTR t[rip], eax ret

No loops are removed, and "big_calc" is written as a function call, with an externally visible definition including several statements, semicolons, and local variables. You can see the "pin = 1; pin = 0;" assembly within the generated assembly.

Do I /still/ have to dig, or are you ready to accept that compilers can re-order statements and non-volatile accesses around volatile accesses, as long as all observable behaviour is correctly ordered?

Reply to
David Brown

The ITS polynomials are for thermocouples, not thermistors.

There is a simple equation that is usually used for thermistors,

formatting link

which is plenty good, since thermistors aren't usually very accurate. We typically measure their resistance ratiometrically against a 1% resistor, and use a fairly coarse lookup table with interpolation, generated from the S-H equation. It's useful to seed a PCB with a few surface-mount thermistors, so you can snoop things, close fan loops, shut down, or temperature compensate things.

I'm about to ship a new high voltage pulse generator that had very complex voltage/duty-cycle/loading limits. Customers might drive hi-z loads, 50 ohms, a laser stack, or a short. The simple fix was to current limit the power supply and hang a thermistor on the 50 ohm output resistor. Let the resistor do the math for me.

RTDs have yet another equation.

The reason to use the ITS thermocouple polynomials is that they are basically the definition of temperature, and we don't want to argue with that.

--
John Larkin         Highland Technology, Inc 

Science teaches us to doubt. 
 Click to see the full signature
Reply to
jlarkin

You're fogging, which is your usual MO when anyone calls you on your BS.

In your most recent one, the return value is thrown away, which the compiler of course noticed.

These are all toys, not big_calculation()s done for some actual reason other than saving your ass.

How about a counterexample using the code I called you on, using actual embedded code with an actual output that gets used for something real?

Cheers

Phil Hobbs

Reply to
pcdhobbs

I know Fellows of big aerospace companies, software guys, who aren't sure which end of an oscilloscope goes up. They can't estimate runtines within 10:1, so they grossly overkill on compute power.

One of them is playing with those raspberry pi things at home. I'm going to send him an oscilloscope.

There was an obvious linux timer IRQ, always there, at 1 KHz as I recall. Things like ethernet activity added less regular timeouts from the hard loop. But the worst stackup that we saw was under 40 us, and rare.

Linux, I understand, allows weak suggestion of what to run on which CPU, which we did, and that seemed to help. We didn't research or document this extensively, because it became obvious that we could run our state loops and such plenty often enough, and that we should do some of our math (like summing the means and squares of ADC samples, for RMS) in the FPGA.

Don't know. We just scoped the gross timeouts.

--
John Larkin         Highland Technology, Inc 

Science teaches us to doubt. 
 Click to see the full signature
Reply to
jlarkin

There is a Xilinx appnote for implementing linux on one ARM core, and running bare-metal on the other. That's what we'd do if we had to. If we had more than one fast thing to run, we'd do a simple state machine. Even an RTOS has overhead.

One could toss a bunch of soft-core processors in the FPGA too, which certain parties are eager to do.

Some day we'll just run every process on its own CPU and avoid all that context switching nonsense.

--
John Larkin         Highland Technology, Inc 

Science teaches us to doubt. 
 Click to see the full signature
Reply to
jlarkin

Right. If the port was treated as a local variable, both the set and the clear would be optimized out, since they obviously don't do anything.

Playing with compiler optimization options can have radical effects. We had one case where it took a lot of experimenting with a c compiler to get it anywhere as fast a runtime as my PowerBasic program. It started down 4:1 and finally got about 80% as fast.

--
John Larkin         Highland Technology, Inc 

Science teaches us to doubt. 
 Click to see the full signature
Reply to
jlarkin

Yes. One trick is to force a c calculation to include the state of a pin or something external, in a way that can't be optimized out. *We* know that the pin is always low, because we soldered it to ground.

Or write an expression that is complex enough that the compiler can't figure out how to optimize it away; create a handy dirty variable.

--
John Larkin         Highland Technology, Inc 

Science teaches us to doubt. 
 Click to see the full signature
Reply to
jlarkin

Set the scope to infinite persistance, and run overnight.

--
John Larkin         Highland Technology, Inc 

Science teaches us to doubt. 
 Click to see the full signature
Reply to
jlarkin

Design is generating the unexpected. It pays better than copying.

It's my impression that coders rarely invent anything. Hardware designers often do.

Where's the electronics that you were going to post?

--
John Larkin         Highland Technology, Inc 

Science teaches us to doubt. 
 Click to see the full signature
Reply to
jlarkin

OK for some limited things, but an ARM+i/o+FPGA is an more general and more appealing concept to me.

Some day? *I do /precisely/ that right now*.

It is *precisely* the concept behind and implementation of the XMOS xCORE processors, programmed in xC (think Occam or CSP, or some of the CSP concepts appearing in Rust and golang), within a standard IDE.

It works, is easy and *fun*. A *much* lower learning curve than Vivado.

To kick the tyres of the concepts I had - one core counting the transitions on an input serial stream, - another doing the same on a different stream, - one core for the front panel, - several cores communicating over USB with a PC (could have been ethernet) - one core for a supervisory FSM coordinating all the above

Hard realtime *guarantees* not to miss any transitions, nor comms, etc. The IDE inspects the optimised binary to calculate precise timing. None of this "measure and hope" crap :)

IDE is free, buy the processors and dev boards at DigiKey.

Reply to
Tom Gardner

what

like to

cted

all

,
.
.
d
I

Bingo! That's the idea behind the GA144 with 144 processors and FPGAs in g eneral where every assignment outside of a process is a parallel task which can have its own private processor... a set of gates and/or FFs.

I try to explain to people why FPGAs are easy to design and it often goes o ver their heads because of the FUD that has been spread about FPGAs... BIG! POWER HUNGRY! COMPLICATED!!! None of which is true any more than it is about processor chips.

--
  Rick C. 

  --+- Get 1,000 miles of free Supercharging 
 Click to see the full signature
Reply to
Ricketty C

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.