Hardware scripting (a better Arduino)

- B
- bitrex
  
  Contact options for registered users
posted
6 years ago

Tue, Jun 27, 2017 2:07 PM

As someone pointed out a while back while the Arduino toolchain seems to be very popular for developing hobbyist embedded applications for AVR/ARM, but the API and IDE is kind of a mess. The API is a bastardized subset of C++ where a lot of things don't work and which most users only use to write kludgey, procedural C-like code; you have C-like global functions such as "digitalWrite" which under the hood use 40 or 50 assembly instructions just to write a bit to a GPIO port.

I think it's certainly possible to make something better that allows someone who doesn't want to fight with the tools get real work done.

I was thinking that since little uPs are really more like programmable hardware rather than general purpose computers, if one's looking to make a microcontroller language for the everyman it doesn't really make sense to derive it directly from a general purpose systems programming language like C/C++.

In modern computer games the graphics and physics engines are written in high-performance compiled languages like C/C++. But the whole product doesn't use those languages to define its behavior; usually all the plot, mechanics, and other stuff that make a video game a game is fleshed out in some kind of scripting language at a much higher level of abstraction. This allows people other than rockstar systems programmers to contribute to the design and avoids needing to recompile the whole codebase every time someone decides this enemy spaceship should fire purple lasers instead of green.

I think even 8 bit processors with modern architectures are fast enough that one could adapt a similar paradigm to writing embedded apps. The high-performance stuff, like interfacing with external hardware through GPIO can be written in C/C++, in something like the "policy-based design" paradigm where you have an abstract type of device, like "Display" or "TemperatureSensor" which defines an interface to the basic functions any device of that type should be able to do, and then "policies" or "plug-ins" which handle the logic requirements of some particular type of TemperatureSensor from some manufacturer. If you want to change the sensor you don't rewrite the entire codebase, you just rewrite the plug-in.

And then the logic for how the external hardware interacts could be written in a very abstract stack machine language, similar to Forth or PostScript, or a programmable RPN calculator. An example stack machine implementation for Arduino I saw would blink an LED on and off like this:

13 output { 13 digitalToggle 1000 delay true } while

These scripts aren't compiled, but actually stored as plaintext strings (in compressed mnemonic form) in Flash or EEPROM and then interpreted on the fly by the stack machine.

- R
- Rob
  
  Contact options for registered users
Vote on answer
posted
6 years ago

Tue, Jun 27, 2017 4:25 PM

You have just re-invented the PLC, maybe you should look into the existing "Arduino as a PLC" and "Open PLC" projects?

- D
- Don Y
  
  Contact options for registered users
Vote on answer
posted
6 years ago

Tue, Jun 27, 2017 5:12 PM

Two obvious problems:

- strings take up lots more space than "compiled code" (you were complaining about 50 instructions to toggle a GPIO, upthread -- what would your "string equivalent" form be: "13 output true"

- how to handle the case AT RUNTIME when a *parse* of the text doesn't map to legitimate executable instructions: "13 uotput true"

Its trivial to write a little interpreter that gives you a bit of both worlds without having to develop a full-fledged compiler or task yourself with generating "really efficient" code. The interpreter can then "playback" the tokenized source code to reveal the source without having to STORE the source.

I wrote a little bastardized BASIC that ran in a 647180 (predated the richer SoC's available, now) so you could do things like:

SPAWN 100 SPAWN 200 STOP

100 STATE = ON 101 LOAD FLASH_TIMER 300ms LAMP STATE WAIT FLASH_TIMER STATE = !STATE GOTO 101

200 PRINT "Please enter your password:" INPUT PASSWORD ...

This might "compile" down to

- C
- Clifford Heath
  
  Contact options for registered users
Vote on answer
posted
6 years ago

Wed, Jun 28, 2017 1:01 AM

Me, I believe.

Yes, I agree. I think a lot of systems would be better implemented that way.

That's what the creators of Arduino should have done with their C++ library. C++ is still the right way to build the system you describe.

This is where I diverge. "Compressed mnemonic" means tokenised. It's better to replace each opcode by a direct pointer to the function that implements the opcode, aka a threaded interpreter. You use a pointer register to replace the PC, so execution is just asm("jmp p++"), and the function extracts its parameters using "*p++". Trivial to implement, and very fast on a not-heavily pipelined CPU.

Want help writing the compiler? It's pretty straightforward. Propose a sensible syntax (not Basic, and not RPN/Forth!) and I'll look at it. It would probably even integrate easily into the Arduino toolchain, emitting C++ data definitions to be compiled into flash.

Clifford Heath.

- B
- bitrex
  
  Contact options for registered users
Vote on answer
posted
6 years ago

Wed, Jun 28, 2017 3:08 PM

I haven't had a chance to follow up to each reply to this thread individually today, so I'll just use the opportunity to address a couple of the objections "Rob" and "Don Y" mentioned here...the performance of the stack machine doesn't have to be particularly great, as profiling usually shows that most apps spend the vast majority of their time grinding thru the same small bit of code, perhaps only 5% of the total codebase. In an embedded app this might be say some signal processing or other data-cruching code, or handling and dispatching packets coming in over a network. Like in the games analogy, that area is the equivalent of the "engine" that's doing the graphics, and would certainly be at least implemented in C/C++.

It will get its work done nearly as fast if the algorithm is simply referenced by some function pointer on a software stack that gets called with other stack objects popped into it, as that's essentially how calling functions on local variables from procedural code works with the intrinsic processor stack defined by the uP's ISA which a C/C++ compiler uses, and with a simple architecture that lacks virtual memoy, cache, TLBs, and all that stuff a well-implemented software stack should be nearly as fast.

Much of a codebase wont be performance-critical and using C++ for the rest of it is the proverbial using a sledgehammer to swat a fly.

I think part of the reason the asm generated for many Arduino API calls is so bloated is that while it is derived from C++, they made virtually no use of the template metaprogramming functionality available in its modern incarnations and so there have to be tons of overloads, virtual methods, etc. for different targets and user configuration options where the compiler has no idea what is needed and what isn't, so has to toss in the kitchen sink.

Template metaprogramming is too abstruse for newbs, but ideally all this platform-specific and configuration stuff would be templated and handled at compile time. The goal of modern C++ is to have "zero overhead abstraction" via generic programming; done correctly most of the important stuff about the target should be known at compile time such that you can certainly use an abstract function like "digitalWrite" in your code, but when its compiled the compiler "knows" that it can smoosh all that abstract "overhead" code down into a single asm bit flip instruction on the proper port for the target device.

Yeah, you could have each "opcode" represented by a single unsigned char, so with an RPN-kind of language the "interpreter" works pretty much the same as you describe, a program counter stepping thru the script and pushing and popping functions and data on and off the stack, and harvesting return values, as appropriate.

You can have foolproof variant types that will happily hold both function pointers and data in the same "package", on the same software stack, using a custom stack allocator designed as appropriate for the capabilities of the hardware. You can have locally scoped lambda functions generated on-the-fly at compile time and stored in flash memory that can be templated to accept any number of arguments from the stack as is required automatically.

All these things are standard stuff it the modern C++ hotness, and compile down to having not much more overhead than pure C or asm.

This project would have to be pretty low on my list of priorities, unfortunately, but I'll definitely keep the offer in mind...:-)

- T
- Tom Gardner
  
  Contact options for registered users
Vote on answer
posted
6 years ago

Wed, Jun 28, 2017 3:30 PM

There is a general philosophical question: is it better to create

- a domain-specific language or

- a domain-specific library in a standard language

Without exception, all the "let's allow easy scripting" (before or after product delivery) using our own DSLanguage have speedily end up being a disgusting mess that even the /creators/ no longer understand. I've seen two at close quarters, and narrowly avoided creating one myself.

Even if a DSLanguage isn't an unholy mess, nobody else will want to work with it, there is zero tool support, unless you create it yourself, and you'll have to train people to use and extend it.

OTOH a DSLibrary comes with all the usual tool support, people /expect/ to use DSLibraries, and the training and support are the essential minimum.

The exception to DSLanguages being universally crap is that well-defined /model/ based languages based on standard concepts can sometimes be beneficial. Classic examples are those based on FSMs and - particularly relevant in this case - ladder logic.

Stick to Forth rather than invent a manky DSLanguage.

- C
- Clifford Heath
  
  Contact options for registered users
Vote on answer
posted
6 years ago

Wed, Jun 28, 2017 11:42 PM

Exactly what I've started doing. E.g. ,

The AD9851 template has six parameters which are compile-time constants, but would otherwise have to be stored in instance variables. Similarly with my port-pin templates, posted here a month or three back.

newbs don't really need to know how to write it, just how to use it.

You do wind up with some ugly hacks (like empty/non-empty base classes in templates, if e.g. you want to make that AD9851 calibration to be adjustable at runtime). And some of those hacks become virtual functions when you don't really need them. C++ is an ugly language full of warts.

Exactly what my port pin templates do. Pity I have to rewrite every library I want to use, but it does make them faster and smaller.

Except then you get a jump-table interpreter which kills branch prediction so you tend to flush the pipeline on every instruction. That doesn't matter on an AVR, but does on Cortex. The advantage with threaded interpreters is the opcode *is* the pointer to the implementation. Not as good as JIT, but still.

Yes. It's esoteric, but doesn't have to be hard for end-users.

Clifford Heath.

- C
- Clifford Heath
  
  Contact options for registered users
Vote on answer
posted
6 years ago

Wed, Jun 28, 2017 11:55 PM

You mean, *another* manky DSLanguage, like Forth.

- L
- Les Cargill
  
  Contact options for registered users
Vote on answer
posted
6 years ago

Sun, Jul 2, 2017 5:50 PM

I think the constraint on project size means this is just fine. You don't want ctors()/dtors() nonsense buggering up a small micro.

There is always assembly.

That's a really big wheel to reinvent.

A video game capable computer is many, many orders of magnitude away from an Arduino. It's skateboards vs. fully loaded freight trains.

That's not even I don't know :)

This is much more trouble than it's worth. I'm working with someone right now who is taking Arduino class devices[1] seriously for instrumentation, and the point of an Ard is that it is nimble and deterministic.

[1] as proto boards, with custom designs for production.

That's not a bad approach.

I think you'd at least want bytecode. But ...

formatting link

--
Les Cargill