parser for embedded systems

C

chrisu 17 years ago

hello,

i need to use a parser in my embedded software. currently i am trying to integrate tinypy or lua. has anyone experience using parsers in embedded software. the requirements for the parser are the following:

- easy to integrate in a c application, no c++

- small footprint, some kbytes in flash and ram

- fast

- execute functions from firmware with parser

- command line interface, read variables ...

please give me some hints...

chrisu

Vote

F

Frank Buss 17 years ago

You can try Forth. Most systems have a command line interface integrated and can be used with some kbyte of Flash and RAM. If you have a C compiler for your platform, you can start with

formatting link

, which should be as fast as other interpreters, like Lua. For many platforms you can find native implementations, too, which can be as fast as C.

Frank Buss, fb@frank-buss.de http://www.frank-buss.de, http://www.it4-systems.de

Vote

J

Jon Kirwan 17 years ago

Okay. Are you looking for the ability to permit customers to execute code they write as a custom matter? Or do you just want them to be able to examine internal states, change control parameters, dump them out, and so on?

If you intend supporting a language of sorts, why? What purpose is being served here? For example, can you accept a mechanism like BASIC 'gosub/return' with global variables? Or do you have to have concepts like local variables, passed parameters, and so on? Also, what exact access is needed into the rest of your embedded application?

What precisely is your flash and RAM limits? I'm asking this because tinypy is "a minimalist implementation of python in 64k of code" and that makes me wonder what you mean by "some kbytes." Worse, lua is said that "under Linux, the Lua interpreter built with all standard Lua libraries takes 153K and the Lua library takes 203K" and that is with Linux sitting there for some support functions, as well! So what in the heck do you mean? Are we talking a few kbyte in the "say 4-12 range?" Or in the "a few hundred kbyte would be okay" range? Maybe you are using the phrase in a way I'm unaccustomed to.

If you are truly limited well below the 64k mark set by tinypy, for example, then I don't think you are going to get either tinypy or especially lua in there. Forth is definitely something you might consider -- I have a hunch (I'm actually ignorant about today's situation as it's been more than 20 years since I even looked) there are some implementations that may be quite small and can be fitted into your application with some work. You will need to supply some interfaces into your application, I suspect, at the very least. But also if you are only talking small numbers of kbyte it sounds more to me as though this is something you are going to pony up on your own. Because general purpose stuff written by and for folks who know nothing of your application usually carries a lot of things with it that you don't need and don't want. And with space at that kind of premium it's likely you will custom-develop it. On the positive side, it's not hard at all to write a lexer/parser and execution in very small space if you keep your eye focused on exactly and only what you need to have.

Oh. I haven't even brought up the question of the processor and development tools -- whether it requires (or even supports) reading flash as data space, for example. When I see that lua can be built for Linux, I imagine in my mind a case where there is a simple, virtual memory space for the application which is loaded there by some operating system and where the code, constants, data, and so on share a simple, single address space. Your specific processor may... flummox... this ideal a bit and that may be yet something else you need to concern yourself about. For example, if you build someone else's language for your use will the constants need to be stored in flash but also then copied out into data RAM by the startup code in order for it to operate properly? (Which then means they add to both requirements.) What about initialized data? What kind of linking control do you have in your development tools and how familiar are you with the various concepts of linkers and application linking?

Of course, maybe you have 100's of kbyte of flash and many tens of kbyte of RAM and more...

Who can provide better guidance without knowing more about what you are trying to achieve? Why didn't you write more about it to start?

Jon

Vote

G

George Neuner 17 years ago

It depends on you platform and what you need to parse. Some examples of your intended input would help.

For regex and simple LL languages you can use re2c[1] to hand write a recursive descent parser. This is probably the tightest code you can achieve.

For more complicated languages, you can use PCCTS[2] or Bison[3]. PCCTS generates recursive descent parsers, Bison generates table driven parsers. Neither is as tight as a hand written parser but they are easier to maintain if the language grammar is complex.

Bison's tables tend to be larger than the corresponding RD code from PCCTS, but Bison can parse more complicated languages (if that matters). The tables can also be compressed to save space although that results in somewhat slower parser operation.

Both PCCTS and Bison require a lexer (input reader) ... for either you can use Flex[4] or write your own with re2c. PCCTS comes with a tool called DLG which generates recursive descent lexer (I have never used it).

I have successfully used all of these. Keep in mind, though, that these tools are not designed for embedded systems. They all require dynamic allocation and you will need to accommodate that somehow.

George

[1]

formatting link

[2]

formatting link

[3]

formatting link

[4]

formatting link

Vote

F

Frank Buss 17 years ago

You can use some fixed width arrays. E.g. this was a solution by me for parsing and evaluating formulas with up to 3 variables and a fixed length bytecode array:

formatting link

No dynamic allocation needed. Would be easy to enhance it with a command line interface, variable declarations and function calling, but it depends on the task for what the OP needs it. If he needs more, like defining own functions within the script language, loops, conditionals etc., it might be better to use an existing language implementation, because it is not easy to develop a sound and easy to use full computer language, and a bug free implementation for it.

Frank Buss, fb@frank-buss.de http://www.frank-buss.de, http://www.it4-systems.de

Vote

C

chrisu 17 years ago

hello,

i think re2c was what i'm looking for. thanks to all for your support!

George Neuner wrote:

Vote

C

chrisu 17 years ago

hello,

my requirements are:

- code footsprint up to 50k

- ram usage. in the optimal case nearly nothing. but 5k is no problem.

- modify or expand the parser without any coding - just by changing some configure and maybe generating some new code

i need to implement an ethernet terminal on an embedded system to execute simple code (hopefully no loops or conditions) and mainly call functions from my application to configure and get some statistics in runtime. this is all i'm looking for;) tinypy, lua or flex and bison ... is too mighty for my purpose - but re2c seems be be the right choice.

thanks for your helpful suggesti> >

Vote

G

George Neuner 17 years ago

All these tools require runtime allocations of variable sizes. The minimum possible implementation is a mark-release style heap.

Suballocation of a heap is "dynamic allocation" by definition.

George

Vote

parser for embedded systems

Join the Discussion

Didn't find your answer?