Delayed "printf"

T

Tim Wescott 9 years ago

Sometimes it's necessary to float a notion just so that someone can affirm your suspicion that yes, you really are an idiot.

On the off chance that I'm a genius, though, here goes:

I don't like putting "printf" in fast tasks for obvious reasons -- even with infinite spooling, it can take time to process a format string. In my usual context, this can result in audible sounds from motors when I print some bit of debugging information from a high-priority task. The problem can't be alleviated by spooling the output of the printf, because it's the execution time of the printf itself that's the problem.

Just how dumb is it to have "printf" queue up the format string AND the parameters into a buffer, to be printed out in some low-priority task? For a system with sufficient memory, this seems like an astonishingly sensible thing to me.

Clearly you'd have to wrap it up somehow so that if the buffer was full no printf at all would happen -- but I'm kinda liking the notion.

So -- whatcha think?

Tim Wescott Wescott Design Services http://www.wescottdesign.com I'm looking for work -- see my website!

Vote

D

Don Y 9 years ago

... because the printf() is "in series" with some other "time-dependent" activity.

Of course, you couldn't rely on any side-effects of the printf() in code following the point at which you "emit" the data to be "printed". Nor can you alter the parameters until you know (for sure) that they've been captured by the followup process.

I use a crude interface to pass (unformatted) "data" to the "black boxes" (think: "flight recorder") in my efforts logging -- even in ISRs. Then, move the "smarts" into the routines that examine the BB contents.

E.g., log(blackbox, timestamp, messageID, ...)

This allows each message type to have its own notions of the parameters of interest (e.g., message ID #4 might take two doubles and an 8 character array while message ID #9 takes a long int and a float; each might pretty-up the presentation of the data with some suitable text).

Doing so lets me keep the size of each blackbox small as it doesn't need to store the data in a form suitable for human consumption -- in much the same way that "error 27" tells a user all he needs to know (via a cheat sheet).

Of course, the interface has to be value driven instead of reference (unless the referenced objects are immutable) as they won't be "resolved" (dereferenced) until much later in time.

Why the restriction? Do you value previously generated "output" more than "more recent" output? (I guess that's conceivable)

Vote

P

Paul Rubin 9 years ago

It's fine, just be sure that if there are pointer parameters (like for %s in the format string) that the objects they point to survive long enough for the logging task to print them.

Vote

T

Tim Wescott 9 years ago

That is implied by "fast task", yes.

Good point. printf returns a value that depends on what's printed, IIRC

-- you'd want a "printf_later" command, I guess.

I was thinking along the lines of capturing them. Which means parsing the string so you can capture values of numbers and contents of strings, which probably means it was a dumb idea after all -- dammit.

If you're pumping stuff into it faster than it can be printed out, something has to give, and you'd want it to give as gracefully as possible.

Tim Wescott Control systems, embedded software and circuit design I'm looking for work! See my website if you're interested http://www.wescottdesign.com

Vote

C

Clifford Heath 9 years ago

We did this with an error message system in an enterprise system written mostly in C++. I had produced a type-safe ArgList class, which allowed you to

Vote

D

Don Y 9 years ago

I was trying to draw attention to the fact that you *could* have had printf() working "in parallel" with the task.

Typically, one ignores the return value of printf().

But, you can use things like "%n" (with an associated int*) to get other "output" from a format string (folks tend to not use this much, either).

You just need to be aware of this when you finally "evaluate" the function.

No, just capture them in a lower-level form. E.g., my blackbox mechanism just captures sets of bytes. The "post processor" figures out what they are intended to mean.

Its like marshalling arguments for an IPC -- and imposing the "typing" later.

The drawback to my scheme is that it has limited flexibility and is less "intuitive" than printf() stubs. E.g., I can't arbitrarily decide that I want to preserve A, B and C -- without knowing their sizes, etc.

But, chances are, you have some (small) fixed number of printf() format strings; your code isn't *synthesizing* arbitrary strings! So, you nail those down, give each a "message ID" and a corresponding decoder "later".

Of course! In my case, I'm interested in the most recent history; I want a continuous record from "now" backwards in time -- to whatever extent the blackboxes size will accommodate. Of course, if I have emitted a lot of LONG messages (in terms of their raw forms, not their "post processed" forms), then I have fewer complete messages available. But, there are no gaps within them.

If you allow *some* printf_later()'s to produce NO output, then you risk having holes in your "printf stream".

(that may be acceptable to you)

Vote

R

Robert Wessel 9 years ago

A problem is that finding all the input parameters, and copying them someplace, may well be a considerable amount of work. *Output* parameters (%n, the return value), have already been mentioned, and pose significant additional problems.

While copying something like an integer or float is going to be fairly quick (assuming you have some place to put them already allocated), and likely rather faster than than actually formatting them, you're still going to have to do a non-trivial amount of parsing to actually

*find* those parameters. And remember you have not just the parameters, but the precision and min size can be specified as a parameter ("*"). With strings, you have a worse allocation problem, and copying the strings to a temporary area is going to be as much work as actually processing them.

So at best, you're likely to only save a modest amount of CPU time in the issuing thread formatting the output string, and quite possibly that CPU usage in the main thread may be a non-trivial amount worse. Overall CPU usage will certainly be worsened considerably.

OTOH, your real goal is to prevent the printf from blocking because it lacks output space.

A simple way to do that is to perform an snprintf to a pre-allocated, and fairly fixed size, buffer, and then pass *that* buffer to a separate thread for output. A better option may be to issue an asynchronous I/O request (although you'll have to step outside of C streams at that point). If the available buffer is too small, you have the choice of ignoring the output, or truncating it, and you have several options for dealing with a new output request from the main thread that happenes before the old buffer is disposed of (ignore, block, have multiple available output buffers).

An advantage of doing the snprintf in the main thread, is that you largely leave the semantics of printf unchanged.

You can sometimes also check (OS dependent) if there is space in the stream output buffer (__fbufsize and __fpending in *nix, for example), and use that to judge whether you're going to have room for the output (you could do a trial with snprintf and a zero length buffer to get the length required, for example). You then need to compensate for things like expansion of newlines (OS dependent, of course, not an issue in *nix*).

Vote

J

Jack 9 years ago

o:

Freescale has a tool, called Freemaster, that allow to read/write directly the RAM of a uC (it also read the elf file, so you have the name of the var iables, and not only the addresses). It's really handy and trasparent to the processor (if it can connect using the programming interface: JTAG or whatever, else a small UART driver is ne eded). Even on other processor it's usually very easy to write a small communicati on protocol to r/w RAM, and a GUI with python is done in half a day. Much better than printf.

Bye Jack

Vote

T

Tom Gardner 9 years ago

Where execution time/latency is important, I have used combinations of these techniques as appropriate:

- rapid determination that printing isn't necessary (e.g. conditional logging of non-fatal conditions)

- pass parameter by value, not reference/pointer, so it remains unchanged

- output parameters directly, to be parsed/prettyprinted by a logging process on a different machine (e.g. FSM state/event encoded as a byte, not a string)

- message boxes or FIFOs between the data source and data consumer threads, depending on relative thread speeds, data rates, and whether the data source needs to ensure message delivery

- async callbacks to the data source, indicating completion (whatever that means) of logging

Vote

J

John Devereux 9 years ago

I think it's a good idea... the format string would normally be in flash so it would only be a pointer in the buffer AFAICT.

Not quite what you asked, but relevant - an interesting video from mikeselectricstuff about using a scopes serial decode for non-intrusive real time debugging / high speed debug messaging.

John Devereux

Vote

D

David Brown 9 years ago

The problem with this approach is that snprintf is still doing most of the work that printf does - it is still a slow function.

Vote

D

David Brown 9 years ago

printf is a horrible function for such purposes - with its variable arguments, etc. Consider having delayed_print_string(), delayed_print_int(), delayed_print_hex(), etc., so that you can have a queue with of structs with a fixed format (a tag for the type to be printed, a union of the data types, etc.). It is a lot easier to keep control of a fixed set of types of a specific number, than a general variable argument set.

If you have a C11 compiler and can grok _Generic, you can wrap the whole thing in variable argument macros and get a programming interface such as:

delayed_print("Current pos: ", position, " register: 0x", hex(reg, 4), endline);

But if you do go for a printf-style function, don't forget gcc's "format" __attribute__ on your function to check that the formatting and the arguments match up - it will save much wailing and gnashing of teeth.

Vote

R

Robert Wessel 9 years ago

My point was that the work of actually saving the parameters someplace for another thread to use was also a substantial amount of work, and, depending on how general you were being (for example, actually doing that based on the variable input to a printf-like function), you might well be doing *more* work on the issuing thread on many possible printf calls.

While a front-end to printf might well be faster when parsing and saving something like "xprintf("%f-%f", f1, f2)" compared to actually executing the snprintf, it would almost certainly be *slower* than the actual snprintf handing "xprintf("%s-%s", s1, s2)".

And that's ignoring all of the synchronization issues, and any work involved in those.

Now if you had non-variable functions, where you didn't have to parse the format string to find the parameters, and you then applied the format only in the worker thread, that should help, but it's a much less general solution, and will help only modestly in many cases.

In any event I thing I was reading the OPs message as being more concerned with the possibility of the printf call blocking.

Vote

D

David Brown 9 years ago

Absolutely true. It depends on many factors, currently only know to the OP.

It would perhaps be less general, but it could well be good enough for the job. And it does not /have/ to be less general - you could build something akin to C++'s output stream system.

In fact, you could make your queue simply hold a pair with a pointer to a function, and a data member (which would be interpreted as either a value directly, a string pointer, or a pointer to a struct with more data). This would be /more/ general than a printf, while keeping the shared queue structure simple.

Of course, you still have issues with synchronisation, interleaving of messages printed from different threads or contexts, and the challenge of making sure your debug data stays alive between the queuing operation and the dequeuing operation.

That is one possible interpretation, but not the only one.

In my own code, I have made printf functions that use snprintf and then pump the output directly into a DMA buffer for a UART, skipping the layers of library calls for file I/O. In such cases, it is the formatting of the data that takes time, rather than the output of the data.

But I have also seen printf's that boil down to polled blocking UART transmission after layers of _write() functions and other internal library code - in which case it is the output part that is slow.

Vote

T

Tim Wescott 9 years ago

You assume. My real goal is to prevent printf from blocking because it is slooooooooooooooooow. Lacking output space is just an anticipated failure mode that I wish to avoid.

Tim Wescott Control systems, embedded software and circuit design I'm looking for work! See my website if you're interested http://www.wescottdesign.com

Vote

S

Stefan Reuther 9 years ago

Am 25.04.2017 um 02:42 schrieb Tim Wescott:

The first question to ask is: do you really need a full-blown 'printf'? Or do you just need something that expands '%d' and/or '%x', '%s', and that's it? Such a function can be implemented in a screenful of code, using a 'putchar' function as the back-end, which in turn writes into an asynchronous buffer.

Such a function is fast so you might get away with it. Compared to 'printf', it lacks locking and synchronisation with the rest of stdio, and it also doesn't need the bajillions of ifs which 'printf' uses to implement left/right-pad, pad-with-zero, float, thousands-separators, etc.

If you wish to keep using a 'printf'-style interface and just stash away the parameters, you'd still need to pre-parse the format string and I'd guess that doesn't save much compared to actually formatting.

I worked with such a 'printf' workalike function in the past and had no problems using it from things like audio DMA interrupts.

Another question to ask is whether you want to log essays like "foo is not recognized as an internal or external command, operable program or batch file. Try reinstalling the application if the program persists", or just brief messages like "foo: file not found". :-)

Stefan

Vote

T

Theo Markettos 9 years ago

My idea, for a system which is supposed to run code written by arbitrary users who expect printf, but is limited in message size that you don't want to send the text, is to abuse the tools for handling localisation.

In other words the user writes: printf("Hello world foo=%d, bar=%s, baz=%f blah blah blah\n", foo, bar, baz);

and the locale tools transform that into a text file containing strings like: STR42: "Hello world foo=%d, bar=%s, baz=%f blah blah blah\n"

and code like: _locale_print("STR42", foo, bar, baz);

a function which packages STR42 and binary representations of STR42 and the parameters into a packet and sends it on its way. (You might need to annotate the _locale_print with the types of the parameters somehow)

At the other end, the PC uses the text file so that when it receives STR42 in a packet it knows how to interpret the packet and prints it out (or logs it to a file for printing later in some GUI tool).

We haven't actually implemented it yet.

Theo

Vote

H

Hans-Bernhard Bröker 9 years ago

Am 25.04.2017 um 02:42 schrieb Tim Wescott:

Exactly as dumb as using printf() at all while being concerned about CPU cycles at all. I.e. the first rule of adding actual printf()s in the hot path is:

DON'T.

Using the real printf for this effectively impossible, if only because there's essentially no way to know, in the general case, what needs to be stored in that printf job queue to be able to execute it later, without parsing the entire format string first. That parse alone will likely already take longer than you can afford.

You need to use something else; if it's to resemble printf(), it'll have to be _considerably_ pared down to forbid all the impossible-to-implement features: no * widths, no * precisions, no %s format, no %n format, best no int128_t, long doubles, etc.

A much saner function whose calls can be queued up for later playback would look like this:

void deb_out(t_typeToPrint type, /* enum of allowed types */ size_t size, /* refuse if size > some limit */ void *data /* copied at call time! */ );

Based on that one might wish to make convenience (inline?) wrappers like:

void deb_out_u8(uint8_t value) { deb_out(E_TYPE_TO_PRINT_U8, 1, &value); } void deb_out_u16(uint16_t vlaue) ...

and maybe, if you have near-perfect control over who gets to use this, even some rather risky ones:

void deb_out_f_float(const char *fmt, float value) { deb_out(E_TYPE_TO_PRINT_FORMATSTRING, strlen(fmt), fmt); deb_out(E_TYPE_TO_PRINT_FLOAT, sizeof(float), &value); }

The worker thread would synthesize a call of snprintf() or similar from those two entries. Entries without a preceding E_TYPE_TO_PRINT_FORMATSTRING one are printed with a default format.

Vote

T

Tim Wescott 9 years ago

The responses so far are really making me think.

I guess the better way to frame my desire is thus:

I want some way to print free-form debug data to a serial port in such a way that anything slow (formatting and waiting on the UART) is performed in a low-priority task.

Some printf-like thing may not be the best way to do it -- but I still want to do it!!

Tim Wescott Wescott Design Services http://www.wescottdesign.com I'm looking for work -- see my website!

Vote

D

Dave Nadler 9 years ago

I recently had a similar problem. What I did was:

- Create a virtual base class with a printf pure virtual function

- derived classes add required parameters and appropriate printf

- fast/time-sensitive task creates+enqueues derived-class objects

- slow path dequeues and prints This is with FreeRTOS where elapsed time to create/enqueue is very fast (wrt my application, your mileage...).

Hope that helps! Best Regards, Dave

PS: C code would do the same with enum for type, union parameter sets, etc. Not as easy and neat as C++...

Vote

Delayed "printf"

Join the Discussion

Didn't find your answer?