Linux printf funny

- P
- Paul Burke
  
  Contact options for registered users
posted
17 years ago

Tue, Feb 27, 2007 12:42 PM

I'm converting an application from Windows console to Linux, and the changeover has gone remarkably easily (considering that I know very little about Linux), until now.

No problem installing GCC, KDevelop, FTDI USB drivers, remarkably few changes to recompile the code... but printf fails after about half-a-dozen calls. A float value prints as "nan"- not a number I asume, rather than what I eat with an Indian takeout. This value is computed from two int values (actually a weight and a tare reading) and multiplied by a scale factor (1.0 for the tests).

The funny thing is that I can't see anything different about the weight or the tare value between instances that print and those that fail. There is the expected one-or-two bits wobble in the weight reading, but the values only oscillate between plus and minus one relative to the tare. Once it fails, it seems to be sticky- it doesn't recover even when the readings are identical to before the nan.

So, please you Linux/ GCC experienced people- what absolutely basic item of knowledge am I lacking?

Paul Burke

- S
- Stef
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Tue, Feb 27, 2007 1:13 PM

Is this a win32 console application or an old DOS application?

It does not sound like a windows/linux problem, but more like a programming error. So please show us the code.

I know I found some errors in old DOS (turbo C) test programs when I ported them to new versions of compilers (windows or linux). These programs worked normally using the old compiler but failed using a newer one. All the errors where my own (beginners)fault: uninitialized vars, wrong format specifiers, near/far pointer stuff etc.

--
Stef    (remove caps, dashes and .invalid from e-mail address to reply by mail)

- P
- Paul Burke
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Tue, Feb 27, 2007 1:22 PM

It was W32 console, compiled with VC++6.

#include #include #include #include #include

#include "WinTypes.h" #include "ftd2xx.h" #include "ADC.h"

int TareWeight = 0; // ADC reading at no- load double ScaleFactor = 1.0; // Bridge scale factor

double ReadWeight(void) { char buff[100]; double fweight; // Float version of weight int Weight; // Weight as returned by ADC

Weight= ADCConvert();

fweight= (double)(Weight - TareWeight) * ScaleFactor; sprintf( buff, "%08x - %08x -> %9.3f", Weight, TareWeight, fweight); MessageBox( 0,buff,"Conversion Result",0); return fweight; }

MessageBox() is just a function that displays the strings supplied, and is in fact a replacement for the Windows message box as in the program's earlier incarnation as a windows GUI program before it got converted to console (don't ask!).

Here's the console output:

Conversion Result 000087bf - 000087bf -> 0.000 Conversion Result 000087bf - 000087bf -> 0.000 Conversion Result 000087bf - 000087bf -> 0.000 Conversion Result 000087bf - 000087bf -> 0.000 Conversion Result 000087c0 - 000087bf -> 1.000 Conversion Result 000087bf - 000087bf -> 0.000 Conversion Result 000087c0 - 000087bf -> 1.000 Conversion Result 000087bf - 000087bf -> nan Conversion Result 000087c0 - 000087bf -> nan Conversion Result 000087bf - 000087bf -> nan

"nan" then remains sticky until I restart the program. You'll notice that the int readings are much the same as when the output is correct!

I could believe that better if it didn't work for the first few readings.

Paul Burke

- P
- Peter Dickerson
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Tue, Feb 27, 2007 1:59 PM

If I was a betting man (which I'm not) then I'd say ScaleFactor is being corrupted somewhere else. Clearly TareWeight doesn't stay 0 either. So, what changes them?

[snip]

Peter

- D
- DJ Delorie
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Tue, Feb 27, 2007 2:26 PM

How to use the debugger ;-)

Run your app under gdb, put a hardware watchpoint on your variables, and see when they change. Or at least put a breakpoint after the printf, continue until it reads nan, and inspect all the variables to see which one is wrong (if any).

Sometimes, you can't use just printf to debug code.

If you can't use gdb to inspect the variables, write a routine to hex-dump them (i.e. cast their address to unsigned char *, dump sizeof(var) bytes), and call it from various places to try to track down what's happening.

- B
- Boudewijn Dijkstra
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Tue, Feb 27, 2007 2:28 PM

d =

's =

o =

Looks like an optimization problem. The compiler is seeing a constant =

where it shouldn't, and/or has modified the type of a variable to double= . =

Without seeing the code that deals with TareWeight, I'd guess this is it= . =

Try making it volatile.

Another possibility is a bug that modifies ScaleFactor, which BTW isn't = =

declared const.

d

gs.

Floating point types have the property that they can 'saturate' to NaN's= , =

Infinities or Zeroes over successive calculations. Some compilers have = a =

"strict math(s)" option to reduce that effect.

--

Gemaakt met Opera's revolutionaire e-mailprogramma:  =

http://www.opera.com/mail/

- D
- D.
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Tue, Feb 27, 2007 2:36 PM

If this is what I believe it is, you should find a way to remove it. I'm not sure it does any harm, but when trying to find arcane errors you'd better go with the safest and most standard setup.

Would the C promotions get in the way somehow? It shouldn't, but I'm still wondering about it.

Safest setup again here: printf("Conversion result: %08x - %08x -> %9.3f", Weight, TareWeight, fweight);

Nan is usually because there is an error such a the root of a negative number, or infinity, or things like that. Which is pretty strange given your calculations...

You could try things such as: float ScaleFactor = 1.0; To try to avoid needing more accuracy than what a double (fweight) can represent.

Is ScaleFactor modified somehow somewhere else in the code?

And which compiler to which target are you using?

Regards, D.

- V
- Vladimir Vassilevsky
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Tue, Feb 27, 2007 2:36 PM

If it is a multithreaded application, this can be the issue with printf() reentrancy. Many of the stdio.h functions are not reentrant by default, unless you are linking the appropriate libraries.

Vladimir Vassilevsky

DSP and Mixed Signal Design Consultant

formatting link

- P
- Paul Gotch
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Tue, Feb 27, 2007 3:24 PM

..yes you need to pass "-pthread" to GCC which has the effect of defining _REENTRANT and linking against libpthread.

This is documented as required on some platforms but strangely not for x86 I don't know if this is a long standing oversight or if it really isn't needed.

-p

--
"Unix is user friendly, it's just picky about who its friends are."
 - Anonymous
--------------------------------------------------------------------

- P
- Paul Burke
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Tue, Feb 27, 2007 3:40 PM

I've found that it doesn't matter what the ADC reading is, and changing ScaleFactor and TareWeight to constants doesn't change the behaviour. If I run it as originally shown, it does 7 conversions before GDB shows fweight as a nan (value 0x8000000000000). If I split out the calculation, so that:

fweight = (Weight - xTareWeight); fweight *= ScaleFactor;

it still does 7 printfs, but if I comment out the Scalefactor line, it does 8 before nan shows up! It also behaves like this if I replace variables ScaleFactor, Weight and TareWeight with explicit constant values.

If I change the types from double to float, it behaves the same, except that the nan value now becomes 0x400000.

Well, I've been doing C since 1983, and I've had trouble before (like forgetting #includes or screwing up the formatting commands), but never one such as this! It all seems so basic - just multiply long 1 by float

1.0, and get an error after doing it OK several times. For now, it's got me bet.

Paul Burke

- T
- tbroberg
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Tue, Feb 27, 2007 6:06 PM

Paul, I noticed that the buffer you're sprintf'ing into is immediately adjacent to fweight on the locals list, and hence, presumably, on the stack.

Is somebody is overrunning the end of the buffer?

- T
- tbroberg
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Tue, Feb 27, 2007 6:07 PM

Paul, I noticed that the buffer you're sprintf'ing into is immediately adjacent to fweight on the locals list, and hence, presumably, on the stack.

Is somebody is overrunning the end of the buffer?

- Tim.

- A
- Arlet
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Tue, Feb 27, 2007 6:26 PM

Depending on the CPU you're using, the compiler may implement constant doubles by creating a 'constant pool' in memory, and loading the values from there.

My guess is that some other part of your program is corrupting memory, and that after 7 or 8 iterations, the corruption has reached the code or variables used in this function. There's nothing wrong with the code you posted.

- T
- tbroberg
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Tue, Feb 27, 2007 6:47 PM

...and this fits with your finding that the problem "went away" when you moved the code into a different scope. The buffers / variables would have moved, causing the overrun to smush something else instead.

- Tim.

- C
- CBFalconer
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Tue, Feb 27, 2007 9:24 PM

Since you didn't publish any code, we have no idea. The only thing I can point out is that any operation on a NAN yields another NAN. You don't need to know anything about Linux if you stick to ISO standard C.

Cut the program down to a minimum that demonstrates the problem, and doesn't use non-standard C (you can fake inputs by using files) and publish that if the process hasn't made you solve the actual problem.

--
Chuck F (cbfalconer at maineline dot net)
   Available for consulting/temporary embedded and systems.