AVR ATmega644 mysterious reset ?

Hello All,

I'm having a hard time struggling with a reset of my application. The Atmega644 is used for an RFID application and handles communication + protocol and all RFID operations including an anticollision scheme. The anticollision is implemented via recursive call's.

The application is written in CodeVisionAVR C compiler vers. 1.25.2.

I'm debugging in AVRstudio vers. 4.12 build 460 using JTAG mkII ICE.

The problem is an uncontrolled reset of the ATmega644. The reset always occurs when I loads the anticollision to a maximum by makeing the RFID reader handle way to many tags in the read area.

Now the mysterious part is:

  • I have removed all Watchdog enabling/resetting to eliminate the possibility of a watchdog reset. I have verified that the watchdog registers are not written during execution.

  • I have disabled the "Brown-out enable" fuse.

  • I verified that the application not by accident jumps to the undefined part of the program memory and just runs until it returns to 0x0000 (Reset Vector). Further more all volatile memory is cleared when the problem occurs - stating that my problem is a reset for sure!

  • I have made a "Reset Source checking" testing on the first five bits of the MCUSR register which descripes the cause of the last reset. I'm breakning the program execution at 0x0000 so I can check this register first thing after a reset. But when my problem occurs the register setting is always

0x00. When ressting the device other ways the register indicates the right reasson e.g. JTAG reset or External Reset.

A possibility could be a stack overflow caused by the recursivity - but that should not generate a uC reset?

I have tried getting closer to the bug using EEprom debug variables and so, but the antocollision algorithm is very timing strict which just made this aproach corrupting the application.

Any ideas on how I'll be getting closer to solve this anoying bug? Or anyone have an idea why I see this problem?

Best Regards

-- Morten M. J. Ba.Sci.EE

(this is also posted on avrfreaks.net)

Reply to
Morten M Jørgensen
Loading thread data ...

A stack overflow will do the most strange things.....

Reply to
TT_Man

rt

Nope, that just means that you have probably run a major portion of your startup again. If you started from your 2nd, 3rd, or 4th instruction of your startup would the observable results be any different than a full reset? Startup usually contains code for other basic setup (such as stack position) before memory clearing. Since much of that is already done you wouldn't notice if it had bee skipped.

hat

No but it could still jump to your start location. All that has to happen is the return address on the stack gets overwritten with the address of the start vector and then on return you do something similar to a reset missing only the HW side effects.

one

Limit the number of tags you'll process. Sneak up on the number that starts causing a problem. It may be easier to diagnose with a minimal case. do NOT ignore odd behaviour at quantities below that required to cause the failure, they may be early signs of the root cause and since you may still have a partially operating system they might be easier to diagnose. And take a good look at what ever memory usage you have on a per tag basis. If you are using dynamic memory allocation particularly something from the *alloc family there is a good chance the heap and stack are colliding, and if you are then you probably should switch to something more robust.

Robert

Reply to
Robert Adsett

The source of a reset can be found in the MCUSR register. Read it on startup and then reset it. If you are not getting a 'real' reset but just a jump to the reset vector, the register will remain cleared.

--
John B
Reply to
John B

part

Nope.

the

that

Nope, but it can cause a return to 0000.

Try to imagine this:

You run a recursive routine, that, at some point tests something and keeps calling itself until some result variable yields 0. The moment your stack, wich also contains return addresses, grows into the data area and your routine decides the test result is 0 and stores this value....on the stack. You have just overwritten your return address with 0, the routine ends, a return is executed to.... 0. Bingo! It appears as if your processor resets but none of the Reset Source bits is set. Try to figure out the stack needs of your routine, implement a recursion iteration counter as a global variable and test/check/display this variable at reset, before the RAM gets zeroed. This should give you a pretty sure evidence if and when your stack overflows.

You should ALWAYS limit the number of iterations of a recursive process.

Meindert

Reply to
Meindert Sprang

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.