[cross-post] nested interrupts

Dear all,

I would like to understand more about nested interrupts since it looks to me they break my code and I'm not sure why.

I have three 'items' in my software: a serial port, a fifo and a timer. The serial port is an external hardware that sends back an interrupt when it's free to send another byte, so I decided to have an interrupt service routine (ISR) that reads from the fifo and puts the byte into the serial port. With this choice the main program should start the first byte transmission from the fifo and then everything is handled by the ISR. Therefore the code looks like this:

int main() { > ... > while(1) { > write_to_fifo(data); > if (state != TRANSMITTING) { > if (read_from_fifo(&byte) == OK) { > write_to_sport(byte); > state = TRANSMITTING; > } > } > } > }

the variable 'state' is needed in order to prevent the main to write into the serial port while the ISR is already doing this. The ISR would look like the following:

> void isr_sport(int signal) { > > if (read_from_fifo(&byte) == OK) write_to_sport(byte); > else state == DONE; > }

When the fifo will be empty the ISR will change the state to DONE such that main can start a new transmission again.

This flow works perfectly as expected but now I decided to add a timer interrupt which is simply incrementing my 'system time' variable, nothing else. If the 'interrupt nesting' is disabled everything works as expected, but if it is enabled something breaks. After some bytes transmitted to the serial port the program simply runs through the main, as if the 'state' variable was not set to DONE when the ISR emptied the fifo.

My simple 'naive' model for this is that while the serial port ISR was running it was interrupted by the timer interrupt (higher priority) but once the timer ISR was completed, instead of returning to the serial port ISR it returned directly to the main, missing the instruction to change the state. At this point there's no chance for the ISR to change the 'state' variable and the main will wait forever a never changing state.

But is this 'possible'? I started looking at the assembler for the interrupt dispatcher and interrupt handling routines but still didn't manage to find a potential flaw (even though I admit I did not fully understand everything there).

Any suggestion or pointer is more than welcome. Thanks a lot,

Al

p.s.: the target is an ADSP21020 and the compiler is g21k.

--
A: Because it fouls the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Reply to
alb
Loading thread data ...

Try 'state = DONE' instead.

Reply to
Arlet Ottens

Ok, that was a typo when posting, my apologies. But as I mentioned, the logic works for some time until the 'clash' with the timer interrupt.

Reply to
alb

i have absolutely no idea what processor this runs on, but to do nested interrupts without too much trouble, you need to have incremental priority levels for the interrupts, where only a higher priority interrupt source may interrupt another process. some processors allow you to assign the priority of your choice to each interrupt source.

of course, for each interrupt, you must fully save and restore every register that is used in the interrupt and that the "status register" or "condition code register" or whatever they call it, must be restored exactly.

then the last issue is the system stack. if you have lotsa nested interrupts, your stack (where you're saving all these registers) will get deeper and deeper. make sure that neither the stack tramples over some other in-use memory nor that some other process accidentally tramples over the stack.

other than that, everything should work transparently with no difference

*except* for time. any process that is interrupted will get back to doing its job and will pick up exactly where it left off, *except* it's at a later time than would be if it were not interrupted. sometimes, in a real-time embedded system, that delay in time kills you if the process is reading or writing I/O and the outside world is unhappy about the delay.

that's all the advice i can throw at you. it's been 2 decades since i had to worry about nested interrupts. chips have evolved/changed since then, but the principles are the same.

--

r b-j                  rbj@audioimagination.com

"Imagination is more important than knowledge."
Reply to
robert bristow-johnson

it is in the p.s. of the OP: ADSP21020 with g21k compiler.

In the case of the processor I'm using the priority is fixed but I have a choice to use a high priority interrupt or low priority for the timer. In this case I'm using the high priority timer interrupt.

This part is done by the library and this is where I fear something is not properly working.

I understand that the 'system stack' may be critical but here there should not be any problem since there's nothing going deeper than two interrupt calls. Maybe I should check what the interrupt call does at the assembler level and make sure that everything is in place.

No issue on delays, that's why I can even live without nested interrupt and wait for the current interrupt is served before serving the next one. The point is that I would like to understand the reason for such a strange behavior since I fear that it will byte me back later if I don't pin it down at this stage.

The processor and the toolchain I'm working with is exactly 2 decades old...

Reply to
alb

Le 01/07/2012 15:27, alb a écrit :

I haven't seen any flaw in either your code or your concept. It would be useful to have the assembly code, particularly that of the ISR and the pre and post ISR code to come up with something.

Reply to
Lanarcam

At this point you have a race between main() and isr_sport(): both will assign something to "state". Normally main() will no doubt have time to execute the assignment below, before isr_sport() is called, but if something else interrupts main() at this point, isr_sport() may reach its statement "state = DONE" first, which means that main() then overwrites DONE with TRANSMITTING in "state", and the logic fails.

This symptom can result from the race condition above.

I'm not sure how the dependence on interrupt nesting arises. Perhaps, when nesting is disabled, and the serial-port interrupt is signalled while the timer interrupt is being handled, one or a few instructions in main() are executed between the return from the timer interrupt handler and before the serial-port handler is executed. This might give main() enough time to execute its "state" assignment before isr_sport() runs.

To remove the race, move the assignment "state = TRANSMITTING" before the write_to_sport(), in main().

By the way, check that you have declared "state" as volatile.

--
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .
Reply to
Niklas Holsti

You have not shown, if somewhere interrupts are disabled. I see one possible race condition:

if (state != TRANSMITTING) { if (read_from_fifo(&byte) == OK) { write_to_sport(byte);

--
	 isr_sport() is triggered and sets state=DONE
Reply to
Reinhardt Behm

A light went on:

Often the amount of stuff you have to save for an ISR is greater than the amount of stuff you have to save for an ordinary function call.

Most modern C compilers let you declare a function to be the target of an interrupt. The syntax isn't standardized -- for some it's "void interrupt foo()", for some it's something like "void foo() attribute ("__interrupt__")", etc. But it boils down to letting the compiler know that it's an interrupt service routine, so that it can do the extra saves & restores.

It may be that you were just lucking out without nested interrupts, and were constantly exposing yourself to trouble and the other shoe just never dropped.

So -- dig through the processor's documentation on its interrupt response, and look at the compiler's assembly output to see if it seems to be saving away enough stuff on each function call. If not, dig through the compiler's documentation to see how to tell it that an ISR is an ISR.

And finally, if you can get by without nested interrupts -- don't use them.

--
My liberal friends think I'm a conservative kook.
My conservative friends think I'm a liberal kook.
Why am I not happy that they have found common ground?

Tim Wescott, Communications, Control, Circuits & Software
http://www.wescottdesign.com
Reply to
Tim Wescott

i remember that. TIMER was the only one that could do that. and i remember being disgusted that there were some devices that we needed at a higher priority and couldn't implement it directly.

okay, what other peripherals do you have hooked up and what are their relative priorities? and, then, about how often are these different interrupts expected to hit? if you have a low-priority interrupt that is expected to hit at, say, 48 kHz, and there is a higher priority interrupt that is connected to a user control (and it has a long and complicated ISR, but one that doesn't need instant attention). you could have a situation where this naturally high-priority device is held hostage to this naturally low-priority device which is fixed in the DSP at a higher level priority.

Tim, i'm sure the stub of code that connects to the C functions (that don't need to save/restore certain "scratch" registers and that sometime return values in registers) from the interrupt vector. i imagine that code takes care of what needs to be saved and restored.

there is one thing about the 210xx in that it has an alternate register set. we used that in a sorta inverted way. we used the "alternate set" as the regular set (that most of the C code was using) and the "normal set" was used for these ISRs that didn't need to save/restore (they owned the registers they used). the reason why was because it was easier to clear the interrupt enable bit and the bits for those alternate register sets with a single AND at the beginning of the interrupt and then restore as a final OR instruction before the RTI (or whatever it's called). the only restriction was that any C code called by these interrupts (that flipped some of the alternate register sets) could not have register local variables. but none of those ISRs were there to run C code.

anyway, if you can make use of the alternate register set for all of your ISRs, and commit some registers, you might not need to do any save/restore.

well, that was about as long ago that i had been working on a SHArC. it was ADSP21062 version 0.6 . lotsa bugs in the tools and some ugly (and therefore useless) designs in the instruction set and architecture of the SHArC. the MANT instruction (i think that's what it was called), is useless since it didn't return a float. it makes it more difficult to do math and you can get a float version of a fractional number with a conversion and subtraction.

--

r b-j                  rbj@audioimagination.com

"Imagination is more important than knowledge."
Reply to
robert bristow-johnson

Actually, since other's pointed it out, there's a race condition in main vs. the serial ISR, where he's writing a value out to hardware and then updating status.

That's certainly a much better candidate for the problem. Mine was an "if", that one is a "when".

--
My liberal friends think I'm a conservative kook.
My conservative friends think I'm a liberal kook.
Why am I not happy that they have found common ground?

Tim Wescott, Communications, Control, Circuits & Software
http://www.wescottdesign.com
Reply to
Tim Wescott

i couldn't figure out what Niklas meant by "state". if it's a shared variable, then you get "death by shared variable". is that it?

how do you do this? you have a hierarchy of interrupt sources, each having a unique interrupt priority. there is an interrupt priority mask (this might be the nasty shared variable) that is set to the level of the interrupt being serviced. only higher interrupt levels can interrupt this ISR. perhaps an interrupt level of zero means it's the "main program" that anything can interrupt. this fits precisely what i mean by "nested".

now, that interrupt priority mask is usually set by hardware. now, if that level is not programmable and if you want a device with ostensibly lower priority to act as if it's a higher priority and interrupt an ISR that the stupid SHArC says is higher, then, i suppose you have to manually bump the mask down in the ISR you want to be lowered priority.

if you were to always bump the priority mask to, say, zero and deal with this arbitration in software with your own pseudo-mask, then there might be shared variable trouble. what happens in an ISR when the first thing you do after saving/switching necessary registers, you bump the pseudo-mask to the "correct" interrupt level, disable interrupts of lower priority, then bump the *real* mask to zero and do the interrupt? that should work, shouldn't it?

now upon completion you undo this in the same order (why not reverse order is puzzling). set the pseudo-mask to zero, then enable interrupts of lower priority in descending order. that's a problem. you might never get to your RTI instruction.

the >

i was gonna say that in the world of latched priority level and assignable interrupt priorities and of semiphores, it shouldn't be too hard to make it bullet proof.

--

r b-j                  rbj@audioimagination.com

"Imagination is more important than knowledge."
Reply to
robert bristow-johnson

It's the variable that Al calls 'state'. In his main loop he loads a byte into the transmitter, _then_ sets 'state' to TRANSMIT. In the ISR, he checks to see if there are any more bytes to transmit, and sets 'state' to DONE.

If he pokes the hardware with the last byte in the queue and gets an interrupt before his program gets to "state = TRANSMIT", then the ISR can set state to 'DONE', return, and have the (correct) value overwritten by the (incorrect) value 'TRANSMIT'.

--
Tim Wescott
Control system and signal processing consulting
www.wescottdesign.com
Reply to
Tim Wescott

in

hardware and

shared

a

ISR,

ISR can

overwritten by

I've had good luck with that technique, except setting and clearing the transmit-interrupt enable instead of a state variable. I don't know whether the ADSP processor permits that.

Mel.

Reply to
Mel Wilson

The fifo is shared between main and the ISR. You are not preventing the ISR triggering while inserting more data into the fifo. This would result in a race condition.

Reply to
#!

On 7/1/2012 8:19 PM, Niklas Holsti wrote: [...]

got your point about the race condition, but let me dig a little further here. At the point you mentioned the isr_sport is not running, since there's no transmission going on. The only interrupt that may kick in is the timer ISR.

If this is true then the you are still right that there's a race condition, since the isr_sport will be run *after* (lower priority) the isr_timer() and maybe set the 'state' to DONE, which will be overwritten by the main() once it is resumed, which will then wait forever.

If all this is true though then I should have this problem even without interrupt nesting.

[...]

I was indeed fascinated by the explanation, but it seems the problem is not *only* there.

In principle higher priority interrupt should vector immediately and lower priority interrupt should be handled "right after" but I should check in the assembler to really understand what is done.

It did not help, still the same signature.

why? in the event of 'state' been optimized away then it should never work, right?

Reply to
alb

On 7/1/2012 7:53 PM, Lanarcam wrote: [...]

I do not want to barf lots of assembler code in here, I'm trying to rationalize the flow and maybe post part of it. This will help me understand the logic a bit better, since I should say that I do not know this part very well.

As I understand the processor jumps to the interrupt vector table (with some latency) and then from that moment on is all taken care by the runtime library which is wrapping my ISR with pushing and popping registers on the stack.

As soon as I have dug enough into the code I will post it here.

Reply to
alb

On 7/1/2012 11:49 PM, Tim Wescott wrote: [...]

In this case this is done with an "extension" in the signal.h. The prototype is the following:

void interrupt(int signal, void (*func)(int));

where func is the service routine for the signal. I expect all the registers pushing and popping is done in there. This is how I can associate a service routine to an interrupt. Certainly I may need to look into the actual code to see what the compiler and the library are really doing there.

[...]

few lines above the previous paragraph you stated:

and I share the same fear, that is why, even if I don't need to use nested interrupt I want to understand what is going on before this problem will byte me harder down the road when going back will be more tedious and frustrating.

p.s.: apologies for reordering your post, but it helped the flow of discussion.

Reply to
alb

The point I meant was just after the statement

write_to_sport(byte);

I assumed that this statement writes "byte" to the serial port, and that (as usual for output ports) writing has the side effect of starting the transmission of the byte. (If this is not so, your code does not show how and where the serial transmission is started.)

Since the serial port is now transmitting, isr_sport() will be called when the byte is transmitted and the serial-port interrupt occurs. In this sense, main() and isr_sport() are concurrent at this point. If you don't really know how fast things go (because of other interrupts or delays that can happen), either main() or isr_sport() can reach its assignment to "state" first. That is a race.

No, the serial-port interrupt can also occur, because you started transmission by writing the byte to the serial port. *When* the interrupt happens depends on the speed of the serial port.

Without the timer interrupt, the transmission of the byte will no doubt take so long that main() has time to assign "state = TRANSMITTING" before isr_sport() is called by the interrupt, and all is well.

If the timer interrupt (or something else) delays main() for a sufficient time between write_to_sport(byte) and "state = TRANSMITTING", the serial interrupt will occur between those two statements of main(). If this leads to isr_sport() being executed between these two statements, and if isr_sport() assigns "state = DONE", the assignments to "state" happen in the wrong order.

Yep.

Maybe, maybe not. As I said, without interrupt nesting I'm guessing that when the timer-interrupt handler executes return-from-interrupt, control returns to the main() function for a couple of instructions (the 21020 has a deep pipeline with two branch-delay slots) before the pending serial-port interrupt forces control to isr_sport(). This may give main() enough time between the two ISRs to execute the assignment "state = TRANSMITTING", giving the right order of assignments.

Also, the 21020 delays interrupt responses for some multi-cycle instructions, including branch. Since the statement "state = TRANSMITTING" is the last statement in a loop body, it is quite likely that the compiler puts it in a delay slot of the "return to loop head" branch. This makes it more likely that the assigment is executed between the return from the timer-interrupt handler and the invocation of the serial-port interrupt handler, when interrupt nesting is disabled.

I don't know for sure that this is what happens, but it would explain what you see.

But for sure your code has a logical race condition at this point.

The 21020 User Manual has diagrams showing several cycles for interrupt synchronization and recognition and branching (with delay slots) to the vector. So "immediately" in some sense, yes, but not on the next clock cycle.

It won't be visible in the assembly code, only in the descriptions of the 21020 Program Sequencer HW in the AD documentation. And that description may not be very complete or precise there, either.

Hmmm... see below on ordering of assignments.

Today the compiler may be kind to you, and not make trouble. Do you want to be surprised tomorrow, after some small, unrelated change to the program?

The compiler is free to reorder writes to non-volatile variables, as long as the mathematical effect is OK. If x and y are two non-volatile variables (or if only one of them is volatile) and you write in your C code

x = 1; y = 2;

the C compiler can translate it to an instruction sequence that first assigns 2 to y and then assigns 1 to x. If x and y are both volatile, the compiler must respect the written order of assignments (and also of reads of the variables).

So if "state" is not a volatile variable, moving its assignment statement in the C source code may or may not move it in the assembly code.

What is write_to_sport()? Is is a macro or a function? If it is a macro (or a possibly in-lined function) of the form "portvar = byte", you should also declare "portvar" as volatile, to ensure that the order of assignments is as in the C source.

And of course you should also make sure that accesses to the FIFO are protected against wrongful concurrent manipulations from main() and isr_sport(), as was suggested in other posts.

--
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .
Reply to
Niklas Holsti

[Deleted]

One thing to try is to disable interrupts just before returning from your interrupt routine. I have had used processors in the past where the actual interrupt returning code was not re-entrant. The interrupt state should be whatever it needs to be after the return from interrupt, since the return from interrupt code should restore the interrupt status to what is was when the interrupt occurred.

Regards Anton Erasmus

Reply to
Anton Erasmus

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.