I have meaningful ISRs taking 400-600ns on a TMS320F2812 running at
150MHz. The entire executor engine of a dynamically reconfigurable waveform generation state sequencer completes in under 4us, with that time depending strongly on the number of possible transitions allowed per state, currently limited to 4.The new TI Delfino at 300MHz could halve these times. The reason these processors can do this is because they have (albeit not particularly generous amounts, although the Delfino again improves this greatly) SRAM running at full processor speed, and a non-cached architecture.
What is critically important is interrupt latency and interrupt latency jitter. In my application, I use assembly language "pre-ISRs" above the non-critical ISRs which simply juggle a few interrupt enables, then very quickly re-enable interrupts. The main cause of interrupt latency jitter is C preamble code for each ISR takes quite a long time to complete. So if the main() code gets interrupted, the interrupt latency is about 100ns.
But if another low-priority interrupt is running when the highest priority real-time interrupt is triggered, then if I just leave it to the C compiler I have to wait for it to save full context, and run a bunch of silly little extra instructions, before having a chance to re-enable interrupts and let the high-priority one take over. That might extend the effective latency to 300ns. Hence, latency jitter.
By writing a pre-ISR in .asm, I can get interrupts re-enabled, restricted to just the more important ones, in just 3-4 instructions. Then the high-priority ISR can preempt the preamble codes of the low priority ISRs, greatly reducing latency jitter. I think the last time I measured, I could guarantee less than 200ns latency on the F2812.
Once past getting interrupts reenabled, the C compiler is good enough at writing the actual meat and potatoes ISR code.
I can't wait to get my hands on the Delfino. Being able to do serious work in a few 100s of ns is very much fun.