Hmmm. I don't know that I'd call that a barrel shifter. I've always considered the old visual image of a circle with arrows from each position to all other positions. That implies that the output is exactly as many bits wide as the input.
What you're describing seems to be something else---with variable width inputs and outputs and some other combinatorial logic.
Does the CPU stack used registers and status in that single clock---or does it use some sort of register-map switch which would imply some limits on nesting?
Oops... found a partial answer in your reference:
"The ALU contains a duplicate bank of registers, shown in Figure 2.2 behind the primary registers. There are actually two sets of AR, AF, AX, and AY register files. Only one bank is accessible at a time. The additional bank of registers can be activated (such as during an interrupt service routine) for extremely fast context switching. A new task, like an interrupt service routine, can be executed without transferring current states to storage."
IOW, no nesting of interrupts for one-cycle response. I suppose you could get invariant timing on a Cortex interrupt if no nesting was allowed--but it would still take some cycles to stack registers.
Don't some of the PIC chips do register swaps at interrupts? I have vague memories (or perhaps shadowy nightmares) from a decade or so back when I worked with one of the PIC chips.
Mark Borgerson