I think this is the crux of the problem, so let's address this first:
Neither case can execute the instructions of both interrupts exactly at the same time (only a multicore would execute A1 and B1 at the same time, not serially like below).
The code executes in the order as you wrote above (assuming one instruction per cycle). In both cases interrupt handling starts and stops exactly at the same time, so there is no difference in total interrupt latency. In both cases instructions are executed serially, but with different interleaving. However any interleaving (like A1:A2:A3:B1:B2:B3:B4:B5:B6:A4:A5:A6:A7:B7) is correct as the interrupts are independent.
Now where do you see a problem? If you do, please remember that just about all CPUs today execute interrupts serially without any issues, and that multithreaded CPUs do interleave instructions differently depending on circumstances (eg. other interrupts).
Wilco