I have measured the time it takes for the IAP to erase one 4kB sector and it was 400ms. The processor is being clocked with a 14.74MHz chrystal, 1x multiplier, no MAM. No interrupts whatsoever. Activating the PLL and MAM made no difference.
I configured the CCLK to 4x and then the delay dropped to 100ms (note: the CCLK argument to IAP was changed accordingly). Then I changed the CCLK argument passed to the IAP to a fraction of the real value and it worked up to around 1/50th of CCLK and the delay dropped to around 8 ms. Using CCLK/20 makes the delay to be 20ms which is seems much more reasonable to me than 400ms to erase a block. However it is a workaround and I am not sure I can leave it like this. Writing 4kB of data to the same block takes 22ms.
I tested it with two differenc uC, LPC2136 and LPC2138 with the very same results. Using IAR EWB 4.20A. Boot code is 0x20B for the LPC2138 and 0x20C for the LPC2136.
What could I be missing? I carefully inspected the code (it was written by another person) and the relevant parts comply with the manual. The manual does not say anything about IAP timings and how it measures timings.
Thanks for listening and thanks in advance for any hint.