You really need to define your requirements better. "Lots of I/O" does not say anything. Do you need 8 bit, 16, or 32 bit I/O, or just single control lines. How many? That makes a BIG difference. The AT91RM9200 has 208 pins, quite a bit of I/O, but you can't talk to an external bus device with
32 bits. If you need that (32), you might look at the Sharp LH79524. Lots of good Philips parts too in the LPC2000 family. Depends if you want ARM7or ARM9.
You say you don't want FPGA (CPLD?), Well do you want to have to add external Flash and SDRAM chips? If not, than stay away from ARM9 parts and stick to ARM7 with built in flash and SRAM.
Just saying you want 100MHz is not a reliable indicator of speed for anything. Some 50MHz parts could have much higher speed than a 100MHz part. That is a very complex subject. The Philips LPC2000 60MHz parts run a 128 bit wide memory access to their internal flash that allows for Zero wait states at full 60MHz. That's probably faster than a 100MHz part with wait states.
There's a ton of MCUs on the market today. You need to refine your requirements in detail to select what fits your real priorities and needs.
How long does you design need to be in production? Some Semi vendors are now saying only 5-7 years on their new MCUs.
I've done two recent designs with ARM7, and I don't want to use anything but ARM now unless I really had to. There were over 1.3B ARM cpus shipped last year. They are taking over the market. I have to design for 15 year product life, and I won't touch anything but ARM now. Many proprietary CPU cores may not be around in the future. Too few in use = discontinued parts.
Chris.