I think it is /possible/ to have cache on a CM3, but it is certainly not common.
There are three points here:
Cortex M devices use the Thumb2 instruction set - the aim of this is that a solid majority of instructions are 16-bit. Since these cpus are single-issue, that means you can run your cpu an average of about 50-70% higher clock speed than flash, assuming a 32-bit bus.
There are ways to get processors going faster than flash even without a processor instruction cache. In particular, it is common for the flash units in faster microcontrollers to have a small buffer/cache in the flash module. If this is combined with wide access flash, say 64-bit wide, you can easily get streams of instructions at cpu speed (but with a penalty for branches and calls).
And here is the main point - manufacturers /don't/ keep speeding up CPU clock speed on the CM3. Most serious manufacturers who make fast CM microcontrollers have moved to the CM4 - some never bothered with the CM3 in the first place. They put caches (and single-precision floating point) on their faster devices.
So yes, CM3 devices /are/ low end - they are now either on older, legacy parts (in this field, that means more than a couple of years old), or as microcontrollers in integrated chips where the cpu plays a minor role (such as a high-end ADC that happens to have a cpu integrated).