Optimizing Instruction Cache

Hi,

I'm interested in compiler optimisations aiming at the improvement of the instruction cache behavior. For data caches there are numerous compiler optimisations, many of them operating on loops. However, for instruction caches I've just found some very few on google.

Could you give my any hints/references for any (further) instruction cache optimizations? I appreciate any answers.

Best regards, Tim

Reply to
Tim Frink
Loading thread data ...

I don't think there is much here. Instructions are normally executed in sequence, barring jumps. The instruction cache is some sort of buffer holding a portion of the instruction segment. This works as long as the instruction to be next executed lies within that cache. So the things that will help it include taut code and short jump distances. Procedure/function calls will obviously wipe out the cache (usually), so you want to concentrate the loops in the lowest level procedures.

You might do something by letting the cache hold several different areas, and selecting which to update under what conditions. But doing this intelligently requires program flow analysis.

--
 Chuck F (cbfalconer at maineline dot net)
   Available for consulting/temporary embedded and systems.
Reply to
CBFalconer

The whole point of using cache is not being bothered with the optimization. If you want to optimize the execution by hand, then you can load the critical parts in the L1 RAM and lock it there. The 4+ way cache does a good job on optimization by itself. The code just have to avoid the unlikely situations of the cache stall. Thus the linker has to arrange the code in the way to prevent the jumps to the distance close to the size of the cache.

Vladimir Vassilevsky DSP and Mixed Signal Consultant

formatting link

Reply to
Vladimir Vassilevsky

I seem to recall the DEC MIPS linker reordered basic blocks to improve code cache performance. Don't know if that's still a useful idea with today's architectures. Code is comparatively small and you have little control over access sequence (beyond basic block order). Data can be huge and all kinds of access sequence fiddling is possible. That's why you see much more work on data cache optimizations.

Here's an old reference. Maybe place to start.

Scott McFarling. Program optimization for instruction caches. Third International Symposium on Architectural Support for Programming Languages and Operating Systems, pp. 183-191, April 1989. Published as Computer Architecture News 17 (2), Operating Systems Review 23 (special issue), SIGPLAN Notices 24 (special issue).

Reply to
Gene

Here's a recent article on the topic:

formatting link

Reply to
David Brown

This optimization in vain.

The program is executed in the virtual address space whereas the cache is operating with the physical addresses. So the compiler and linker can't have any idea of how the pages are going to be mapped in the every particular case.

Vladimir Vassilevsky DSP and Mixed Signal Design Consultant

formatting link

Reply to
Vladimir Vassilevsky

Some systems have caches that use virtual addresses, for example the LEON series of SPARC processors. The cache tags then have a context (process number) field in addition to the address field so the hardware knows which process (which virtual address space) owns a given cache line.

--
Niklas Holsti
Tidorum Ltd
 Click to see the full signature
Reply to
Niklas Holsti

Assuming fairly typical cache behavior:

It would be possible to position collections of routines with carefully selected page offsets so as to avoid conflict misses. Say you had ten small routines that called each other. Aligning each of them at the beginning of a 4K page would make for terrible cache thrashing with typical two- or four-way set associative i-caches. Aligning them on different 512 byte boundaries would likely allow all of them to fit in the cache.

Reply to
robertwessel2

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.