instruction cache or data cache


Should my processor have instruction cache more ? or Should my processor have data cache more ?

Which should be more for an efficient processor ?

Thx in advans, Karthik Balaguru

Reply to
Loading thread data ...

More of both, If you can ;)

It really depends on the ISA and the workloads your expecting, as well as the rest of the architecture surrounding the chip. You should paint us a picture of what you already have and then you might get some better help. :)


Reply to
Joe Peric

I am designing a system that should do efficient processing of data received and also communication with a FPGA(Shared Memory) that will have data written/read by all other processors (2 other DSP processors). All these processors can communicate the data/info among themselves via the FPGA(Shared Memory) and can perform algorithmic manipulation of those data after reading from FPGA or before writing to the FPGA.There is one more controller that takes of the auxillary devices and that will also be communicating with the FPGA for retreiving the data for its auxillary devices.

Karthik Balaguru

Reply to

It all depends on what kind of processing is to be done.

If the algorithms require touching the data just once (like compressing, converting or encrypting a stream), a data-cache won't help much. It will be merely reduced to "prefetch" and "writebuffer" functionality. It might even hurt performance, especially when you need each data word just once, and in random order. However, if you need to touch each data word several times (but can't keep them in CPU registers in the meantime), then a correctly sized data-cache will help a lot.

If the algorithm code is very repetitive (tight loops et al), then an instruction cache will help too. On the other hand, if you have to branch to lots of possible execution flows, then the instruction cache won't be very useful. It too can hurt in some situations.

So unless you specify your situation with more detail, our answers can't be more specific either.

Regards, Marc

Reply to

The processor is yours, so it is for you to decide.

None. Ideally, the bus should run at full speed.

Vladimir Vassilevsky DSP and Mixed Signal Design Consultant

formatting link

Reply to
Vladimir Vassilevsky

This depends upon the target of the design point. If the design is shooting at less than 1/2 of the maximum performance if all stops are taken out, then you can get away with a unified cache, expecially if wide access is allowed to the cache (like 128-bits/cycle). This kind of width allows several instruction accesses to take only one cycle, leaving the other 3/4 of the cycles for data accesses. I once designed a pipeline around this very notion, and found that it works quite well (at the 1/2 less than all out max performance).

A unified cache may actually have more cache and a better range of ballance points between I and D stream accesses. Separate I and D caches are for when one cannot solve the interference problems (which typically acrue with targets of max performance possible--"damn the torpedos, full speed ahead".

useful additions:

If the instruction set is byte aligned, then consider making the basic cache access byte aligned for both instruction fetches and data fetches. This is a straight tradeoff between power and utility, and when making a unified cache it definately helps lower the interference between the I and D streams.

This kind of cache also gets rid of Self-modifying code issues as the modified data is in the same cache as the instructions. However, it does NOT get rid of cross modifying code issues.

Reply to

If your target performance is less than 1/2 of the max performance achievable with all the tricks of the trade, then you can get by with a unified cache. This is especially true if you fetch wide items from the cache (128-bits) and double especially true if you can fetch a misaligned 128-bit item from the cache. These two things get the natural instructiion fetch out of the way from the data fetches for most applications.

Reply to

What am I missing here? If you can tolerate 1/2 the max performance, why are you concerned with any cache?

Reply to
Everett M. Greene

Non-cache memory is so slow that one cannot obtain even 1/10 max performance without caching. So caches have to be somewhere.

What I am suggesting is that between 0.X and 1.0 of "as high a performance as you can architect" one pretty much needs a harvard cache design, because the interference between the I and D streams is (for practical purposes) unsolvable for a unified cache.

X can be somewhere in the 0.4 to 0.6 range of performance where a top- end Intel or AMD CPU is rated at 1.0.

Whereas under 0.X of max performance, the unified cache comes back into vouge and will end up out performing the separate I&D caches of equivalent storage space.


Reply to

In general, instruction caches tend to have higher hit rate the data caches. Data caches have to deal with less predictable data reads and writes. Also, data cache entries need to be written back to the main memory.

-- EventStudio 4.0 -

formatting link
Embedded System Modeling with Text Based Sequence Diagrams

Reply to

You could simulate for your particular application. Write a simple simulator that's behaviourally close to your intended design and run something close to your intended application on it. You can parametrize the cache size (and associativity, etc.) and then try different sizes to see whats best.


Reply to
Chris Maryan

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.