instruction cache or data cache

- K
- karthikbalaguru
  
  Contact options for registered users
posted
16 years ago

Wed, Oct 17, 2007 3:00 AM

Hi,

Should my processor have instruction cache more ? or Should my processor have data cache more ?

Which should be more for an efficient processor ?

Thx in advans, Karthik Balaguru

- J
- Joe Peric
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Wed, Oct 17, 2007 3:34 AM

More of both, If you can ;)

It really depends on the ISA and the workloads your expecting, as well as the rest of the architecture surrounding the chip. You should paint us a picture of what you already have and then you might get some better help. :)

Joe

- K
- karthikbalaguru
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Wed, Oct 17, 2007 4:50 AM

I am designing a system that should do efficient processing of data received and also communication with a FPGA(Shared Memory) that will have data written/read by all other processors (2 other DSP processors). All these processors can communicate the data/info among themselves via the FPGA(Shared Memory) and can perform algorithmic manipulation of those data after reading from FPGA or before writing to the FPGA.There is one more controller that takes of the auxillary devices and that will also be communicating with the FPGA for retreiving the data for its auxillary devices.

Karthik Balaguru

- J
- jetmarc
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Wed, Oct 17, 2007 8:23 AM

It all depends on what kind of processing is to be done.

If the algorithms require touching the data just once (like compressing, converting or encrypting a stream), a data-cache won't help much. It will be merely reduced to "prefetch" and "writebuffer" functionality. It might even hurt performance, especially when you need each data word just once, and in random order. However, if you need to touch each data word several times (but can't keep them in CPU registers in the meantime), then a correctly sized data-cache will help a lot.

If the algorithm code is very repetitive (tight loops et al), then an instruction cache will help too. On the other hand, if you have to branch to lots of possible execution flows, then the instruction cache won't be very useful. It too can hurt in some situations.

So unless you specify your situation with more detail, our answers can't be more specific either.

Regards, Marc

- V
- Vladimir Vassilevsky
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Wed, Oct 17, 2007 2:28 PM

The processor is yours, so it is for you to decide.

None. Ideally, the bus should run at full speed.

Vladimir Vassilevsky DSP and Mixed Signal Design Consultant

formatting link

- M
- MitchAlsup
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Wed, Oct 17, 2007 6:01 PM

This depends upon the target of the design point. If the design is shooting at less than 1/2 of the maximum performance if all stops are taken out, then you can get away with a unified cache, expecially if wide access is allowed to the cache (like 128-bits/cycle). This kind of width allows several instruction accesses to take only one cycle, leaving the other 3/4 of the cycles for data accesses. I once designed a pipeline around this very notion, and found that it works quite well (at the 1/2 less than all out max performance).

A unified cache may actually have more cache and a better range of ballance points between I and D stream accesses. Separate I and D caches are for when one cannot solve the interference problems (which typically acrue with targets of max performance possible--"damn the torpedos, full speed ahead".

useful additions:

If the instruction set is byte aligned, then consider making the basic cache access byte aligned for both instruction fetches and data fetches. This is a straight tradeoff between power and utility, and when making a unified cache it definately helps lower the interference between the I and D streams.

This kind of cache also gets rid of Self-modifying code issues as the modified data is in the same cache as the instructions. However, it does NOT get rid of cross modifying code issues.

- M
- MitchAlsup
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Thu, Oct 18, 2007 5:39 AM

If your target performance is less than 1/2 of the max performance achievable with all the tricks of the trade, then you can get by with a unified cache. This is especially true if you fetch wide items from the cache (128-bits) and double especially true if you can fetch a misaligned 128-bit item from the cache. These two things get the natural instructiion fetch out of the way from the data fetches for most applications.

- E
- Everett M. Greene
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Thu, Oct 18, 2007 5:11 PM

What am I missing here? If you can tolerate 1/2 the max performance, why are you concerned with any cache?

- M
- MitchAlsup
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Fri, Oct 19, 2007 7:40 PM

Non-cache memory is so slow that one cannot obtain even 1/10 max performance without caching. So caches have to be somewhere.

What I am suggesting is that between 0.X and 1.0 of "as high a performance as you can architect" one pretty much needs a harvard cache design, because the interference between the I and D streams is (for practical purposes) unsolvable for a unified cache.

X can be somewhere in the 0.4 to 0.6 range of performance where a top- end Intel or AMD CPU is rated at 1.0.

Whereas under 0.X of max performance, the unified cache comes back into vouge and will end up out performing the separate I&D caches of equivalent storage space.

Mitch

- E
- EventHelix.com
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Sat, Oct 20, 2007 3:47 AM

In general, instruction caches tend to have higher hit rate the data caches. Data caches have to deal with less predictable data reads and writes. Also, data cache entries need to be written back to the main memory.

-- EventStudio 4.0 -

formatting link

Embedded System Modeling with Text Based Sequence Diagrams

- C
- Chris Maryan
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Sat, Oct 20, 2007 5:51 PM

You could simulate for your particular application. Write a simple simulator that's behaviourally close to your intended design and run something close to your intended application on it. You can parametrize the cache size (and associativity, etc.) and then try different sizes to see whats best.

Chris