Caches in embedded systems

Hi I would like to know what sort of caches (Icache and Dcache) do typical embedded systems use. I know caches are avoided in real time applications but wondering when they are used what sort of configuration do they have Directmapped/assoictaivity, what line size etc, writeback/write through ? That aso leads me to askembedded systems with caches what sort of applications are they used in. Can any one give me some examples.

thanks Shrey

Reply to
shrey
Loading thread data ...

On 3 May 2006 18:04:17 -0700, "shrey" wrote in comp.arch.embedded:

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

If you are asking elementary questions like this about the use of cache, how do you know they are "avoided in real time applications"? What is your source for this information?

We are shipping a product today with a board developed about 10 years ago, with a 66 MHz 486DX2 running a commercial RTOS.

Do you really think we avoided turning on the built-in cache in the

486 to decrease our worst-case processor loading from close to 80% to under 50%, and improve response times for the highest priority events?

If really think that, think again.

Our latest generation uses an ARM9 with 16k each of on-chip I cache and D cache. Maybe you think we don't use them.

For future use, we are looking at Freescale's ARM11, which has the same size I and D cache, and also 128KB level 2 cache on chip. I guarantee you that if/when I put this part on a board, we will use that cache as well.

And of course, there are a large number of PC and/or PC compatible running real time embedded applications under VxWorks, QNX, and others, that use the on-chip cache of the x86 processor plus in some cases external cache on the motherboard.

If you are talking about adding cache memory external to the microprocessor/microcontroller, you should be more specific.

The first spin of our 486 board 10 years ago had 128KB of external level 2 cache. It was eliminated as a cost reduction when the system went into production because by then we had verified acceptable performance to meet our requirements using only the on-chip cache.

--
Jack Klein
Home: http://JK-Technology.Com
 Click to see the full signature
Reply to
Jack Klein

Am I missing something? AFAIK, cache is an addressable piece of (presumably, fast) memory which we ask the MMU to use for cache functionality. If so, we may choose it for different purposes, such as buffer(s) for intensive calculations. I mean, if "embedded software" is the software which is a part of a device, it generally knows what device it runs on and what it is doing. So, it can possibly use fast memory better than a general-purpose caching algorithm built into the MMU. Again, am I missing something? Please help me here. Regards, Ark

Reply to
Ark

Yes, you're missing almost everything about cache memory.

Cache is not addressable memory, but a unit that sits on the path to addressable memory which remembers what is in recently-used locations of addressable memory so that it can be recalled faster. It usually uses *content-addressable* lookup on the memory address. Because it's not possible always to know which locations will be remembered and hence faster, cache can alter timing in ways that can cause problems in some real-time apps.

Clifford Heath.

Reply to
Clifford Heath

Only if idiots are programming who dont know what volatile means. Caches are not an issue in real time systems unless you have multiport memory that another realtime system is relying on and your system doesnt support the ability to write through the cache or 'flush out' the cache when requested.

but wondering when they are used what sort of

The answer is pretty simply, they are used in any system where clock speed CPU runs faster than the memory. Once you pass about 40Mhz you generally see caches starting to come in on embedded systems. We have some systems that run from 1Mhz to 33Mhz with no caches and from 66Mhz upto 1Ghz all have caches. They are almost never direct mapped because the caches on most embedded systems will be 16kb etc while the main memory could easily be 16MB or more. Usually they have

256/512/1024 byte chunks (it all depends on the CPU and its pointer registers). Usually you associate a peice of cache to map over the main memory. A system of oldest section used combined with least section used decides which cache block to overwrite. Modern caches allow block sizes of variable size.
Reply to
DAC

Unless the cache is very badly implemented, the worst case timing occurs when the cache is disabled.

Thus, it is sufficient to verify that all high priority tasks with some definite deadlines are executed within the deadlines even when the cache is disabled.

A system usually also contains tasks that are not time critical (such as user interfaces or calculating weekly statistics), which are executed, when _no_ high priority task is using the processor. Thus, the low priority tasks can progress quite slowly, if there is much high priority activity.

Enabling the cache reduces _on_average_ the execution time of the high priority tasks, freeing up some additional time for the low priority tasks, which will progress more rapidly and ultimately allows the null task to run at times when there is no useful work to be done.

The net effect of enabling the cache is that the high priority tasks still complete within the deadlines, but the visible result is that the low priority background tasks executes faster.

Enabling the cache will also reduce the number of main memory accesses, thus reducing the memory and memory driver power consumption. This is usually true even if the low priority tasks can now perform more work, since the cache hit rate is usually quite high.

Paul

Reply to
Paul Keinanen

I meant, physically addressable. When you program the MMU, you tell it how to translate between virtual (app program-accessible) and physical memory, whether to enable cache, and what physically addressable region of memory is dedicated to cache. So, I don't think I was missing THAT, I am afraid I am missing the rationale for the blanket statement.

- Ark

Reply to
Ark

If your professor told you caches were avoided in real time applications, I suggest you get a better professor.

Reply to
Jim Stewart

The only interaction between the MMU and the cache that I can think of is that when using memory mapped I/O, the I/O pages should not be cached, since the contents of a I/O page register can change without the cache noticing it, thus there would be a cache coherence problem.

Paul

Reply to
Paul Keinanen

Independent of your cache implementation, the software *can* cause really bad timing in some cases, especially for data accesses. When the software hits in (almost) random access patterns a large amount of memory, i.e. larger than the cache size, each access causes a cache line load, instead of reading a single word needed by your program. When that data word is also changed it will furthermore mean that a whole updated cache line must be written back to memory when a new access is done. Of course this behaviour will not happen in general, but it can happen. Instruction cache behaviour in general will be good, unless your compiler is really broken.

Rob

Reply to
Rob Windgassen

Not true. Depending on system properties, such as CPU power, amount of data/code to be processed and cache size/properties, it may be a challenge for the designer.

Rob

Reply to
Rob Windgassen

A particular MMU might have that feature, but it's almost completely irrelevant to its function as an MMU.

*No physically addressable* region of (main) memory is used as a cache.

Certain regions might be cacheable and others not, on some MMU, butthat doesn't mean that the cache itself is *stored* in main memory. The whole point of cache is that main memory is too slow, so you use a smaller amount of fast memory for recently-used locations, so that you don't need to go to main memory at all (until something else needs that cache line and you have to save it).

Reply to
Clifford Heath

Even in this situation, the difference is not usually that dramatic. E.g. in a typical x86 implementation with 32 byte cache line and 8 byte (64 bit) wide DRAMs, a cache line loading requires one full RAS/CAS cycle (which includes the DRAM access time) to get the first 8 bytes and additional three CAS cycles (to get the remaining 24 bytes), which essentially activates a data selector on the DRAM.

A direct random memory access would still require the full RAS/CAS cycle to get up to 8 bytes of data. Both the cache line load as well as the random access read contains a single memory cell access time (which depends on the cell technology), while the cache load additionally multiplexes three times 8 bytes on the memory data bus (the time depends on the bus speed).

The need for immediate write back usually occurs in direct mapping caches, but in any associative mapping, the write back can usually be delayed.

It should be noted that with direct access, a read and a write cycle must still be performed.

Even a single byte write to a 2 .. 8 byte wide memory will require a read operation to get the unmodified bytes from the memory word, then replace the byte to modified in the CPU and then write back the full 2 .. 8 byte wide memory word, unless of course you have up to 8 separate write enable signals for each byte.

The difference between direct access and cached access is not that great as it might first appear.

In hard real time systems with firm deadlines the worst case situation must still be identified.

Paul

Reply to
Paul Keinanen

Reply to
Alan

Hi,

DMA is an integral part of all Embedded Systems. DMA s can be programmed to move data ( usually large amounts of Data ) to and from Internal memory of processors. In such a case If a processor does not have a cache then during the DMA the processor has to be stalled because the inetrnal memory can be accessed by either the DMA or the processor at a time but not both.

But if the processor has a cache ( L1 cache ) then the processor will operate on the L1 most of the times while the DMA parallely operates on the Internal memory.

Regards, kvm.

Reply to
manja

NO it is part of SOME embedded systems, this may be a larger number than 5 or 10 years ago, but there are lots of applications the do NOT use DMA in ANY form.I wonder what use DMA is applications like

Timer functions for heating and ventilation control Security systems ......

--
Paul Carpenter          | paul@pcserviceselectronics.co.uk
    PC Services
 Click to see the full signature
Reply to
Paul Carpenter

No, it is not. There are *very many* embedded systems that do not use DMA.

--

John Devereux
Reply to
John Devereux

What I wanted to convey was, When you have DMA in the system, Caches become very useful as the processor need not be stalled during a DMA transfer.

-kvm

Reply to
manja

Do anyone know of any embedded processors that use direct mapped cache. Direct mapped cache definetly offers lower per access energy cost but it increases the misses in many cases. So is there any incentive in using a direct mapped cache over an associative cache. One argument I have heard is it saves area. Is that really serious ? Also are line sizes as small as 8 bytes of any value ?

shrey

Reply to
shrey

Include context, dammit. See my sig below for means on the broken google interface. DO NOT ASSUME google is usenet. READ THE URLs.

There are various forms of DMA. Some can seize control of the memory buss, others do what is known as cycle stealing. Usually DMA access is given preference, and thus can very well stall the processor. You have to look at the actual situation.

--
"If you want to post a followup via groups.google.com, don't use
 the broken "Reply" link at the bottom of the article.  Click on 
 Click to see the full signature
Reply to
CBFalconer

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.