pci and caching

Consider a Pentium PC with main memory and a PCI bus. One can plug memory-type devices into the PCI bus, things like video or ADC buffers, CPCI cards, and including, I suppose, more program-space memory.

A couple of questions:

Is there (I guess there must be) a mechanism for a Windows program to directly map a chunk of PCI-bus memory into its virtual address space? Anybody know how this works?

Does anybody know how the BIOS decides what should be cached? There's nothing in a device's PCI config registers that says "don't cache me" as far as I can tell.

I know the guy who wrote the book "PCI Bus Demystified" so I asked him; he hadn't a clue about any of this.

Thanks,

John

Reply to
John Larkin
Loading thread data ...

Don't know about windows, but on linux, you'd use mmap() to map a physical memory range such as a PCI VGA card to a process's virtual memory. Check out linux device drivers for more info:

formatting link
. Specifically, check out chapter

  1. I guess on windows its the same. Look for documentation of mmap() or memmap() etc for windows.

Don't know, but mmap() seems to handle this. CPUs that implement caching typically have ways to prevent certain memory I/O from being cached. Usually this can only be done in supervisory mode so you really need to ask your OS to set it up for you.

Reply to
slebetman

I'm thinking about doing a CPCI board that would look like a smallish block of stuff in memory space (as opposed to i/o space) on the PCI bus. I was wondering if a Win application could get at it directly, without doing a driver call for every i/o access, and how, in general, a PC decides what's cachable and what's not. My board would deliver realtime data in the register block ('volatile' in C-speak, I think) so it must not be cached at any level.

I'm guessing everything beyond the contiguous ram space is not cached, and/or maybe anything above the 2Gig line doesn't get cached. Funny how little seems to be known about this.

Yeah, we have a couple of PowerBasic programs that run under DOS or

9x, that search PCI config space for a device and then drag it down into a hole in real space, between 640K and 1M. Of course, first you have to locate an unused, uncached hole to plop it into, and that seems to be different from bios to bios. We've found systems that have unused, cached holes!

John

Reply to
John Larkin

Not really. Once upon a time I wrote some DOS programs that did this (using djgpp) but I've never done it in windows. You may have to write device driver code to do this sort of thing. Normal application code may not have enough privilege to perform this type of mapping.

I'm not 100% sure what you mean by cached, in this instance. Do you mean cached by the CPU itself (L2 cache)? If so, I have no idea. I am pretty sure that when control initially passes to the BIOS the cache is disabled, though. At some point, the BIOS turns on the cache, but this may be very late in the boot process.

But if you are talking about accesses being cached by the North Bridge, then I would guess that PCI accesses are never cached.

I think the way this words is that memory access from the CPU goes over the host bus to the North Bridge, and the North Bridge routes the access to PCI space or SDRAM as appropriate. If it is a PCI access, the North Bridge would not ever cache, I assume.

What are you trying to do? As far as I know (which may not be very far!), you don't need to worry about this issue unless you are doing something really obscure.

I have written DOS code (again, with djgpp) which accessed PCI config space (and memory mapped areas) for reading and writing and there were no problems with caching.

HTH!

--Mac

Reply to
Mac

I'm rusty on this stuff, so be suspicious.

I/O space is a kludge for ISA. Best to avoid it, but I don't think it's really any different. (other than not many address bits)

I think there is a cachable bit in the config space options.

If you have appropriate driver support, you can map some of an applications virtual address space to "memory" on your device, and it's easy to make that uncached.

With appropriate driver support, I've written diagnostic/hacks that ran in user space. Interrupts are a bit tricky. I forget the details. I think there was a magic location to read that turned off the interrupt. We probably had some API to tell the driver the address of that location, it would read it and save the answer, and more API to get that data (probably that last one) and a count of how many interrupts had happened since the last time you asked.

The deal with PCI is that the BIOS allocates PCI addreses. If a device wants X bits of address, they must be aligned on an X bit address boundry. Unless you are very lucky, that will result in holes. (Lucky means you filled in all the holes with chunks from other devices.)

--
The suespammers.org mail server is located in California.  So are all my
other mailboxes.  Please do not send unsolicited bulk e-mail or unsolicited
 Click to see the full signature
Reply to
Hal Murray

It wasn't bad... just had to figure out a couple of bios calls. It's available for anybody who's interested. We also wrote a couple of cute utilities that scan the 0..1M address space; one shows the data contents, one graphs access time vs address. Between the two you can pretty much figure out where the BIOS has put things (like shadows of itself!) and what's cached.

We like to use DOS for our dedicated VME/PCI test programs (rackmount PC or VME card embedded Pentium) because we can own the CPU and run pretty much realtime, even turning off or dancing around the 18 Hz clock tick in extreme cases.

Well, I'm typing on, essentially, a heavily kluged 8008 CPU, which was a ghastly bad architecture to start on. How we wound up with Intel and Microsoft as the dominant computer architecture is one of the tragedies of our time. Think of how things would be if IBM had gone with the 68K and anybody but Bill.

John

Reply to
John Larkin

Not PCI config registers. Memory configuration registers. MTRRs (Memory Type Range Registers) in the processor and chipset control such things.

Memory on the PCI bus can no longer cached (deprecated in V2.2, IIRC). Cacheable memory on the PCI bus is a mess so most systems, even before it was yanked out of the spec, didn't support it.

I *highly* recommend "PCI System Architecture" by Shanley and Anderson as a introduction/reference. MindShare has a very good series of books on busses and processors. They do an *excellent* series of courses too, if a tad expensive. Their books are available in dead tree form or as e-books:

formatting link

--
  Keith
Reply to
Keith

Hmm. It seems like other people must have done this (or something like it) before. I am guessing that this will work fine without any special precautions.

Well, there are people out there who know this stuff, but they might not be reading here. There is a lot of information about Intel Architecture that is hard to ferret out. Most of it doesn't seem to be in any kind of real specification, either.

Anyway, I am quite sure that accesses to memory mapped areas on the PCI bus are totally distinct from true memory accesses in the sense that the North Bridge is well aware which area it is accessing. So I don't think you have to worry about cacheing there.

As for the L2 cache, I'm not sure how that is managed. Maybe it only caches instructions. But it obviously doesn't interfere with normal device driver operation, so I don't think you have to worry about it, either. As you can tell, I'm just guessing, but it seems like there are a lot of things that wouldn't work right if these types of accesses were cached by hardware.

Oh, wow. That sounds kind of hard. With djgpp and a protected mode stub, you can write real 32-bit programs for DOS. No worries about dragging stuff down below 1M. You don't get any kind of GUI, but for some tasks that is not a problem.

The Intel Architecture and associated DOS baggage is unbelievable arcane. It sure would be nice to dump it all and start over. ;-)

--Mac

Reply to
Mac

John, the NT DDK provides an example program that does just this. Please see this link from MS:

formatting link

The down side is that you are opening up protected hardware resources to user apps, which isn't 100% kosher.

I have seen this applied to Win2k and XP. Beats me what would be required for 9x.

Reply to
tbroberg

But only the PCI card itself knows whether its memory-space registers are suitable for caching. If it's video memory, likely they are; if's an ADC buffer, it sure ain't. Seems to me that, if PCI space is allowed to be cached, there should be a mechanism that allows a PCI card to tell the bios (or OS) whether caching is a good idea for it.

John

Reply to
John Larkin

The specs, perhaps?

That doesn't change the cacheability. Caches are a processor thing and

*NOT* under control of any PCI device.
--
  Keith
Reply to
Keith

Don't confuse pre-fetching with caching. And don't confuse MRM with cachable. To be able to pre-fetch data simply requires that a read to that data not cause any side-effect. To be able to cache data requires that not outputting that data not cause any side-effect.

Generally the only cachable device is generic RAM. When you output to any other device besides RAM you expect to see that output in the real world. RAM is therefore the only safe device to cache.

For example, say you have a memory mapped variable 'foo'. You need to send a signal to the device that 'foo' belongs to and it needs to be a pulse. Say you code something like:

foo = 1; delay(1); foo = 0;

If 'foo' was cached, then the signal may or may not be generated in this case. Even with the delay statement (which prevents some C compilers form optimising away foo=1) 'foo' may not be written back to the device if there is still enough space in the cache to not require writing back 'foo'.

Pre-fetchable means it is safe to read data multiple times. Cachable means it is safe to not write out data.

Reply to
slebetman

I guess the answer for linux is hidden between ;

"Chapter 9 page 236 I/O Registers and Conventional Memory"

and

"Chapter 12 page 316 Accessing the I/O and Memory Spaces"

of ldd3.

yusuf

Reply to
yusufilker

Short answer: Snoops from any PCI initiator into PCI cached memory are in the purview of the spec. ;-)

Longer answer[*]: The SBO# (Snoop Back Off) and SDONE (snoop done) signals are part of the pre-PCI2.2 spec (actually, I believe in 2.2 it's recommended that they not be used and pulled high). These are used by the memory bridge to initiate retries to cached memory. SDONE indicates an access to cached memory is complete. SBO# active indicates a cached line is being accessed and the access must be terminated by the initiator and retried later.

[*] I've ever used these things so I'm not real up on the spec here. The performance is horrible so isn't often implemented and less often used.

prefetching cacheing

But it *is* part of the (pre 2.2) spec. The bus must guarantee coherency in this case. The way it does it is with back-offs and retries. Now think about this with multiple bridges and initiators. It gets to be a mess.

It's there in the older versions of the spec. As I've mentioned, it's been deprecated in later versions.

--
  Keith
Reply to
Keith Williams

I'm not sure where you got this info, but it's news to me! :O

To answer the OP:

As for mapping PCI memory from windows, there's a toolkit called TVICPCI which grants user space access to PCI resources, which of course includes memory. They have a demo version for evaluation. You won't be able to tell Windows to use this as generic memory space (ie. Windows won't be executing out of it), but you will be able to use it for your own data.

There's a bit in the PCI BAR that specifies whether or not the region is pre-fetchable. If set, a PCI master will know that it is allowed to use MRL (memory read line) or MRM (memory read multiple) on that address space. That doesn't necessarily mean it will.

Regards, Mark

Reply to
Mark McDougall

The BIOS doesn't. The device driver (which knows the PCI card intimately) asks for non-cached memory when it asks for a virtual mapping of the physical address range of the card. If that's appropriate. If there's no reason why it can't be cached, it doesn't.

Steve

-------------------------------------------------------------------------------------------- Steve Schefter phone: +1 705 725 9999 x26 The Software Group Limited fax: +1 705 725 9666

642 Welham Road, Barrie, Ontario CANADA L4N 9A1 Web:
formatting link
Reply to
steve_schefter

Which begs the question, why would the PCI spec refer to something that has nothing to do with PCI?

If a PCI memory space is marked as 'pre-fetchable' then it guarantees, among other things, that the act of pre-fetching memory has no side-effects. This means nothing more than the fact that it may be a suitable candidate for caching, if the platform supports it. In this case, a master may issue MRL (& MRM) commands.

OTOH, cache-coherency (which I assume you're hinting at) is a different problem altogether - especially if you've got multiple bus masters accessing PCI memory space with their own caches. However, this is a

*system* problem and (IMHO) not really any concern of the PCI bus spec group to mandate that PCI memory is not 'cacheable' - whatever that means in each context!

In fact, there's little discussion what-so-ever in the spec (that I can see) about 'caches' - which is just what I would expect.

BTW I'm quite happy to be shown the error in my reasoning!

Regards, Mark

Reply to
Mark McDougall

But we often use PCI devices under DOS, with no device driver at all. It appears to me that the BIOS locates devices in PCI config space, looks at the requested resources (in the PCI overhead reggies) and assigns memory space, as requested, to the gadgets. Usually these addresses are really high, past 2G as I recall. So the cached/uncached situation must be resolved, somehow, before the os boots, although it can certainly be changed by drivers later.

John

Reply to
John Larkin

What book? The usual answer for Linux is O'Reilly's Linux Device Driver, chapters 7, 8 and 13. The free online version can be found at:

formatting link

Reply to
slebetman

With Windows NT (2k/xp/2k3) applications can never access hardware directly. (*) You will have to write a kernel mode device driver to control your piece of hardware. The NT kernel mode api is very different from what you know from the user mode Windows api. You will need the Microsoft Platform SDK and the DDK and the Microsoft C compiler. The DDK contains samples and documentation for everything. Try to minimize the user/kernel mode switches.

(*) There are kludges like giveio.sys etc, which allow user applications to access io ports. Forget about these - it won't be enough for you.

Mit freundlichen Grüßen

Frank-Christian Krügel

Reply to
Frank-Christian Kruegel

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.