SPI Interface

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Sep 15, 2014 12:20 AM

We discuss current need for faster memories and you bring up a system from 40 years ago. What???

--

Rick

- M
- Martin Gregorie
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Sep 15, 2014 12:31 AM

I see a speed difference. IIRC almost all graphical cards sit on the PCI bus which appears to run at 33 MHz.

This 'ere laptop, a not exactly new Lenovo R61i with a 1.66 GHx Core Duo chip, uses SODIMM RAM which runs at 667MHz.

On these numbers the RAM is 20x faster than the PCI bus, which means that in the best case one 32 bit data chunk can only get DMAed across the PCI bus while 20 32 bit transfers to or from RAM can occur during the same time interval.

--
martin@   | Martin Gregorie 
gregorie. | Essex, UK 
org       |

- M
- Martin Gregorie
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Sep 15, 2014 12:48 AM

Of course it is.

No, that's not the point. The point is that, because the off-bus source or sink of DMA data is so much slower than any on bus source, the

*interval between items of DMA data becoming available for transfer* is measures in multiples of memory bus speed and so simply doesn't interfere much with non-DMA transfers using that bus.

Example: if you're using DMA to send data to a PCI display card on a box using 467MHz SODIMM RAM, the DMA subsystem can only move data to the PCI card at 32 bits at 33,000,000 chunks/sec (the PCI bus runs at 33 MHz) while the RAM can send or receive 32 bits at 467,000,000 chunks/sec (the SODIMM memory card runs at 467 MHz).

IOW the RAM can send data to the DMA transfer process 20 times faster than it can forward it to the PCI card. Put another way, the ongoing DMA transfer is only stealing one RAM access in twenty from everything else that is concurrently using RAM.

--
martin@   | Martin Gregorie 
gregorie. | Essex, UK 
org       |

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Sep 15, 2014 12:50 AM

Yeah, back in the 90's! Then it was AGP and now it is PCIe with up to

31 GB/s, almost a 1000:1 speed up.

If you are going to discuss this stuff, you might try actually learning something recent about it.

--

Rick

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Sep 15, 2014 12:54 AM

What does any of this have to do with the difference in performance between DMA and PIO?

--

Rick

- M
- Martin Gregorie
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Sep 15, 2014 1:01 AM

Correction: YOU seem to think faster memory is the answer to everything. I'm pointing out that this is a mistake. Faster memory is only needed if the speed of memory at a particular place in the machine causes a bottleneck. But, in a well-designed system this bottleneck can often be avoided in other ways such as parallel processing, and may even allow slower storage to be used in some parts of the system without affecting performance.

If you don't know what has been tried in the past and don't understand its costs and benefits then you're very likely to repeat past mistakes in the future.

--
martin@   | Martin Gregorie 
gregorie. | Essex, UK 
org       |

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Sep 15, 2014 1:15 AM

Wow! I have no idea where you are coming from.

You said this:

I disagree that this is true. Memory is still a major bottleneck in modern PCs, in particular PCs with multicore processors.

You seem to be discussing computing in general without describing any particular architecture. Sure, you can always construct a system so memory is not the bottleneck. Whatever.

--

Rick

- T
- The Natural Philosopher
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Sep 15, 2014 6:24 AM

PCI Express these days. for GFX.

a serial link capable of up to a GBYTE/sec according to wiki. If parallel serial paths are in use.

Mmm so its not memory mapped - just very high speed serial.

"PCIe 1.x is often quoted to support a data rate of 250 MB/s in each direction, per lane. This figure is a calculation from the physical signaling rate (2.5 Gbaud) divided by the encoding overhead (10 bits per

this is correct in terms of data bytes, more meaningful calculations are based on the usable data payload rate, which depends on the profile of the traffic, which is a function of the high-level (software) application and intermediate protocol levels."

well thats because it doesn't happen that way

apparently. so we are both wrong.

Modern GFX is not memory mapped and there are serial communications by a bus faster than memory.

--
Everything you read in newspapers is absolutely true, except for the  
rare story of which you happen to have first-hand knowledge. ? Erwin Knoll

- R
- Rob
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Sep 15, 2014 8:34 AM

I still have a card like that for some debugging, but those are very rare now. The first step away from that was the addition of a dedicated slot for the videocard, and now with PCIe a generic slot can do that but there usually still is one slot where a videocard would preferably go. (because it is much faster than the others)

- R
- Rob
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Sep 15, 2014 8:37 AM

That was in the old ages (like the mentioned 6809).

Today, the processor has built-in cache and has so much parallelism that it can always read instructions or data when it wants. The cache is required to reduce the memory access rate. When it does not access external memory it accesses its internal cache to get instructions.

- T
- The Natural Philosopher
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Sep 15, 2014 8:49 AM

A very dangerous oversimplification.

Caching most certainly does NOT help when e.g. accessing a graphic object many times larger than the cache. Nor is it a great deal of use if the processor jumps to a code segment not IN the cache.

And the smarts needed to determine what should BE in the cache are non trivial in terms of cycles.

For e.g. tight loops operating on local variables, cache is supreme. But for tight loops operating on very large memory objects, it aint so good. Sure the code is all cached, but the data cant be.

Neither is it successful when operating on large amounts of code - whether inline or subroutine.

--
Everything you read in newspapers is absolutely true, except for the  
rare story of which you happen to have first-hand knowledge. ? Erwin Knoll

- T
- The Natural Philosopher
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Sep 15, 2014 8:54 AM

PCI video is at least 15 years old now. I don't have any left at all. After that was AGP, and then PCIe.

In fact even this 7 year old board does not HAVE the ability to run off a PCI card IIRC. IT has to be PCIe.

"By 2010 few new motherboards had AGP slots. No new motherboard chipsets were equipped with AGP support, but motherboards continued to be produced with older chipsets with support for AGP.

Graphics processors of this period use PCI-Express, a general-purpose (not restricted to graphics) standard that supports higher data transfer rates and full-duplex. To create AGP-compatible graphics cards, those chips require an additional PCIe-to-AGP bridge-chip to convert PCIe signals to and from AGP signals. This incurs additional board costs due to the need for the additional bridge chip and for a separate AGP-designed circuit board."

(wiki)

--
Everything you read in newspapers is absolutely true, except for the  
rare story of which you happen to have first-hand knowledge. ? Erwin Knoll

- R
- Rob
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Sep 15, 2014 11:17 AM

I'm sure you know it all better than Intel and AMD!

- M
- Martin Gregorie
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Sep 15, 2014 11:21 AM

OK, but would anything that fast still use DMA? Sounds a bit unlikely.

There are still USB 2.0 interface cards and devices being sold which can handle block data transfers but have a max transfer rate that's still far below the bog standard PCI. Anything I've said previously will apply to them.

--
martin@   | Martin Gregorie 
gregorie. | Essex, UK 
org       |

- M
- Martin Gregorie
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Sep 15, 2014 11:29 AM

This: once a DMA transfer has been started it can continue in the background with relatively little interference with bus throughput and zero impact on the OS and other running processes apart from the interrupt it sends to tell the scheduler that the transfer has now finished and the process that requested the transfer can be put back on the 'runnable' queue.

What you really don't seem to get is that throughput comparisons are fairly irrelevant provided only that DMA is fast enough to manage the requested transfer. Its real purpose is to handle block transfers in the background without using CPU or OS resources.

--
martin@   | Martin Gregorie 
gregorie. | Essex, UK 
org       |

- A
- Andrew Smallshaw
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Sep 15, 2014 11:32 AM

His position wasn't entirely accurate but he did indirectly refer to user context. It was you that morphed that into user mode.

--
Andrew Smallshaw 
andrews@sdf.lonestar.org

- M
- Martin Gregorie
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Sep 15, 2014 11:32 AM

*THAT* is glaringly obvious.

I didn't say that at all: Paul Carpenter did. Get your attributions right.

--
martin@   | Martin Gregorie 
gregorie. | Essex, UK 
org       |

- R
- Rob
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Sep 15, 2014 11:40 AM

His basic understanding about sockets is wrong. Sockets can interface to different layers of the network stack. Those layers are entirely in the kernel, but the point where you connect your socket may differ.

It is not like "this is the lowest level where you can connect a socket so everything above that must be in user level libraries".

- R
- Rob
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Sep 15, 2014 11:44 AM

That, but also there is the widespread misunderstanding that the CPU would not be able to continue while the DMA is operating. This is not true because the DMA transfers at device rate (e.g. the A/D sample rate), and the memory bandwith is a vast multiple therof. So the DMA steals a memory cycle every 20us, but all the other cycles remain available to other uses. The disk operates at a higher speed so it steals more cycles, but still not all of them, especially in a modern system with 128-bit memory and burst mode transfers.

- T
- The Natural Philosopher
  
  Contact options for registered users
Vote on answer
posted
9 years ago

Mon, Sep 15, 2014 12:19 PM

Don't be silly, that's where I got it from!

Amongst other places.

Pipelining is no substitute for raw memory bandwidth. If you had that you wouldn't bother with pipelining or indeed cache or even registers.

Or multiple cores. ALL these things are there to give faster access to local storage than to main storage and the flipside is that sometimes you can't use local storage, and end up chewing cycles to detect that .

Its like disk caching. huge speed increase. Till you write a whole video to the disk.

The reason that we are seeing multiple cores and pipelines and cache is simply because the transistors wont go any faster, so that is the only direction left.

--
Everything you read in newspapers is absolutely true, except for the  
rare story of which you happen to have first-hand knowledge. ? Erwin Knoll