Real Programmers Code Drivers!

So anyone here who has what it takes to improve the SPI driver in the rPi? I'm not certain that it will ever suit the needs of control systems, but certainly there is a lot of room for improvement over the existing driver. I would be interested in designing a board to go with the rPi, but it would need a better driver.

--

Rick
Reply to
rickman
Loading thread data ...

Since you know so much more than anybody else, why not DYOR and FOAD?

--
Everything you read in newspapers is absolutely true, except for the  
rare story of which you happen to have first-hand knowledge. ? Erwin Knoll
Reply to
The Natural Philosopher

Kind of a tall order, but it's part of what the Pi was invented for.

Mel.

Reply to
Mel Wilson

Well, I'm bit-banging SPI from userland, as the kernel driver didn't exist when I was designing my app, and I haven't updated it for a long time. Bit-banging SPI from userland works fine for me.

Out of curiosity, what's wrong with it that you want to fix?

--
Andrew Gabriel 
[email address is not usable -- followup in the newsgroup]
Reply to
Andrew Gabriel

The only issue with it is the transfer start-up latency.

Data transfer is fine - once its going. The latency affects lots of little transfers - such as reading an ADC at more than 10K samples/sec.

This table has the details:

formatting link

however don't trust the latency columns - not 100% sure they're anywhere near accurate - the code is at

formatting link

Gordon

Reply to
Gordon Henderson

I do know that I wrote a userland program to transfer audio samples straight from micrphne to headphines on a AMD64 linux station., It coped pretty well apart from when the program started up - then the program spewed out warnings about underruns and overruns. I even measured the time taken to transfer the data. Utterly trivial. Its latency that gets you - sometimes it can be several hundred microseconds before you get your audio samples. And that's with a decent onboard sound chip, which is a lot more than a A2D..

So something in the 'load and run a program' is occupying high level interrupts for a long old time.

--
Everything you read in newspapers is absolutely true, except for the  
rare story of which you happen to have first-hand knowledge. ? Erwin Knoll
Reply to
The Natural Philosopher

The timing issues. First, I've been told the existing examples only run a a few kB/s transfer rate. More importantly when you are using an ADC for many apps, it is important to trigger the conversion on a regular period. Someone was looking for a 48 kHz sample rate which would have needed to be triggered at a regular time to within better than a microsecond. With any MCU not running an OS this would be child's play. If the hardware supported it, even under Linux it shouldn't be hard. I can't yet say for sure if the hardware supports this or not.

I was reading more in the peripherals handbook and SPI does use DMA. I haven't found a link to a timer quite yet. I also haven't found a way to trigger the ADC from a clock and then have the SPI transaction also triggered by the clock... unless the clock creates an interrupt and does the SPI transfers using PIO. That could work at 48 kHz I expect. It would also give the best timing accuracy.

--

Rick
Reply to
rickman

The DMA engines in the SoC can be linked to many peripherals in the chip, but the current Linux SPI kernel driver does not use the DMA engine.

(I've just checked the source).

It works like a UART with a FIFO - copies data from memory into the FIFO, waits the the nearly empty interrupt, lather rinse repeat.

I do not believe this is the cause of latency as I've verified that once a transfer has started then it can continue at full SPI clock speed for the duration of the transfer. This non DMA method will consume a few more CPU cycles though, but I've not noticed it on the longer transfers I've been doing (to an LCD display)

Looking at the code more, it seems to me that the way the transfer gets kicked off is to enable the transmitter with no data, then use the end-of-transfer interrupt to start the very first transfer. This may well be a contributing factor to the latency when doing small transfers

- there is a comment in the code to the effect that trying to fill the FIFO first before starting the transfer doesn't work.

It would not surprise me if this was some sort of limitation in the SoC - it would not be the first one I've encountered...

Link to the source:

formatting link

Do it in a kernel module and you have the best chance of succeeding. My experiments to get an interrupt into Linux userland works well, but are limited to about 66Khz at 100% cpu usage. (so 48Khz isn't going to leave much headroom)

Also the trigger can be the CE line - it will go from high to low when a transfer starts - that's supposed to be used by SPI devices to reset themselves and synchronise their clock and to enable the MISO pin on the peripheral (usually wire-or'd with othe SPI devices)

I've now forgotten what the original poster was trying to achieve now. Also had some email in the past few days from someone trying to synchronise 6 ADCs to do some direction finding...

This is all stuff the Pi's SoC was probably never really designed to do

- I suspect that half the peripherals are on the SoC because they're left-over from the last one, or some special feature that some other customer wanted, so may have been implemented to "just work" for some other application rather than being more general purpose for what the Pi world is looking for...

But once upon a time they said a Pi would never drive "neopixels" and now it does...

Gordon

Reply to
Gordon Henderson

Ah - I'm only reading it once every few seconds.

--
Andrew Gabriel 
[email address is not usable -- followup in the newsgroup]
Reply to
Andrew Gabriel

In this context I'm not sure what latency is and I don't think it actually matters. What matters would be that the transfers happen with a defined period. It doesn't matter so much when the first one happens as long as the process continues with the specified interval between transfers.

I tried looking at this source, but why do people have a love affair with dim text? Comments can be colored differently without making them hard to read... lol

More important is the low to high transition (end of CS or CE) which is often used to clock the data into the "active" register to do something in the peripheral. But it completely depends on the device. ADCs used for signal processing (like the faster SAR devices) have a separate convert signal which can be driven by a clock. Someone who was making the rPi into an oscope did that at 10 MHz. But then he didn't have a way to sync the data reads to the clock, lol.

Same task. His posts in another forum were the impetus for me to look into this. To be honest, the more I dig the more I think I need to add an interface device that lets the ADCs look like a FIFO. It would handle all the detailed timing. The rPi would just need to keep up with the data rate.

But at this point I am curious about the combination of timer, DMA and SPI. Or maybe SPI isn't the right interface and a parallel bus is the way to go.

"Designed to do" often has little meaning in MCUs. They toss a bunch of generic stuff together to address a market position and let the developers figure out how best to use it.

--

Rick
Reply to
rickman

I don't find it hard to read, maybe your screen is too bright? Anyway you can click the "Raw" button to view it unformatted in your browser, or download it and open in your editor of choice.

Reply to
Rob Morley

It's the time from when your program says: transfer this 3 bytes over the SPI bus (which at the same time reads 3 bytes in) and the call returning to your program. If this time is the same as sending those 3 bytes over the bus then your effective rate is halved. Or worse if the latency is higher.

If copying a million bytes in one operation then it's not an issue - only when sending a very small number of bytes.

Here's a thought:

"Manually" Bit-bang SPI. Not hard.

Have 6 SPI ADCs. Connect their MOSI pins together to one gpio pin on the Pi.

Connect their clock pins together to one pin on the Pi.

Connect their CE lines togther to one pin on the Pi.

Connect their MISO pins to 6 separate pins on the Pi.

So now you have 3 output pins, 6 inputs pins.

Make sure the clock & CE lines are the right polarity to start with. (clock low, ce high I think)

Assert the CE line (take it low) then clock out the command that tells the DACs to start the sample. They will all get this at the same time (to within the length of the wires!)

At the same time (as your program is wiggling the clock), shift in the

6 inputs into 6 variables. One bit out, one bit in - that's how SPI works.

And hey presto you've just done 6 concurrent readings.

The clock speed will be limited by the basic GPIO software pin wiggle speed - you're probably not going to clock it much faster than 5MHz in software.

Then you just need to accurately time the above operation for the duration, storing values every (e.g. 125 microsecs for 8KHz sampling) You could crudely sit in a loop, work out the next time using gettimeofday(), and timneradd () on 125uS, then do the sample, then spin on gettimeofday() until it's >= the nextTime, (use timercmp() I think) and repeat for the number of samples. It won't be as accurate as a hardware timer but it might be good enough.

Then process the 6 arrays of sampled data and point an arrow in the direction...

Gordon

Reply to
Gordon Henderson

The sampling jitter caused by scheduling nondeterminacy in Linux is the problem, and it is all but unavoidable.

It's why I earlier suggested that the simplest way to get deterministic timing is to do the sampling with a "bare metal" microcontroller, then read the samples with the Pi user process.

The cost of this solution is negligibly more, and the improvement in accuracy and simplicity is priceless. ;-)

Processors are cheap--use the appropriate one for each job.

--
-michael - NadaNet 3.1 and AppleCrate II: http://home.comcast.net/~mjmahon
Reply to
Michael J. Mahon

+1

:-)

Reply to
mm0fmf

Yes, but why add hardware if it is not needed. I'm not yet convinced that the hardware in the BCM chip won't do the job. It has all the components. It could just be a matter of figuring out how to use them. Even if I add an MCU or better an FPGA, I still have to figure out how to make it all work. Even on an MCU you can't do it all in software, the timing just won't work well enough.

--

Rick
Reply to
rickman

Are you saying the text is not a grey color in your browser? I suppose different screens show it differently, but this just looks absurd to me. Heck, I once refused to sign a contract once because they wrote the "small print" in well... small, grey print. I don't get the rational.

I didn't see the RAW button. I'll have to download the file and open it in my text editor.

--

Rick
Reply to
rickman

It is grey on white, but it's clear and easy to read. Although obviously I'd prefer C source code to be displayed as green on black, in vi.

Reply to
Rob Morley

It all depends on how you want to spend your time--and whether any capability you might discover will be stable through software/hardware revisions.

--
-michael - NadaNet 3.1 and AppleCrate II: http://home.comcast.net/~mjmahon
Reply to
Michael J. Mahon

The comments are light grey, pre-processor directives dark grey, statements and operators black, types blue, identifiers burgundy, integer constants teal and string constants red ...

Reply to
Andy Burns

I find the light grey hard to read.

--

Rick
Reply to
rickman

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.