Linux serial port dropping bytes

Derek Young · 2008-03-31T21:37:54+00:00

Hi all,I'm using an Arcom Viper PXA255 single board computer running Linux, and I'm trying to receive bytes over the built-in RS-422 serial port at 921.6 kbps. The received packets are 1500 bytes long. The UART is a 16C2850 (dual port version of the 16C850 with 128 byte FIFO).Everything works fine at low speeds, but if the packet rate increases, I start to lose bytes.I believe this is caused by the Linux serial port interrupt handler not responding fast enough. Some testing revealed that this problem only occurs when other peripherals (such as the flash drive or ethernet port) are active.I searched online and found a utility called IRQtune which promises to fix exactly this sort of problem by allowing control of ISR priorities in Linux. Unfortunately, IRQtune only works for x86 processors.Does anybody know of a simple way to modify the priorities of interrupts in Linux? I don't mind recompiling the kernel, but I don't have a lot of expertise in that sort of thing. (In fact, I would have preferred skipping the whole OS thing and implementing the entire solution on an FPGA/uController, but oh well... one doesn't always get to choose.)Any advice would be greatly appreciated. Thanks!Derek--remove the .nospam and rearrange appropriately to e-mail.

D

Didi 18 years ago

The hardware can do what you are after at < 10% overhead (more like

1%). The OS (or should we call it a inOS?) or any software can be written in a way to make any hardware unusable, of course.

Not long ago I used a 400 MHz MPC5200; part of what it did was to continuously (no pauses at all) update a serial DAC at apr. 16 MbpS and read

4 ADCs at another 16 Mbps (4 MbpS per ADC, that is, but going over a single 16 MbpS link). The CPU is doing it all at a fraction of its resources, and I actually used *no* interrupts (this was for fun/experiment). I did not try the UART at >76800 bpS (had no faster port at the other end), but it had plenty of margin (and 9600 would have been lpenty for the application). Of course all seril ports work simultaneously. (see some of it at

formatting link

).

Now the XScale is not a PPC but even if it were 16 times slower at the same clock rate it would still be sufficient for your 1 MbpS with plenty of margin. So clearly the software is the inhibiting factor.

Well increasingly more people seem to fall for things like that, it seems the popularity of words like windows and linux and being exposed all day to colourful websites make people think everything will just work no matter what - which sometimes is far from being true... :-).

Dimiter

------------------------------------------------------ Dimiter Popoff Transgalactic Instruments

formatting link

------------------------------------------------------

formatting link

Derek Young wrote:

Vote

D

David Brown 18 years ago

In this case (a fast UART on a Linux system), you don't want to do anything with the incoming data except buffer it - processing is done in a different process/thread, as you suggest. But it's worth noting that in some embedded systems, it makes a lot of sense to do more specific handling of data during interrupt routines - interrupt handlers do not necessarily need to be as fast as possible, only as fast as necessary. If you have a system where you have better knowledge of the interrupts, the response times, and the required times, then you are free to do all the work you want during an interrupt routine.

Vote

U

Ulf Samuelsson 18 years ago

DMA is way superior to FIFO, since you dump to memory in the background. The AT91 (and AVR32 implementation) support a Timeout interrupt which is triggered if NO characters arrive in a certain number of bit periods.

What really limits the speed in Linux is the error handling. If you want to to proper error handling, you typically have to handle the error before the next character arrives, and this is pretty difficult in Linux, and will severly limit the speed.

Best Regards, Ulf Samuelsson This is intended to be my personal opinion which may, or may not be shared by my employer Atmel Nordic AB

Vote

D

Didi 18 years ago

Why would you ant to do it there? Interrupts are meant to be as short as possible and do only what cannot be done outside their handlers - this is fundamental to programming. I know it can be done otherwise, and I know they do such a mess to no direct consequences because most of the hardware nowadays is 10x to 1000+x overkill, but why want to do it so?

Dimiter

------------------------------------------------------ Dimiter Popoff Transgalactic Instruments

formatting link

------------------------------------------------------

formatting link

David Brown wrote:

Vote

G

Grant Edwards 18 years ago

There are a lot of micro-controllers out there with horribly designed UARTs in them. One I've fought with recently is the in the Samsung S3C4530. Half of the features don't work at all. Half of the stuff that does work are useless because whoever specified/designed the UART had never actually done any serial communications only had a vague understanding of how things like UARTs and FIFOs are used. For example, it has FIFOs, but there's no way to flush them. It also has "hardware flow control", but it doesn't work in a way that can be used with any other UART on the planet.

Grant Edwards grante Yow! ... the MYSTERIANS are at in here with my CORDUROY visi.com SOAP DISH!!

Vote

J

John Devereux 18 years ago

How about decoding SLIP or similar? If you wait until the end of the frame, you have to have double the buffer size to cope with the worst case scenario. If decoded "inline", in the irq handler, the maximum size is just that of the decoded data.

Also protocols like modbus need to have protocol-level decisions (e.g. about timing) done in the ISR. It doesn't work to have a generic "read block" performed by the ISR followed by decoding in the task level.

(Obviously all this depends on your definition of ISR, since no doubt it can all be done at "task level" with a good enough RTOS. But in that case the "task" is realy just another type of ISR, isn't it?)

John Devereux

Vote

J

John Devereux 18 years ago

Everything with a "industry standard '550 uart" is horrible.

John Devereux

Vote

D

Didi 18 years ago

Hi John,

Actually it takes very little more than that - a few bytes - and you will not need to do it in the handler. Even a 16 byte FIFO organized in memory will allow you to do it "normally" and only queue the incoming data into the FIFO. But OK, I can see your point. I don't know SLIP, but I have done PPP and this is doable since you will not enter a loop anywhere in the handler, just make it a bit branchier. Not much more than queueing the data. Can be a valid choice, I agree - although it should be taken only if there is a good enough reason not to take the other one, e.g. you do need the 16 or so bytes, or if you can squeeze the last drop of CPU performance if you do so and you need that drop etc.

Well no, the "task" can have a lot worse a latency than the ISR - in the example above, 16 times. Make that 256 times if you can afford a 256 byte deep queue (FIFO). Then you can spend this latency on multitasking or whatever you can use it for in the particular design.

Dimiter

------------------------------------------------------ Dimiter Popoff Transgalactic Instruments

formatting link

------------------------------------------------------

formatting link

John Devereux wrote:

Vote

A

Anton Erasmus 18 years ago

This can be surprisingly slow. On a recent project I used an STR9 ARM MCU with the onboard UARTs as well as an external Exar UART. On the Exar UART one could read the number of characters available in the RX FIFO, on the MCU uarts one can only check for character avaiable, or FIFO full. So with the EXAR onr could:

read number of chars available repeat read char if buffer not full stuff into buffer until chars read.

This turned out to be 5x faster than with the onboard UARTs where one had to check the FIFO not empty flag every time. On a 50MHz ARM9 it took about 25us per 16 characters having to do it the way CBFalconer described it, while it only took about 5us per 16 characters where one could read how many chars were in the Rx FIFO.

Regards Anton Erasmus

Vote

H

Hans-Bernhard Bröker 18 years ago

Of course. But sometimes "what cannot be done outside" does include some, or even all the processing of that incoming data. Resources or response time constraints might not allow any other approach.

For just one example, let's consider that you're running XON/XOFF flow control on a plain old RS232 link. That would mean even an interrupt handler that would normally just stuff each byte received by the UART into some software FIFO, had better look at the actual character, too, to check if it's XOFF.

Vote

C

CBFalconer 18 years ago

That action is a 'handler'. Bear in mind that errors need to be tied to the appropriate point in the buffered stream.

This is fine IF you have adequate processing time left after the interrupts. You always have to keep an eye on the available throughput for the system.

[mail]: Chuck F (cbfalconer at maineline dot net) [page]: Try the download section.

Vote

C

CBFalconer 18 years ago

The ONLY reason for that attack is to preserve a maximum of cpu time for the 'unknown' projects. If you know precisely what the system has to do, the most efficient mechanism is desireable. It may also make the system better partitioned. For example, in the serial input scheme, nothing outside the interrupt system needs to know anything about the i/o ports, etc.

[mail]: Chuck F (cbfalconer at maineline dot net) [page]: Try the download section.

Vote

C

CBFalconer 18 years ago

... snip ...

Well, I am not familiar with the particular hardware the OP is using, and having started out with a ridiculous misapprehension doesn't help my reliability reputation. However, I have just been trying to point out the sort of things to consider.

[mail]: Chuck F (cbfalconer at maineline dot net) [page]: Try the download section.

Vote

D

Didi 18 years ago

Yes, XON/XOFF processing obviosly falls in the category of things you cannot do outside of the IRQ handler in a reasonable way.

Another example in that line, which demonstrates to a further extent how the work within the ISR must be minimised, it related to sending the XOFF, when your ISR sees that the FIFO is nearly full. Generally you cannot send the XOFF from within the receive ISR because you may have to wait almost an entire serial character time for the UART to be able to take it - this would be about 1 mS at 9600 bpS. The way it is done (OK, the way I do it) is to flag the fact that XOFF is due, and make sure the transmit IRQ will be asserted as soon as possible (i.e. after the current character has gone out or immediately); then, within the imminent transmit IRQ handler the flag is detected and the first thing which is sent - overriding the output queue and possibly some Tx FIFO - is a XOFF.

I believe this illustrates pretty well what I mean by saying:

I wish I could remember when I first did it this way - was it on a

6800 or on a 6809, I really don't know. Perhaps a 6800, I did a terminal with it around 1985, may be it has been then. Not such a demand for XON/XOFF things nowadays, UARTs, however popular still, seem to be in the "phase out" phase - which may take a few decades, that is :-).

Dimiter

------------------------------------------------------ Dimiter Popoff Transgalactic Instruments

formatting link

------------------------------------------------------

formatting link

Hans-Bernhard Br=F6ker wrote:

Vote

R

Robert Adsett 18 years ago

Of course this assumes the UART can tell you how many characters are in the FIFO. I don't remember any of the ones I've used having that capability (not to suggest they don't esist, just that it's not rare for it not to be there).

Robert

Posted via a free Usenet account from http://www.teranews.com

Vote

D

Didi 18 years ago

Well, the thing is, doing it this way typically one wastes more resources. Let us stick to the UART example and PPP. If you do all within the ISR, this means you will have to add at least the following overhead for each incoming character:

-retrieve/update the current state (flag $7e seen/not etc.)

-retrieve/update the pointer where to put the character

-retrieve some CRC table pointer

-retrieve/update the CRC value itself

-save/restore some more registers needed to do the above.

If you do this on a larger block of characters (tens or hundreds), you will do the above tens or hundreds of times less.

As usual, things must be considered on a per case basis, but the general rule stands - do in the ISR only what you cannot do elsewhere. If you think doing more there will save you time or effort, think again :-). I don't claim there can be no exceptions to this rule, of course, but in the vast majority of cases this would be both the best and likely the simplest way to do it (although it takes some less straight forward/layered thinking, it is in effect the simpler way).

Dimiter

------------------------------------------------------ Dimiter Popoff Transgalactic Instruments

formatting link

------------------------------------------------------

formatting link

Vote

M

Mel Wilson 18 years ago

Particular cases can be pretty overwhelming. We have an application where serial data comes in containing a distinguished end-of-frame value, and some escapes that ensure that data bytes are never mistaken for end-of-frame markers. Looking at generated code for PIC18s or AVRs shows that: - retrieving a state flag (e.g. in-escape/not-in-escape) is one of the cheapest things we do -- single byte fetch from a known location - subscripted access to a data buffer is one of the most expensive things we do So it makes sense to un-escape our data and detect frame boundaries while the incoming bytes are in our hands -- in the interrupt routine. Less time wasted on laborious address arithmetic. It helps that processing this serial input is the critical task for this application. There is not a more important task that would be kept waiting.

Mel.

Vote

D

David Brown 18 years ago

No, interrupts are not "meant to be as short as possible" - they are a feature of the processor, to be used as appropriate in the given system. If you have an OS handling multiple threads or processes, then normally you want your interrupts to be as short as possible. But there are many different ways to structure the software in an embedded system, and doing real work during interrupt routines is a perfectly good way to do it - as long as you are aware of the issues.

Interrupts can give you many of the benefits of a multi-threading RTOS while keeping the system as simple as possible - they let you do things in the background. I've written systems where the communication system is handled entirely within the interrupt routines - telegrams are checked as they come in, handled when the packet ends, and replies sent out again from within the interrupt routines. In fact, I've written systems where the *entire* running program is handled by interrupts - the "main loop" is nothing more than a "sleep()" function.

Clearly, you need to think about response times, sharing data, nested interrupts, and many other issues with interrupts - the more work you do during the interrupt routines, the more relevant these issues become. But interrupt routines can give you a convenient event-response structure that is fast and built into the hardware - why bother doing things indirectly (interrupt routines setting flags to be read by other threads) or using extra RTOS software if it's not actually necessary?

Vote

D

Didi 18 years ago

Yes they are. This is fundamental. They can be misused and make lengthier - many people do it - and up to a point this may even be reasonable.

What advantages did this approach buy you compared to polling?

Dimiter

------------------------------------------------------ Dimiter Popoff Transgalactic Instruments

formatting link

------------------------------------------------------

formatting link

David Brown wrote:

Vote

D

David Brown 18 years ago

No, this is not "fundamental". The only thing "fundamental" in embedded programming is that the best solution depends on the system you are working on.

They can be used in different ways for different purposes - just because a particular tool is used in a way you are not familiar with, does not make it "misuse".

The code was smaller, faster, neater, clearer, and spent more of the time asleep.

Vote

Linux serial port dropping bytes

Join the Discussion

Didn't find your answer?