UART design considerations

- K
- Keith M
  
  Contact options for registered users
posted
16 years ago

Tue, Mar 4, 2008 3:38 AM

I've implemented a simple software-based UART on a 50-mhz Parallax uC. I'm operating it at a baud rate of 2mbps. It talks to an FTDI FT232BM chip

formatting link

in a usb- serial converter. While I've done extensive testing of the transmit portion of my code, I'm still working on and refining my RX code.

Most UART implementations use an over-sampling (I've seen it called super-sampling or majority center sampling as well) method where the UART samples each bit between 8 and 16 times, and then uses that to determine the transmitted bit value. My simple UART samples just once.

My pseudocode looks like this:

(interrupts are off)

wait for transition high to low 'look for start bit NOP delay x-cycles to middle of 1st bit

for 8-bits

sample the bit NOP delay rest-of-bit-period

next NOP delay to eat stop bit

What PRACTICAL problems could show up by not oversampling? My cable distances are about 3 inches. The FTDI converter uses a 6mhz crystal, and the uC uses a 50mhz Murata resonator.

By waiting for the transition, I'm in-effect sync'ing to the start bit. So how much could my clock, or the FTDI's clock) really drift in

8(or 10 if you included start/stop) bit-times? Even if there is a slight change in the transmitted bit-period, wouldn't it have to be a huge error in the same direction, before I end up sampling in the wrong place?

I'm not arguing the time-tested methods that UARTs use --- I just don't understand them in context of today's hardware.

Note that this is a hobby project, not a commercial or space shuttle application, so simplicity here is really the deciding factor for me.

Thanks

Keith

- T
- Tim Wescott
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Tue, Mar 4, 2008 4:43 AM

AFAIK the standard UART behavior is to sample the _three_ intervals in the middle of a bit, and use a majority vote to decide on a 0 or a 1. If your signal is clean enough it doesn't make any difference at all to sample once or thrice.

I think it's really a leftover from 300 baud modems working over phone lines -- I don't think that there are many point-to-point transmission environments where the signal is much improved by the majority-three voting. Doing a majority-16, or better yet doing an integrate-and-dump would, indeed, help, but not a majority-three, IMHO.

--
Tim Wescott
Control systems and communications consulting
http://www.wescottdesign.com

Need to learn how to apply control theory in your embedded system?
"Applied Control Theory for Embedded Systems" by Tim Wescott
Elsevier/Newnes, http://www.wescottdesign.com/actfes/actfes.html

- K
- Keith M
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Tue, Mar 4, 2008 7:31 PM

Sure.

Majority-three doesn't sound too hard to implement. Sampling any more than just a few times would probably not work. 16 * 2mbps = 32mhz. Doesn't leave much time for anything else.

I don't have enough practical experience to know really what can go wrong.

My application will checksum the received data, which will be sent in relatively small groups, and it will retransmit any failed groups. It will be transmitting at 2mbps w/ no flow control, but I'm not concerned. The uC will be dedicated to the task of receiving the data and storing it. I have serial memory, so I might just store it as its received in real-time bit-by-bit.

Thanks for the conversation.

Keith

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Wed, Mar 5, 2008 8:28 PM

I have never seen a UART that samples a bit more than once. The reason for the 8x or 16x clock is to detect the edge of the start bit with enough accuracy to then time out to the middle of the bit. Although you would not expect much drift in 10 bit times, the application of the standard UART does not always require a crystal and so it is tolerant of about 2% variation on each end. To get that level of tolerance, you need to start pretty much in the middle of the bit time. Then in 10 bits you won't be outside of the last bit.

So how much resolution do you get when you "wait" for the start bit transition? What is the time of the loop? Try testing it by waiting for the start bit and then toggling an output bit. In fact, you can replace the "sample the bit" with "toggle output" to see how consistently you find the middle of the bit. A scope won't lie.

BTW, by resonator, do you mean one of those ceramic things? My understanding is that they can vary a lot with temperature and from part to part. So you might actually miss the tolerance you need to make this work correctly. But some people use the term "resonator" to mean a crystal which would be much more accurate.

I don't think you will see very much noise, just timing issues.

Rick

- R
- Rich Webb
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Wed, Mar 5, 2008 9:33 PM

The AVRs, at least, also pull multiple samples at the expected center of the start bit after edge detection, and then also at the computed center of each data bit. For example, see page 148 of the data sheet

[Note for folks on a slow link, this is a 4.5 MB file.]

--
Rich Webb     Norfolk, VA

- U
- Ulf Samuelsson
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Thu, Mar 6, 2008 7:27 AM

You use the three sample majority voter mentioned before to get rid of noise in most UARTs AFAIK. Some advanced UARTs detect edges which are a little off, and will add/substract a sample from that bit to adapt to the incoming speed.

Instead of sampling, you could only detect the time when there is an edge on RXD and then make decision from that info.

--
Best Regards,
Ulf Samuelsson
This is intended to be my personal opinion which may,
or may not be shared by my employer Atmel Nordic AB

- J
- -jg
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Thu, Mar 6, 2008 9:04 AM

The 80C51 UARTS have always used x3 mid samples, and the AVRs followed that lead.

It is quite common for simplest SW uarts to just fire T=1.5 bit times on start edge and then every bit time (ie not bother with any over sample)

Don't forget to start the search for START bit edge, in the MIDDLE of the stop bit.

A more thorough design will have false start detection (ie over sample the start bit), to make sure a spike does not fire a Byte RX

The new Atmel ATxmega data, has a good uart section, and it also covers a x16 and x8 modes, and the required clock precisions needed. Makes good reference.

-jg

- K
- Keith M
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Thu, Mar 6, 2008 4:46 PM

formatting link

From PDF page 7,

"An internal clock samples data at 16 times the data rate. The start bit can occur as much as one clock cycle before it is detected, as indicated by the shaded portion. The state of the start bit is defined as the majority of the 7th, 8th, and 9th sample of the internal 16x baud clock. Subsequent bits are also majority sampled."

I must not be reading this right. Or is this just the exception?

Great question. I hoping to get into the down and dirty with this.

I'm at 50mhz, so my instruction execution cycle is 20ns. My bit period is 500ns for 2mbps baud rate. And remember this is _not_ RS232, so forget about inverting stuff.

waitforstartbit:

SNB receivepin (if receivepin is LOW, skip the next instruction) JMP waitforstartbit

The execution time would be 2 cycles (2*20=40ns) if receivepin was low at the top of the code, or 4 cycles (4*20=80ns) if it had loop back. I _think_ this gives me a jitter of 40ns, the difference. Does this sound right?

Even if I am slightly early or slightly late, this problem won't continue to slip (given a fixed transmit bit period), I'd just end up consistently sampling slightly off center on each bit. IE, its just a problem getting started, not a continuing problem.

Great idea.

I think so. CSTLS50M0X51-B0

formatting link

Just so everyone knows, this uart code of mine actually functions as is, so I'd hope there aren't any MAJOR issues with it. I am in the process of doing a code review and want to stress test it. I'm looking to see if there are major design flaws in my code, since I've never written a UART before....

Thanks!

Keith

- K
- Keith M
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Thu, Mar 6, 2008 5:18 PM

Thanks for the link, Rich. That picture/description looks exactly like how Maxim is doing it with the MAX3100 UART. I posted the datasheet in another reply.

That atmega link spells out exactly how they do their majority ruling, which is good, if end up implementing it.

Keith

- K
- Keith M
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Thu, Mar 6, 2008 5:29 PM

Once again, I saw Maxim doing that. That's pretty neat, pretty advanced, and pretty unlikely I'm gonna touch it with a 10-ft assembly pole. :)

Yeah, so you are saying to do edge detection to find the start bit, and then, what, just use delays as I'm doing now between the bits, and sample each bit?

I don't really know what that gains me. Since I don't plan on using interrupts for this(for simplicity's sake, other stuff going on in the ISR, even though interrupts are disabled), I still have to check the edge detect register for a value. And that checking would be subject to the same "SNB/JMP" jitter I mentioned in an earlier reply.

There are other issues, like my uC not supporting edge detection on my current port, etc.

Thanks Keith

- K
- Keith M
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Thu, Mar 6, 2008 6:07 PM

Great. This is exactly what I'm doing.

MIDDLE of

This is exactly what I do. Initially, I wasn't sure how to handle that. I wanted to "eat" the stop bit inside my rx routine, to keep everything consistent outside the routine to help modularity of the component. (ie to prevent the need of knowing the state of the rx data line outside the uart itself)

My particular routine reads one byte at a time. So once I've read 8 bits, I delay another bit period, which puts me in the middle of the stop bit, and then I exit. I did this to make sure there's plenty of time for the routine to see the start-bit-transition when it's called the next time, for the next byte.

Thanks for that.

Ok, question. As far as my current approach goes to handle false start detection, what do you think about me just checking the state of the RX pin again, a little later in terms of time?

Like right now, I do this(sorry to repeat this, if you've seen my other reply):

waitforstart:

SNB receivepin (if receivepin is LOW, skip the next instruction) JMP waitforstart

This takes a minimum of 40ns to execute, so I could simply repeat the code right after it:

SNB receivepin (if receivepin is STILL low, then it's truly a wider start bit, skip the jump) JMP waitforstart (if not, must have been a spike, go reset and wait again)

That should do the trick, no? I could also delay the check by padding with NOPs if necessary. Of course, I'd have to take that in consideration when calculating how much to delay to find middle of 1st bit.

Thanks Jim!

Keith

- J
- Jim Granville
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Thu, Mar 6, 2008 6:34 PM

Yes, that is a simple 'noise filter', and you have a little spare time to burn on the start bit anyway, so it should not impact your top baud rate.

Some SW uart designs use an EDGE interrupt, and a timer interrupt, in a ping-pong system, but IIRC the parallex chip is 'all sw' ?

-jg

- K
- Keith M
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Thu, Mar 6, 2008 7:23 PM

Right, I've got plenty of spare time there. Almost 1.5 bit times......

680ns, or 34 single-cycle instructions.

Yeah, there are no real hardware interfaces, like UARTs(for example ), etc.

I run the chip at 50mhz, but it can run at 75+. And that's 50 mips at

50mhz. And so doing this stuff in software is easy. I've got to slow everything down with a bunch of NOPs just to get the baud rate to 2mbps --- with my bit-banging approach.

I'm using the Parallax SX28AC/SSOP.

Keith

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Sat, Mar 8, 2008 2:29 PM

No, measuring the time of the transitions would be done by setting an interrupt or timer capture on each transition without "sampling" in the manner you are thinking. The the pulse widths are calculated which give you the number of bits high or low. You need a timer interrupt or timing loop from the start bit to indicate the end of a character since there will be no more transitions after the last low to high which won't be the stop bit if the last data bits are high.

I have not seen this type of implementation, but it could be useful if you have timer capture hardware, which I think you don't...

If you don't use interrupts and don't have dedicated hardware, then this is not a good approach.

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Sat, Mar 8, 2008 2:49 PM

No, you are reading this right and I am learning something (or maybe relearning...). Personally, I am surprised they implement this. I would deal with noise issues in the analog domain, either by improving my SI design or adding a filter (that's what the oversampling is doing, low pass filtering). Improving the SI of the design works on all frequencies of noise not just the higher freqs.

I think so. The loop is 40 ns, so that is your resolution. At less than 10% of the bit time, this should be good enough.

Clock mismatches will cause the timing to slip more with each bit. If you sample perfectly in the middle of the start bit, a 2.5% error in the clock of each end will just be reaching a bit boundary at the end of ten bits.

The specs on this part seem to be good enough for UART timing at less than 1% total accuracy over temperature.

The problem with testing is that it can't prove that a design works under all conditions because it is hard to create "all conditions". But I would say this design should be pretty easy to verify by testing.

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Sat, Mar 8, 2008 3:06 PM

MIDDLE of

Once you have read 8 bits and timed to the middle of the stop bit, you only have a half a bit time ***minus the accumulated error*** before the next start bit. What happens in the code between returning and calling the UART routine again? If that takes more than a half dozen instructions, I think you may have trouble.

BTW, in my earlier post where I said you should be able to verify this design by testing, I didn't consider the variation in clock speed due to manufacturing tolerances. The data sheet gave 0.5% initial tolerance. If you count this at both ends, you get 1% max difference (plus temperature effects which you can test). This 1% over 10 bit times is 10% of a bit or one fifth of the half a bit you have between invocations of the UART code (again not counting the temperature factor).

This is not bad. Obviously others have looked at UART construction more than I have, but what I recall is that they check again at the middle of the start bit, not just twice at the beginning (or maybe at all samples up to the middle of the start bit). What if your "spike" is two samples wide? This oversampling is really a low pass filter and you should consider it in that context. What frequency range of noise are you trying to filter? What frequency range of noise do you

**expect**? What does your code do for noise that happens just after the leading edge of the start bit? Looks to me like it only rejects low going noise and not high going noise (noise at the second test will reject a valid start bit). A simple analog RC filter would likely do a lot better.

I seem to recall that you are only running this signal over a very short distance. Is noise really an issue?

- A
- Albert van der Horst
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Sat, Mar 15, 2008 6:29 PM

In this case assuming you can do delays precise to one instruction cycle and have deterministic instructions, this amount to 25 times oversampling, better than the 16 times of UART's. Ignoring noise problems (that you probably don't have in your setup) you should be just fine.

--

--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- like all pyramid schemes -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst