forward error correction on ADSP21020

- A
- alb
  
  Contact options for registered users
posted
12 years ago

Fri, Mar 2, 2012 11:25 AM

Hi everyone,

in the system I am using there is an ADSP21020 connected to an FPGA which is receiving data from a serial port. The FPGA receives the serial bytes and sets an interrupt and a bit in a status register once the byte is ready in the output register (one 'start bit' and one 'stop bit'). The DSP can look at the registers simply reading from a mapped port and we can choose either polling the status register or using the interrupt.

Unfortunately this is just on paper. The real world is much more different since the FPGA receiver is apparently 'losing' bits. When we send a "packet" (a sequence of bytes) what we can observe with the scope it that sometimes the interrupts are not equally spaced in time and there is one byte less w.r.t. what we send. So we suspect that the receiver has started on the wrong 'start bit', hence screwing up everything.

The incidence of this error looks like dependent on the length of the packet we send, leading to think that due to some synchronization problem the uart looses the sync (maybe timing issues on the fpga).

Given the fact that we cannot change the fpga, I came up with the idea to use some forward error correction (FEC) encoding to overcome this issue, but if my diagnosis is correct it looks like that the broken sequence of bytes is not only missing some bytes, it will certainly have the bit shifted (starting on wrong 'start bit') with some bits inserted ('start bit' and 'stop bit' will be part of the data) and I'm not sure if there exists some technique which may recover such a broken sequence.

On top of it I don't have any feeling how much would cost (in terms of memory and cpu resources) any type of FEC decoding on the DSP.

Any suggestions and/or ideas?

Al

--
A: Because it fouls the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

- S
- Stef
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Fri, Mar 2, 2012 11:52 AM

Is this a continuous stream of bits, with no pauses between bytes? Looks like the start bit detection does not re-adjust it's timing to the actual edge of the next start bit. With small diffferences in bitrate, this causes the receiver to fall out of sync as you found.

Obviously, the best solution is to fix the FPGA as it is 'broken'. Is there no way to fix it or get it fixed?

Can you change the sender of the data? If so, you can set it to 2 stop bits. This can allow the receive to re-sync every byte. If possible, I do try to set my transmitters to 2 stop bits and receivers to 1. This can prevent trouble like this but costs a little bandwidth.

Another option would be to tweak the bitrates. It seems your sender is now a tiny bit on the fast side w.r.t. the receiver. Maybe you can slown down the clock on your sender by 1 or 2 percent? Try to get an accurate measurement of the bitrate on both sides before you do anything.

--
Stef    (remove caps, dashes and .invalid from e-mail address to reply by mail)

An egghead is one who stands firmly on both feet, in mid-air, on both
sides of an issue.
		-- Homer Ferguson

- A
- alb
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Fri, Mar 2, 2012 1:03 PM

in within a "packet" there's should be no pause between bytes, I will check though. There might be a small difference in bitrate, maybe I would need to verify how much.

The FPGA, is flying in space, together with the rest of the equipment. We cannot reprogram it, we can only replace the software in the DSP, with non-trivial effort.

We are currently investigating it, the transmitter is controlled by an

8051 and in principle we should have control over it. Your idea is to use the second stop bit to allow better synching and hopefully not lose the following start bit, correct?

We can certainly measure the transmission rate. I am not sure we can tweak the bitrates to that level. The current software on the 8051 supports several bitrates (19.2, 9.6, 4.8, 2.4 Kbaud) but I'm afraid those options are somehow hardcoded in the transmitter. Certainly it would be worth having a look.

- S
- Stef
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Fri, Mar 2, 2012 1:58 PM

Whoops, that's a real "cannot" then. Too often a statement like "cannot" is flexible after a little interrogation, guess this is not one of those cases.

But it seems you have at least a test system on the ground you can do measurements or tests on? Changing the FPGA there can at least confirm that you found the actual cause of the problem if you need to.

The extra stop bit will allow the receiver start looking for a start bit from scratch, causing re-sync every byte. But the exact effect ofcourse depends on the receiver implementation in the FPGA. A "correct" implementation should start looking for a start-bit after half a stop-bit or so and sync the data sampling for the real bits to the first edge of the newly detected start-bit.

If you have the FPGA code or RTL and a simulator, you can set up a simulation testbench to test the effects of 1 and 2 stop-bits under slightly varying bitrates.

If you only have control over the dividers in the 8051, there is nothing you can in this area, you can only change the bitrate in large steps. Can you change the bitrate on the FPGA side? If so, depending on the receiver implementation, changing the bitrate on both sides may help. If bitrates are not exact, you may be able to reverse the speed errors at certain settings.

Tweaking the bitrate in the error margin can only be done by changing a crystal on a 8051.

Again, if you can simulate the FPGA receiver, you can take a lot of guessing out of the equation.

--
Stef    (remove caps, dashes and .invalid from e-mail address to reply by mail)

I am firm.  You are obstinate.  He is a pig-headed fool.
		-- Katharine Whitehorn

- T
- Tim Wescott
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Fri, Mar 2, 2012 7:38 PM

Go over the FPGA code with a fine-toothed comb -- whatever you're doing, it won't help if the FPGA doesn't support it.

What the FPGA _should_ be doing is starting a clock when it detects the leading edge of a start bit, then sampling the waveform at the middle of every bit period, then beginning a search for a new start bit at the _middle_ of the stop bit.

It sounds like it either doesn't resynchronize in the middle of a packet at all, or it starts seeking the start bit at the _end_ of the preceding stop bit. In the former case adding a stop bit will just screw things up completely. In the latter case, if the DSP bit clock is slower than the FPGA the whole resynchronization thing falls apart -- but adding that extra stop bit will fix things.

And next time you go sending satellites to space, put in a mechanism to upload FPGA firmware!

--
My liberal friends think I'm a conservative kook.
My conservative friends think I'm a liberal kook.
Why am I not happy that they have found common ground?

Tim Wescott, Communications, Control, Circuits & Software
http://www.wescottdesign.com

- C
- Charles Bryant
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Sat, Mar 3, 2012 3:32 AM

In article , alb wrote: }in the system I am using there is an ADSP21020 connected to an FPGA }which is receiving data from a serial port. The FPGA receives the serial }bytes and sets an interrupt and a bit in a status register once the byte }is ready in the output register (one 'start bit' and one 'stop bit'). }The DSP can look at the registers simply reading from a mapped port and }we can choose either polling the status register or using the interrupt. } }Unfortunately this is just on paper. The real world is much more }different since the FPGA receiver is apparently 'losing' bits. }When we send a "packet" (a sequence of bytes) what we can observe with }the scope it that sometimes the interrupts are not equally spaced in }time and there is one byte less w.r.t. what we send. So we suspect that }the receiver has started on the wrong 'start bit', hence screwing up }everything. } }The incidence of this error looks like dependent on the length of the }packet we send, leading to think that due to some synchronization }problem the uart looses the sync (maybe timing issues on the fpga).

Try a test where you send a packet consisting of only 0xff bytes (I'll assume it's 8 bits per character). Watch the interrupts and confirm that what happens is they're normally spaced until one goes missing where you get an extra gap of about 9 bits. This is a problem with the delivery of the characters rather than one of recognising them.

Then try sending only 0x55 bytes (this gives a bit pattern of

010101010101...). If the interrupts show the same pattern, you're missing a whole character; if the pattern has an extra gap of about 2 bits, then the receiver is missing a start bit and using the next 0 that arrives as the new start bit. This is a problem of recognising the characters.

If the problem is recognising characters, then most errors will cause many bytes in a packet to be wrong from the point of the error. It the problem is in delivery, then delivered bytes will be right, merely missing out the byte where the error occurred. The best solution will depend on which problem you have.

One possible solution might be possible to do the UART receive function in software (this depends very much on how the hardware works). By setting the baud rate on the FPGA over 10x the true speed, it sees every bit as either a 0xff or a 0x00 character. If you can react to the interrupt fast enough and read a suitable clock, you can then decode the bits in software. Of course if the FPGA is failing to deliver characters, this is no better.

Fixing a delivery problem is more tricky. It's necessary to know more about the losses. Is it certain bit patterns which are more likely to get lost, or every Nth character, or apparently at random? If random, about how often are characters lost? How big are your packets, and what sort of re-trasmitting error-correction do you already have?

- A
- alb
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Mon, Mar 5, 2012 9:32 AM

Well, we do have the system on the ground and we could in principle replace the FPGA, but unfortunately when we try to place&route the original design we bump into a problem in pin assignments (I think is a clock resource signal without the clock buffer or something similar) and we do not know how the original designer 'tricked' the tool to produce the bitstream. No traces of whatsoever in the design or in any document. Needless to say the original designer has migrated into a different world and recollects nothing but 'how difficult it was and how many problems they had'.

If there's a timing problem in the FPGA and we will re-route the pinouts I fear we will not be seeing the same effects, or worse, reveal some other problems somewhere else.

I should admit this project is well below the standards the other piece of electronics is conforming to. Considering out production cycle, this piece should have never made it into the flight assembly!

Here probably would be much easier to do a test with 2 stop-bits directly. To run the simulation is probably more time consuming, even though it might be helpful in the long run, in case we find other oddities in the behavior of the FPGA.

No, we cannot change the receiver rate. Again we could in principle change the crystal and see if that helps, but certainly cannot be an option to solve it.

Here I'm only concerned that we would need to run a back annotate simulation, other wise we may lose some nasty time critical effects. Usually when I was designing the vhdl I made sure that I didn't have any timing violation, but here we cannot guarantee since we have a place&route issue that we do not know how it was originally solved.

- A
- alb
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Mon, Mar 5, 2012 10:44 AM

Ok, a colleague of mine went through it and indeed the start-bit logic is faulty, since it is looking for a negative transition but without the signal being synchronized with the internal clock (don't ask me how that is possible!).

Given this type of error the 0xFF byte will be lost completely, since there are no other start-bit to sync on within the byte, while in other cases it may resync with a '0' bit in within the byte.

Adding a delay between bytes does mitigate the effect, but of course it does not solve it.

We have ~1000 FPGAs onboard and all of them are anti-fuse. The whole design and production process, which consists of several test campaigns on different quality models (Engineering, Qualification and Flight), should have ensured this level of functionality. The reason why it failed is most probably due to a poor level of quality control of the process. Just as an example we are missing test reports of the system, as well as Checksums for the FPGA firmware.

As a side note: IMO the capability to reprogram an FPGA onboard is built when your needs are changing with time, not to fix some stupid UART receiver.

- A
- alb
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Mon, Mar 5, 2012 11:09 AM

On 3/3/2012 4:32 AM, Charles Bryant wrote: [...]

Since we found that the 'start bit' logic has a problem, the 0xFF pattern has a good chance to be lost completely, since no other 'start bit' will be recognized.

the 0xFF has a good chance to go completely lost. The method you suggest may reduce the problem of recognizing bytes to the problem of delivering the bytes. Then extra encoding should be added to recover the loss of bytes.

If you plot the number of failed packets [1] with the position in the packet which had the problem, you will see an almost linear increasing curve, hence the probability to have problems is higher if the packet is longer.

At the moment we don't have any re-transmitting mechanism and the rate of loss is ~0.5% on a 100 bytes packet. We want to exploit the 4K buffer on the transmitter side in order not to add too much overhead, but it looks like the rate of loss will be higher with bigger packets.

[1] we send a packet and echo it back and compare the values.

- T
- Tim Wescott
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Mon, Mar 5, 2012 3:07 PM

Ah well.

Testing is only the most visible and least effective of all means to insure quality. It is a net with whale-sized holes, with which you go out and attempt to catch all the minnows in the sea by making repeated passes.

And all too often, quality programs end up being blown off by management and/or design personnel as being an unnecessary expense, or a personal affront. Or the QA department gets staffed with martinets or rubber- stamp bozos or whatever. Because -- as you are seeing -- sometimes quality problems don't show up until long after everyone has gotten rewarded for doing a good job.

Humans just aren't made to build high-reliability systems, so an organization really needs to swim upstream to make it happen.

Well, time has marched on, and your needs have certainly changed.

--
Tim Wescott
Control system and signal processing consulting
www.wescottdesign.com

- T
- Tim Wescott
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Mon, Mar 5, 2012 3:14 PM

This may be your answer -- instead of two stop bits, use a protocol that sends eight data bits but with the most significant bit always 0. This will make your life difficult when you go to unwind real 8-bit data, but it can be done.

While you're at it, if the connection is two-way you might want to implement a BEC scheme, but if the failures are data-pattern dependent any correction scheme that doesn't randomize the data is going to cause you problems.

--
Tim Wescott
Control system and signal processing consulting
www.wescottdesign.com

- C
- Charles Bryant
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Tue, Mar 6, 2012 1:51 AM

In article , alb wrote: }On 3/3/2012 4:32 AM, Charles Bryant wrote: .. data loss due to failure to see some start bits ... }> One possible solution might be possible to do the UART receive function }> in software (this depends very much on how the hardware works). By }> setting the baud rate on the FPGA over 10x the true speed, it sees }> every bit as either a 0xff or a 0x00 character. If you can react to }> the interrupt fast enough and read a suitable clock, you can then }> decode the bits in software. Of course if the FPGA is failing to }> deliver characters, this is no better. } }the 0xFF has a good chance to go completely lost. The method you suggest }may reduce the problem of recognizing bytes to the problem of delivering }the bytes. Then extra encoding should be added to recover the loss of bytes.

If you can set the receiver clock fast enough you won't lose any bytes. For example, if you set it to 16x, and suppose the true bit-stream is 0000101101 (ASCII 'h'). Then the apparent bit-stream is

0000000000000000000000000000000000000000000000000000000000000000111111 1111111111000000000000000011111111111111111111111111111111000000000000 00001111111111111111

(wrapped for convenience). Assuming this starts after lots of '1' bits, and receiving a '0' stop bit is merely reported as a framing error and doesn't affect synchronisation, this gets interpreted as:

s........Ss........Ss........Ss........Ss........Ss........Ss........S

0000000000000000000000000000000000000000000000000000000000000000111111

__________00________00________00________00________00________00________f8

s........Ss........Ss........Ss........Ss........Ss........Ss........S

1111111111000000000000000011111111111111111111111111111111000000000000

____________________00________e0____________________________3f________00

s........Ss........S

00001111111111111111

__________f8

When you get an interrupt reporting a character, you note the time since the last such interrupt (I believe your CPU has a built-in timer which can count at high speed and which might be useful for this). Then you work out approximately what bits must have been received based on both the character and the time. Since each real bit is seen as sixteen bits, even if one is missed, this only introduces a small error, so although you don't get an exact match to any valid pattern, you're much closer to one than any other.

Specifically, if an interrupt is T bit-times since the last one, then there must have been T-10 one bits, a zero bit, the bits in the character received, and a stop bit (0 if framing error was reported, 1 otherwise).

When a 0 bit is missed, then there were T-1 ones, two zeros (the missed zero must be the first of these), the bits in the character, and the stop bit. But since sixteen of these bits make one real bit, the difference between T and T-1 is never big enough to flip a real bit.

Having said all that, you might not be able to change the receive clock without also changing the transmit clock, in which case it won't help. Similarly, if you can't time the interrupts to sufficient accuracy, it won't work.

}If you plot the number of failed packets [1] with the position in the }packet which had the problem, you will see an almost linear increasing }curve, hence the probability to have problems is higher if the packet is }longer. } }At the moment we don't have any re-transmitting mechanism and the rate }of loss is ~0.5% on a 100 bytes packet. We want to exploit the 4K buffer }on the transmitter side in order not to add too much overhead, but it }looks like the rate of loss will be higher with bigger packets. } }[1] we send a packet and echo it back and compare the values.

That suggests that the receiver only resynchronises in a gap. The solution suggested elsewhere of two stop bits sounds very promising (some UARTs can do 1.5 stop bits and that might be enough). Otherwise a simple re-transmission scheme tailored to the fault might be good. Here is an example:

Each packet starts and ends with FF. Unlike typical framing schemes, the start and end cannot be shared. This guarantees that an error is confined to one packet.

Other than the start and end bytes, all bytes in the packet are escaped. (e.g. byte FF becomes CB 34, CB becomes CB 00).

The last two bytes in the packet are a CRC.

The byte before the CRC is an acknowledgement number.

If a packet is at least four bytes, the first byte is the packet sequence number.

This gives a packet like this:

FF SS DD DD DD...DD AA CC CC FF ^^^^^^^^^^^^^^^^^^^^^^^^^ these are escaped as necessary

When the receiver gets a packet if the CRC is bad, or if the sequence number is not the next expected, ignore the packet. Otherwise accept the data or ack.

The transmitter sends continuously. If it has no data to send, it sends just ACKs (i.e. packets with just AA CC CC). Otherwise it sends a packet with this loop: 1) send the packet (SS DD...DD AA CC CC) 2) send ack (AA CC CC) until we have received at least X bytes and a valid packet 3) if our sent packet has been acknowledged, this one is done 4) else goto 1

Step 2 avoid the need for a timer. The value X depends on the round-trip delay. Since the AA field is at the *end* of a packet, we know when we receive a packet that it reflects the remote receiver's last packet at a time that is a fixed interval in the past. e.g. if the round-trip time is five bytes (to allow for buffering in the UART etc) then when we send the FF framing byte we know that any AA we receive in the net five bytes could not possible acknowledge the packet we just sent, if the remote happened to be just about to send the AA field, we might get AA CC CC FF, so any packet which ends more than 9 bytes after we sent the FF of our packet should acknowledge our packet, so we would use a value of about 10 to make re-transmissions be as prompt as possible. (This depends on the protocol being implemented at a character-by-character level at each end. If you had a higher-level view whereby hardware was given a complete packet to send at once, the ACK no longer benefits frombeing at the end and the timing is more complex).

Obviously the SS numbers in one direction have AA numbers going in the opposite direction.

The overhead of this scheme is 129/128p + 4+X for a packet size of p. It could be made lower by allowing more than one packet to be in flight at once, though that makes it more complex and costs more when a re-transmission is needed, unless you add even more complexity and have selective acknowledgements.

If you have a suitable timer available, the sending of ACKs in step 2 can be omitted (e.g. possibly saving power by not running the UART continuously).

- A
- alb
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Tue, Mar 6, 2012 10:20 AM

On 3/5/2012 4:07 PM, Tim Wescott wrote: [...]

Nice imagery, even though I don't see which other tool you would have to ensure functionality other than testing the specs. Certainly I can understand that a big design effort may help cutting down the time once you do system integration, but certainly it does not remove the need to do your testing.

Saying a program is not needed because it is too often compromised by other factors does not prove the program is not needed. On the contrary, if management does not get in the way, a quality program can certainly reduce the level of uncertainties. The quality program does not necessary means that it has to be enforced by the department of defense. Peer reviews and open standards may help a lot here (I understand that it might not always be applicable). And specifically in our case this is how we usually go.

Unfortunately it was not the case with the system I'm dealing with currently and indeed it was due to an overlook of the management which was too focused on higher priority tasks which at that time sucked in all the resources.

IMO humans have reached a level of reliability which is far beyond imagination, through standards, processes and certainly money.

Again I tend to disagree, my needs are exactly the same as 10 years ago, when the specs were laid down and we wanted to have a UART (there was no option saying "better be working").

- T
- Tim Wescott
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Tue, Mar 6, 2012 5:24 PM

You are mistaking "testing program" for "quality program". A testing program is a _part_ of a quality program, but a quality strategy that states only "test the hell out of it once it's done" is little better than "launch it and find out".

And I wasn't saying that a testing program isn't an essential part of a quality program -- I was saying that the statement "but we tested it" is, in my book, tantamount to "we painted it and polished it". If what you painted and polished is just dried up dog turds, then no matter how shiny the paint is it's still dog turds underneath.

A good quality program is one that comes in many steps, with each step having the goal that the _next_ step isn't going to find any problems.

Design reviews at every step, conducted by people who are competent and engaged, prevents far more bugs than testing finds. Unit testing, while being something called "testing", is often not included in a "testing program" that just does black-box testing on the completed system. A comprehensive _quality_ program has all of these and more, and does not equate to just testing.

Yes, and we still manage to screw up, sometimes. Like in your case.

Clocking a UART on the wrong part of the incoming serial stream is something that the designer shouldn't have done at all. Then it should have been caught in a design review before an FPGA was ever programmed. The fact that it wasn't mean that many people were just not on the ball in that case. The designer got it wrong, the team who were supposed to review his work didn't, or didn't do it thoroughly enough, the _real_ quality program that's supposed to make sure that the design reviews happen correctly didn't. Then the testing of the UART functionality _by itself_ in the FPGA either wasn't thorough enough or wasn't done at all, etc., etc.

Yet here you are, with a crying need for an FPGA mod. "Perceived need", then, perhaps.

--
My liberal friends think I'm a conservative kook.
My conservative friends think I'm a liberal kook.
Why am I not happy that they have found common ground?

Tim Wescott, Communications, Control, Circuits & Software
http://www.wescottdesign.com

- A
- alb
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Wed, Mar 7, 2012 2:16 PM

On 3/6/2012 2:51 AM, Charles Bryant wrote: [...]

Unfortunately we cannot change the receiver clock. But I'll stick with you for the sake of discussion.

This approach is nice if and only if the transmitter does not introduce extra gaps in between bytes, which may make your job to take time into account a little bit more complex. And the problem is still not solved, since the receiver is missing the start bit (transition high-low).

Timing the distance between interrupt may be tricky since at 19.2 Kbaud a 52us interval is needed for each bit, hence you would need your timer to run faster than that... Considering that the fastest interrupt service routine introduce ~3.5us of overhead it looks like you won't have much more time to spend for the rest of the application.

I think I lost you here. Why FF becomes CB 34?

This is an interesting request/acknowledge scheme, but we want to avoid to change the transmitter side to include this scheme and since the transmitter side has many other activities to perform we are not sure about the impact of this protocol in the overall scheduling. If we find we cannot avoid that certainly we will go along that path.

What we have found instead is an encoding scheme which will allow us to recover the synchronization loss of the receiver. You have three levels, a character one, a packet one and a command one. A character is everything between a start and a stop bit, while a packet is a sequence of character of maximum number of characters (fixed by receiver software buffer). The byte encoding for the character looks like the following: ____ ____| st | rs | b0 | b1 | b2 | b3 | sh | cb | ^^^^ ^^^^ |||| ++++ ---> stop bit

++++ ------------------------------------------------> start bit

where the meaning of each bit is the following:

st = 1 (sticky to '1') rs = resynchronization bit bn = body sh = shift cb = control bit

The shift bit signals to the receiver if the character has to be right shifted by 2 before being used (this we will see happens when the real start bit is lost and the uart synchs on the rs bit). The control bit selects either 'control' character or 'data' character. A data character has 4 valid bits which will be part transferred to the command level to build the 'telecommand'. A control character has 16 types available (bn will encode the meaning), out of which I can think of three types:

bn = '0000': BOP (begin of packet) bn = '0001': EOP (end of packet) bn = '1111': NUL (null character)

In the event of a NUL char the rs will be fixed to '1', in all other cases it will be '0'. This is done to have NUL = 0xFF which is needed to resynch on the correct start bit after the first desync occurred. An FF will be dropped by the receiver if a start bit has been missed already, otherwise it will be dropped by the software.

A packet will look like this:

BOP | DAT | DAT | ... | NUL | DAT | DAT | ... | NULL | ... | EOP

where the number of DAT between NULs may be short enough to eliminate the possibility to have two start bit miss before the NUL. The BOP and EOP will help at the packet level to control the transmission, while only DAT will be part of the command level where we can include a length and a crc to cross-check the integrity of the data.

We found that with a NUL every 8 DAT we can send reliably ~2000 bytes without any loss and with an error recovery of about 40% (number of bytes which had to be shifted before use).

When the receiver fails to sync on the first start bit, it has good chance to sync on the rs bit, hence the stop bit will be received as the shift bit. This means that the character has to be shifted before use. In the event of a NUL character we artificially introduce a long gap which will help synchronizing on a real start bit. When there is no miss of the start bit and the NUL character is received, the shift bit is artificially set to one, which may add an unnecessary shift operation, but then the control character will be still a NUL and discarded.

This mechanism hides the complexity at the character level and packet level, while the command level remains the same as a fully functional uart. I don't have any other idea on additional control characters but lots of possibilities may arise.

At the packet level if a EOP is lost the software will wait for the next BOP and the number of NUL only increases level of confidence that the packet will arrive. Critical information may be heavily redundant (to the extent of a NUL each second character), while less important commanding may be less redundant.

Of course the overhead is not negligible but the somewhere we knew we had to pay.

- T
- Tim Wescott
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Wed, Mar 7, 2012 6:47 PM

__________00________00________00________00________00________00________f8

____________________00________e0____________________________3f________00

until

else

I know it sounds like an oxymoron, but that's a really elegant kludge. That you had to do it at all makes it a kludge -- but it looks like you did a good job with it within the confines of what you had to work with.

--
My liberal friends think I'm a conservative kook.
My conservative friends think I'm a liberal kook.
Why am I not happy that they have found common ground?

Tim Wescott, Communications, Control, Circuits & Software
http://www.wescottdesign.com

- C
- Charles Bryant
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Mar 8, 2012 1:43 AM

In article , alb wrote: }On 3/6/2012 2:51 AM, Charles Bryant wrote: }[...] .. running rx at much higher clock ... }This approach is nice if and only if the transmitter does not introduce }extra gaps in between bytes, which may make your job to take time into }account a little bit more complex. And the problem is still not solved, }since the receiver is missing the start bit (transition high-low).

Reading your solution, I think I may have been wrong in my assumption about how the receiver worked. I assumed that when it missed the start bit, if another zero bit arrived it would see that (i.e. level-triggered), rather than needing a 1 and subsequent 0, so indeed my solution would not work (nor would the suggestion of using two stop bits).

}Timing the distance between interrupt may be tricky since at 19.2 Kbaud }a 52us interval is needed for each bit, hence you would need your timer }to run faster than that...

The ADSP-21020 TCOUNT register runs at the processor clock speed, so at a typical speed of 20MHz it increments every 50ns. And it can be read in a single cycle.

} Considering that the fastest interrupt }service routine introduce ~3.5us of overhead it looks like you won't }have much more time to spend for the rest of the application.

Unless you're running the processor at a very slow speed you should be able to make an ISR take a lot less time than that.

}The byte encoding for the character looks like the following: } ____ }____| st | rs | b0 | b1 | b2 | b3 | sh | cb | .. rest omitted ...

That looks very good. I'm sure that theoretically it would be possible to design something with lower overhead, but if you can tolerate that amount of overhead, it is simple enough to see that it obviously works, while a more complex solution might have an obscure flaw.

- L
- langwadt
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Mar 8, 2012 3:40 AM

I trying real hard to understand what it is you are saying

if the uart cannot find the edge of the start bit and then sample 8 bits correctly with out more edges to resync the baudrates would have differ quite bit, like several %

-Lasse

- A
- alb
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Mar 8, 2012 3:39 PM

It depends on the size of the packet (*) that you send. For a 100 bytes packet would be like ~0.5%, while with a 2KB packet is ~50%.

The receiver does not resynchronize the input signal with its internal clock and the condition to have a start bit is set when the negated signal and the clocked signal are both 1.

here is a simplified snippet in vhdl:

Since the 'not input' is not synchronized with the internal clock, the start_bit ff may not have the hold time satisfied hence the miss of the start bit.

(*) a packet is a continuous stream of characters (**) (**) a character is what is between a start and a stop bit

- A
- alb
  
  Contact options for registered users
Vote on answer
posted
12 years ago

Thu, Mar 8, 2012 3:46 PM

On 3/7/2012 7:47 PM, Tim Wescott wrote: [...]

I agree that is a shame we had to introduce this additional layers, but if you look on the command level everything would look the same and all the kludge is left in the other levels where the dirty work is being done.

After all there's always somebody who's doing the dirty work, likely in this case we may have found a solution that lives the dirt down at the bottom.

So far we have not found pitfalls, but we will post it if that is the case.

Al