Serial data decoding

Please help! I have a string of data as the result of serial to parallel hardware conversion. I need to decode this string into correct bytes. The start of a byte position is unknown. In this particular case, part of my data is as follows. Serial data string: (not rs232, straight serial data)

01001010010010010101001001001001010100100010010001001010 This was read through a shift register to get 8-bit bytes to store in memory as: 4A 49 52 49 52 24 4A I need to decode bit pairs into corrected bits. 00 = 0 01=1 10=0 11=Illegal So that means every 16bits = 8bit byte Ultimately I need to find certain values within this string which confirm bit decode positions. Somewhere within this string of bits there is a pattern of 4E 4E A1 and on paper I do see it. The actual data memory size to decode is 16k. Thanks, Ed
Reply to
Ed
Loading thread data ...

Where do you see it? According to your coding rules, the encoded nibble '4' would be 00010000 and I don't see any run of 4 zeroes in the original encoded string.

--
Dan Henry
Reply to
Dan Henry

Look at where I separated it. Ignore the 6 bits then start.

010010 1001001001010100 - 1001001001010100 - 1000100100010010 - 10

4E = 0100 1110 = 1001001001010100

Remember that both a 10 and a 00 = 00

Reply to
Ed

Please don't top-post.

That last bit about dual patterns for zero is what I missed.

Anyway, what's the issue? How to code something to do a post-process search for 4E 4E A1? How to code something to sync up to 4E 4E A1 in realtime as the data streams in?

--
Dan Henry

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Reply to
Dan Henry

The decode can be done either way. Decoding it on the fly would be faster I think. But, I am able to store the data first then parse through it. Thanks for any help.

Reply to
Ed

You worry about top posting and don't edit?!?

Interesting. But not an example of top posting. It's an example of reversal of an interspersed reply, used, I assume, because Q1 Q2 A1 A2 isn't "order in which people normally read text" either. A bit disingenuous, no?

Ponder why someone may already have read the question and may want to see just the answer and skip to the next message, noting that in text groups usenet is resonably reliable these days.*

Ponder why editing to just the highlight of the question and doing an an interspersed reply might also work nicely.

Ponder if the fact that the bottom post "rule" letting you jump on a supposed less 1337 individual is really a sufficient reason for it's existence.

3ch

*I know several people who read with speech interfaces and -really- prefer not to have to read the same message over and over to get to the replys in case this one is stumping you.

Reply to
colonel_hack

messagenews: snipped-for-privacy@4ax.com...

My newsserver is having problems, so I've resorted to posting from Google Groups.

If you can post-process it, I'd first decode the bit pairs, then shift the pattern across the resulting decoded bits. The following code should give you gist of it. The code is not portable, assumes little-endian by ordering, could be optimized, etc.

Do notice that of the 4E 4E A1 that you saw, I don't find the A1 by hand and neither does the program. However 4E 4E 14 is present.

-Dan Henry

#include #include

/* int find_bit_pattern(buffer, length, pattern, width) * * The function searches 'length' bits of 'buffer' bit-by-bit for the * 'width'-bit 'pattern' ('width' must be

Reply to
google

Is it correct that the bit pattern 11 is always illegal but otherwise 10 is the prefered encoding for 0 (i.e. clear 10 cannot be encoded 01 10 and clear 00 cannot be encoded 00 00)? If that is true then I believe you only need to look for a single special

5 bit patter (01000) to find the incorrect bit boundary.

If the last encoded digit was a zero then the last digit of the encoded stream is also a zero and the encoding would be

00->1010 01->1001 10->0100 11->0101 starting after a one 00->0010 01->0001 10->0100 11->0101 so legal 5 bit patterns are in the encoded stream are 01010 01001 00100 00101 10010 10001 10100 10101 8 out of 32 possible patterns so there are 24 illegal 5 bit patterns. Find one and you've off on you bit boundaries. 0 10 00 is -not- on the list but it is in a sifted position in 4E 4E A1:

4E 4E A1 encodes as xx 01 00 10 01 01 01 00 10 01 00 10 01 01 01 00 01 00 01 00 10 10 10 01 ^^ ^^ ^ so finding 01 00 01 confirms your current bit boudary while finding 10 10

00 denys it and one or the other must be present if 4E 4E A1 is in the unencoded version.

Double check the conversions as I did them at the keyboard but with 3/4 of the 5 bit combos illegal I'll bet there's one in the sample string even if I've glitched the conversion.

3ch
Reply to
colonel_hack

. ^^ ^^ ^ The marker was wrong. 01 00 0 occurs later. so it is 01 00 01 which confirms and 10 10 00 which denys correct allignment

3ch
Reply to
colonel_hack

I see I made an error in coppying the 3rd sequence.

010010 1001001001010100 - 1001001001010100 - 1000100100010010 - 10 Should have been 010010 1001001001010100 - 1001001001010100 - 0100010010001001 - 10 Sorry bout that.
Reply to
Ed

01 01 01 01 01 is legal 10 10 10 10 10 is legal 10 00 01 is legal 11 is illegal 00 00 00 is illegal Actual coding rules: 1 = 01 0 = 10 if following a 0 data bit 0 = 00 if following a 1 data bit
Reply to
Ed

messagenews: snipped-for-privacy@80g2000cwy.googlegroups.com...

Please don't top-post.

Anyway, the code modifications to the test raw serial input and the search pattern should be obvious. However, I'd bet that the program will output an error notification because of the back-to-back one-one (11) above, which you said is illegal. You must know something about special conditions (e.g., 1-1) that makes it a legal illegal pair then.

-- Dan Henry

A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail?

Reply to
google

Not under the encoding assumptions I listed.

How do you know when to encode 0 as 00 and when as 10? If you know that rule you can almost certainly find a pair of fairly short bit patterns to sync the bits.

Your example encoded string had no strings of more than three zero in a row and no pairs of ones. In fact "11" was refered to as illegal, not merely "unused". When I've seen funny schemes like this it's often been to ensure a transition at least every so often to keep things in sync (i.e. rll for a hard drive), so it seems to make some sense.

If the rules say use 10 unless it would make a 11 in the encoded stream then x0 10 00 will never occur, but 01 00 0x -will- occur if your sample 3 bytes are encoded.

So the question is it ok to encode 10010 encoded as 0100000110 or not? Do you have a sample of data? Is there -ever- a 11? a 0000? If not, then I stongly suspect these or similar rules are in effect and you can sync up without even looking at the decoded data. And the sample data could occur in both the true and off-by-half-a-bit streams so merely looking at the decoded data can fail to identify the correct stream.

3ch
Reply to
colonel_hack

A 10010 would encode as 0100100100 Note that both 10 and 00 = 0 The rules stated that:

1 = 01 0 = 00 if following a 1 0 = 10 if following a 0
Reply to
Ed

Ok. I missed you'd clairified. Sorry.

So if you have 3 bits to encode they would encode as (x means I can't tell without the previous data)

000 - x0 10 10 001 - x0 10 01 010 - x0 01 00 011 - x0 01 01 100 - x1 00 10 101 - x1 00 01 110 - x1 01 00 111 - x1 01 01

Since the unencoded three bit patterns are all the possible patterns, then the resulting patterns in the last 5 bits are the only possible ones that can occur. But there are 32 possible unrestricted 5 bit patterns. That means there are 24 that cannot occur in an encoded message. Listing and noting that four 0's in a row cannot occur:

00000 - 0000 rule 00001 - 0000 rule 00010 00011 - 11 rule 00100 - OK 00101 - OK 00110 - 11 rule 00111 - 11 rule 01000 - 01001 - OK 01010 - OK 01011 - 11 rule 01100 - 11 rule 01101 - 11 rule 01110 - 11 rule 01111 - 11 rule 10000 - 0000 rule 10001 - OK 10010 - OK 10011 - 11 rule 10100 - OK 10101 - OK 10110 - 11 rule 10111 - 11 rule 11000 - 11 rule 11001 - 11 rule 11010 - 11 rule 11011 - 11 rule 11100 - 11 rule 11101 - 11 rule 11110 - 11 rule 11111 - 11 rule

leaves only two that might occur in a bit shifted stream but that cannot occur in the proper one. I found one on these in the bit-shifted vesion of your sample data so you know it must exist in the "wrong" stream.

I belive that unless you know a quite a bit about the data (like your sample data cannot occur due to shifting of other data) merely scanning the decoded data will not work.

3ch
Reply to
colonel_hack

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.