Open source Verilog BCH encoder/decoder

Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

Threaded View
As part of my research, I needed a BCH encoder/decoder engine. Sadly, such  
a thing has no existed under a permissive license. Even more depressing is  
that many students seem to submit Verilog or VHDL engines as a project (or  
even research), but never release anything that is usable.

Anyway, I'm releasing a BSD licensed Verilog BCH encoder/decoder. It offers
:

* Parallel input/output
* Modular components that can be shared across multiple decoders
* Automatic selection of BCH parameters based on data size and errors to be
 corrected
* Specialized error locators for 1 error and 2 error codes
* Parallel or serial error polynomial generator for codes with 2 or more er
rors

https://github.com/russdill/bch_verilog

I'm releasing this under BSD because I'd like to see the code used as widel
y as possible, but I'd still like to get feedback and hopefully improvement
s merged back in.

As an example, a decoder for a 512 byte data block that corrects up to 12 e
rrors with an 8 bit wide input and an 8 bit wide output currently occupies  
1635 slices and operates at up to 191 MHz on a Virtex-6 LX240T-3. Such a de
coder would take input for 532 clock cycles (512 data bytes, 20 ecc bytes),
 calculate for about 28 clock cycles, and then produce output for 512 clock
 cycles.

The code currently compiles on Icarus Verilog (latest git) and Xilinx XST/I
sim (tested with 14.5).

Re: Open source Verilog BCH encoder/decoder
the link is expired, Can you the share it. Even i doing research on BCH Encoder and Decoder.

Thank You.

Re: Open source Verilog BCH encoder/decoder


wrote in message  

the link is expired, Can you the share it. Even i doing research on BCH  
Encoder and Decoder.

Thank You.


https://www.google.com/search?as_q=bch+encoder&as_epq=&as_oq=&as_eq
=&as_nlo=&as_nhi=&lr=&cr=&as_qdr=all&as_sitesearch=&as_occt
=any&safe=images&tbs=&as_filetype=&as_rights=&gws_rd=ssl  


Re: Open source Verilog BCH encoder/decoder
On Friday, January 2, 2015 8:19:22 PM UTC-8, snipped-for-privacy@gmail.com wrote:
Quoted text here. Click to load it

Which link is expired? The only link in the post is to github, which is fine.

Re: Open source Verilog BCH encoder/decoder
On Monday, June 23, 2014 at 1:35:52 AM UTC-7, Russell Dill wrote:
Quoted text here. Click to load it
h a thing has no existed under a permissive license. Even more depressing i
s that many students seem to submit Verilog or VHDL engines as a project (o
r even research), but never release anything that is usable.
Quoted text here. Click to load it
rs:
Quoted text here. Click to load it
be corrected
errors
ely as possible, but I'd still like to get feedback and hopefully improveme
nts merged back in.
Quoted text here. Click to load it
 errors with an 8 bit wide input and an 8 bit wide output currently occupie
s 1635 slices and operates at up to 191 MHz on a Virtex-6 LX240T-3. Such a  
decoder would take input for 532 clock cycles (512 data bytes, 20 ecc bytes
), calculate for about 28 clock cycles, and then produce output for 512 clo
ck cycles.
Quoted text here. Click to load it
/Isim (tested with 14.5).


Awesome code, thanks for making it available !

Simulations run great on Icarus.

However, I'm having execution time trouble on XST.

On a Thinkpad T540P with 16 GB DDR3, a DATA_BITS10%24,T=8,BITS=8 is s
till synthesizing after 12 hours (note that 1 of 4 CPUs is fully utilized s
o
it's not a case of continual memory swapping).

What compute capabilities did you use for DATA_BITS40%96,T12%,BITS16%
 on XST ?
How long did it take ?

I am using -loop_iteration_limit 2048 (for my case, 1024 barfs out) and
-opt_level 2


Thanks !

Re: Open source Verilog BCH encoder/decoder
On Sunday, May 10, 2015 at 9:33:19 AM UTC-7, snipped-for-privacy@gmail.com wrote:
Quoted text here. Click to load it
uch a thing has no existed under a permissive license. Even more depressing
 is that many students seem to submit Verilog or VHDL engines as a project  
(or even research), but never release anything that is usable.
Quoted text here. Click to load it
fers:
Quoted text here. Click to load it
o be corrected
Quoted text here. Click to load it
e errors
idely as possible, but I'd still like to get feedback and hopefully improve
ments merged back in.
Quoted text here. Click to load it
12 errors with an 8 bit wide input and an 8 bit wide output currently occup
ies 1635 slices and operates at up to 191 MHz on a Virtex-6 LX240T-3. Such  
a decoder would take input for 532 clock cycles (512 data bytes, 20 ecc byt
es), calculate for about 28 clock cycles, and then produce output for 512 c
lock cycles.
Quoted text here. Click to load it
ST/Isim (tested with 14.5).
Quoted text here. Click to load it
 still synthesizing after 12 hours (note that 1 of 4 CPUs is fully utilized
 so
Quoted text here. Click to load it
16 on XST ?
Quoted text here. Click to load it


I've pushed some updates related to corner cases and syndrome
computation, go ahead and pull and give it another try.

The main thing that will make XST run "forever" is swapping due to lack of  
RAM. For synthesizing single channel decoders, I'd recommend at least 16GB.
 For multi-channel, 32GB.

Re: Open source Verilog BCH encoder/decoder
On Sunday, May 10, 2015 at 2:42:36 PM UTC-7, Russell Dill wrote:
Quoted text here. Click to load it
 such a thing has no existed under a permissive license. Even more depressi
ng is that many students seem to submit Verilog or VHDL engines as a projec
t (or even research), but never release anything that is usable.
Quoted text here. Click to load it
offers:
Quoted text here. Click to load it
 to be corrected
Quoted text here. Click to load it
ore errors
 widely as possible, but I'd still like to get feedback and hopefully impro
vements merged back in.
Quoted text here. Click to load it
o 12 errors with an 8 bit wide input and an 8 bit wide output currently occ
upies 1635 slices and operates at up to 191 MHz on a Virtex-6 LX240T-3. Suc
h a decoder would take input for 532 clock cycles (512 data bytes, 20 ecc b
ytes), calculate for about 28 clock cycles, and then produce output for 512
 clock cycles.
Quoted text here. Click to load it
 XST/Isim (tested with 14.5).
Quoted text here. Click to load it
is still synthesizing after 12 hours (note that 1 of 4 CPUs is fully utiliz
ed so
Quoted text here. Click to load it
16% on XST ?
Quoted text here. Click to load it
f RAM. For synthesizing single channel decoders, I'd recommend at least 16G
B. For multi-channel, 32GB.




Thanks Russel.

I believe you've only changed the Makefile yes ?

What's your approximate synthesis time for sim.v with DATA_BITS10%24,T=
8,BITS=8 ?

I'm not sure what you mean by single / multiple channels.

Cheers,

-Paul

Re: Open source Verilog BCH encoder/decoder
On Sunday, May 10, 2015 at 3:23:32 PM UTC-7, snipped-for-privacy@gmail.com wrote:
Quoted text here. Click to load it
y, such a thing has no existed under a permissive license. Even more depres
sing is that many students seem to submit Verilog or VHDL engines as a proj
ect (or even research), but never release anything that is usable.
Quoted text here. Click to load it
t offers:
Quoted text here. Click to load it
rs to be corrected
Quoted text here. Click to load it
 more errors
Quoted text here. Click to load it
as widely as possible, but I'd still like to get feedback and hopefully imp
rovements merged back in.
Quoted text here. Click to load it
 to 12 errors with an 8 bit wide input and an 8 bit wide output currently o
ccupies 1635 slices and operates at up to 191 MHz on a Virtex-6 LX240T-3. S
uch a decoder would take input for 532 clock cycles (512 data bytes, 20 ecc
 bytes), calculate for about 28 clock cycles, and then produce output for 5
12 clock cycles.
Quoted text here. Click to load it
nx XST/Isim (tested with 14.5).
Quoted text here. Click to load it
8 is still synthesizing after 12 hours (note that 1 of 4 CPUs is fully util
ized so
Quoted text here. Click to load it
S16% on XST ?
Quoted text here. Click to load it
nd
 of RAM. For synthesizing single channel decoders, I'd recommend at least 1
6GB. For multi-channel, 32GB.
Quoted text here. Click to load it

No, you can see the full list of changes here:

https://github.com/russdill/bch_verilog/compare/cfd444733f...cee257ae47

Quoted text here. Click to load it
=8,BITS=8 ?


tb_sim.v and sim.v were not intended to be synthesizable.  

Quoted text here. Click to load it


Running multiple decoders in parallel

Re: Open source Verilog BCH encoder/decoder

Quoted text here. Click to load it

Sorry for the imprecise question, my first pull was a week ago so I meant since then, I believe the Makefile only has changed but I may be wrong again :)

Quoted text here. Click to load it

Sure, bad question again, I hacked your sim.v (and unfortunately kept the same name ...) to contain bch_syndrome, bch_errors_present, bch_sigma_bma_parallel and bch_error_tmec (+ hook-ups ... etc).

If I try to synthesize the whole thing with T12%, DATA_BITS40%96 I hit a wall
(on a 16GB machine, 1 BCH channel only).

So I narrowed it down:

bch_syndrome, bch_errors_present, bch_sigma_bma_parallel all synthesize individually in 10 minutes or fewer and use < 2GB DDR even if I use  
T64%, DATA_BITS81%92 (>5x T and 2x DATA_BITS of above)  

However, bch_error_tmec ALONE with
T12%, DATA_BITS40%96 only, takes 1+1/2 hours and reaches 7 GB DDR utilization with a funny pattern of slow ramp-ups and sharp declines - it doesn't use up all the available DDR though, i.e., there are 5 or more GB available at all times.

T64%, DATA_BITS81%92 barfs.

I tried chien separately and got the same result as with bch_error_tmec.


Do you expect chien to be so much harder to synthesize than all the rest ?
From my past experience with Reed-Solomon I sort of expected Berlekamp-Massey
and Chien to be of somewhat comparable complexity.


Thanks !

-Paul


Re: Open source Verilog BCH encoder/decoder
On Monday, May 11, 2015 at 6:13:58 PM UTC-7, snipped-for-privacy@gmail.com wrote:
Quoted text here. Click to load it
 since then, I believe the Makefile only has changed but I may be wrong aga
in :)

You are looking at the dates of the commits. The commits happened a while a
go, but were only recently pushed.

Quoted text here. Click to load it
4,T=8,BITS=8 ?
Quoted text here. Click to load it
 same name ...) to contain bch_syndrome, bch_errors_present, bch_sigma_bma_
parallel and bch_error_tmec (+ hook-ups ... etc).
Quoted text here. Click to load it
t a wall
Quoted text here. Click to load it

Here's xilinx_error_tmec with PIPELINE_STAGES=2, DATA_BITS40%96, T12%
, BITS=8, REG_RATIO=8

xst:
524.43user 4.14system 8:42.02elapsed 101%CPU (0avgtext+0avgdata 13612104max
resident)k
0inputs+26104outputs (0major+4229163minor)pagefaults 0swaps

map:
11.68user 0.12system 0:11.81elapsed 100%CPU (0avgtext+0avgdata 581040maxres
ident)k
0inputs+184outputs (0major+155622minor)pagefaults 0swaps

So XST is using nearly 14GB. Depending on your machine's configuration, 16G
B of physical memory would not have been enough. Incidentally, the decoder  
uses 612 slices, and runs at least 200MHz.

Quoted text here. Click to load it
ndividually in 10 minutes or fewer and use < 2GB DDR even if I use  
Quoted text here. Click to load it
lization with a funny pattern of slow ramp-ups and sharp declines - it does
n't use up all the available DDR though, i.e., there are 5 or more GB avail
able at all times.
Quoted text here. Click to load it
?
Quoted text here. Click to load it
ssey


The way I'm dynamically compiling the chien modules is giving XST a hard ti
me. Although going bit-parallel, you do need a lot of parallel multipliers  
for T64% (some 512 of them). I've created quite a few variants to get the
 most out of XST, each variant with it's own strengths and weaknesses. If y
ou change the multiplier in bch_chien_expand from a parallel_standard_multi
plier_const1 to a parallel_standard_multilier, you can save some memory, bu
t not the orders of magnitude required.

Here's an example at: PIPELINE_STAGES=1,DATA_BITS40%96,T32%,BITS=8,
REG_RATIO=8

xst:
1925.94user 9.58system 31:49.83elapsed 101%CPU (0avgtext+0avgdata 28960864m
axresident)k

map:
565608inputs+50552outputs (353major+8315051minor)pagefaults 0swaps
13.37user 0.15system 0:13.60elapsed 99%CPU (0avgtext+0avgdata 613264maxresi
dent)k
2104inputs+208outputs (3major+163868minor)pagefaults 0swaps

So you can see that in this case xst is using about 29GB of memory. I origi
nally targeted the code for around T=3 to T16%. If you want to get up t
o T64%, you'll likely need to figure out why XST is taking so much memory
 synthesizing the multipliers, likely by paring this down bit by bit.

Additionally, the number of pipeline stages in error tmec is limited to 2 r
ight now. You might need to go higher to get 64 13 bit terms summed togethe
r and compared with zero.

Quoted text here. Click to load it

Re: Open source Verilog BCH encoder/decoder
Hi Russell!
First of all, thank you very much for this amazing tool. It works wonders a
nd is very well coded.  
I do have a question and hope you can help me. For my project, I need to ha
ve a specific codeword length. My DATA_BITS size can change, as well as "T"
, but I do need a specific N which is 256 (32 bytes). So far I cannot achie
ve this. For different values of DATA_BITS and T either I get 31 or 33 byte
s codeword length, never 32. I guess this has to do with this piece you wro
te on the README: "Note that the number of errors correctable for a given p
olynomial is sparse. The search function will choose the next highest numbe
r of correctable
errors rather than trying to move to the next polynomial."

Any advice on how can I work around this?  

Thanks again.
Gabriel

Site Timeline