how do I minimize the logic in this function?

- B
- Brannon
  
  Contact options for registered users
posted
18 years ago

Fri, Jan 13, 2006 10:22 PM

I have an async function with six bits in and eight bits out (listed below). I need to minimize the logic usage (in a Virtex2) for this function. It appears that most kmap tools will support six bits in but only one bit out. Anyone have a tool or method they would recommend to help with my problem?

I do it currently with two three bit subtracters (in LUTs, not CarryChain) and a 6bit x 16 element mux (using muxf prims) running as a lookup table. The output of one adder drives the mux S input. It just takes too much time and space.

In Out

0 0 1 66 2 128 3 0 4 64 5 65 6 66 7 64 8 130 9 128 10 136 11 130 12 0 13 66 14 128 15 0 16 68 17 69 18 65 19 68 20 80 21 81 22 69 23 80 24 64 25 65 26 66 27 64 28 68 29 69 30 65 31 68 32 138 33 136 34 160 35 138 36 130 37 128 38 136 39 130 40 162 41 160 42 168 43 162 44 138 45 136 46 160 47 138 48 0 49 66 50 128 51 0 52 64 53 65 54 66 55 64 56 130 57 128 58 136 59 130 60 0 61 66 62 128 63 0

Thanks for your time.

- A
- Austin Lesea
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Fri, Jan 13, 2006 10:42 PM

Brannon,

How about a BRAM? Simple ROM look up table, 6 bits in, 8 bits out.

Aust> I have an async function with six bits in and eight bits out (listed

- A
- Alex Freed
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Fri, Jan 13, 2006 10:57 PM

- B
- Brannon
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Fri, Jan 13, 2006 11:17 PM

Okay: lets suppose I use eight ROM64x1 prims. Is that really fewer recourses and/or comparable in speed to a four bit subtracter plugged into the switch 6b x 16el mux with constants on all its data? What resources does a ROM64x1 use?

- A
- Austin Lesea
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Fri, Jan 13, 2006 11:36 PM

Brannon,

I am proposing you load up a BRAM with the values you need for the addresses you have.

A simple large table lookup. Done in one cycle. Uses one BRAM block, but, do you sue them anyway? Do you have a spare one?

Sure it is more real estate than just about every other method, but it is fast, and simple. And if you have any unused BRAMs lying about, it is done.

Aust> Okay: lets suppose I use eight ROM64x1 prims. Is that really fewer

- B
- Brannon
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Sat, Jan 14, 2006 12:11 AM

"one cycle" is the whole issue. I don't have any spare cycles. This has to be done asynchronously.

- E
- Eric Smith
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Sat, Jan 14, 2006 12:56 AM

So split it into eight functions that each take six bits in and produce one bit out. Then minimize each one separately.

Better yet, just write it in an HDL and let the synthesis tools take care of it. For all but the most timing-critical cases, you'll wind up with perfectly acceptable results.

- J
- John_H
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Sat, Jan 14, 2006 1:16 AM

The ROM64x1 uses 4 16-bit LUTs, two MUXF5s and a MUXF6. You would have 6 bits of address with fanouts of 6-24 with 1 LUT through MUXF5 and MUXF6 as your delay times. The total resources: 16 slices.

- A
- austin
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Sat, Jan 14, 2006 2:30 AM

Brannon,

Ah. I see. No clock.

Seems strange that somewhere in this whole design there is no clock that tells you when the data is valid, but then, it isn't something I am working on.

Even a strobe that tells you the address is valid could be used to clock the BRAM....

But, if you are doing something totally asynchronous, I will bow out immediately.

Aust> "one cycle" is the whole issue. I don't have any spare cycles. This has

- B
- Bob Perlman
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Sat, Jan 14, 2006 2:39 AM

Hi -

It's easy to try out. Here's an inelegantly-written Verilog module:

module comb_function ( // Outputs out_val, // Inputs in_val, );

//-----FPGA I/O

output [7:0] out_val; input [5:0] in_val;

reg [7:0] out_val;

always @(in_val) case(in_val) 0: out_val = 0 ; 1: out_val = 66 ; 2: out_val = 128; 3: out_val = 0 ; 4: out_val = 64 ; 5: out_val = 65 ; 6: out_val = 66 ; 7: out_val = 64 ; 8: out_val = 130; 9: out_val = 128; 10: out_val = 136 ; 11: out_val = 130 ; 12: out_val = 0 ; 13: out_val = 66 ; 14: out_val = 128 ; 15: out_val = 0 ; 16: out_val = 68 ; 17: out_val = 69 ; 18: out_val = 65 ; 19: out_val = 68 ; 20: out_val = 80 ; 21: out_val = 81 ; 22: out_val = 69 ; 23: out_val = 80 ; 24: out_val = 64 ; 25: out_val = 65 ; 26: out_val = 66 ; 27: out_val = 64 ; 28: out_val = 68 ; 29: out_val = 69 ; 30: out_val = 65 ; 31: out_val = 68 ; 32: out_val = 138 ; 33: out_val = 136 ; 34: out_val = 160 ; 35: out_val = 138 ; 36: out_val = 130 ; 37: out_val = 128 ; 38: out_val = 136 ; 39: out_val = 130 ; 40: out_val = 162 ; 41: out_val = 160 ; 42: out_val = 168 ; 43: out_val = 162 ; 44: out_val = 138 ; 45: out_val = 136 ; 46: out_val = 160 ; 47: out_val = 138 ; 48: out_val = 0 ; 49: out_val = 66 ; 50: out_val = 128 ; 51: out_val = 0 ; 52: out_val = 64 ; 53: out_val = 65 ; 54: out_val = 66 ; 55: out_val = 64 ; 56: out_val = 130 ; 57: out_val = 128 ; 58: out_val = 136 ; 59: out_val = 130 ; 60: out_val = 0 ; 61: out_val = 66 ; 62: out_val = 128 ; 63: out_val = 0 ; endcase

endmodule

The resource usage is:

Mapping to part: xc2v40fg256-4 LUT2 6 uses LUT3 3 uses LUT4 11 uses

Synplify estimates an in-to-out delay of 2.865ns, NOT including I/O buffers. Note, too, that I used -4, which should be the lowest speed grade.

Seems pretty fast and cheap.

Bob Perlman Cambrian Design Works

On 13 Jan 2006 14:22:20 -0800, "Brannon" wrote:

- B
- Brian Davis
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Tue, Jan 17, 2006 1:33 AM

Whatever you do, don't ever feed async inputs into a Virtex-II BRAM.

And never clock BRAMs, while enabled, from an unlocked DCM.

See Answer Record 21870 aka "How do I randomly clobber bits in my read only BRAM function"

or the recent thread starting here:

formatting link

Brian