Newbie question: XC3S400 Gate Count

Hi, I am a FPGA newbie. Here's my question: How do you count the gates in Xilinx FPGA? I have a XC3S400 board, which is supposed to have 400K gates. But when I synthesize the following code, the synthesizer report 68 LUTs were used, compared to total of 7168 LUTs, I would say that about 1% of the 400K gates, which is about 4000 gates, were used to synthesize my code. How ever, my following code only uses 150 "AND" gates and 142 "XOR" gates, why does it take XC3S400 take 4000 "gates" to handle my code? Is there any setting trick need to be done in the synthesizer or in the code?

//===============Verilog code=======================// module GFMul(z,y,x); output [7:0] z; input [7:0] y; input [7:0] x;

reg [7:0] z;

//entry: reg[7:0] t; always @(y or x ) begin

t[0]=((x [0]&y [0])^(x [1]&y [7])^(x [2]&y [6])^(x [3]&y [5])^(x [4]&y [4])^(x [5]&y [3])^(x [5]&y [7])^(x [6]&y [2])^(x [6]&y [6])^(x [6]&y [7])^(x [7]&y [1])^(x [7]&y [5])^(x [7]&y [6])^(x [7]&y [7])); t[1]=((x [0]&y [1])^(x [1]&y [0])^(x [2]&y [7])^(x [3]&y [6])^(x [4]&y [5])^(x [5]&y [4])^(x [6]&y [3])^(x [6]&y [7])^(x [7]&y [2])^(x [7]&y [6])^(x [7]&y [7])); t[2]=((x [0]&y [2])^(x [1]&y [1])^(x [1]&y [7])^(x [2]&y [0])^(x [2]&y [6])^(x [3]&y [5])^(x [3]&y [7])^(x [4]&y [4])^(x [4]&y [6])^(x [5]&y [3])^(x [5]&y [5])^(x [5]&y [7])^(x [6]&y [2])^(x [6]&y [4])^(x [6]&y [6])^(x [6]&y [7])^(x [7]&y [1])^(x [7]&y [3])^(x [7]&y [5])^(x [7]&y [6])); t[3]=((x [0]&y [3])^(x [1]&y [2])^(x [1]&y [7])^(x [2]&y [1])^(x [2]&y [6])^(x [2]&y [7])^(x [3]&y [0])^(x [3]&y [5])^(x [3]&y [6])^(x [4]&y [4])^(x [4]&y [5])^(x [4]&y [7])^(x [5]&y [3])^(x [5]&y [4])^(x [5]&y [6])^(x [5]&y [7])^(x [6]&y [2])^(x [6]&y [3])^(x [6]&y [5])^(x [6]&y [6])^(x [7]&y [1])^(x [7]&y [2])^(x [7]&y [4])^(x [7]&y [5])); t[4]=((x [0]&y [4])^(x [1]&y [3])^(x [1]&y [7])^(x [2]&y [2])^(x [2]&y [6])^(x [2]&y [7])^(x [3]&y [1])^(x [3]&y [5])^(x [3]&y [6])^(x [3]&y [7])^(x [4]&y [0])^(x [4]&y [4])^(x [4]&y [5])^(x [4]&y [6])^(x [5]&y [3])^(x [5]&y [4])^(x [5]&y [5])^(x [6]&y [2])^(x [6]&y [3])^(x [6]&y [4])^(x [7]&y [1])^(x [7]&y [2])^(x [7]&y [3])^(x [7]&y [7])); t[5]=((x [0]&y [5])^(x [1]&y [4])^(x [2]&y [3])^(x [2]&y [7])^(x [3]&y [2])^(x [3]&y [6])^(x [3]&y [7])^(x [4]&y [1])^(x [4]&y [5])^(x [4]&y [6])^(x [4]&y [7])^(x [5]&y [0])^(x [5]&y [4])^(x [5]&y [5])^(x [5]&y [6])^(x [6]&y [3])^(x [6]&y [4])^(x [6]&y [5])^(x [7]&y [2])^(x [7]&y [3])^(x [7]&y [4])); t[6]=((x [0]&y [6])^(x [1]&y [5])^(x [2]&y [4])^(x [3]&y [3])^(x [3]&y [7])^(x [4]&y [2])^(x [4]&y [6])^(x [4]&y [7])^(x [5]&y [1])^(x [5]&y [5])^(x [5]&y [6])^(x [5]&y [7])^(x [6]&y [0])^(x [6]&y [4])^(x [6]&y [5])^(x [6]&y [6])^(x [7]&y [3])^(x [7]&y [4])^(x [7]&y [5])); t[7]=((x [0]&y [7])^(x [1]&y [6])^(x [2]&y [5])^(x [3]&y [4])^(x [4]&y [3])^(x [4]&y [7])^(x [5]&y [2])^(x [5]&y [6])^(x [5]&y [7])^(x [6]&y [1])^(x [6]&y [5])^(x [6]&y [6])^(x [6]&y [7])^(x [7]&y [0])^(x [7]&y [4])^(x [7]&y [5])^(x [7]&y [6])); z =(t);

end

endmodule

//===============Resource Utilization================// Resource Usage Report for GFMul

Mapping to part: xc3s400tq144-4 LUT2 17 uses LUT3 9 uses LUT4 42 uses I/O primitives: 24 IBUF 16 uses OBUF 8 uses

I/O Register bits: 0 Register bits not including I/Os: 0 (0%)

Mapping Summary: Total LUTs: 68 (0%)

Thanks!

Reply to
jerryzy
Loading thread data ...

Because in FPGA you don't count space in "gates" ... so don't try ... just use the Slice/LUT/FF count .

The gate count given is more a commercial argument to me. And you use many more than you anticipated because in a FPGA when you don't always use all the resources inside a LC/LE to their full potential so those are wasted ...

Sylvain

Reply to
Sylvain Munaut

This does nicely show that whoever writes this report software, really SHOULD fix the reporting, so it is more usefull than the nonsense 0% for 68 LUTs.

These are large devices, and often small benchmarks are used, so common sense (Surely) would say, report % usage with a decimal point ?

eg > Total LUTs: 68 (0.949%)

-jg

Reply to
Jim Granville

At the risk of being embroiled in the great comp.arch.fpga FPGA gates debate, I will attempt to answer this.

Your are mixing up some terms here and there in your analysis. First of all a gate is considered a 2-input basic function block (AND, OR) and for instance a D Flip/Flop register with a clock enable and reset function would require multiple (6-8) gates to be implemented. A two input XOR requires 2 inverters, 2 AND gates and 1 OR gate to be implemented or 4-5 gates depending on how you count the inverters (0.5 or 1 gate each).

The FPGA gate counts include not only the logic that can be built in the LUTs, but also the registers, the carry logic, the CLB muxes, the BlockRAMs, the clock DCMs, the IOB registers, DSP logic (not in the XC3S400). At the most basic level we count a "Logic Cell" as 12 gates where a logic cell is a LUT and register combination. So simply taking the 1% LUT utilization and multiplying it by the gate count for the device is not accurate as you are missing a wide number of other functions in the device are included in the total that are 0% in utilized in your design.

Your HDL code was comprised of 8 similar functions of 16-19 2-input ANDs and 16-19 input XOR functions. The 2-input ANDs are straight forward and would amount to 32-38 gates per function. The XOR functions however have to be cascaded as 2-input XOR tree generating 15-17 2-input XORS or 60-85 gates each. In total you have 8 * (32 to 38) + 8 * (60 to 85) or 736-984 gates in total.

You synthesized the design to 68 LUTs which if we used the 12 gates/LC would equal 816 gates. But you aren't using registers should this should be discounted by 6-8 gates. I'll pick the lower 6 to get just 6 gates per LUT or 408 FPGA gates for your design.

In Summary:

Hand Calculated Gates: 736-984 FPGA LUTs : 68 FPGA LUTs as gates : 408

You are getting a really efficient FPGA design in this case as a single LUT can implement a 4-input XOR directly which would take 15 gates to implement in an ASIC using 3 2-input XORs in a tree.

Ed

Reply to
Ed McGettigan

Ed, just out of curiosity how many transistors, roughly, are there in an XC3S400? That is, total in the package - including RAMs, configuration, routing, redundancy, everyithng?

I'm curious to know how many transistors you pack into a package compared to, say, a 55 million transistor Pentium4.

I know it doesn't actually mean anything to the user - again... just curious.

Thanks, Paul.

Reply to
Paul Marciano

Thanks a lot Ed!

Does that mean if I want to roughly estimate the ASIC gates in a general design, I can just multiple the LUT number by 10-15?

Reply to
jerryzy

We counted transistors for a while internally as we thought that it was an interesting statistic. But, it's actually a very hard problem to handle as our devices are almost 100% full custom and sometimes the legs of a transistor may not be clearly defined or split between different submodules. The auto reporting functions from the CAD tools were also spitting out numbers that were way too high. We stopped doing this in detail for Virtex-4 as there was no real benefit in knowing the exact number except for bragging rights.

We had a press release for the Virtex-II Pro 2VP100 part that stated

430 Million transistors back in 2003
formatting link

I think that we were estimating about 1 Billion transistors for the Virtex-4 LX200 parts that we are shipping now. I'm not as familiar with the Spartan-III line, but I think that you are looking at about 30-35 Million for the XC3S400 which has a configuration size of 1.7 Mbit.

Ed

Reply to
Ed McGettigan

Check out:

formatting link

And the other articles this week all on using FPGAs to prototype ASICs.

Aust> Thanks a lot Ed!

Reply to
Austin Lesea

It really depends on what your logic is actually doing in the LUT. If it's an XOR it would accurate, if it's an AND it wouldn't. We have an old XAPP on this subject that you can look through for more info:

formatting link

In my original analysis of your HDL code I didn't include any area optimization for sharing logic functions which is likely happening.

The right way to do this is to resynthesize your design to a target ASIC library and then to use the reported gate counts.

Ed

Reply to
Ed McGettigan

At the FPGA-FAQ web site, I have a page that displays a bunch of statistics about various FPGAs:

formatting link

I just ran it with the above chips that Ed mentioned, and looked at the transistor count estimates:

(Use monospaced font)

Part Number Ed's TX count My TX count ESTIMATE ESTIMATE

XC2VP100 430,000,000 554,279,440 XC4VLX200 1,000,000,000 734,982,144 XC3S400 35,000,000 26,043,328

My estimates are based on a complex set of rules using only publicly available information, which is why they are listed as "Transistors (Wild estimate)" on my web site. Ed is much closer to the source of the information. Sometimes my estimates are higher, sometimes lower, than Ed's. Your answer is probably bounded by these values.

I 110% agree with Ed that "there is no real benefit in knowing the exact number"

Cheers, Philip

=================== Philip Freidin snipped-for-privacy@fpga-faq.org Host for

formatting link

Reply to
Philip Freidin

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.