Quartus-II 8.0 resource-sharing? (why inferred addsub takes 2x LUTs?)

I created a simple Verilog-2001 test-module:

`default_nettype none module top #( parameter integer D_W = 16 ) ( input wire sel, input wire signed [D_W-1:0] ina, inb, output wire signed [D_W:0] out );

assign out = sel ? (ina - inb ) : (ina + inb); endmodule // : top `default_nettype wire

------------------------------------------------

When I ran this module through Xilinx Webpack 10.1, it synthesized as a "addsub" macro. I.e., I tested another module which performs addition-only, and both modules occupied the same amount of LUT. Furthermore, the Xilinx synthesis report even tells me :

| "Synthesizing Unit . | Related source file is "top.v". | Found 17-bit addsub for signal . | Summary: | inferred 1 Adder/Subtractor(s). | Unit synthesized. | | INFO:Xst:1767 - HDL ADVISOR - Resource sharing has identified that some arithmetic operations in this design can share the same physical resources for reduced device utilization. For improved clock frequency you may try to disable resource sharing."

------------------------------------------------

But I tried the same top.v module in Altera Quartus-II. I manually turned ON 'auto resource sharing' inside the Project Analsyis settings.

When I synthesize to (Cyclone-II 2C20), Quartus-II creates an adder unit, plus a bunch of separate muxes in front of port 'inb.' I've found that the LUT consumption is TWICE the amount (34 vs 17) as a straight adder-only.

| Resource Usage (my "addsub") | Estimated Total logic elements 34 | | Total combinational functions 34 | Logic element usage by number of LUT inputs | -- 4 input functions 0 | -- 3 input functions 16 | --

Reply to
hlao
Loading thread data ...

Yes Altera LE-based FPGAs work like this. Pretty much the entire LUT is used as a full adder, there is not enough left for an optional ones-complement of the B input to generate selectable subtractor.

This is changed beginning with the Stratix-II ALM design.

But why do you care? How many selectable adder/subtractors are you going to have in your design? Even if it was a lot, Cyclone-II FPGAs have plenty of LUTs.

You would only really care if it made a timing difference. The only way to know this is to run your entire design through each vendor's tools.

There's a good chance that Altera would win this due to a less obvious feature of their 4-input LUTs: namely that the input to output delay depends on which pins are used: some are slow and some are fast (unlike Xilinx which are all even). The tool maps timing critical pins to the fast inputs.

If you want something more significant to compare between X and A, compare Xilinx LUT-RAMs with Altera M4Ks available in Cyclone-II. The combinatorial read port on the LUT-RAMs (and SRL16s) is sometimes very convenient.

--
/*  jhallen@world.std.com AB1GO */                        /* Joseph H. Allen */
int a[1817];main(z,p,q,r){for(p=80;q+p-80;p-=2*a[p])for(z=9;z--;)q=3&(r=time(0)
+r*57)/7,q=q?q-1?q-2?1-p%79?-1:0:p%79-77?1:0:p158?-79:0,q?!a[p+q*2
]?a[p+=a[p+=q]=q]=q:0:0;for(;q++-1817;)printf(q%79?"%c":"%c\n"," #"[!a[q-1]]);}
Reply to
Joseph H Allen

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.