Xilinx fast carry logic path for v5 is described in this document
Xilinx fast carry logic path for v5 is described in this document
document
Thankyou, I have no need to learn the details of the virtex5 carry logic at the moment. The point I'm on is that in the fpga technologies I know the carry chain path has a much faster gate delay compared to the normal gate delay. This means you could do 8 to 16 bit adding in a pipeline running nearly at max technology speed for FF-gate-FF, maybe you need more time multiplexing the data before or after the adder stage than the time you need for adding. In ASIC I've only seen gates with the carry chain having a gate delay compareable to a gate delay, this means, that a 16 bit ripple adder won't be able to run anywhere near FF-gate-FF but needs something like FF-17xgate-FF. In that case you will likely have the adder stage dominating your pipeline frequency. You could only speed up a ripple carry adder by placing it tight together, but this has to be done at any timing critical path anyway.
bye Thomas
There are also different adder implementations at the synthesis tool level. For example Design compiler with proper libraries supports following adders: ripple-carry, carry-look-ahead, delay-optimized flexible parallel-prefix, brent-kung, conditional-sum and ripple-carry-select.
--Kim
document
Then I'm not sure what we're discussing here but I'll try one more time.
That's because "normal gate delay" is so slow because of the programmable gates and routing. Normally when one is designing custom adders, a carry ripple adder is the slowest and smallest adder against which other more sophisticated carry select, carry skip, carry lookahead etc. are judged. One can buy more speed by paying with area and/or power by using one of these architectures. The fact that in an FPGA a dedicated carry ripple adder is the fastest just shows the inefficiency of the fabric. But one gets programmability with that inefficiency so the compromise usually works out. Actually what the FPGAs has should be named "dedicated/hard-wired carry ripple logic & routing" as there is not much "fast" about it. What would've been fast is if they added some carry look ahead logic.
Within a CLB, there certainly is carry look-ahead. It is abstracted out in the user's guides as an implementation detail that is not visible to the user. Be assured however, that there is a carry look-ahead going on in the physical hardware.
Peter, If I remembering correctly there is a TTL IC which has a 4 bit CLA in it. It would fit into the CLB nicely :-)
Do you think of the 74181 (=9341 in Fairchild parlance)? Nostalgia from 1970... It's a 4-bit ALU, but that means 22 signal pins in a 24-pin package:
4+4 operand inputs and 4 result outputs,5 mode controls (one of them to change between logic and arithmetic), carry in, carry out, carry generate, and carry propagate, plus an A=B output thrown in for free. Hard to put into a CLB, unless you define the mode by configuration. Peter
That reminds me of the time I was in one of two row boats racing to shore. I thought I could "help" the race by jumping out and pushing the boat with my swimming. Boy did we suddenly slow down!
A 4-bit CLA in a CLB (wait - is there a CLK? no, no...) is marginally helpful in only the most extreme cases. Very long adders would still need a "generate" signal from each segment in a multi-segment adder (even the 74181 is a 4-bit ALU) when the individual sum is all-1s to go along with the carry from the carry chain. If you wait for the result to decode the generate, you need to get on and off the carry chain and through two levels of logic to detect a 16-bit "generate" or double your counters with A+B and A+B+1 results to come up with a C and G at the same time. What's this gaining? The carry from your 4-bit CLA needs 2 levels of logic to generate the "lookahead" from the 4 segments. A 64 bit counter would need 4 levels of logic plus routing *or* twice the number of adders with 2 levels of logic on top of routing. This is a quick way to make things go very slow.
If you want to accelerate small adders in FPGAs, you can't do it with FPGA logic. The carry chains are already very low propagation though there might be an opportunity to get on and off the carry chains more quickly with focused silicon development, perhaps compromising other performance aspects of the chip to achieve that improved on/off adder delay.
If you want to accelerate very large adders, there are methods that can provide better results than a CLA. You know about carry select, carry skip, etc, you should know that for small adders there's no help in the FPGA.
Don't believe me? Take a splash. Watch the boat slow down. Synthesis is cheap! Or was the smiley face showing an attempt at humor that I just can't grasp?
If you're suggesting that adding dedicated CLA functionality to the FPGA fabric, think of what it takes to produce the generate with the carry and aggregate the signals into the CLA structure. Do you think it could possibly be worth it for 99% of the adders in user designs?
- John_H
Actually I was thinking of 74182. The resemblance between it and page
193 is quite interesting, no? 4 pairs of inputs in addition to carry from previous block. The outputs need changing a little though.
The XC4000 carry logic was pretty well documented, unless they implemented it completely different than it was documented.
I would have called it a form of carry select logic using special properties of pass transistors to minimize delay.
The internal details of the current devices are not as well documented, but still the internal carry should be faster than CLB based carry lookahead for most reasonable length adders.
-- glen
The internal carry chain structure had a wholesale change when either Virtex or VirtexII was introduced (I don't recall which now).
The only thing that really matters is the best combination of logic, circuitry, and transistor technology that achieves the smallest incremental carry delay per bit. And everybody can easily do a static timing analysis to calculate that incremental delay. I think it is about 30 ps per bit.
Peter Alfke, Xilinx
ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.