Do you have a question? Post it now! No Registration Necessary
 Koen Van Renterghem
December 13, 2006, 1:23 pm
Hi,
I'm trying to devise an efficient method to generate a ones' complement
sum of two 16 bit integers in a single cycle. I'm targeting a Xilinx
device and right now I'm using a 32 bit adder. Suppose I want to add x
and y :
a,b,result : std_logic_vector(15 downto 0);
x,y,result_32 : std_logic_vector(31 downto 0);
a <= x & y;
b <= y & x;
result_32 <= a + b;
result <= result_32(31 downto 16);
Is there a better way to implement this?
I'm trying to devise an efficient method to generate a ones' complement
sum of two 16 bit integers in a single cycle. I'm targeting a Xilinx
device and right now I'm using a 32 bit adder. Suppose I want to add x
and y :
a,b,result : std_logic_vector(15 downto 0);
x,y,result_32 : std_logic_vector(31 downto 0);
a <= x & y;
b <= y & x;
result_32 <= a + b;
result <= result_32(31 downto 16);
Is there a better way to implement this?
Re: Ones' complement addition
You need the endaround carry. It is pretty easy with a carry
lookahead adder to wire the carry out back to the carry in.
(snip)
You should only need a 16 bit adder with carry in/carry out.
How about:
assign = a+b+carry;
The worst case is where the carry propagates 16 bits.
(Well, the full width.) It never goes farther than that.
 glen
Re: Ones' complement addition
>
>> Hi,
>>
>> I'm trying to devise an efficient method to generate a ones' complement
>> sum of two 16 bit integers in a single cycle. I'm targeting a Xilinx
>> device and right now I'm using a 32 bit adder. Suppose I want to add x
>> and y :
>
> You need the endaround carry. It is pretty easy with a carry
> lookahead adder to wire the carry out back to the carry in.
>
> (snip)
>
> You should only need a 16 bit adder with carry in/carry out.
>
> How about:
>
> assign = a+b+carry;
>
> The worst case is where the carry propagates 16 bits.
> (Well, the full width.) It never goes farther than that.
>
>  glen
>
I previously tried this approach, but that resulted in synplify and
xilinx complaining about a combinatorial loop created by connecting the
carryout to the carryin of the adder. That is why I started using a
dedicated 16 bit adder to calculate the carry. This carry is then
applied to a second adder.
The statement you suggested will save area, but is there a proper way to
constrain it? Right now the combinatorial loop seems to be ignored in
timing analysis :
<snip>
1 logic loop found and disabled.

! Warning: The following connections close logic loops, and some paths !
! through these connections may not be analyzed. !
! !
! Signal Driver Load !
!    !
! csum_2x16_temp<1>_cyo LICE_X46Y82.COUT SLICE_X46Y87.BX !

<\snip>
>> Hi,
>>
>> I'm trying to devise an efficient method to generate a ones' complement
>> sum of two 16 bit integers in a single cycle. I'm targeting a Xilinx
>> device and right now I'm using a 32 bit adder. Suppose I want to add x
>> and y :
>
> You need the endaround carry. It is pretty easy with a carry
> lookahead adder to wire the carry out back to the carry in.
>
> (snip)
>
> You should only need a 16 bit adder with carry in/carry out.
>
> How about:
>
> assign = a+b+carry;
>
> The worst case is where the carry propagates 16 bits.
> (Well, the full width.) It never goes farther than that.
>
>  glen
>
I previously tried this approach, but that resulted in synplify and
xilinx complaining about a combinatorial loop created by connecting the
carryout to the carryin of the adder. That is why I started using a
dedicated 16 bit adder to calculate the carry. This carry is then
applied to a second adder.
The statement you suggested will save area, but is there a proper way to
constrain it? Right now the combinatorial loop seems to be ignored in
timing analysis :
<snip>
1 logic loop found and disabled.

! Warning: The following connections close logic loops, and some paths !
! through these connections may not be analyzed. !
! !
! Signal Driver Load !
!    !
! csum_2x16_temp<1>_cyo LICE_X46Y82.COUT SLICE_X46Y87.BX !

<\snip>
Re: Ones' complement addition
(snip regarding ones complement adder)
The delay will be at most one trip around the carry loop, but
the tools don't know that. It might be that there is a way
to tell the system the delay. You will have to include the
endaround carry through the routing fabric, which will depend
on the routing.
It is nice that it figures out that there is a loop, and to ignore
it. Unless it can specifically recognize ones complement adders,
I don't think there is anything else it could do. That might be
one to ask synplify.
 glen
Re: Ones' complement addition
It looks to me like there is a chance of a pulse running around the carry
loop for a multiple times, perhaps even forever.
The case of interest is a calculation that should result in an answer of
zero. Note that there are two representations of zero in one's complement
notation, all '1's and all '0's, often called negative zero and positive
zero.
If there is a carry, the answer is all '0's, or positive zero. If there is
no carry, the answer will be all '1's, or negative zero.
Now suppose my adder is half in the state with a carry, and half in the
state without a carry. The carry, and not carry will both propagate up the
carry chain to the msb and around to the lsb again, chasing each other,
and it is a race. If carry is faster than the not carry, then the final
result will be all '0's. If not carry is faster than the carry, then the
final result will be all '1's. Assuming you wait long enough, of course.
If carry propagates at exactly the same speed as the not carry, then the
circuit will never produce either of the two correct results. If you check
before the end of the race, then you will get an incorrect result, that is
something other than negative zero or positive zero.
In other words, this circuit has a metastable state when the answer is
zero. It may take forever to to produce a correct answer, if it must
decide between negative zero and positive zero.

Phil Hays (Xilinx, but posting for myself)
Phil Hays (Xilinx, but posting for myself)
Re: Ones' complement addition
(snip regarding carry and ones complement adders)
If this were really a problem I don't believe Cray would have built
ones complement machines when he was trying to build them as fast as
possible.
As far as I know (never having actually tried to build a ones complement
machine) it is more usual to use a subtractor than adder. The reason I
always thought it was done was to reduce the negative zero results.
Does a subtractor still have this effect?
Also, a Xilinx implementation will have the wraparound carry much slower
than the others.
Is real logic symmetric in propagation time for zero and one?
 glen
Re: Ones' complement addition
There is a reason why ones complement isn't done on recent computers. Even
Cray didn't use it in his later machines. For example, a google search
found me this:
http://docs.cray.com/books/0042179001/html0042179001/rvc5mrwh.html
There are solutions to this problem. From example, hold the last carry
until there is a stable next carry. The carry was faster than the sum,
as it was generated by a carry lookahead, and the sum required carry
ripple between several bits. From memory cells not refreshed since about
1975 or so, the CDC6600 did something like this. I just wrote some simple
programs on one, perhaps someone who knows more about the details will
comment.
Why would a subtractor produce more or less negative zero results?
Which doesn't help. It is still a loop.
It can be.
CMOS logic is inverting. A carry value of "zero" or "one" must be both a
physical low and a physical high at different stages of the circuit. While
the high to low transition may be faster than the low to high transition,
the next stage is the inverse, so some or all of the difference cancels
out.

Phil Hays (Xilinx, but speaking for myself)
Phil Hays (Xilinx, but speaking for myself)
Re: Ones' complement addition
Phil Hays wrote about a one's complement adder, which uses
endaround carry:
There is no way that can happen if the inputs are stable. For an nbit
one's complement adder, there can't be a carry propogation wider than n
bits.
In other words, in a 16bit adder, a carry propogation might start in
position 5, but even if it wraps it cannot cause carry propogation
past position 4.
You can prove this by exhaustion for a short word width, and then by
induction for longer word widths.
The worstcase add time for an nbit one's complement ripple carry adder
is exactly the same as the worstcase for an nbit two's complement
ripple carry adder. This is why there wasn't a problem with using one's
complement in various early computers, including the early IBM
mainframes, DEC PDP1, and CDC 6600.
If you're suggesting that there is a case in which the endaround
carry won't produce the numerically correct result (assuming
that the numerical values of "positive zero" and "negative zero" are
equal), you are mistaken.
There is no combination of operands for which that can happen.
For the sum to be zero (whether represented as "positive zero" or
"negative zero", the operands must be n and n. But if that is
true, then the bit patterns representing the operands are
bitwisecomplementary, so there will be no carries at all.
For example, suppose you add 3 and 3 with a fourbit one's complement
adder:
0011
1100

1111
There is no way for any carry to occur in such a case; the result
is "negative zero". Similarly for adding "positive zero" and
"negative zero".
If you add "positive zero" to itself, there are no carries, and the
result is "positive zero".
If you add "negative zero" to itself, you get:
1111
1111

11110 < before endaround carry

1111 < after endaround carry
Thus the final result is "negative zero".
If you still think this is possible, please give an example of operands
for which it could occur.
Eric
endaround carry:
There is no way that can happen if the inputs are stable. For an nbit
one's complement adder, there can't be a carry propogation wider than n
bits.
In other words, in a 16bit adder, a carry propogation might start in
position 5, but even if it wraps it cannot cause carry propogation
past position 4.
You can prove this by exhaustion for a short word width, and then by
induction for longer word widths.
The worstcase add time for an nbit one's complement ripple carry adder
is exactly the same as the worstcase for an nbit two's complement
ripple carry adder. This is why there wasn't a problem with using one's
complement in various early computers, including the early IBM
mainframes, DEC PDP1, and CDC 6600.
If you're suggesting that there is a case in which the endaround
carry won't produce the numerically correct result (assuming
that the numerical values of "positive zero" and "negative zero" are
equal), you are mistaken.
There is no combination of operands for which that can happen.
For the sum to be zero (whether represented as "positive zero" or
"negative zero", the operands must be n and n. But if that is
true, then the bit patterns representing the operands are
bitwisecomplementary, so there will be no carries at all.
For example, suppose you add 3 and 3 with a fourbit one's complement
adder:
0011
1100

1111
There is no way for any carry to occur in such a case; the result
is "negative zero". Similarly for adding "positive zero" and
"negative zero".
If you add "positive zero" to itself, there are no carries, and the
result is "positive zero".
If you add "negative zero" to itself, you get:
1111
1111

11110 < before endaround carry

1111 < after endaround carry
Thus the final result is "negative zero".
If you still think this is possible, please give an example of operands
for which it could occur.
Eric
Re: Ones' complement addition
Let me give an example. Note that the two inputs can be anything that are
inverses of each other. The first two cases are stable.
1111
+0000
+0 Carry in
======
1111 carry out 0
Note that the carry bits are 0000
or:
1111
+0000
+1 Carry in
======
0000 carry out 1
Note that the carry bits are 1111
With me so far?
To describe an unstable case, I'm going to show only the carry bits. The
inputs are the same as above. I'm assuming ripple carry, and that one time
step is required for propagation of carry for one bit position.
Time
0 Carry 0011
1 Carry 0110
2 Carry 1100
3 Carry 1001
4 Carry 0011
5 Carry 0110
6 Carry 1100
7 Carry 1001
8 Carry 0011
...
The sum would also not be stable.
There are multiple ways to avoid the unstable cases. A few that come to
mind with little effort:
1) Force the previous carry until the carry output is stable. This has the
side effect of making the result (positive or negative zero) depend on the
previous computation. (I think that the CDC6600 used this method)
2) If the carry look ahead shows all propagate bits are '1', then generate
a carry in = '1' regardless of the carry input or any generate bits. This
has the side effect of producing only positive zeros any add unless both
operands were negative zeros.
3) Force a carry input of '0' until the carry output is stable. This has
the side effect of never producing a positive zero output for any add
unless both operands were positive zeros. (Hand computed examples often
use this method)
4) Force a carry input of '1' until the carry output is stable. This has
the side effect of producing only positive zeros any add unless both
operands were negative zeros.

Phil Hays (Xilinx, but posting for myself)
Phil Hays (Xilinx, but posting for myself)
Re: Ones' complement addition
Phil Hays wrote about a one's complement adder, which uses endaround
carry:
A one's complement adder does not have a "carryin" input, because
there isn't s a simple one's complement "full adder" cell with
three inputs. A one's complement adder consists of a chain of
binary full adders with the carryout from the MSB routed to the
carryin of the LSB; no individual piece of that structure can
stand on its own as a one's complement adder. In particular, there
is no simple way to daisychain two one's complement adders into
a longer one's complement adder.
If you need to add two nbit one's complement numbers and an unsigned
onebit carryin, you need a threeinput nbit one's complement adder,
which is composed of *two* nbit twoinput one's complement adders:
n 
A />  1's  n 
 comp  />  1's  n
B />  adder   comp  /> result
n  >  adder 
 
n1 
zeros /

carry_in /
1
The internal structure of each 1's complement adder as shown is
an nbit binary full adder with endaround carry.
The logic of the second adder can be simplified slightly due to n1
of its inputs being hardwired to zero, but it still has to be the full
n bits wide.
If you don't build it this way, you don't get the correct result. If
you do build it this way, there isn't any way for the circuit to be
unstable (oscillate), though it is of course possible to get an
arithmetic overflow.
Eric
carry:
A one's complement adder does not have a "carryin" input, because
there isn't s a simple one's complement "full adder" cell with
three inputs. A one's complement adder consists of a chain of
binary full adders with the carryout from the MSB routed to the
carryin of the LSB; no individual piece of that structure can
stand on its own as a one's complement adder. In particular, there
is no simple way to daisychain two one's complement adders into
a longer one's complement adder.
If you need to add two nbit one's complement numbers and an unsigned
onebit carryin, you need a threeinput nbit one's complement adder,
which is composed of *two* nbit twoinput one's complement adders:
n 
A />  1's  n 
 comp  />  1's  n
B />  adder   comp  /> result
n  >  adder 
 
n1 
zeros /

carry_in /
1
The internal structure of each 1's complement adder as shown is
an nbit binary full adder with endaround carry.
The logic of the second adder can be simplified slightly due to n1
of its inputs being hardwired to zero, but it still has to be the full
n bits wide.
If you don't build it this way, you don't get the correct result. If
you do build it this way, there isn't any way for the circuit to be
unstable (oscillate), though it is of course possible to get an
arithmetic overflow.
Eric
Re: Ones' complement addition
Please excuse my poor choice of words. Replaced "carryin" with
"carrywrapped around input to LSB", which is the same as the "carryout"
or the "carrywrapped around output from the MSB" after the propagation
delay from the MSB's output to the input of the LSB. So let me try again:
An example. Note that the two inputs can be anything that are inverses of
each other. The first two cases are stable.
0111
+1000
+0 Carry wrapped around input to LSB
======
1111 carry wrapped around output from MSB = 0
Note that the carry bits are 0000
or:
0111
+1000
+1 Carry wrapped around input to LSB
======
0000 carry wrapped around output from MSB = 1
Note that the carry bits are 1111
The key point here is that there are two correct answers in ones
complement to the problem:
7  7 = ?
And these answers are positive zero and negative zero.
With me so far?
To describe an unstable case, I'm going to show only the carry bits. The
inputs are the same as above. I'm assuming ripple carry, and that one time
step is required for propagation of carry for one bit position.
Time
0 Carry 0011
1 Carry 0110
2 Carry 1100
3 Carry 1001
4 Carry 0011
5 Carry 0110
6 Carry 1100
7 Carry 1001
8 Carry 0011
...
The sum would also not be stable.
There are multiple ways to avoid the unstable cases. A few that come to
mind with little effort:
1) Force the previous carry wrap around input into the LSB until the carry
wrap around output from the MSB is stable. This has the side effect of
making the result (positive or negative zero) depend on the previous
computation. (I think that the CDC6600 used this method)
2) If the carry look ahead shows all propagate bits are '1', then generate
a carry wrap around input into the LSB = '1' regardless of the carry wrap
around output from the MSB or any generate bits. This has the side effect
of producing only positive zeros for any add unless both operands were
negative zeros.
3) Force a carry wrap around input of '0' until the carry wrap around
output is stable. This has the side effect of never producing a positive
zero output for any add unless both operands were positive zeros. (Hand
computed examples often use this method)
4) Force a carry wrap around input of '1' until the carry wrap around
output is stable. This has the side effect of producing only positive
zeros any add unless both operands were negative zeros.
Any method that must decide between two correct answers may take forever
to come to one of the two answers.

Phil Hays (Xilinx, but posting for myself)
Phil Hays (Xilinx, but posting for myself)
Re: Ones' complement addition
My apologies for misunderstanding your point.
After further consideration, I believe you are correct that the cases
where the result is numerically zero can be unstable, unless a method
to precondition at least one stage of the carry chain is used (as done
in your suggestions).
When I find a bit of spare time, I'll look into what method was used
by the PDP1. As a guess, I suspect the design probably is such that
the initial state of the carry signals is zero, in which case the
result is stable even for operands that add to zero.
Eric
Re: Ones' complement addition
This post started with the question how ones' complement is best done on
FPGAs (Xilinx)
From the discussion we learned that interconnecting the carryout and
carryin is a bad idea, as this can become unstable when A and ~A are
added. (Thanks for clarifying this!)
With your suggestions in mind I've tried to come up with a couple of
alternatives to build a 16 bit ones' complement adder, requiring a
single clock cycle adding the operands A and B :
1. Make a chain of two 16 bit two's complement adders both calculating
A+B. Use the carryout of one of the adders as carryin for the other
adder. This is like building a 32bit adder calculating A&B + B&A.
The 16 most significant bits are used as the result.
2. Use a single 16 bit two's complement adder with endaround carry. If
we put a flip flop triggered on the falling edge in the carry loopback
path we can avoid the instability problems.
On the rising edge this flip flop should be set to zero, on the falling
edge it clocks in the carryout.
3. Instantiate two adders. One calculates A+B+0 and the other A+B+1.
Then add a 2:1 multiplexer and use the carryout of A+B+0 to select the
right sum. If the carryout is zero, multiplex A+B+0 onto the output,
otherwise multiplex the result of A+B+1 to the output
I guess 1 & 2 will reach similar speeds, but 2 uses less hardware.
Option 3 seems interesting as it could be faster then the other
alternatives.
I would be happy to hear your ideas about this!
Koen.
Site Timeline
 » more of ERROR:MapLib:661
 — Next thread in » FieldProgrammable Gate Arrays
 » Maplib Error 661.
 — Previous thread in » FieldProgrammable Gate Arrays
 » FPGA sensitivities
 — Newest thread in » FieldProgrammable Gate Arrays
 » MCP41HVx1 anyone?
 — The site's Newest Thread. Posted in » Electronics Design
 » FPGA sensitivities
 — The site's Last Updated Thread. Posted in » FieldProgrammable Gate Arrays