#### Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

•  Subject
• Author
• Posted on
Hi,

I'm trying to devise an efficient method to generate a ones' complement
sum of two 16 bit integers in a single cycle. I'm targeting a Xilinx
device and right now I'm using a 32 bit adder. Suppose I want to add x
and y :

a,b,result    : std_logic_vector(15 downto 0);
x,y,result_32 : std_logic_vector(31 downto 0);

a <= x & y;
b <= y & x;

result_32 <= a + b;
result  <= result_32(31 downto 16);

Is there a better way to implement this?

You need the end-around carry.  It is pretty easy with a carry
lookahead adder to wire the carry out back to the carry in.

(snip)

You should only need a 16 bit adder with carry in/carry out.

assign = a+b+carry;

The worst case is where the carry propagates 16 bits.
(Well, the full width.)  It never goes farther than that.

-- glen

>
>> Hi,
>>
>> I'm trying to devise an efficient method to generate a ones' complement
>> sum of two 16 bit integers in a single cycle. I'm targeting a Xilinx
>> device and right now I'm using a 32 bit adder. Suppose I want to add x
>> and y :
>
> You need the end-around carry.  It is pretty easy with a carry
> lookahead adder to wire the carry out back to the carry in.
>
> (snip)
>
> You should only need a 16 bit adder with carry in/carry out.
>
>
> assign = a+b+carry;
>
> The worst case is where the carry propagates 16 bits.
> (Well, the full width.)  It never goes farther than that.
>
> -- glen
>
I previously tried this approach, but that resulted in synplify and
xilinx complaining about a combinatorial loop created by connecting the
carry-out to the carry-in of the adder. That is why I started using a
dedicated 16 bit adder to calculate the carry. This carry is then

The statement you suggested will save area, but is there a proper way to
constrain it? Right now the combinatorial loop seems to be ignored in
timing analysis :

<snip>
1 logic loop found and disabled.

----------------------------------------------------------------------
! Warning: The following connections close logic loops, and some paths !
!          through these connections may not be analyzed.              !
!                                                                      !
! --------------------------------  ----------------  ---------------- !
! csum_2x16_temp<1>_cyo             LICE_X46Y82.COUT  SLICE_X46Y87.BX  !
----------------------------------------------------------------------
<\snip>

The delay will be at most one trip around the carry loop, but
the tools don't know that.  It might be that there is a way
to tell the system the delay.  You will have to include the
end-around carry through the routing fabric, which will depend
on the routing.

It is nice that it figures out that there is a loop, and to ignore
it.  Unless it can specifically recognize ones complement adders,
I don't think there is anything else it could do.  That might be

-- glen

It looks to me like there is a chance of a pulse running around the carry
loop for a multiple times, perhaps even forever.

The case of interest is a calculation that should result in an answer of
zero. Note that there are two representations of zero in one's complement
notation, all '1's and all '0's, often called negative zero and positive
zero.

If there is a carry, the answer is all '0's, or positive zero. If there is
no carry, the answer will be all '1's, or negative zero.

Now suppose my adder is half in the state with a carry, and half in the
state without a carry. The carry, and not carry will both propagate up the
carry chain to the msb and around to the lsb again, chasing each other,
and it is a race. If carry is faster than the not carry, then the final
result will be all '0's. If not carry is faster than the carry, then the
final result will be all '1's. Assuming you wait long enough, of course.
If carry propagates at exactly the same speed as the not carry, then the
circuit will never produce either of the two correct results. If you check
before the end of the race, then you will get an incorrect result, that is
something other than negative zero or positive zero.

In other words, this circuit has a metastable state when the answer is
zero. It may take forever to to produce a correct answer, if it must
decide between negative zero and positive zero.

--
Phil Hays (Xilinx, but posting for myself)

(snip regarding carry and ones complement adders)

If this were really a problem I don't believe Cray would have built
ones complement machines when he was trying to build them as fast as
possible.

As far as I know (never having actually tried to build a ones complement
machine) it is more usual to use a subtractor than adder. The reason I
always thought it was done was to reduce the negative zero results.
Does a subtractor still have this effect?

Also, a Xilinx implementation will have the wraparound carry much slower
than the others.

Is real logic symmetric in propagation time for zero and one?

-- glen

There is a reason why ones complement isn't done on recent computers. Even
Cray didn't use it in his later machines. For example, a google search
found me this:

http://docs.cray.com/books/004-2179-001/html-004-2179-001/rvc5mrwh.html

There are solutions to this problem. From example, hold the last carry
until there is a stable next carry. The carry was faster than the sum,
as it was generated by a carry lookahead, and the sum required carry
ripple between several bits. From memory cells not refreshed since about
1975 or so, the CDC6600 did something like this. I just wrote some simple
programs on one, perhaps someone who knows more about the details will
comment.

Why would a subtractor produce more or less negative zero results?

Which doesn't help. It is still a loop.

It can be.

CMOS logic is inverting. A carry value of "zero" or "one" must be both a
physical low and a physical high at different stages of the circuit. While
the high to low transition may be faster than the low to high transition,
the next stage is the inverse, so some or all of the difference cancels
out.

--
Phil Hays (Xilinx, but speaking for myself)

end-around carry:

There is no way that can happen if the inputs are stable.  For an n-bit
one's complement adder, there can't be a carry propogation wider than n
bits.

In other words, in a 16-bit adder, a carry propogation might start in
position 5, but even if it wraps it cannot cause carry propogation
past position 4.

You can prove this by exhaustion for a short word width, and then by
induction for longer word widths.

The worst-case add time for an n-bit one's complement ripple carry adder
is exactly the same as the worst-case for an n-bit two's complement
ripple carry adder.  This is why there wasn't a problem with using one's
complement in various early computers, including the early IBM
mainframes, DEC PDP-1, and CDC 6600.

If you're suggesting that there is a case in which the end-around
carry won't produce the numerically correct result (assuming
that the numerical values of "positive zero" and "negative zero" are
equal), you are mistaken.

There is no combination of operands for which that can happen.

For the sum to be zero (whether represented as "positive zero" or
"negative zero", the operands must be n and -n.  But if that is
true, then the bit patterns representing the operands are
bitwise-complementary, so there will be no carries at all.

For example, suppose you add 3 and -3 with a four-bit one's complement

0011
1100
----
1111

There is no way for any carry to occur in such a case; the result
is "negative zero".  Similarly for adding "positive zero" and
"negative zero".

If you add "positive zero" to itself, there are no carries, and the
result is "positive zero".

If you add "negative zero" to itself, you get:

1111
1111
----
11110  <- before end-around carry
----
1111  <- after end-around carry

Thus the final result is "negative zero".

If you still think this is possible, please give an example of operands
for which it could occur.

Eric

Let me give an example. Note that the two inputs can be anything that are
inverses of each other. The first two cases are stable.

1111
+0000
+0 Carry in
======
1111 carry out 0

Note that the carry bits are 0000

or:

1111
+0000
+1 Carry in
======
0000 carry out 1

Note that the carry bits are 1111

With me so far?

To describe an unstable case, I'm going to show only the carry bits. The
inputs are the same as above. I'm assuming ripple carry, and that one time
step is required for propagation of carry for one bit position.

Time
0     Carry  0011
1     Carry  0110
2     Carry  1100
3     Carry  1001
4     Carry  0011
5     Carry  0110
6     Carry  1100
7     Carry  1001
8     Carry  0011
...

The sum would also not be stable.

There are multiple ways to avoid the unstable cases. A few that come to
mind with little effort:

1) Force the previous carry until the carry output is stable. This has the
side effect of making the result (positive or negative zero) depend on the
previous computation. (I think that the CDC6600 used this method)

2) If the carry look ahead shows all propagate bits are '1', then generate
a carry in = '1' regardless of the carry input or any generate bits. This
has the side effect of producing only positive zeros any add unless both
operands were negative zeros.

3) Force a carry input of '0' until the carry output is stable. This has
the side effect of never producing a positive zero output for any add
unless both operands were positive zeros. (Hand computed examples often
use this method)

4) Force a carry input of '1' until the carry output is stable. This has
the side effect of producing only positive zeros any add unless both
operands were negative zeros.

--
Phil Hays (Xilinx, but posting for myself)

carry:

A one's complement adder does not have a "carry-in" input, because
there isn't s a simple one's complement "full adder" cell with
three inputs.  A one's complement adder consists of a chain of
binary full adders with the carry-out from the MSB routed to the
carry-in of the LSB; no individual piece of that structure can
stand on its own as a one's complement adder.  In particular, there
is no simple way to daisy-chain two one's complement adders into

If you need to add two n-bit one's complement numbers and an unsigned
one-bit carry-in, you need a three-input n-bit one's complement adder,
which is composed of *two* n-bit two-input one's complement adders:

n     ---------
A ---/---> | 1's   |    n        ---------
| comp  | ---/------> | 1's   |     n
B ---/---> | adder |             | comp  | ----/---> result
n     ---------       ----> | adder |
|     ---------
n-1                  |
zeros -----/-------------------|
|
carry_in -----/--------------------
1

The internal structure of each 1's complement adder as shown is
an n-bit binary full adder with end-around carry.

The logic of the second adder can be simplified slightly due to n-1
of its inputs being hardwired to zero, but it still has to be the full
n bits wide.

If you don't build it this way, you don't get the correct result.  If
you do build it this way, there isn't any way for the circuit to be
unstable (oscillate), though it is of course possible to get an
arithmetic overflow.

Eric

Please excuse my poor choice of words. Replaced "carry-in" with
"carry-wrapped around input to LSB", which is the same as the "carry-out"
or the "carry-wrapped around output from the MSB" after the propagation
delay from the MSB's output to the input of the LSB. So let me try again:

An example. Note that the two inputs can be anything that are inverses of
each other. The first two cases are stable.

0111
+1000
+0 Carry wrapped around input to LSB
======
1111 carry wrapped around output from MSB = 0

Note that the carry bits are 0000

or:

0111
+1000
+1 Carry wrapped around input to LSB
======
0000 carry wrapped around output from MSB = 1

Note that the carry bits are 1111

The key point here is that there are two correct answers in ones
complement to the problem:

7 - 7 = ?

And these answers are positive zero and negative zero.

With me so far?

To describe an unstable case, I'm going to show only the carry bits. The
inputs are the same as above. I'm assuming ripple carry, and that one time
step is required for propagation of carry for one bit position.

Time
0     Carry  0011
1     Carry  0110
2     Carry  1100
3     Carry  1001
4     Carry  0011
5     Carry  0110
6     Carry  1100
7     Carry  1001
8     Carry  0011
...

The sum would also not be stable.

There are multiple ways to avoid the unstable cases. A few that come to
mind with little effort:

1) Force the previous carry wrap around input into the LSB until the carry
wrap around output from the MSB is stable. This has the side effect of
making the result (positive or negative zero) depend on the previous
computation. (I think that the CDC6600 used this method)

2) If the carry look ahead shows all propagate bits are '1', then generate
a carry wrap around input into the LSB = '1' regardless of the carry wrap
around output from the MSB or any generate bits. This has the side effect
of producing only positive zeros for any add unless both operands were
negative zeros.

3) Force a carry wrap around input of '0' until the carry wrap around
output is stable. This has the side effect of never producing a positive
zero output for any add unless both operands were positive zeros. (Hand
computed examples often use this method)

4) Force a carry wrap around input of '1' until the carry wrap around
output is stable. This has the side effect of producing only positive
zeros any add unless both operands were negative zeros.

Any method that must decide between two correct answers may take forever
to come to one of the two answers.

--
Phil Hays (Xilinx, but posting for myself)

My apologies for misunderstanding your point.

After further consideration, I believe you are correct that the cases
where the result is numerically zero can be unstable, unless a method
to precondition at least one stage of the carry chain is used (as done

When I find a bit of spare time, I'll look into what method was used
by the PDP-1.  As a guess, I suspect the design probably is such that
the initial state of the carry signals is zero, in which case the
result is stable even for operands that add to zero.

Eric

This post started with the question how ones' complement is best done on
FPGAs (Xilinx)

From the discussion we learned that interconnecting the carry-out and
carry-in is a bad idea, as this can become unstable when A and ~A are

With your suggestions in mind I've tried to come up with a couple of
alternatives to build a 16 bit ones' complement adder, requiring a
single clock cycle adding the operands A and B :

1. Make a chain of two 16 bit two's complement adders both calculating
A+B. Use the carry-out of one of the adders as carry-in for the other
adder. This is like building a 32bit adder calculating A&B + B&A.
The 16 most significant bits are used as the result.

2. Use a single 16 bit two's complement adder with end-around carry. If
we put a flip flop triggered on the falling edge in the carry loopback
path we can avoid the instability problems.
On the rising edge this flip flop should be set to zero, on the falling
edge it clocks in the carry-out.

3. Instantiate two adders. One calculates A+B+0 and the other A+B+1.
Then add a 2:1 multiplexer and use the carry-out of A+B+0 to select the
right sum. If the carry-out is zero, multiplex A+B+0 onto the output,
otherwise multiplex the result of A+B+1 to the output

I guess 1 & 2 will reach similar speeds, but 2 uses less hardware.
Option 3 seems interesting as it could be faster then the other
alternatives.

Koen.