Ones' complement addition

- K
- Koen Van Renterghem
  
  Contact options for registered users
posted
17 years ago

Wed, Dec 13, 2006 1:23 PM

Hi,

I'm trying to devise an efficient method to generate a ones' complement sum of two 16 bit integers in a single cycle. I'm targeting a Xilinx device and right now I'm using a 32 bit adder. Suppose I want to add x and y :

a,b,result : std_logic_vector(15 downto 0); x,y,result_32 : std_logic_vector(31 downto 0);

a

- G
- glen herrmannsfeldt
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Thu, Dec 14, 2006 7:49 AM

You need the end-around carry. It is pretty easy with a carry lookahead adder to wire the carry out back to the carry in.

(snip)

You should only need a 16 bit adder with carry in/carry out.

How about:

assign {carry,result} = a+b+carry;

The worst case is where the carry propagates 16 bits. (Well, the full width.) It never goes farther than that.

-- glen

- K
- Koen Van Renterghem
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Thu, Dec 14, 2006 2:20 PM

glen herrmannsfeldt wrote: > Koen Van Renterghem wrote: >

I previously tried this approach, but that resulted in synplify and xilinx complaining about a combinatorial loop created by connecting the carry-out to the carry-in of the adder. That is why I started using a dedicated 16 bit adder to calculate the carry. This carry is then applied to a second adder.

The statement you suggested will save area, but is there a proper way to constrain it? Right now the combinatorial loop seems to be ignored in timing analysis :

1 logic loop found and disabled.

---------------------------------------------------------------------- ! Warning: The following connections close logic loops, and some paths ! ! through these connections may not be analyzed. ! ! ! ! Signal Driver Load ! ! -------------------------------- ---------------- ---------------- ! ! csum_2x16_temp_cyo LICE_X46Y82.COUT SLICE_X46Y87.BX ! ----------------------------------------------------------------------

- G
- glen herrmannsfeldt
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Mon, Jan 8, 2007 10:18 AM

(snip regarding ones complement adder)

The delay will be at most one trip around the carry loop, but the tools don't know that. It might be that there is a way to tell the system the delay. You will have to include the end-around carry through the routing fabric, which will depend on the routing.

It is nice that it figures out that there is a loop, and to ignore it. Unless it can specifically recognize ones complement adders, I don't think there is anything else it could do. That might be one to ask synplify.

-- glen

- P
- Phil Hays
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Mon, Jan 8, 2007 3:44 PM

It looks to me like there is a chance of a pulse running around the carry loop for a multiple times, perhaps even forever.

The case of interest is a calculation that should result in an answer of zero. Note that there are two representations of zero in one's complement notation, all '1's and all '0's, often called negative zero and positive zero.

If there is a carry, the answer is all '0's, or positive zero. If there is no carry, the answer will be all '1's, or negative zero.

Now suppose my adder is half in the state with a carry, and half in the state without a carry. The carry, and not carry will both propagate up the carry chain to the msb and around to the lsb again, chasing each other, and it is a race. If carry is faster than the not carry, then the final result will be all '0's. If not carry is faster than the carry, then the final result will be all '1's. Assuming you wait long enough, of course. If carry propagates at exactly the same speed as the not carry, then the circuit will never produce either of the two correct results. If you check before the end of the race, then you will get an incorrect result, that is something other than negative zero or positive zero.

In other words, this circuit has a metastable state when the answer is zero. It may take forever to to produce a correct answer, if it must decide between negative zero and positive zero.

--
Phil Hays (Xilinx, but posting for myself)

- G
- glen herrmannsfeldt
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jan 12, 2007 10:17 AM

(snip regarding carry and ones complement adders)

If this were really a problem I don't believe Cray would have built ones complement machines when he was trying to build them as fast as possible.

As far as I know (never having actually tried to build a ones complement machine) it is more usual to use a subtractor than adder. The reason I always thought it was done was to reduce the negative zero results. Does a subtractor still have this effect?

Also, a Xilinx implementation will have the wraparound carry much slower than the others.

Is real logic symmetric in propagation time for zero and one?

-- glen

- P
- Phil Hays
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jan 12, 2007 5:37 PM

There is a reason why ones complement isn't done on recent computers. Even Cray didn't use it in his later machines. For example, a google search found me this:

formatting link

There are solutions to this problem. From example, hold the last carry until there is a stable next carry. The carry was faster than the sum, as it was generated by a carry lookahead, and the sum required carry ripple between several bits. From memory cells not refreshed since about

1975 or so, the CDC6600 did something like this. I just wrote some simple programs on one, perhaps someone who knows more about the details will comment.

Why would a subtractor produce more or less negative zero results?

Which doesn't help. It is still a loop.

It can be.

CMOS logic is inverting. A carry value of "zero" or "one" must be both a physical low and a physical high at different stages of the circuit. While the high to low transition may be faster than the low to high transition, the next stage is the inverse, so some or all of the difference cancels out.

--
Phil Hays (Xilinx, but speaking for myself)

- E
- Eric Smith
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Sun, Jan 14, 2007 6:22 AM

Phil Hays wrote about a one's complement adder, which uses end-around carry:

There is no way that can happen if the inputs are stable. For an n-bit one's complement adder, there can't be a carry propogation wider than n bits.

In other words, in a 16-bit adder, a carry propogation might start in position 5, but even if it wraps it cannot cause carry propogation past position 4.

You can prove this by exhaustion for a short word width, and then by induction for longer word widths.

The worst-case add time for an n-bit one's complement ripple carry adder is exactly the same as the worst-case for an n-bit two's complement ripple carry adder. This is why there wasn't a problem with using one's complement in various early computers, including the early IBM mainframes, DEC PDP-1, and CDC 6600.

If you're suggesting that there is a case in which the end-around carry won't produce the numerically correct result (assuming that the numerical values of "positive zero" and "negative zero" are equal), you are mistaken.

There is no combination of operands for which that can happen.

For the sum to be zero (whether represented as "positive zero" or "negative zero", the operands must be n and -n. But if that is true, then the bit patterns representing the operands are bitwise-complementary, so there will be no carries at all.

For example, suppose you add 3 and -3 with a four-bit one's complement adder:

0011 1100 ---- 1111

There is no way for any carry to occur in such a case; the result is "negative zero". Similarly for adding "positive zero" and "negative zero".

If you add "positive zero" to itself, there are no carries, and the result is "positive zero".

If you add "negative zero" to itself, you get:

1111 1111 ---- 11110 zero.

If you still think this is possible, please give an example of operands for which it could occur.

Eric

- P
- Phil Hays
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Sun, Jan 14, 2007 6:00 PM

Let me give an example. Note that the two inputs can be anything that are inverses of each other. The first two cases are stable.

1111 +0000 +0 Carry in ====== 1111 carry out 0

Note that the carry bits are 0000

or:

1111 +0000 +1 Carry in ====== 0000 carry out 1

Note that the carry bits are 1111

With me so far?

To describe an unstable case, I'm going to show only the carry bits. The inputs are the same as above. I'm assuming ripple carry, and that one time step is required for propagation of carry for one bit position.

Time

0 Carry 0011 1 Carry 0110 2 Carry 1100 3 Carry 1001 4 Carry 0011 5 Carry 0110 6 Carry 1100 7 Carry 1001 8 Carry 0011 ...

The sum would also not be stable.

There are multiple ways to avoid the unstable cases. A few that come to mind with little effort:

1) Force the previous carry until the carry output is stable. This has the side effect of making the result (positive or negative zero) depend on the previous computation. (I think that the CDC6600 used this method)

2) If the carry look ahead shows all propagate bits are '1', then generate a carry in = '1' regardless of the carry input or any generate bits. This has the side effect of producing only positive zeros any add unless both operands were negative zeros.

3) Force a carry input of '0' until the carry output is stable. This has the side effect of never producing a positive zero output for any add unless both operands were positive zeros. (Hand computed examples often use this method)

4) Force a carry input of '1' until the carry output is stable. This has the side effect of producing only positive zeros any add unless both operands were negative zeros.

--
Phil Hays (Xilinx, but posting for myself)

- E
- Eric Smith
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Wed, Jan 17, 2007 10:33 PM

Phil Hays wrote about a one's complement adder, which uses end-around carry:

Phil Hays wrote:

A one's complement adder does not have a "carry-in" input, because there isn't s a simple one's complement "full adder" cell with three inputs. A one's complement adder consists of a chain of binary full adders with the carry-out from the MSB routed to the carry-in of the LSB; no individual piece of that structure can stand on its own as a one's complement adder. In particular, there is no simple way to daisy-chain two one's complement adders into a longer one's complement adder.

If you need to add two n-bit one's complement numbers and an unsigned one-bit carry-in, you need a three-input n-bit one's complement adder, which is composed of *two* n-bit two-input one's complement adders:

n --------- A ---/---> | 1's | n --------- | comp | ---/------> | 1's | n B ---/---> | adder | | comp | ----/---> result n --------- ----> | adder | | --------- n-1 | zeros -----/-------------------| | carry_in -----/-------------------- 1

The internal structure of each 1's complement adder as shown is an n-bit binary full adder with end-around carry.

The logic of the second adder can be simplified slightly due to n-1 of its inputs being hardwired to zero, but it still has to be the full n bits wide.

If you don't build it this way, you don't get the correct result. If you do build it this way, there isn't any way for the circuit to be unstable (oscillate), though it is of course possible to get an arithmetic overflow.

Eric

- K
- Koen Van Renterghem
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Fri, Jan 19, 2007 9:11 AM

Eric,

I was wondering if you can provide a reference to a good book or article that deals with the proof you are mentioning here?

Regards, Koen

- P
- Phil Hays
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Sun, Jan 21, 2007 6:06 PM

Please excuse my poor choice of words. Replaced "carry-in" with "carry-wrapped around input to LSB", which is the same as the "carry-out" or the "carry-wrapped around output from the MSB" after the propagation delay from the MSB's output to the input of the LSB. So let me try again:

An example. Note that the two inputs can be anything that are inverses of each other. The first two cases are stable.

0111 +1000 +0 Carry wrapped around input to LSB ====== 1111 carry wrapped around output from MSB = 0

Note that the carry bits are 0000

or:

0111 +1000 +1 Carry wrapped around input to LSB ====== 0000 carry wrapped around output from MSB = 1

Note that the carry bits are 1111

The key point here is that there are two correct answers in ones complement to the problem:

7 - 7 = ?

And these answers are positive zero and negative zero.

With me so far?

To describe an unstable case, I'm going to show only the carry bits. The inputs are the same as above. I'm assuming ripple carry, and that one time step is required for propagation of carry for one bit position.

Time

0 Carry 0011 1 Carry 0110 2 Carry 1100 3 Carry 1001 4 Carry 0011 5 Carry 0110 6 Carry 1100 7 Carry 1001 8 Carry 0011 ...

The sum would also not be stable.

There are multiple ways to avoid the unstable cases. A few that come to mind with little effort:

1) Force the previous carry wrap around input into the LSB until the carry wrap around output from the MSB is stable. This has the side effect of making the result (positive or negative zero) depend on the previous computation. (I think that the CDC6600 used this method)

2) If the carry look ahead shows all propagate bits are '1', then generate a carry wrap around input into the LSB = '1' regardless of the carry wrap around output from the MSB or any generate bits. This has the side effect of producing only positive zeros for any add unless both operands were negative zeros.

3) Force a carry wrap around input of '0' until the carry wrap around output is stable. This has the side effect of never producing a positive zero output for any add unless both operands were positive zeros. (Hand computed examples often use this method)

4) Force a carry wrap around input of '1' until the carry wrap around output is stable. This has the side effect of producing only positive zeros any add unless both operands were negative zeros.

Any method that must decide between two correct answers may take forever to come to one of the two answers.

--
Phil Hays (Xilinx, but posting for myself)

- E
- Eric Smith
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Mon, Jan 22, 2007 11:37 PM

My apologies for misunderstanding your point.

After further consideration, I believe you are correct that the cases where the result is numerically zero can be unstable, unless a method to precondition at least one stage of the carry chain is used (as done in your suggestions).

When I find a bit of spare time, I'll look into what method was used by the PDP-1. As a guess, I suspect the design probably is such that the initial state of the carry signals is zero, in which case the result is stable even for operands that add to zero.

Eric

- K
- Koen Van Renterghem
  
  Contact options for registered users
Vote on answer
posted
17 years ago

Tue, Jan 23, 2007 9:39 AM

This post started with the question how ones' complement is best done on FPGAs (Xilinx)

From the discussion we learned that interconnecting the carry-out and carry-in is a bad idea, as this can become unstable when A and ~A are added. (Thanks for clarifying this!)

With your suggestions in mind I've tried to come up with a couple of alternatives to build a 16 bit ones' complement adder, requiring a single clock cycle adding the operands A and B :

Make a chain of two 16 bit two's complement adders both calculating A+B. Use the carry-out of one of the adders as carry-in for the other adder. This is like building a 32bit adder calculating A&B + B&A. The 16 most significant bits are used as the result.
Use a single 16 bit two's complement adder with end-around carry. If we put a flip flop triggered on the falling edge in the carry loopback path we can avoid the instability problems. On the rising edge this flip flop should be set to zero, on the falling edge it clocks in the carry-out.
Instantiate two adders. One calculates A+B+0 and the other A+B+1. Then add a 2:1 multiplexer and use the carry-out of A+B+0 to select the right sum. If the carry-out is zero, multiplex A+B+0 onto the output, otherwise multiplex the result of A+B+1 to the output

I guess 1 & 2 will reach similar speeds, but 2 uses less hardware. Option 3 seems interesting as it could be faster then the other alternatives.

I would be happy to hear your ideas about this!

Koen.