Pipelining on Multiple Clock Edges

I recall a processor implementation where the guy tried to say that one particular part of the pipeline design had a register inserted which was clocked on the negative edge. I could never see how this would positively impact anything. In fact, the setup and hold time of the register, not to mention the routing time, would add to the delay in that pipeline stage.

Was I missing something or is this ever used to advantage?

--

Rick C
Reply to
rickman
Loading thread data ...

Opposite edge pipe registers can be useful if your clock distribution scheme is not able to guarantee the required hold time. I've used this in early Xilinx parts that had only 4 internal clock buffers and I needed to bring in more (relatively slow) inputs using an additional clock. In those parts you could use "low skew nets" to route a clock, but even then you'd have hold time issues. In that particular design everything on the poorly routed clocks went back and forth between clock edges. That included things like counters, which would typically use a single N-wide register and feedback from their own outputs. Instead I needed two N-wide registers (one on each clock) to remove hold time in the feedback paths. Obviously this would be painful to do a whole design in, but for me it worked enough to get the data into distributed RAM for transfer to one of the internal global clock domains.

--
Gabor
Reply to
Gabor

This is an issue of poor clock distribution. The guy using the opposite edge registers was saying it added a pipeline stage the same as the positive edge registers. Even if this was done for all logic on all stages it would not be the same as adding more positive edge registers because it doesn't speed up the clock. In fact the added setup and hold time of the added register slows down the circuit.

--

Rick C
Reply to
rickman

I guess there could be some way that the logic going to and from that regis ter is fast enough that it would be possible to get and extra cycle for free

Reply to
lasselangwadtchristensen

Sometimes you want a pipeline stage to work in a different clock phase from other stages. This is sometimes done to fit the write-back stage and the o p fetch stage in the same clock cycle. Another example was the original MIP S 2000 and how it used the same pins for both the instruction and data cach es by using a different phase for the fetch pipeline stage.

And while it is something different, see how the three stage ARM Cortex M0+ pipeline is made to look like a two stage pipeline:

formatting link

The alternative is to use a clock with twice the frequency and have enables that make some stages work on even clocks and others on odd ones.

-- Jecel

Reply to
Jecel

I imagine it was used to transfer slack from one stage to another. Imagine it's 1976, and you have everything laid out, but then you find that you ha ve some stage with negative slack (let's say a multiplier) followed by a st age with positive slack (let's say a mux). It's hard to move registers bac k into the multiplier, partly because it would increase the number of FFs, and partly because it's 1976 and you'd have to re-tape everything. So you just have the mux grab the data on the falling clock edge, transferring hal f a period of slack from the mux to the multiplier so the multiplier has 1.

5 cycles and the mux has 0.5. Something like that.
Reply to
Kevin Neilson

I don't know if you have seen this before, but something similar is described in the book, "But How Do It Know?" by J. Scott Clark:

formatting link

Someone made a video describing how it is useful for certain types of slow-clock CPUs:

formatting link

If you look, the computation takes place nearer to the positive edge, and the write operating takes place nearer to the negative edge, so that enough time takes place in-between to conduct the workload.

I've seen several designs which trigger in this way. There are also several methods described in (I believe) Lattice documentation, which shows how to merge multiple clock signals together to obtain a clock signal that will dwell fire around the negative edge, and dwell fire around the positive edge for various purposes.

Thank you, Rick C. Hodgin

Reply to
Rick C. Hodgin

That implies that the minimum prop delay of the multiplier is guaranteed to be more than 1/2 clock period. Probably also a good bet in 1976. In any case this doesn't represent a pipe stage for 1/2 clock but rather for 1 1/2 clocks.

--
Gabor
Reply to
Gabor

Yes, it depends on mintimes so it's a poor design technique and would probably stop working when you shrink the die.

Reply to
Kevin Neilson

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.