Pipelining on Multiple Clock Edges

Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

Threaded View
I recall a processor implementation where the guy tried to say that one  
particular part of the pipeline design had a register inserted which was  
clocked on the negative edge.  I could never see how this would  
positively impact anything.  In fact, the setup and hold time of the  
register, not to mention the routing time, would add to the delay in  
that pipeline stage.

Was I missing something or is this ever used to advantage?

--  

Rick C

Re: Pipelining on Multiple Clock Edges
On Saturday, 5/13/2017 5:52 PM, rickman wrote:
Quoted text here. Click to load it

Opposite edge pipe registers can be useful if your clock distribution
scheme is not able to guarantee the required hold time.  I've used
this in early Xilinx parts that had only 4 internal clock buffers
and I needed to bring in more (relatively slow) inputs using an
additional clock.  In those parts you could use "low skew nets" to
route a clock, but even then you'd have hold time issues.  In that
particular design everything on the poorly routed clocks went back
and forth between clock edges.  That included things like counters,
which would typically use a single N-wide register and feedback from
their own outputs.  Instead I needed two N-wide registers (one on
each clock) to remove hold time in the feedback paths.  Obviously
this would be painful to do a whole design in, but for me it worked
enough to get the data into distributed RAM for transfer to one of
the internal global clock domains.

--  
Gabor

Re: Pipelining on Multiple Clock Edges
On 5/14/2017 4:14 PM, Gabor wrote:
Quoted text here. Click to load it

This is an issue of poor clock distribution.  The guy using the opposite  
edge registers was saying it added a pipeline stage the same as the  
positive edge registers.  Even if this was done for all logic on all  
stages it would not be the same as adding more positive edge registers  
because it doesn't speed up the clock.  In fact the added setup and hold  
time of the added register slows down the circuit.

--  

Rick C

Re: Pipelining on Multiple Clock Edges

Quoted text here. Click to load it
  
Quoted text here. Click to load it
  
Quoted text here. Click to load it

I guess there could be some way that the logic going to and from that regis
ter
is fast enough that it would be possible to get and extra cycle for free  

Re: Pipelining on Multiple Clock Edges
On Saturday, May 13, 2017 at 6:52:37 PM UTC-3, rickman wrote:
Quoted text here. Click to load it
  
Quoted text here. Click to load it
  
Quoted text here. Click to load it

Sometimes you want a pipeline stage to work in a different clock phase from
 other stages. This is sometimes done to fit the write-back stage and the o
p fetch stage in the same clock cycle. Another example was the original MIP
S 2000 and how it used the same pins for both the instruction and data cach
es by using a different phase for the fetch pipeline stage.

And while it is something different, see how the three stage ARM Cortex M0+
 pipeline is made to look like a two stage pipeline:

http://microchipdeveloper.com/32arm:m0-pipeline

The alternative is to use a clock with twice the frequency and have enables
 that make some stages work on even clocks and others on odd ones.

-- Jecel

Re: Pipelining on Multiple Clock Edges
Quoted text here. Click to load it

I imagine it was used to transfer slack from one stage to another.  Imagine
 it's 1976, and you have everything laid out, but then you find that you ha
ve some stage with negative slack (let's say a multiplier) followed by a st
age with positive slack (let's say a mux).  It's hard to move registers bac
k into the multiplier, partly because it would increase the number of FFs,  
and partly because it's 1976 and you'd have to re-tape everything.  So you  
just have the mux grab the data on the falling clock edge, transferring hal
f a period of slack from the mux to the multiplier so the multiplier has 1.
5 cycles and the mux has 0.5.  Something like that.

Re: Pipelining on Multiple Clock Edges
On Monday, 5/15/2017 2:29 PM, Kevin Neilson wrote:
Quoted text here. Click to load it

That implies that the minimum prop delay of the multiplier is
guaranteed to be more than 1/2 clock period.  Probably also a
good bet in 1976.  In any case this doesn't represent a pipe
stage for 1/2 clock but rather for 1 1/2 clocks.

--  
Gabor

Re: Pipelining on Multiple Clock Edges
Quoted text here. Click to load it
Yes, it depends on mintimes so it's a poor design technique and would probably stop working when you shrink the die.  

Re: Pipelining on Multiple Clock Edges
On Saturday, May 13, 2017 at 5:52:37 PM UTC-4, rickman wrote:
Quoted text here. Click to load it

I don't know if you have seen this before, but something similar is
described in the book, "But How Do It Know?" by J. Scott Clark:

    https://www.amazon.com/But-How-Know-Principles-Computers/dp/0615303765

Someone made a video describing how it is useful for certain types
of slow-clock CPUs:

    
https://www.youtube.com/watch?v=cNN_tTXABUA


If you look, the computation takes place nearer to the positive edge,
and the write operating takes place nearer to the negative edge, so
that enough time takes place in-between to conduct the workload.

I've seen several designs which trigger in this way.  There are also
several methods described in (I believe) Lattice documentation, which
shows how to merge multiple clock signals together to obtain a clock
signal that will dwell fire around the negative edge, and dwell fire
around the positive edge for various purposes.

Thank you,
Rick C. Hodgin

Site Timeline