seperate high speed rules for HDL?

Hi everyone. I'm trying to find out if, at high speeds, it is necessary to clock every other register using every other clock transition. For instance , clocking every other register in a shift register using the positive cloc k transition and the rest use the negative clock transition. This VHDL may help explain:

I know this works at lower speeds:

if(clk'event and clk='1')then D

Reply to
sketyro
Loading thread data ...

How fast are you thinking about?

I have done FPGA designs that had at most two LUTs between FFs. That, and optimal routing, leads to a fairly fast design. I believe that using one clock edge works best in this case.

For most FPGA families, there is a well optimized clock tree to minimize the clock skew. If you use different clock edges, there must either be an inverter on some clock inputs, or two separate clock trees. Either seems likely to add clock skew, and limit the speed.

Also, the timing tools might have a harder time figuring out the appropriate timing. Probably it doesn't cause so much of a problem, but it is your problem to get the clock timing right.

The only advantage I see is that the clock signal runs at a lower frequency.

OK, in the olden days there were advantages. We now have nice, well designed master-slave flip-flops. Before TTL, as well as I know it, much logic was done using only latches. The Earle latch allows one to generate efficient pipelines, merging two levels of logic with the latch logic. Without the advantage of a master-slave FF, using either two clock edges or, more usual, two separate clock phases, allows for nice pipelines.

If I remember, the TMS9900 microprocessor uses a four phase clock. The 8088 and 8086 use a single clock input with 33% duty cycle, and dynamic logic. (There is a minimum clock frequency of about one or two MHz. It has been some time since I thought of the exact value.) The 33% is optimal for the different path lengths on the two clock edges.

But as far as I know, there are no advantages for current FPGA families.

Now, there are DDR DRAMs which clock on both edges. The FPGA logic required to do that likely has FFs clocked on both edges. It might be that for signals going into or out of the FPGA, that you can do it faster using both edges.

-- glen

Reply to
glen herrmannsfeldt

o clock every

every other

rest use

It works at any speed

About the only plausible situation where you would benefit here is if you c an't double the clock frequency either because it would exceed the device l imits or if the other logic that is clocked by that same clock can't run th at fast without having to massively redesign. You'll still have to deal wi th trying to receieve data with only one half of a clock time of setup if t hose negative edge triggered flip flop outputs fan out to anything other th an your shift register (i.e. it's a small design niche where it may be usef ul).

e middle of

Being in the middle of a data eye that you can do nothing about doesn't hel p. You have to meet the setup and hold time of the flip flops, there is no extra credit for placing the sampling clock edge in what you think may be the middle. Devices are designed to distribute free running clocks that or iginate at an input pin or an internal PLL output with zero skew from the p erspective of the designer.

how to find

The speed would be device specific since it would be the maximum clocking s peed of that device which you can find in the datasheet. However, that max imum speed is typically only applicable to a simple shift register, stick i n any logic and the clock speed will drop.

Kevin Jennings

Reply to
KJ

This is overall not a very good idea. Even with 50% duty cycle clocks, say path A.Q to B.D has T/2 time so you are cutting the time available for B to register by half. To make a design run faster you need to increase the sou rce clock edge to destination clock time, not decrease as you are doing her e. Your options are add multicycle paths or useful skew to increase the time a vailable between clock edges. The former is difficult to constrain and the latter is strictly a physical design solution which doesn't apply to FPGAs.

Reply to
muzaffer.kal

o clock every other register using every other clock transition. For instan ce, clocking every other register in a shift register using the positive cl ock transition and the rest use the negative clock transition. This VHDL ma y help explain:

e middle of "A's" data eye. Is this coding style required above some speed? If so, does anyone know how to find out what that speed is or just tell me some general approximate?

no you shouldn't do that it doesn't gain anything, if anything it'll make t hings slower.

in you example using both edges the output of A only has half as much time to get to input of B so it will only be able to run half as fast

FPGAs are generally designed so the clock arrives at all FFs at the same ti me so all you need to check is if the path takes less time than a clock cycle minus setup time

and the hold time on FFs are zero or less so that paths can be arbitrarily fast

-Lasse

Reply to
langwadt

This I did not know. I feel like it should have come to me before now. Oh well. Thanks everyone!

Reply to
sketyro

Zero (or less) hold time on FFs is not true for all FPGAs. It also does not account for finite skew between clock arrival at different FFs, even using a "global clock net".

As to the original problem that the alternate-edge clocking scheme is presu mably trying to solve, there is a one-clock-cycle delay between A & C. But there are 2 t_setup and 2 t_clk2out times, since you are using an additiona l register on the opposite clock edge between A and C. You would be better off to perform the operations for B and C combinatorially in series between A & C, without trying to use a register for B in between them.

Andy

Reply to
jonesandy

then the tools better hide that if you want to use it for anything

if you don't have zero hold you can't tell if the path between two FFs might happen to be less that the required hold and if you could what would you do? insert some dummy logic to add some delay?

and if you can't assume the the skew is effectively zero how are you going to do a synchronous design?

-Lasse

Reply to
langwadt

(snip, someone wrote)

The FF's don't have to have zero hold time, they just have to have a hold time less than the shortest route between a previous FF.

I remember in the TTL days, with zero hold time one could wire from one output pin to an input, such as Qbar to D. That was guaranteed to work.

In the case of FGPAs, though, you have the FPGA routine fabric to go through. There will be a minimum length route.

Well, again, if the clock skew plus hold time is less than the minimum length route, you won't notice it.

For some FPGA families and tools, one can hand route at least some signals. If there was a possible route faster than skew plus hold, the data sheet should tell you about it.

-- glen

Reply to
glen herrmannsfeldt

what I mean isn't that hold and skew should literally have to be zero but it should be so that you can design as if it was and it is guaranteed to work

-Lasse

Reply to
langwadt

What the OP should do is a trial fully-synchronous design, run it all the way through the tools, and see whether the Static Timing Analysis shows that it is "fast enough". If not, start adding pipelining stages in the areas that are causing a problem.

The major vendors' toolsets are all quite good at optimising for speed in fully-synchronous datapath designs (although I have had various problems with Virtex-5 parts in the past).

--------------------------------------- Posted through

formatting link

Reply to
RCIngham

In the case of Microsemi PA3E devices, their place & route tool works to solve any hold times for you, assuming you enable that setting. I have seen several hold time violations with that setting disabled, but not with the setting enabled.

Andy

Reply to
jonesandy

how does that work? I mean if you don't enable that option and get a hold violation what can you do? can't just start adding random logic hoping it fixes the problem

-Lasse

Reply to
langwadt

The tool solves hold times buy just adding delay to the datapath. Xilinx tools fix hold times too. Or am I misunderstanding your question? It's on by default - don't even know if you can turn it off.

Regards,

Mark

Reply to
Mark Curry

Or by adjusting placement to create appropriate skew in the clock arrival times at the source and destination registers.

I'm not sure why they allow the option to be disabled...

Andy

Reply to
jonesandy

Your second example will run at half the clock speed of the first example because there only is half a clock from output to input. The data gets through two flip-flops per clock so both examples get about the same throughput as a first estimate. If you look at it in more detail the second example will be slower because clock duty cycle uncertainty now eats into your timing budget.

Also: Use rising_edge(clk) and falling_edge(clk) for safer simulation and better readability compared to clk'event...

Have fun,

Kolja

formatting link

Reply to
Kolja Sulimma

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.