Virtex 4 released today

Any ideas if the V2 rocket MGTs are any better ? int terms of tolerance to lock freq ppm window and to be able to support more standards?

Antti

Reply to
Antti Lukats
Loading thread data ...
060605080202040702080504 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit

No

MicroBlaze runs at 150 MHz in a V2PRo and 185 MHz in V4. That is a 23% improvement.

Göran

J>>I think it's better that I answer that.

Reply to
Goran Bilski

No

MicroBlaze runs at 150 MHz in a V2PRo and 185 MHz in V4. That is a 23% improvement.

Göran

J>>I think it's better that I answer that.

Reply to
Goran Bilski

Hi!

My points exatly General Schvantzkoph. In a majority of designs 8 is more than enought, but in cases, when you have to interface with 4 different CPUs each running in it's own clock domain, you have 4 independent DDR channels (1-2 variable shifted clocks per channel), 2 ZBTs, USB and Ethernet interface, 1 or 2 ChipscopePro cores for debugging. Then you also have a system clock (which is used in 80% of the design) all that on the same chip,... then you are running out of IOs, out of BRAMs, out of clocks, out of everything. That's why you need to cross the regions... And on top of this, SI issues, different IO standards and voltage levels sometimes forces you use IOs in other banks... And at the end location of other devices on PCB, which are connected to FPGA also forces you to use IOs you normaly wouldn't. Some would say use bigger chip. Of course this could be a solution, but a very expensive one, because price is also a major factor. By using bigger chip (with more IOs) I would have to use 12 layer PCB instead of 10, PCB would be bigger, I had to redesign the cooling solution. All this would result in a more expensive product.

Austin, It was not my intension to criticize your desicions, I just wanted to clear this matter. And If I had ISE 6.3 installed on my machine, I wouldn't ask this, instead I would try to figure this out by myself. Anyway, when can we expect CDs with ISE 6.3 to arrive?

Regards, Igor Bizjak

Reply to
IgI

Please have a look at the new ChipSync Interface, which is available in all I/Os and takes care of the issue that you describe.

It makes the design of all source synchronous interfaces much easier. Instead of using a global clock signal for the received clock, it uses a dedicated local clock to capture the data. The clock AND the data can be delay adjusted, so that board delays and skew can be compensated.

Within the FPGA you have the option to run the FIFO of your example at "line speed" or reduce the clock frequency and increase the bit-width by using the built-in SERDES. As an example, if you have data coming into the device at 1Gbps (500 MHz DDR) you may want to set the SERDES up with a divide by 4 (the value 4 is just an example that makes sense in this case) so that you get 4 bits with every clock of 250 MHz. This can easily be transferred to a BRAM / FIFO. Only when you read the FIFO, you may want to use a global clock, as this is your "real" system clock (primary domain).

There are 2 local clocks per clock region. The number of clock regions depends on the device size - there are between 8 and 24 of these clock regions.

A detailed description of these resources can be found in the "Virtex-4 User Guide". I can not describe all the features and capabilities in this message. There are also a couple of Application Notes available that take advantage of this technology.

--Jürgen

General Schvantzk> There are lots of situations where you need multiple clocks. If you

Reply to
Juergen Fuhrmann

The Xilinx web site

formatting link
says 150 MHz for a -6 device. I was assuming the -7 would be a bit faster.

Cheers, JonB

Reply to
Jon Beniston

Hi,

If I remember my test was with the -7 speed grade for V2Pro. So I don't think that the website is correct. It should be a -7 device. I will contact the person who is responsible for the webpage.

Thanks for f>

formatting link

Reply to
Goran Bilski

Symon,

Yes. That is a real improvement in V4 to be able to use global clock nets for something other than global clocks.

Aust> Hi Vic,

Reply to
Austin Lesea

Vic,

Thanks for clarifying that: it wasn't clear from the presentations, so I was incomplete in my answer.

Aust>>>Several times in the past I bumped into the 8 global clocks limitation on

Reply to
Austin Lesea

Jim,

Afraid not. Still limited in the DCM mux to the clock trees, I think it still is only 4 outputs from a single DCM.

But you could use all DCMs from the same clock, and generate a lot of phases. Don't think you could get to 32 of them, but perhaps more than

  1. Now you couldn't use them all in one place. Still need CLKFB connections from BUFGs, as well as something to get to all the inputs of all of the DCMs....

And the jitter is still there on all of them, so you would have 24+ some phases fuzzily distributed (but accurately placed over time) thoughout the period.

Not sure why anyone would want to do that, but.....

Aust> Vic Vadi wrote:

Reply to
Austin Lesea

IgI,

Soon on the CD's. As I have said before, I am not in the software division, so I just do not know. We have advanced copies, and I am told I get my updates by next week, which means the builds are ready to roll, and CDs are ready to be burnt after that.

I did not take your comments as criticism. I just want to know what you use all those clocks for! If it was me, I would tell someone that having four separate processors, each with their own clock source, is not good system engineering! I would have a master clock generator with four phases, one for each processor (to spread out the RF noise, and make bypassing and SSO's easier).

Vic's corrections also help, as you may now use a global clock as a clock enable, making synchronous design easier to implement. Not sure if synthesis can take advantage of this new feature, however.

Have fun!

Aust> Hi!

Reply to
Austin Lesea

Hi,

Your assertion is false. There is nothing in the PCI Express specification that requires you use a distributed reference clock. It's present in some form factors, but not all. When it's present, you might view it as a "convenience".

You can use a distributed reference clock with a PLL if you want. But you'll have to convert from 100 MHz to 125 MHz and "clean it up" so that it meets the frequency and +/- 100 ppm requirements of the Virtex-II Pro transceiver blocks.

You can also use a local 125 MHz oscillator that is +/- 100 ppm.

Eric

Reply to
Eric Crabill

X.

hm, I am actually looking for +-300ppm lock :) i.e. SATA or PCIe without spread spectrum.

PCIe has also spec +-300ppm, but I think the only way to use PCIe is to use external PLL IC, same way Nital is doing on their boards.

Its funny that RocketIO seems to be the worst MGT from all

1) Altera Stratix GX has special PLL that is ok to be used, ie Altera can handle PCIe without external PLL 2) Lattice has +-300ppm clock range

RocketIO has +-100ppm and can not use DCM for refclock as far as I know, or does there exist a solution to use DCM in the PCIe application?

It really interest me how is the Xilinx Real PCI exprress working - there is nothing mentioned about the need of external PLL but without it can not work??

Thanks for any replies

Antti

Reply to
Antti Lukats

Hi,

The body of this message is about Virtex-II Pro, and not about Virtex4, as the subject suggests. I don't want anybody to get confused.

The original post didn't mention compliance. You simply asserted the need for an external PLL. That is false.

Virtex-II Pro transceivers require +/- 100 ppm on the externally supplied 125 MHz reference clock. The PCI Express 1.0a spec requires components to generate and tolerate +/- 300 ppm on the unit interval (UI) which is undoubtedly derived from a reference clock.

Also note that the Virtex-II Pro transceiver uses a 125 MHz reference clock, while one implementation of PCI Express systems provide a 100 MHz reference clock. The PCI Express 1.0a physical layer specs don't make any mention of required reference clock frequency. You might find some future form factor providing an entirely different reference clock frequency -- or perhaps none at all (e.g. a cable based PCI Express system...)

The "quality" of what arrives at the Virtex-II Pro receiver and the ability of the CDR to recover the clock and the data is a separate topic not directly related to this +/- 100 ppm requirement (although I am sure the quality of the reference clock to the Virtex-II Pro does have some effect on CDR...)

Untrue.

If both use the distributed reference clock, you have a synchronous system. The transceivers will lock, and the elastic buffers in the devices on both sides of the link will remain half-full. This is because everything is running from the same clock.

If you use independent (local) reference clocks which result in bitrates close to each other (e.g. +/- 300 ppm) you have a pleisochronous system. The transceivers lock, and the elastic buffers in the devices compensate for the difference in bitrate by dropping or adding symbols in the data stream. In PCI Express, this clock compensation mechanism is part of the spec and designed to handle up to a total of 600 ppm difference (one device at -300 ppm, the other at +300 ppm).

If you're using Virtex-II Pro with a +/- 100 ppm reference clock, you'll be fine because the total difference can only reach 400 ppm. Also, for any given device, it does not matter what reference clock frequency is required, what matters is that the bittime (UI) is +/- 300 ppm.

That is correct. You need to use the BREFCLK clock inputs. And, for the reference clock, you can use:

  1. Local 125 MHz +/- 100 ppm reference clock.
  2. Distributed reference clock, multiplied by 1.25, and cleaned up to +/- 100 ppm.

I hope that makes sense. Sometimes I get dizzy reading the spec!

Eric

Reply to
Eric Crabill

Understood - but that jitter is relatively low - correct ? (did someone mention 25ps for V4 ?)

Maybe the pin-sychronisers can also have independently and fine-grain control ?

Hmmm ? - I can think of quite a few application, where a design could go into the time-domain, rather than simply be clocked. FPGA clock ceilings are relatively low (NB: that's relative to their time-precision ability!) - we are now at ~500MHz ?

Some examples : Pulse duration/width measurement to

Reply to
Jim Granville

YOU CAN NOT! (not at least to 100% compliance!)

Read the PCIe spec and check MGT datasheet!

Do your work! (its your work, not your homework!) ok, relax, maybe that is not exactly your work :)

Eric, I do not think my assumptions are false (they very seldomly are)

PCIe spec says that clock must be +-300ppm, i.e. in the worst the case the non-Virtex end of the PCIe has the maximum allowed per PCIe spec clock error.

As the lock range of V2Pro(x) MGT's (and I assume also for V4) is +-100ppm then using an non locked to PCIe reference clock local oscillator for MGT reference will not be working solution as the MGT will never get initial lock. It might but is not guaranteed, as the clock error may be outside the MGT lock range.

As much as I know DCM's (in V2Pro at least) are no suitable for MGT reflock, so the ***only*** that would ever allow the V2Pro MGT to be used for PCIe is the use of external PLL that is locked to PCIe ref and does 100>125MHz (there are special chips for this purpose from ICS)

No other method (like using external 125Mhz oscillator) would not yield to PCIe compliant solution.

Correct me if I am wrong!

(maybe the V4 has DCM things that suitable for MGT reflclock, that is what I do not know but would like to know)

Antti PS good to see some Xilinx person has courage to reply to the MGT PCIe issue, my previous posting did not get any replies at all :(

Reply to
Antti Lukats

Sorry my previous post had one quote in wrong place, fixed here

[SORRY] the above section was supposed to be later!! my typo!!

YOU *MUST* use external PLL is correct

the ***CAN NOT SHOULD BE HERE***

+-100ppm

the

reflock,

is

to

I
Reply to
Antti Lukats

I thought that is exactly what the new I/O phase shift feature is about....

Kolja Sulimma

Reply to
Kolja Sulimma

I always make use of all BRAMs available in a device, but because there is a new primitive IDELAY available in V4 the need for synchronization FIFOs will be reduced. This new feature will be very useful. I do have several questions regarding the usage of ILOGIC primitive. What is the tab resolution delay? I remember a number of 80ps from the presentation, but I'm not sure I heard correctly? I believe this delay depends on the speed-grade of the device, right?

In order to use variable IDELAY, IDELAYCTRL has to be instantiated. Lets say I would like to use IDELAY for two groups of data signals, one running at

166MHz and the other at 200MHz. Do I have to instantiate two IDELAYCTRL primitives and connect REFCLK of each IDELAYCTRL to the same reference clock or do I have to connect 166MHz clock to one IDELAYCTRL and 200MHz clock to other IDELAYCTRL. Is there any correlation between the REFCLK and clocks of the incoming data signals at all, or is REFCLK completely unrelated to any other clock?

I saw there is a "Fourth-Generation Design Security" build into V4. Is external battery still needed, or have you implemented some sort of non-volatile memory (EEPROM) inside V4 for the keys?

What's the size of bitstream file compared to the previous generation of devices?

Regards, Igor Bizjak

Reply to
IgI

IgI,

See below,

Aust>>>Yup. Not his time. Aside, do we have enough BRAM?

Yes, you did. 78 ps.

I believe this delay depends on the speed-grade

Nope. It is derived from a feedback loop from the reference clock, so it never changes.

The reference clock is always 200 MHz for the delay elements, and has nothing at all to do with the speed the interface needs to run at. One ref clock for the delay is all it takes for the whole part.

Is there any correlation between the REFCLK and clocks of

Completely unrelated, unless by chance you need a 200 MHz clock to do something else.

Triple 56 bit key DES, battery bcked key RAM, just like V2 and V2P. It is the fourth generation/technology part to have this core. Even though single and double key DES is no longer considered secure (by the federal gov't), triple DES with three differing keys is still considered safe for the time being. Next generation will require AES (in two years).

Uh, depends on the device. Generally speaking, an LX25 is ~ 2500 system gates (whatever that means) and is 7,819,520 bits long. A 2VP20 (~2000 system gates) is 8,214,560. V4 has fewer BRAM bits than Virtex II Pro in ratio to CLBs, and the LX family has no PPC or MGTs.

Or, to put it another way, we haven't done anything radical to save config bits in V4.

Reply to
Austin Lesea

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.