1GHz FPGA counters

Are there any FPGA parts available today that can contain a 32-bit, free-running counter running at 1GHz and a 32-bit storage register to take a snapshot of the count and read it to a slower external interface?

The Xilinx Virtex II Pro seems to go up to about 325MHz...

Thanks.

Dave

Reply to
starfire
Loading thread data ...

a

I doubt any FPGAs can count that fast ... not directly

- you could use 4 counters running with different clocks (shifted by 90 degrees) at 250 MHz ...

- you could use the Rocket-IO SerDes ... de-serialize your gate-signal and process the datawords at a lower frequency ..

bye, Michael

Reply to
Michael Schöberl

Short answer is no. Getting a divide by 2 close to 1GHz on a room temp typical basis is probably do-able, check with Peter A. at Xilinx ?

Full margin, 32bit count and capture requires carry logic and so is going to be slower.

If you really want to measure time, (or create pulse widths), then FPGAs do have resources that can go under 1ns in time resolve.

-jg

Reply to
Jim Granville

I've done (tiny) Johnson counters in Virtex II Pro that would go to more than twice that speed. They could be used as a prescaler for a larger binary counter.

800MHz seems to be about the limit. Perhaps the OP should wait for Virtex 4.

(What about Peter Alfke's frequency counter? Didn't he claim 1GHz in V2P?)

Regards, Allan.

Reply to
Allan Herriman

Thanks for the response.

My application is for precise time correlation readings between random input pulses starting with a reset/sync pulse. The thought is if a free-running counter with 1ns resolution were reset to zero on receipt of the reset/sync pulse then a snapshot of the count made when a series of pulses are received (a separate 32-bit counter value when each pulse is received), a precise time correlation could be made from the sync to any input and from any input to any other input. The reset/sync pulse would normally be received before allowing the counter to overflow (typically about 35ms).

What resources are you referring to when you say FPGAs have resources that can go under 1ns in time resolve?

Dave

take a

Reply to
starfire

take a

A few months ago Xilinx announced that they had achieved 1 GHz performance in the lab, so it's probably a couple of years away for production devices.

Leon

Reply to
Leon Heller

Sounds like a time-domain problem...

Consider a 250MHz freq, with 4 phases in a DLL/PLL, capture of those resolves to 1ns,but only needs to toggle at 250MHz. Or, a long simple carry chain, with many capture registers : An edge can capture to the delay quantize, so 200 chain of

200ps each, is 40ns. This will need alternate calibrate/measure, as the delays are silicon derived, so are Vcc/Temp variable.

Some DLLs/DCM allow finer phase adj than 4, so 8 phase clock, and 8 copies of 125MHz counters/capture would resolve to 1ns (each IP edge). You will need to watch aperture and metastable effects in cross-clock domains, but the x8 copy scheme would allow you to check the integrity, as all counters should be within 1 count of one another. So you might read [+1][+1][+1][Whoops][+0][+0]{+0][+0] [Whoops] is a wildly variant value, that indicates the sample edge violated the [DeltaQt]+ [DeltaDt] aperture time.

As a general indication of the counter speeds/width, these are from a Lattice data sheet ( not clear if these are guaranteed, or typical )

16-bit counter 360 MHz 32-bit counter 280 MHz 64-bit counter 180 MHz

-jg

Reply to
Jim Granville

well - there are still the RocketIOs ... You can easily reach 0.5 ns There was a thread in March - look at Message-ID:

(with RocketIO-X you could even go further..)

bye, Michael

Reply to
Michael Schöberl

formatting link

will work better for most people, as very few news servers will hold a message from March.

Regards, Allan.

Reply to
Allan Herriman

I am surprised that no one has mentioned that you can pipeline a counter to get much higher speeds. This takes more logic and your capture registers must also be pipelined, but you can get much higher speeds this way. Each bit of the counter has two FF outputs, one is that bit of the count and the other is the carry out to the next stage. So each bit of the counter will be one clock behind the next lower bit. It only requires a single stage of carry propogation, so longer counters do not run slower. This will run at about the same speed as a toggle FF.

Ci-1 ---- ----

-------| & |--------|D Q|--- Ci +---| | | | | ---- clk---|> | +------------+ ---- ---- | |D Q|---+-------- Bi clk | |

-------|> | ----

--

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX
Reply to
rickman

When we experiment in the lab, we do that with (pre)production devices. So we are then weeks or a few months ahead of general availability. Peter Alfke, Xilinx Applications

Reply to
Peter Alfke

I think your safest bet is to sample the input with four staggered 250 MHz clocks, feeding four shift registers.Then differentialte the edges and move them into a common 250 MHz clock domain. Or you use 8 phases for an even safer circuit. Virtex-II can adjust the clock in 50 ps increments, and 250 MHz is reasonable, while 125 MHz is easy, and definitely guaranteed to work. You can capture the data, store the arrival time of 512 input pulses in a BlockRAM and read the data out at your convenience. The trick of using MGTs has not been proven yet, but this thread rekindled my interest... Peter Alfke, Xilinx Applications.

Reply to
Peter Alfke

Well spotted. One key to pushing FPGAs is you can trade logic for speed, and as these devices have a ton of registers anyway, it does not matter if you use x8 or x16 the possible minimum number, if it means you can get to x4 the time precision. -jg

Reply to
Jim Granville

Good to hear that :) - thinking aloud... What about a design that uses all the tricks, to push time-resolve as far as possible ? ( with maybe some FPGA family splits ) :

  • Pilelined counter (rickman) - what about a pipelined Gray counter ?
  • Multiple Phase counters/captures x4 is simplest, x8 is using more resource. What is the limit - x16 / x32 ... ( 'limit' would be when the routing/delay/capture uncertainty jitter meets the minimum time resolve - tho like ADCs, you can generate LSB that needs further averaging/filtering to be usefull )

With simple phased clocks, you can syncronise the sample/unknown pulse in each domain, which allows binary pipeline capture. Gray counters would need post-conversion for maths, but they do avoid the aperture effects of the capture pulse arriving, and so they open the option of fractional LSB extension via clock-capture & delay lines.

-jg

Reply to
Jim Granville

I'm new to fpga world, I've worked with DSPs for years and now i'm moving to fpga interested for their high rate count potential (high energy physics) . I think that the limit will be the jitter as said jim. Anyone knows the max rate that could be achieved with the Spartan 3 family?

-- Luis Vaccaro

lines.

Reply to
Luis Vaccaro

The most obvious limit here would be the amount of clock resources - global clock lines (I suppose you don't want to use local clocks), and to a lesser degree DCMs.

-g

Reply to
Gerd

You could just as well use a shift register for the lower 4 bits of the counter (16 FFs), and use the pulse from the last FF to increment a (16x slower) normal counter. *Somehow*.

The most obvious limit here would be the amount of clock resources - global clock lines (I suppose you don't want to use local clocks), and to a lesser degree DCMs.

-g

Reply to
Gerd

Yes, you can use prescaler schemes, which can be shift register / Johnson counters, but that presents problems on capture.

The DCMs look to have all the nice logic, but a rather limited number of taps for this type of time-extension (pity). Their advantage is they are there already, and are easy to deploy, and if this is your prime usage, who cares if all the DCMs are used ?.

You can, of course, make a similar fine-time device using the FPGA itself, but that becomes a much more process and tool dependant path. But it would be interesting, and may approach 100ps in resolution.

-jg

Reply to
Jim Granville

Indeed, but not that big I guess.

Emphasis on 'to a lesser degree'. Even the small devices have 4 DCMs, each of which should supply you with four clocks (0/90/..). Starting with the 2vp20 you get 8 of them, but by then you are already running out of global clock nets (8/16).

It does.

"Process dependance" is a really nice understatement *g* I'm thinking more along environmental conditions, the exact placement you choose, etc. But again, it's not much help if you can't distribute your clocks.

"Tools" are down to fpga_editor & co, too.

regards,

-g

Reply to
Gerd

Here are my thoughts for a fairly simple implementation. If I recall, the original post asked for a report of the arrival time of input pulses (let's assume rising edges) with a resolution of 1 ns.

I suggest a synchronous design running at 250 MHz (synchronous counter, transfer to BlockRAM etc) augmented with a small "prescaling" front-end. The input line gets clocked into four flip-flops in parallel, each clocked on a different quadrant of the 250 MHz clock. Using the flip-flop clock polarity option, this requires only two global lines driven by one DCM.

Now that we have captured the input edge in 4 flip-flops, we have to figure out where it was captured first. For that, we must move the four staggered signals into the same clock domain, and we should move any signal only by a quarter clock per step (to avoid excessively tight delay requirements). This takes half a dozen flip-flops, followed by a 1-of-4 decoder that defines the position of the leading edge, and is used as the two LSBs for the timer.

This circuit would have problems if two pulses arrive within 4 ns, but I hope that is physically impossible.

Counter trickery is really not necessary. It's all synchronous to 250 MHz. It's only the sub-one-nanosecond resolution that requires some trickery.

Peter Alfke

>
Reply to
Peter Alfke

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.