FPGA : Async FIFO, Programmable full

B

bijoy 19 years ago

Hi In designing asynchronous FIFOs we have to use Gray code read/write pointers.

And while using gray code I know how to generate full and empty flags.

But i would like to get ideas from experienced designers how to generate Programmable full or programmable empty flags

Thanks in advance

rgds bijoy

Vote

S

Symon 19 years ago

Well, no, you don't _HAVE_ to. You can use ordinary binary counters just fine, as long as you design the clock domain crossing bit properly.

Perhaps you could share this with the V4 FIFO designers! :-) Sorry Peter!

I use binary counters for my FIFOs. A Google search shows that arithmetic is difficult on gray coded numbers. e.g.

formatting link

There may well be a trick other than converting to binary, let us know if you find it! HTH, Syms.

Vote

K

Kim Enkovaara 19 years ago

But crossing the clock domain with binary counter is hard to do correctly, it needs some kind of handshake protocol. It's not enough to just put dual flops to each counter bit like with gray coding.

--Kim

Vote

B

Ben Jones 19 years ago

A useful compromise is to have counters in binary, so you can have fast arithmetic, and then convert from binary to Gray-code, cross the clock domain, and then convert back again to do the pointer comparison. While there is some latency, it is usually easier than a bi-directional handshake.

Cheers,

-Ben-

Vote

J

JuanC 19 years ago

You can read these papers by Cliff Cummings on aSynchronous FIFO design:

formatting link

Vote

P

Peter Alfke 19 years ago

There is a wrong way and a right way to convert binary to Gray count values: The wrong way hangs the (simple!) combinatorial conversion logic (XORs) on the binary outputs. That does you no good, since the Gray code will just reflect the binary transient errors. The right way takes the D inputs of the binary counter (, converts them independently to Gray and registers them in their own flip-flops. Now you have two parallel rgisters, both counting in step, the first in binary, the other in Gray, and there are no funny decoding spikes. BTW, only Gray counter outputs have the feature that you can compare two asynchronous counters for identity, without transient errors. If the Gray code does not come from a counter, it might change multiple bits per transition... Fast asynchronous FIFOs with configurable flag values are definitely not a trivial design exercise. We got it right in Virtex-5, after a silly oversight in Virtex-4. Peter Alfke

messagenews:VKSfh.926$ snipped-for-privacy@reader1.news.saunalahti.fi...

to have counters in binary, so you can have fast

Vote

K

KJ 19 years ago

Not to put a damper on the thread but I've found that there are probably very few applications for true programmable fifo flags inside a programmable chip that can be in-system updated like an FPGA. By 'programmable flag' I'm assuming that we're talking about having some register that can be written to with some arbitrary number that will cause some associated fifo status flag to trigger when it reaches that level. This arbitrary number is in some fashion controlled by software and is not and can not be known a priori.

There are applications where I'd like a flag to go off at 'almost full' or 'almost empty' but those 'almost' conditions have always been able to be determined as hard fixed constants. Maybe those levels are things like 7/8 or 321/456 or .01 (just making up numbers here) but the point is that the point where I want a fifo flag to trigger has (for me at least) always been traceable back to known system parameters. It could be that the trigger level is some bizzarro function of those system parameters but the end result is a known fixed number so generally what I will do is code the bizzarro function into the VHDL that instantiates the fifo to produce the constant that will be input as a generic input to the fifo that uses it to determine the flag status trigger levels.

True programmable flags where some outside software comes in and actually tells the fifo what specific level to trigger at came about back when fifos were mostly implemented as discrete parts and the suppliers of these little gens of course would have virtually no market for parts that trigger at levels like 7/8 or 321/456 or .01 so instead they came up with a mechanism to allow the user to program in an arbitrary level...which they did....and always (in my experience) did so with some constant that was determined from system parameters.

For those applications that fit the cases I've described and really can have fixed constants to determine possibly bizzaro trigger levels there is no need for any convoluted math or moving gray code to binary to do math. Your gray code counter that keeps track of the depth is a finite state machine, there is exactly one state that needs to be matched to see if it corresponds to the equivalent of being 321/456 full.

If someone does have an application that really requires true programmable flags I'd be interested to hear about it. The environment though is

- Fifo is inside a programmable part.

- That programmable part can be updated in system.

Presumably this would have to mean that the trigger levels can not be determined either empirically at design/design verification time or as a function of known system parameters but would (presumably) have to be a dynamic function of what is going on in a particular system at a particular time (i.e. something unique to the particular system at that point causes the software to go 'Woooah, I need to change that fifo trigger level from

321/456 to 7/8 (or some such)'.

KJ

Vote

K

Kim Enkovaara 19 years ago

I have many examples from telecom side. For example if you have some universal interface card, that has each port configurable to different protocol (for example STM1/4/16). That kind of big change in speed usually needs some changes in fifo limits to get maximum efficiency. If you have for example 8 ports that would need quite many different FPGA images to cover static settings. Also partial reconfiguration in my opinion is little too dififficult way to handle this problem.

Of course you can code the static values for each protocol and use mux to select one of them to the fifo comparison logic. But it's easier just to add one register and let the software to handle the values. Also usually maximum efficiency needs some lab testing to find the optimal fifo values. And waiting hours for new fpga after each change in the limits is not fun.

I have used programmable flags so many times in lab while debugging different problems in the design, and while trying to understand the dynamics how different limits affect interconnected fifo fystems between chips etc.

--Kim

Vote

R

RCIngham 19 years ago

Whether that is the right answer rather depends on WHY the OP "has" to us Gray Code, which is still not fully established.

BTW, there is a quite interesting article on Gray Code just posted at:

formatting link

Vote

P

Peter Alfke 19 years ago

Vote

K

KJ 19 years ago

select

different

Having a single fifo with (in your case) 8 different flags come out at those 8 different fill levels would accomplish the same thing. The only thing that needing muxing would be the 8 different flags not 8 entire fifos. Software would write to a register which would select one of the 8 pre-defined constants.

At some point though the logic required to generate an arbitrary number of flags that are comparisons of the fill level to a set of fixed constants becomes greater than the cost to compare the fill level to a programmable register. At what point that you cross that threshold depends on the number of bits in the counter, but in the cases where I've run across it was cheaper (in logic) and faster in performance to do the multiple comparisons and select but that was with 4 sets of fill levels....8 choices might just push you over.

I agree that lab testing to empirically find those optimal values can be a pain when it means an FPGA rebuild but once beyond that and the optimal values determined they usually tend to stay set in concrete. Hard coding them in at that point would tend to give you less logic and better performance (again, how much depends on the counter size).

Good example.

KJ

Vote

T

Tommy Thorn 19 years ago

How about a different way? Couldn't one simply maintain two views (one for each clock) of the state of the FIFO, always a conservative approximation to the "true" state, and use standard handshake techniques to communicate to the writer "since our last handshake, I've dequeued X words", and visa versa.

The advantage of this (besides simplicity) is that one can pipeline the handshake arbitrarily much, only at the expense of added latency between full->non-full and empty->non-empty transitions.

Isn't this a standard technique?

Tommy

Vote

P

Peter Alfke 19 years ago

way? Couldn't one simply maintain two views (one

Vote

R

Ray Andraka 19 years ago

I've used similar techniques. Basically, an alternative to using gray code is to maintain population counts that are separate from the read and write pointers, and then resynchronizing the write event for a read side population count and/or resynchronizing the read event for a write side population count. The things to watch are 1: you need to make sure that read or write events can't overrun the flag in your system. This can happen if, for example, you write twice between successive read clock edges. So, yes, this is a viable alternative, it will sometimes save some logic, can result in higher speeds, but it also carries with it a responsibility for the user to make sure the event flags don't overrun.

Vote

T

Tommy Thorn 19 years ago

That was the idea. I admit I have never done it so I was looking for the potential pitfalls.

I'm afraid I don't follow. As writing (resp. reading) follow a local, synchronous, conservative approximation how could it overrun? The "flag" as I understand it is direct function of the local counter so it would be updated for each write.

Thanks, Tommy

Vote

T

Tommy Thorn 19 years ago

That was the idea, although I'm suggesting communicating deltas rather than absolute values (ie, how many elements read / written respectively). It may be the same, but it seems more intuitive.

I admit I have never done it so I was looking for the potential pitfalls.

I'm not sure I follow. As writing (like reading) follow a local, synchronous, conservative approximation that would be updated for each write, what would stop me from adding logic to drop the write if I deem the FIFO full?

BTW, what were the other things to look out for? :-)

Thanks, Tommy

Vote

T

Tommy Thorn 19 years ago

That was the idea. I admit I have never done it so I was looking for the potential pitfalls.

I'm afraid I don't follow. As writing (resp. reading) follow a local, synchronous, conservative approximation how could it overrun? The "flag" as I understand it is direct function of the local counter so it would be updated for each write.

BTW, what were the other things to look out for? :-)

Thanks, Tommy

Vote

R

Ray Andraka 19 years ago

OK, for example, if the write clock is at 10 MHz and the read clock is at 5 MHz, if you wrote on 2 consecutive write clocks, then synchronizing them to the read clock domain, you'd lose one or both of the write events. In order for this set-up to work, you need to guarantee that whatever signal you send across the clock domain boundary to indicate a write event or events is infrequent enough to be reliably sensed in the other domain for every single event. If there is a possibility of writing too frequently, then you need some sort of signaling scheme that can indicate a certain number of writes occurred and update the population count accordingly. If the clocks for both sides are similar frequencies, you need at least one clock between events to get a clean crossing. This extra care to make sure you don't have events happen too quickly for reliable operation is the reason the shrink wrapped FIFO components use the gray code schemes.

Vote

T

Tommy Thorn 19 years ago

Thanks Ray. I'm sorry to persist, but I still can't agree.

When the writer writes the first item we can have one of several situations, including:

- The reader has yet to acknowledge some messages prior to this write. No update will be sent until it does. The writes writes another item and updates its local count. Once the reader finally acknowledges the original message, the writer sends a "2 written" message.

- The writer wasn't waiting and goes ahead with a "1 written" message. The reader updates it's local count and acknowledges that it saw "1 written" message. The writer writes again etc.

- The writer wasn't waiting and goes ahead with a "1 written" message. The reader doesn't pick up the message in time before the writer writes to the FIFO again. No worries. Once the reader finally acknowledges the "1 written", the writer can issue a new message to the reader with the delta between the writers current count and the writers count at the time of the last message, thus including the write that just happend.

IMHO, any scheme depending on such an assumption is broken (it's too easy to forget this assumption when the design is revised a year later). I make no assumption on the relative speed of writers and readers. I do assume that no messages will be dropped/missed, but that assumption can also be removed with more work.

I beleive that's what I suggested.

Thanks Tommy

Vote

D

Daniel S. 19 years ago

There is a simple way of avoiding "mysterious glitches" when using binary counters: freeze the counter in a second register in the first clock domain and do a one bit handshake across the two domains. Since the counter (copy) is frozen during the handshaking, the other clock domain's FFs will have ample setup time until the strobe propagates through the resync.

With a fully resync'd handshake, this approach does introduce up to 3x clock1 + 1.5x clock2 cycles delay between update, which is certainly not suitable for low-latency and small high-speed FIFOs, especially if there is a large difference in clock speeds. It works well as long as one can live with long status delays for the first (read) and last (write) few FIFO words.

The design is nice as long as clocks are of similar frequencies but the delay may become an issue for large read:write (and vice-versa) clock ratios. At 10:1, the status update may have a latency exceeding 40 fast clock cycles.

So, binary counters can be safely used by using frozen copies and handshaking. The caveat is higher latency than gray counters due to handshaking and register copies that add extra dependencies upon the other clock domain.

As always, designers are free to pick their pois> In an asynchronous FIFO, reading and writing is controlled by two

Vote

FPGA : Async FIFO, Programmable full

Join the Discussion

Didn't find your answer?