DC fifo behaviour at underflow/overflow

K

kaz 13 years ago

We all know that a fifo should operate without getting empty or full. Does

anybody have experience of what sort of output disorder can one expect when

operating in the wrong state (underflow or overflow).

I am asking that because naturally one thinks of some data samples getting lost

when a fifo is in this wrong state but I am facing another output pattern at

final system output and trying to find a cause. The pattern I get is an odd/even offset by some 8 samples in one case or every 8th sample duplicated in

another case. For case1 if I realign that stream it gets correct so I am not

actually losing samples. The system is too large and remotely tested and there

is not much room to do any test at the time being.

I have suspicion of a dc fifo in the path that may enter wrong state(underflow/overflow). It is altera dc fifo in stratix iv writing on ~368MHz clock and reading on ~245MHz, 32bits wide and 8 words deep.

Any thoughts appreciated

Kaz

--------------------------------------- Posted through

formatting link

Vote

A

Andy Bartlett 13 years ago

I would never let a FIFO over or under flow. You should always stop writing to the FIFO if the full flag is set and discard the input data stream. If the empty flag is set you should not read from the FIFO - instead output known dummy data (invariably I output all zero's).

Following this rule the behaviour of the FIFO is totally predictable.

Andy

Vote

K

kaz 13 years ago

writing

Thanks Andy. No question that fifo is meant to be working away from underflow

or overflow. What I am asking is there any known patterns that could emerge

- after all - within this unpredictibility. Here I am asking about known symptoms of wrong behaviour really.

Kaz

--------------------------------------- Posted through

formatting link

Vote

A

Andy Bartlett 13 years ago

Depends how the FIFO is constructed.

If it is as a dual port RAM with an incrementable write pointer on the input port, and an incrementable read pointer on the output port then if you fill it to full - stop writing - then keep pulling data from the read port it will act as a circular buffer with data that will repeat over a number of cycles which will equal the FIFO length.

You can work out other scenarios for this architecture yourself, for sure.

Andy

Vote

K

kaz 13 years ago

If

output

known

input

fill

sure.

My crucial point is: Is there anyway this altera fifo will break up the stream into another stream

with even samples ahead of its odd half by 8 samples?

Kaz

--------------------------------------- Posted through

formatting link

Vote

K

kaz 13 years ago

Let me rephrase the problem. It may not be that the presumed fifo problem is a case of underflow/overflow

but rather it is a timing problem or both mixed up.

dc fifos protect against metastability to some degree but a failure could occur. The cross-domain paths are made false by default understandably. So is't a

case of loss of functionality for some time intermittently that has to be accepted. The error stays for several tens of msec then disappears. Don't we

expect fifos to recover more quickly(its internal sync pipeline is set to

3).

Kaz

--------------------------------------- Posted through

formatting link

Vote

A

Allan Herriman 13 years ago

I saw a DC (dual clock) FIFO do something like that once. It was in a Xilinx part, but the design error would apply equally well to an Altera part or an ASIC.

It was part of an IP core that a client had bought. To make a 64 bit wide FIFO, the IP developer had used two 32 bit wide FIFOs in parallel. The two FIFOs had independent control circuits.

Of course, as a dual clock FIFO, one can't really make any guarantees about the depth immediately after an asynchronous reset when the clocks are running, and indeed the two halves of the FIFO would start with different depths sometimes. There was no circuit to check for this state and get them back into sync and the end result was until the next reset,

32 bit chunks of data were swapped around.

Regards, Allan

Vote

M

Michael S 13 years ago

The symptoms look exactly like underflow/overflow.

a

According to the dcfifo help, value of 3 is internally translated to

1, which for very high clock rates that you are using is almost certainly insufficient. Try 4.

Did you pay attention to DELAY_RDUSEDW/DELAY_WRUSEDW parameters? Altera's default value (1) is unintuitive and, in my experience, tends to cause problems. If you rely on exact values of rdusedw or wrusedw ports for anything non-trivial, I'd recommend to set respective DELAY_xxUSEDW to 0. I'd also set OVERFLOW_CHECKING/UNDERFLOW_CHECKING to "OFF" and do underflow/overflow prevention in my own logic.

BTW, personally, I wouldn't use Altera's 8-deep FIFOs, they don't appear to be as well tested as their deeper relatives. Or, may be, it's just me.

Vote

K

kaz 13 years ago

Many thanks for your contributions.

The fifo I am using is very basic: 32 bits wide, 8 words deep, no reset,

3 stage synchroniser, write and read connected directly(combinatorially) to

full/empty flags, word count not used, clocks(wr/rd 368/245).

I am trying to put my head deeper into how a fifo might work internally. Assuming a simplest case, I understand the write pointer is clocked by write

clock and it increments on write request (counting is binary or Gray). The read pointer mirrors that on the read side.

The signals that cross the clock domain are the empty/full flag (in my case

as I am not using the word counts).

What now mystifies me is that if anything went wrong be it flow issue or timing then wouldn't these counters just increment from where they might have

landed implying self recovery, excluding the case that read pointer is ahead

of write pointer (as assumed in my case because samples are read out correctly each but misaligned).

I mean to get 8 samples odd/even misalignment I can only think of pointers

going crazy or address arriving crazy but regular.

Kaz

--------------------------------------- Posted through

formatting link

Vote

M

Michael S 13 years ago

Gray.

se

No, it does not work like that. The signals that cross clock domains are:

write pointer - resynchronized variant of it is used in the rdclk clock domain to generate rdempty and rdusedw
read pointer - resynchronized variant of it is used in the wrclk clock domain to generate wrempty and wrusedw

Your write clock is faster than your read clock, so, supposedly, your wrreq has

s

Vote

K

kaz 13 years ago

reset,

directly(combinatorially)

internally.

ca=

I agree that a resynchronised variant of write pointer will be used to generate rdempty and rdusew in other domain but not for read pointer itself

i.e. each side has its own pointer.

or

might

yes the input rate is controlled by valid being active in 2/3 ratio regularly.

The read side is always active if fifo is not empty.

Kaz

--------------------------------------- Posted through

formatting link

Vote

M

Michael S 13 years ago

lf

Of course. Each side has it's own pointer + resynchronised copy of other side's pointer

yes = *only* wrfull or yes=additional terms? If the former, where wrdata is coming from?

Can you post here a representative excerpt from your design ?

Vote

K

kaz 13 years ago

Yes there is extra term. Here is some excerpt:

TX_SRX_FIFO_inst : TX_SRX_FIFO PORT MAP ( data => TX_SRX_FIFO_DATA, rdclk => iCLK245, rdreq => TX_SRX_FIFO_rdreq, wrclk => iCLK368, wrreq => TX_SRX_FIFO_wrreq, q => TX_SRX_FIFO_q, rdempty => TX_SRX_FIFO_empty, wrfull => TX_SRX_FIFO_full );

-- 2 in 3 clock enables is used TX_SRX_FIFO_wrreq

Vote

M

Michael S 13 years ago

WOW, I reproduced the behavior that you describe (non-recovery after overflow) in functional simulation with Altera's internal simulator! I never imagined that anything like that is possible. Sounds like bug in implementation of dcfifo. Of course Altera will call this bug a feature and will say that as long as there was overflow nothing could be guaranteed. Or similar bullsheet. I am writing sequential counter and see the pattern like (64, 57, 66,

59, 68, 61, 70, 63...) on the read side. To be continued...

Vote

M

Michael S 13 years ago

Few more observations:

The problem is not limited to 8-deep DCFIFO. 16-deep DCFIFO could be easily forced into the same "mad" state.
A single write into full FIFO is not enough to trigger the problem. You have to write to full FIFO 3 times in a row. Which, generally should never happen even in presence of poorly prevented metastability.
So, in order to force FIFO into "mad" state you have to do stupid sequence on the write side. But when FIFO is already mad, it's a read side that is keeping it here. Somehow, it stops correctly detecting rdempty condition.

What would I do?

I'd increase RDSYNC_DELAYPIPE/WRSYNC_DELAYPIPE to 4. It's very unlikely that the problem is here, but for such high clock frequencies the value of 3 is still wrong.
I'd start looking for race condition type of bug. Like feeding one clock domain with vector, generated in other clock domain. If you don't know all parts of design then try to look at Timequest "clock transfers" display. It could be helpful.
In the longer run, I'd redesign the whole synchronization block. IMHO, a design that has maximal FIFO read throughput exactly equal to

*nominal* write throughput is not sufficiently robust. I'd very much prefer maximal read throughput to be at least 1% higher. Then your FIFO will be most of the time close to empty and the block as whole will be "self-curing". As additional benefit, you will have more predictable latency through FIFO. Even if latency is not important for your main functionality, it's good for easier debugging.

Vote

K

kaz 13 years ago

Thanks so much Michael. It is great that you thought of simulating fifo in this mad state. I will try reproduce that. I assume you are doing functional

simulation. The interesting thing is that we got many fifos in our system but only one is

misbehaving.

Kaz

--------------------------------------- Posted through

formatting link

Vote

M

Michael S 13 years ago

ter

n

e

I thought a bit more about it. As a result, I am taking back everything I said about Altera in the post #13. Altera's dcfifo is o.k. The access pattern is just too troublesome for overflow recovery, it will cause problems to any reasonable FIFO implementation. Sorry, Altera, I was wrong.

Now I'd try to explain the problem: Immediately after overflow read pointer and write pointer are very close to each other - on one cycle read pointer pulls ahead of write pointer and reads sample from 9 writes ago, on the next cycle it fails behind write pointer and reads the very last write sample, and then again pulls ahead and so on. It happens because read machine sees delayed version of write pointer, trailing read pointer by one or two and then thinks that the FIFO is almost full. And continues to read. Write machine, on the other hand, sees delayed version of read pointer, equal to write pointer or trailing it by one and then thinks that FIFO is either empty or almost empty. And continues to write. Since average rate of writing is exactly equal to rate of reading the recovery fromthis situation can take a lot of time, in case of common clock source recovery could never happen.

The solution? Assure that overflow/underflow never happens. If you can't - at least increase the frequency of read clock, as suggested in my previous post. 1% increase is enough. If that is too hard too then slightly modify write pattern. Instead of "++-++-++-++-" do "+++++++++---+++++++++---". That pattern will guarantee instant overflow/underflow recovery. If, for some reason, such modification of the write pattern is impossible then do smaller modification "++++--++++--". This pattern is not safe, but probabilistically should recover from overflow much faster than yours.

Good luck.

Vote

G

glen herrmannsfeldt 13 years ago

(snip)

Last I knew, FIFOs were supposed to have an almost full and almost empty signal to avoid that problem. Maybe at 7/8 and 1/8.

If you use the almost full and almost empty, that should leave plenty of margin for such delays. Even more is needed if the signals are processed through software.

Then, only after writing is finished, flush out all the data with the actual empty flag.

-- glen

Vote

M

Michael S 13 years ago

I'd also like to see a definition of TX_SRX_FIFO.

Vote

K

kaz 13 years ago

Hi Michael,

below is definition of fifo. What troubles me is that write/read are tied up to full/empty respectively so I don't see why flow problems should occur. Moreover the write/read is protected internally as well.

Could you also please let me know was it timing simulation that you did?

Thanks

LIBRARY ieee; USE ieee.std_logic_1164.all;

LIBRARY altera_mf; USE altera_mf.all;

ENTITY TX_SRX_FIFO IS PORT ( data : IN STD_LOGIC_VECTOR (31 DOWNTO 0); rdclk : IN STD_LOGIC ; rdreq : IN STD_LOGIC ; wrclk : IN STD_LOGIC ; wrreq : IN STD_LOGIC ; q : OUT STD_LOGIC_VECTOR (31 DOWNTO 0); rdempty : OUT STD_LOGIC ; wrfull : OUT STD_LOGIC ); END TX_SRX_FIFO;

ARCHITECTURE SYN OF tx_srx_fifo IS

SIGNAL sub_wire0 : STD_LOGIC ; SIGNAL sub_wire1 : STD_LOGIC ; SIGNAL sub_wire2 : STD_LOGIC_VECTOR (31 DOWNTO 0);

COMPONENT dcfifo GENERIC ( intended_device_family : STRING; lpm_numwords : NATURAL; lpm_showahead : STRING; lpm_type : STRING; lpm_width : NATURAL; lpm_widthu : NATURAL; overflow_checking : STRING; rdsync_delaypipe : NATURAL; underflow_checking : STRING; use_eab : STRING; wrsync_delaypipe : NATURAL ); PORT ( wrclk : IN STD_LOGIC ; rdempty : OUT STD_LOGIC ; rdreq : IN STD_LOGIC ; wrfull : OUT STD_LOGIC ; rdclk : IN STD_LOGIC ; q : OUT STD_LOGIC_VECTOR (31 DOWNTO 0); wrreq : IN STD_LOGIC ; data : IN STD_LOGIC_VECTOR (31 DOWNTO 0) ); END COMPONENT;

BEGIN rdempty "OFF", lpm_type => "dcfifo", lpm_width => 32, lpm_widthu => 3, overflow_checking => "ON", rdsync_delaypipe => 5, underflow_checking => "ON", use_eab => "ON", wrsync_delaypipe => 5 ) PORT MAP ( wrclk => wrclk, rdreq => rdreq, rdclk => rdclk, wrreq => wrreq, data => data, rdempty => sub_wire0, wrfull => sub_wire1, q => sub_wire2 );

END SYN;

Kaz

--------------------------------------- Posted through

formatting link

Vote

DC fifo behaviour at underflow/overflow

Join the Discussion

Didn't find your answer?