Delaying a pulse train

I have to delay a pulse train by a given number of clocks on the same domain as the pulse to be delayed.

The best approach I can think of is to run a counter of sufficient width and log pulse transitions and states into a circular "pulse transition list" of sufficient depth. After waiting for the desired number of clock transitions (the delay) an output counter of the same width as the input sample counter is allowed to start counting. This counter is used to address the "pulse transition list" to generate a delayed output that matches the input.

Using SelectRAM memory for delay is out of the question as too much memory would be required and it is needed elsewhere in the design.

Can anyone suggest a better way to do this? The incoming pulses are relatively regular and can be of any duration, from a few clocks to hundreds.

Thanks,

-Martin

Reply to
m
Loading thread data ...

Martin, have you thought of using the dual-ported BlockRAMs? Use a free-running counter to address one port, where you write the incoming data. Use another counter, appropriately offset, (or a subtractor) to read the data out on the other port.

As l> I have to delay a pulse train by a given number of clocks on the same

Reply to
Peter Alfke

It would consume too many BRAMs. I need delays in the order of 500us at 150MHz. That's about 5 BRAMs (Virtex2). I need them for other portions of the design.

It'd sure be easy though!

-M

Reply to
m

The entire delay can be "a few" to "hundreds" of clocks and the overall delay can be 500us*150MHz=75000 periods. How many pulses can be in transit at any one time? hundreds? Only one? The transition list might be the lowest-resource approach, especially when using SRLs to maintain the list. I'm just not sure if 16 (or 32 or 64) elements are enough for your needs. You'll need a 17 bit counter to reach 500 us at 150 MHz. It might be handy to guarantee clean operation by adding an 18th bit for pulse polarity. The number of entries you'll need is twice the number of pulses you must accommodate.

Reply to
John_H

Reply to
Peter Alfke

His original suggestion was to mark the edges on the input and replicate the edges to the output with a counter shifted by the needed delay, accumulating these transitions in a list. As long as the number of transitions in the delay is small (compared to 75k, at least) his suggested approach is wholly adequate.

Reply to
John_H

I have the module I described in the original post working just fine right now. So, for cases where the number of transitions during the delay period isn't excessive, yes, there is another way. I agree though, that, if the intent was to store any random pulse train you'd have no choice but to store and "replay" at a later time...requiring BRAM.

-M

Reply to
m

I think his edge-tag scheme is valid and is often used in Logic Analysers, to get better apparent dynamic range. Of course it is more complex that a simple spinning delay buffer, but it may be the Logic/BRAM trade off is worth it ?

-jg

Reply to
Jim Granville

What is your expected maximum edge count, in that 500us ?

=jg

Reply to
Jim Granville

Huffman run-length encoding has been used successfully in early- generation fax machines. Whether that or any other compression (e.g. edge-detection) scheme is good, depends on the characteristics of the bitstream in question. Peter Alfke

Reply to
Peter Alfke

I should have also mentioned that there's a need to delay more than one such pulse streams (at least four). That would require about twenty 18K BRAMS...I use that many without going to a significantly larger device.

-M

Reply to
m

If you have the time-headroom, you might be able to compress what you have now, with a simple dictionary type lookup, and a dT storage. One more level of lookup, but the average storage/edge can drop. How many bits do you store now, per edge ?

-jg

Reply to
Jim Granville

Reply to
Peter Alfke

Martin,

At the end this will result in using LUT's as long shift register or a DPRAM as FIFO, doesn't it? A memory based approach can save up the usage of LUTs. I think it's best to implement by using block RAM's.

Luc

Reply to
lb.edc

I think the description of your solution in your original post was rather confused.

To me your "circular pulse transition list" needs to be a FIFO. If you only record transitions you need wide counters and FIFO to handle wide pulse spacing. A very long 1 bit FIFO is the opposite extreme.

The optimum will be run length encoding with a FIFO width of something in between (assuming there is no pattern to the pulse data which a compression scheme could exploit).

1 bit to indicate the data state and n bits of count of clock cycles at that state. For multiple streams you can add more data bits and use only one set of encoding/decoding logic and one larger FIFO.
Reply to
nospam

A good idea, but what if you want co-incident edges ? Perhaps a small phase-nudge field, that passes to the IO block, and allow sequential FIFO unloads, but coincident IO edges ?.

-jg

Reply to
Jim Granville

The multi-bit data approach supports coincident edges. An event at time t presents a data value of 4'b0010 for a single pulse active. The event at time t+delta is 4'b0101 resulting in 2 pulses going active the same time as the original pulse deasserts. No nudging required.

Reply to
John_H

Yes, you are correct, I was thinking of more compressed port-pointer storage, but if you store one-bit-per-port, then all ports can change. Becomes very like a dT logic anaylser in playback mode.

-jg

Reply to
Jim Granville

Depends on the pulse in question.

I've implemented five different variants utilizing different techniques. The first is a "brute force" BRAM-based approach. The other variants use a-priori knowledge of the pulse patterns to implement delays using counters, pulse lengths, etc.

It's a good problem. It is clear that low resource solutions are possible (and desirable) if pulses are relatively cyclic and this knowledge can be coded into the logic from the start. The case of a randomly changing pulse with a random number of transitions per unit of time is probably one that almost requires a BRAM buffer approach.

Thanks for your suggestions. I think I have a couple of low-resource solutions that work well now. Valuable BRAM resources have been preserved for the rest of the design.

-M

Reply to
m

Reply to
Peter Alfke

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.