Metastability pipeline causes bad juju

I need to send a couple signals across a clock boundary (27 MHz to 36 MHz) so I put in a pipeline for each signal to take care of metastability (in Verilog):

input CLK36, RSTn; input x1, x2, x3; output x1_buf, x2_buf, x3_buf;

reg [3:0] pipe1, pipe2, pipe3; reg x1_buf, x2_buf, x3_buf;

always @ (negedge RSTn or posedge CLK36) if (!RSTn) begin pipe1

Reply to
Chris
Loading thread data ...

so I put in a pipeline for each signal to take care of metastability (in Verilog):

[snip]

work -- these signals are enables to clock data from the processor into some registers, and the registers never get updated. Worse, the board sometimes draws about twice as much power as it's supposed to. If I get rid of this entire section of code and just use x1, x2, and x3 without anything to take care of the clock boundary everything works fine, but you never know the circuit might work intermittently because of metastability.

[snip]

It may look simple but you may be expecting all 12 bits of data to show up at the same 36MHz clock edge. It won't. One solution is to use one signal that goes through the metastability pipe to select *which* of two differently delayed version of the data to use. A properly designed FIFO is the general solution to this problem.

Please check out Peter Alfke's TechXclusive article on crossing time domains for ideas:

(short version)

formatting link
(long version)
formatting link
guageID=1

Reply to
John_H

I see what you mean, but I don't think I explained my problem clearly enough.

My processor runs at 27 MHz and it needs to set a couple registers in the FPGA that runs at 36 MHz. The processor sets the 8-bit data and 8-bit address and then a little later sets a flag high. By this time the data and address have been stable for numerous clock cycles (of both the 27 and the 36 MHz clock) so there's no metastability problem on either of them. In order to clock them in, the processor sets another signal high, for instance x1 in the code above. I'm not in a big hurry here so I ran it through some registers to take care of the metastability, otherwise I might look into a dual-port FIFO or something. I should get something like this:

27 MHz x1 000000011111111111000000000

36 Mhz pipe1[0]

00000000xx111111111xx000000

36 MHz x1_buf

000000000000111111111111000

The data and address are stable long before x1 goes high and long after x1_buf should go low. I'm not concerned about speed, all I care about is that x1_buf has no metastability, which it shouldn't. I've done stuff like this before and it worked fine, but in this case it didn't work at all.

Reply to
Chris

that runs at 36 MHz. The processor sets the 8-bit data and 8-bit address and then a little later sets a flag high. By this time the data and address have been stable for numerous clock cycles (of both the 27 and the 36 MHz clock) so there's no metastability problem on either of them. In order to clock them in, the processor sets another signal high, for instance x1 in the code above. I'm not in a big hurry here so I ran it through some registers to take care of the metastability, otherwise I might look into a dual-port FIFO or something. I should get something like this:

should go low. I'm not concerned about speed, all I care about is that x1_buf has no metastability, which it shouldn't. I've done stuff like this before and it worked fine, but in this case it didn't work at all.

There is still a bit of info missing. Did you *simulate* this design? If it passes simulation and fails in the chip, then you have a real issue, or your inputs are not what you think they are. If both fail, then I guess you can find the problem more easily in the simulation.

But your metastabilitly circuit should work fine. So I suggest you look at how the delays are affecting the rest of the circuit.

--
Rick "rickman" Collins

rick.collins@XYarius.com
 Click to see the full signature
Reply to
rickman

If you have your values latched and stable... If you take one synchronizing signal through a synchronizing pipe...

Check to make sure your synthesis didn't do you a "favor" by replicating one or more registers in your synchronizing pipe. I've heard of people getting multiple, sometimes conflicting versions of what should be a single node, resynchronized pulse.

Reply to
John_H

No, I don't have a simluation tool -- startup company, shoestring budget, etc. I've always been a fan of the big simulator called real life anyway.

I'll look into it. I've had this sort of trouble with CPLDs before.

Reply to
Chris

I've always been a fan of the big simulator called real life anyway.

replicating

I don't understand. You can get a *free* copy of the Modelsim simulator that will work with Xilinx tools. I thought that came with the Foundation tools, but maybe not. It does come with the free webpack package. I really have no idea how you can expect to get an FPGA working in a reasonable time without a simulator. Even if yo have no money, you can't afford to do without a simulator. :)

--
Rick "rickman" Collins

rick.collins@XYarius.com
 Click to see the full signature
Reply to
rickman

I wouldn't be without my simulator but ChipScope offers a possible alternative which makes sense in certain circumstances. The simulation time is certainly reduced! At the expense of synthesis/compilation time! Cheers, Syms.

Reply to
Symon

OK, I'll play devil's advocate. Why is simulation so critical?

Why don't software people simulate their code? What's the difference between software and hardware that makes simulation so important/good for hardware?

What's the turn around time to make a minor patch to a FPGA design and download the new bits and do the testing in real time?

I'm not trying to say that simulation is a bad idea, just trying to understand what makes it so appropriate for the FPGA world where the NRE of a trial is time rather than the cost of a mask set.

--
The suespammers.org mail server is located in California.  So are all my
other mailboxes.  Please do not send unsolicited bulk e-mail or unsolicited
 Click to see the full signature
Reply to
Hal Murray

Ok, if you test a new design in a chip and it doesn't work, what do you do then? In a simulation you look at the internal signals to figure out what you did wrong. On the bench you would need to recompile your design to bring different signals out to a test point until you track down the issue.

My original point was not that it is essential to use a simulator for

*every* issue of a design (although I do), but that the OP was not using one *at all*! Have you ever designed an FPGA without simulating??? If you have, I bet it was before I was working with FPGAs, most likely on the Xilinx 2k devices... ;)
--
Rick "rickman" Collins

rick.collins@XYarius.com
 Click to see the full signature
Reply to
rickman

depends on how you define simulation but i think actually software people do simulate their code. as far as i am concerned, the debugger is analoguous to simulator in software world and just downloading is similar to running the program and looking at the output. when you run the program under the debugger, you get to see all the internal state in interesting ways (waveforms or register values etc).

also simulation happens usually with rtl as opposed to gate-level. one wants to see the intended behaviour before the synthesizer messes with it. of course there are times when you have to look at the gates but similar to assembly only debugging, it is much more difficult to do as opposed to source level debugging. with rtl simulations, you see all the internal registers/wires as they are named by the designer which may not be available after synthesis/p&r.

the issue is observability and controllability. of course if you have the ability to dump all the internal nodes of the fpga, look at the with a waveform viewer, suspend the execution, apply different values to pins etc. you may not need simulation but that requires a pretty darn good logic analyzer, internal dump support, a good pattern generator and still leaves you with out rtl level access.

imo rtl simulation is still the best way to weed out the initial stage of bugs from your design.

Reply to
m

ModelSim's free edition isn't too great. I tried using it with my design early on and it ran into its max number of gates very quickly. The Xilinx Edition is a crippled version of the full version that still only has a pretty limited FPGA size. Given $10k and the choice between a logic analyzer and Modelsim, I think it's a pretty obvious choice.

I'm not against simulating, of course, but in my experience it can take just as long to run a simulation as it does to compile and burn code into a chip. Perhaps I am a little biased from my days of working with ModelSim on Sparc Stations, where it might take 20 minutes to run a simulation of medium complexity and anything over a certain size you might as well run over your lunch break or even overnight. Back then I just got in the habit of testing/troubleshooting in hardware with a logic analyzer if the simulation was too complex. With little effort you can route all of the important nodes in the device out to your logic analyzer and change between different sets of test signals in real-time by just putting in a little serial port control program or even by setting jumpers on the board.

As far as the impossibility of doing this with a "complex" design, I've got an XC2V1500 about 70% full and I've never simulated any of it. This includes a custom soft processor, Viterbi and Reed Solomon encoder, etc. It's not exactly rocket science but it's not 2k territory either.

Another thing -- as a system becomes more complex, simulations become less viable. Just ask any weatherman or climatologist. Okay, so an FPGA isn't a chaotic system but it still applies. At what point does it become such a pain to simulate all the various interfaces with sufficient fidelity that you're better off just trying it in the hardware? How often have you had a problem in the hardware and it turns out to be something you never even THOUGHT to simulate?

Reply to
Chris

on and it ran into its max number of gates very quickly. The Xilinx Edition is a crippled version of the full version that still only has a pretty limited FPGA size. Given $10k and the choice between a logic analyzer and Modelsim, I think it's a pretty obvious choice.

I guess I am not clear about this. The "max" lines of code have to do with the simulation speed. I am currently still using the free version of modelsim even though I passed the size threshold long ago. I really don't see a significant slowdown in my simulations. I guess if I were running very long simulations it might be an issue, but the software is still very usable.

I am not trying to argue the point with you, but how do you probe inside the FPGA with a logic analyzer? Do you recompile to bring out internal points to IO pins?

as long to run a simulation as it does to compile and burn code into a chip. Perhaps I am a little biased from my days of working with ModelSim on Sparc Stations, where it might take 20 minutes to run a simulation of medium complexity and anything over a certain size you might as well run over your lunch break or even overnight. Back then I just got in the habit of testing/troubleshooting in hardware with a logic analyzer if the simulation was too complex. With little effort you can route all of the important nodes in the device out to your logic analyzer and change between different sets of test signals in real-time by just putting in a little serial port control program or even by setting jumpers on the board.

One of the things you can do in a simulation is to create a test bench that not only drives the inputs to your design, it verifies the outputs. This can relieve you of a lot of tedious manual checking of signals.

If you are happy with your technique, fine. But you might want to revisit the simulation world and see what has changed in the last however many years.

XC2V1500 about 70% full and I've never simulated any of it. This includes a custom soft processor, Viterbi and Reed Solomon encoder, etc. It's not exactly rocket science but it's not 2k territory either.

viable. Just ask any weatherman or climatologist. Okay, so an FPGA isn't a chaotic system but it still applies. At what point does it become such a pain to simulate all the various interfaces with sufficient fidelity that you're better off just trying it in the hardware? How often have you had a problem in the hardware and it turns out to be something you never even THOUGHT to simulate?

I catch all sorts of bugs in simulation that I did not expect. I don't design my simulation purely to find bugs I think of. I design my simulations to verify that a circuit is working correctly, just like the tests I run on the hardware.

So how does *not* simulating help finding bugs that would be missed in simulation?

I am not trying to argue this point. If you are happy not simulating, fine. But you asked about a problem that in simulation would likely have been identified quickly or at least could be explored. I am just trying to point out that fact. Why isn't the logic analyzer helping you with this problem? Can't you bring these signals and data out to the IO pins? That should identify the problem.

--
Rick "rickman" Collins

rick.collins@XYarius.com
 Click to see the full signature
Reply to
rickman

Reply to
Symon

Chris,

I won't say that simulation isn't a good thing, but I will say that Chipscope Pro(tm) sure beats using a logic analyzer.

Aust> ModelSim's free edition isn't too great. I tried using it with my design early

on and it ran into its max number of gates very quickly. The Xilinx Edition is a crippled version of the full version that still only has a pretty limited FPGA size. Given $10k and the choice between a logic analyzer and Modelsim, I think it's a pretty obvious choice.

as long to run a simulation as it does to compile and burn code into a chip. Perhaps I am a little biased from my days of working with ModelSim on Sparc Stations, where it might take 20 minutes to run a simulation of medium complexity and anything over a certain size you might as well run over your lunch break or even overnight. Back then I just got in the habit of testing/troubleshooting in hardware with a logic analyzer if the simulation was too complex. With little effort you can route all of the important nodes in the device out to your logic analyzer and change between different sets of test signals in real-time by just putting in a little serial port control program or even by setting jumpers on the board.

XC2V1500 about 70% full and I've never simulated any of it. This includes a custom soft processor, Viterbi and Reed Solomon encoder, etc. It's not exactly rocket science but it's not 2k territory either.

viable. Just ask any weatherman or climatologist. Okay, so an FPGA isn't a chaotic system but it still applies. At what point does it become such a pain to simulate all the various interfaces with sufficient fidelity that you're better off just trying it in the hardware? How often have you had a problem in the hardware and it turns out to be something you never even THOUGHT to simulate?

Reply to
Austin Lesea

I'll have to look into the free ModelSim version again to see if I would slow down too much.

I have 16 pins dedicated to an unused expansion bus on the Xilinx that I currently have hooked up to a logic analyzer. I can select beween 8 different

16-bit signals internal to the FPGA by hitting 1-8 on the serial port. Any time I add some code I hook up the relevant signals to the mux. I rarely have to recompile to get different signals out.

Maybe I'll try this out, it looks like it might be worthwhile.

I'm not anti-simulation, I wish I had a good simulation tool but I also wish I had Matlab and a few other software packages, too. I don't, so I end up doing strange things like "simulating" an RS Encoder using bitwise operations in Microsoft Excel.

Reply to
Chris

It sure is worthwhile. I have a system which needs approx. 100 hours to=20 simulate 500 us in ModelSim. Doing the same thing in Chipscope takes 500 =

us and gives me the essence of the results. Yes, I'm currently using the =

free version of ModelSim XE, but Hey!

I don't say that simulations are bad things either, but for some systems =

it is just not feasable in a greater scale. During those 100 hours I=20 manage to do quite a few system implementation / Chipscope verification=20 iterations.

Also, you can evaluate the full version of Chipscope for 60 days for=20 free. Should be enough time to get familiar with it...

--=20

----------------------------------------------- Johan Bernsp=E5ng, snipped-for-privacy@xfoix.se Research engineer, embedded systems

Totalf=F6rsvarets forskningsinstitut Swedish Defence Research Agency

Please remove the x's in the email address if replying to me personally.

-----------------------------------------------

Reply to
Johan Bernspång

WOW! I have no idea what is going on in 100 hours. I am simulating about 800 LEs and 5 blocks of memory, plus a test bench and it takes less than a minute per 5 uS of simulation.

--
Rick "rickman" Collins

rick.collins@XYarius.com
 Click to see the full signature
Reply to
rickman

Well, it's a quite large design sampling the input at 200 MHz (from an=20 ADC), CIC filters, LP FIRs, CORDIC, BP FIRs etc. I.e. lots of=20 calculations for my poor ModelSim.

Earlier in the design cycle I've been trying to avoid simulations due to =

the complexity of the system. But I gave it a try anyway since I was=20 occupied with some other stuff for a few days...

--=20

----------------------------------------------- Johan Bernsp=E5ng, snipped-for-privacy@xfoix.se Research engineer, embedded systems

Totalf=F6rsvarets forskningsinstitut Swedish Defence Research Agency

Please remove the x's in the email address if replying to me personally.

-----------------------------------------------

Reply to
Johan Bernspång

Of course, you simulated each of these blocks separately to verify they worked? ;-) Cheers, Syms.

Reply to
Symon

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.