Xilinx ISE ver 8.2.02i is optimizing away and removing "redundant" logic - help!

Hello All,

I am writing a VHDL design for a Xilinx FPGA using ISE ver 8.2.02i (8.2 SP 2) and I'm trying to get post-map simulating correctly now that I have post-translate simulating correctly. I put "keep" attributes on every single signal, including those in the ports (editor keystroke macros help a lot). I also ran the following command line in a command (DOS) window: (just run the following three lines together with spaces at the line breaks; it is a copy from the DOS window): Note the added -u to try to prevent logic removal, which is why I had to run this in the command window; it's not available as a setting in map properties.

map.exe -ise ppcaesh.ise -intstyle ise -p xc4vfx12-ff668-10 -cm speed

-detail

-pr b -k 4 -c 100 -u -o user_logic_map.ncd user_logic.ngd user_logic.pcf

I have the map optimization now set for speed in my project, reflected in this command. I tried setting "no optimization" as well as all the other optimization choices. The above command made the post map files for me. Now, at least all the red is gone from Post-Map simulation and some of the bytes are right in my first section of output. I think this is due to all the "keep" attributes I added. My removed "redundant logic" list was made a little smaller. Here is the syntax for "keep":

signal mysignal : std_logic; -- declare a signal

attribute keep : string; -- you just need this once -- then you can do the next line for each signal you want. attribute keep of mysignal : signal is "true"; -- this goes in the architecture section after the signal declarations and -- before the "begin".

(Thanks to the thread "Looking for ways to keep diagnostic signal from being optimized out (Xilinx)" here in comp.arch.fpga)

This syntax is found in the Constraints Guide, cgd.pdf.

Here is a partial list of the removed logic: Section 5 - Removed Logic

------------------------- Optimized Block(s): TYPE BLOCK GND XST_GND VCC XST_VCC

Redundant Block(s): TYPE BLOCK LOCALBUF u0/my_sub_mod_128_0_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_10_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_12_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_11_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_13_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_14_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_15_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_16_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_17_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_18_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_19_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_1_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_20_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_21_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_22_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_23_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_24_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_25_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_26_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_27_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_28_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_29_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_2_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_30_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_31_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_3_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_4_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_5_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_6_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_7_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_8_xo1/LUT3_D_BUF LOCALBUF u0/my_sub_mod_128_9_xo1/LUT3_D_BUF LUT1 myrst_inv1 LUT1 dcnt_Msub__sub0000_xor11 INV Bus2IP_Clk_inv_INV_0 LOCALBUF Mxor__xor0019_Result1/LUT3_L_BUF LOCALBUF Mxor__xor0055_Result1/LUT3_L_BUF LOCALBUF Mxor__xor0127_Result1/LUT3_L_BUF LOCALBUF Mxor__xor0083_Result1/LUT3_L_BUF LOCALBUF user_logic_010_xo1/LUT4_L_BUF LOCALBUF user_logic_012_xo1/LUT4_L_BUF LOCALBUF user_logic_014_xo1/LUT4_L_BUF LOCALBUF user_logic_018_xo1/LUT4_L_BUF LOCALBUF user_logic_019_xo1/LUT4_L_BUF LOCALBUF user_logic_01_xo1/LUT4_L_BUF LOCALBUF user_logic_020_xo1/LUT4_L_BUF LOCALBUF user_logic_022_xo1/LUT4_L_BUF LOCALBUF user_logic_023_xo1/LUT4_L_BUF LOCALBUF user_logic_024_xo1/LUT4_L_BUF LOCALBUF user_logic_028_xo1/LUT4_L_BUF LOCALBUF user_logic_02_xo1/LUT4_L_BUF LOCALBUF user_logic_032_xo1/LUT4_L_BUF LOCALBUF user_logic_033_xo1/LUT4_L_BUF LOCALBUF user_logic_035_xo1/LUT4_L_BUF LOCALBUF user_logic_036_xo1/LUT4_L_BUF LOCALBUF user_logic_039_xo1/LUT4_L_BUF LOCALBUF user_logic_03_xo1/LUT4_L_BUF LOCALBUF user_logic_040_xo1/LUT4_L_BUF LOCALBUF user_logic_043_xo1/LUT4_L_BUF LOCALBUF user_logic_044_xo1/LUT4_L_BUF LOCALBUF user_logic_045_xo1/LUT4_L_BUF LOCALBUF user_logic_046_xo1/LUT4_L_BUF etc... (93 more lines)

As you can see from this list from the file user_logic_map.mrp in the project directory, there is still logic being removed. The optimized blocks are still removed if you set "no optimization" and the "redundant" blocks are still being removed even with -u "Do Not Remove Unused Logic" command. Could we have a "Do Not remove Any Logic" option?

-and have the "no optimization" setting respected fully (when set)?

The key, I have learned, is to use the correct Xilinx VHDL style, which is different for FPGAs and ASICs. Once you follow that, you won't have any more problems. Can someone advise me on this correct syntax from this list of "optimized" and "redundant" logic? Meanwhile I am reading the xst.pdf manual, section on VHDL style to try some things. The style to avoid latches I already used, which really worked. Also the style to clock ROMs so that they won't be optimized away as asynchronous RAMs I already used, which did the trick in post- translate.

Best regards,

-James

Reply to
james7uw
Loading thread data ...

I guess the most obvious question I would have for you first off is "Why are you bothering with this?" Let the tool do it's job which is to turn VHDL/Verilog design source files into the properly formatted bitstream needed to program the device.

Don't waste your time trying to prevent the tool from optomizing your code...trust me even the best code written by an experience person can be optomized to target a specific device.

I realize that doesn't address your question, just thought I'd save you what seems to me to be a fruitless exercise on your part.

KJ

Reply to
KJ

The reason I'm doing this is implied in my third paragraph: "Now, at least all the red is gone from Post-Map simulation and some of the bytes are right in my first section of output." In other words I'm doing this because it wasn't working, according to the simulator, and this improved matters significantly but not completely. The "red" I refered to was used by my simulator, ModelSim III XE 6.1e starter edition, to indicate unknown output values, I think, since X's for unknowns appeared post-map. They no longer appeared after I did the above and then some but not all of my output was actually correct.

In my first sentence I wrote "I'm trying to get post-map simulating correctly now that I have post-translate simulating correctly." In other words I'm trying to get the post-map design correct as shown by the simulator, because it isn't. I got the behavioural simulation showing correct logic and I got the post-translate simulation to show correct operation by clocking my ROMs so they wouldn't be optimized away.

Do you have any ideas to prevent "redundant logic" from being removed? I've been told the key is in using different VHDL coding style. I'm also going to look into putting "Save" on all my nets. This is a constraint, according to dev.pdf, the Development System Reference Guide, so I will look it up in the Constraints Guide, cgd.pdf. "Keep Hierarchy" might help.

I suspect this is the problem because post-translate I was having my ROMs inferred as RAMs and optimized away, and, as I wrote in my third paragraph, "My removed "redundant logic" list was made a little smaller," and operation was significantly improved, as indicated in that paragraph. I figure that restoring more removed logic might do the trick. It certainly looks like the thing to try.

Does anyone have some samples of VHDL code before and after that were interpreted as redundant and then weren't after being changed? Best regards,

-James

Reply to
james7uw

And again, I'll point out that this going about the task of 'getting it working' by trying to disable synthesis optomizations is the wrong approach and will likely be wasted effort.

But that means nothing until you understand why the optomized result was 'incorrect'. Remember that optomization does not involve changing the overall function, just the implementation of that function. I have no doubt that if you were to somehow disable every possible optomization that it might be possible to have it emulate the code that you originally wrote...but I'll content that it still won't work for you in a real device either.

While it's possible (but not terribly probable) that there is a bug in the synthesis tool the more likely explanation is in the source code that you wrote. Synthesis to actual FPGA/CPLD/ASIC produces an output model that is strictly std_logic/std_ulogic based...there are no 'enumerated types', 'integers', etc. Those output models also model expected propogation delays that will exist in the actual device. That being the case, here are the things to look for and how to go about looking for them.

- Peruse the synthesis report for warnings. If it runs across code that is valid but is not well synthesized there will usually be a warning (the classic example being the latch, signal 'initialization' values being another one). Comb through those warnings and fix them.

- Peruse the timing report for timing conditions. Timing analysis produces five basic numbers: setup time (for each input relative to the clock that samples it), hold time (for each input relative to the clock that samples it), clock to output delay, propogation delay (for pure combinatorial input to output paths) and clock frequency. Now go back to the code for your testbench and make sure that you are - Not violating setup or hold time. - Not violating clock frequency - Not blaming the post route model when you look at outputs going to 'X' at a time that is still within the clock to output delay or propogation delay.

- Peruse your source code for ANY usage of a data type other than std_logic or std_ulogic. Enumerated types and integers are not illegal and can easily be synthesized but they are susceptible to misuse. The misuse comes about because in the simulation environment signals/variable of those types will get 'magically' initialized...there is typically no such magic in a real device. You can write code for a counter using type 'natural' that will simulate just fine but when that 'natural' is translated into 'std_logic' as it must be to be synthesized the output model will not 'work' and will sit there as an unknown value because the original code had nothing to reset it to a known value.

Those are the tools you need to debug your problem. Disabling optomizations is not in that list and will only lead you down a path that will result in your final design not working anyway.

KJ

Reply to
KJ

Hi,

  1. I agree fully with KJ who is an experienced author in this group.
  2. I never do any post-map simulating with all 6-8 projects I have finished individually in Xilinx FPGA and all of them go to market successfully.
  3. While doing simulation, just check if logic design is correct, don't have to check timing.
  4. Let Xilinx compiler determine if the project meets its timing: a. setup timing; b. holding timing; c. running frequency; If Xilinx ISE tells there is no timing violation in the above 3 catagories, put the design in a chip, then test the board to see if there is any logic design error.

Never spend time doing post-map simulation; Never spend time using DOS command lines; Never spend time turning off Xilinx's optimization;

Weng

Reply to
Weng Tianxiang

Thanks for your replies, KJ, Weng. It is clear to me that disabling optimization is not the real way to fix my problem, but that using the coding style that Xilinx likes is the way. Once that is followed, no more problems will occur. I am just not looking forward to the painful process of trial and error that I have read will be required by first-time Xilinx users to get the right coding style. I have looked over the xst.pdf manual for coding style. Removing all the latches was very helpful when I did that prior to posting here, back at my translate stage.

I am using only "std_logic". I will check my synthesis report for warnings. I have no timing violations listed as of this stage: post-map. I am not at post-PAR yet. Why should I take the time to place and route when post-map simulation doesn't even work? I think doing that is for experienced users who don't have trouble with the earlier stages.

I am doing a lot of simultaneous "xor"s of different bit ranges of 128-bit "words" and using a function that uses a function (i.e., combinatorial logic) and I'm doing that simultaneously as input to signals that are then "xor"-ed. These are done after each clock cycle, when initial signals are updated. That is, when these initial signals are updated in a process at the rising edge of my clock, then I have additional signals that should just be updated because the data has changed. I'm not using any sensitivity list or any clock cycle for them. These assignments should cause signals to change, which cause the next set of signals to change, in about three steps, with ranges of bits being processed in parallel (and mixed, which is why I have to get into bit ranges). Finally, signals named "_next" are updated, then the next clock cycle is awaited at which time the original signals are updated from the "_next" signals. Based on my experience so far in which I got into trouble at the synthesize and translate stage due to not having a clock on my ROM, do you think putting clocks on everything would be the thing to try? This is the trial and error that I will have to go through.

Do you have any samples of this kind of VHDL code that Xilinx likes, that you could show me?

Best regards,

-James

Reply to
james7uw

In my experience with Xilinx and/or other FPGAs, the only kind of HDL that these tools "don't like" are non-synthesizable constructs. For e.g "a

Reply to
ankyag

Reply to
Weng Tianxiang

Sorry if my previous comment confused you. All I meant was it cannot be used as a concurrent assignment (in vhdl) or in the "assign" statement in verilog. It is okay to use it within a "process" or "always" block. In the latter case, a latch/flip-flop is inferred.

Hope this clears the confusion, Ankur

Reply to
ankyag

Weng,

Can you clarify the 2nd one about "DOS command lines"? I'm using xilinx webpack tools under linux, operating from the command line. Actually I've built up a Makefile that invokes the commands. Is there some gotcha I need to know about? I prefer command line tools operated by "make" as opposed to IDE's.

Below's the important pieces of the Makefile. The commands I got from the pacman source build script, converted to unix make syntax. Works fine.

-Dave

XILINX=/Xilinx NAME=main SETUP=LD_LIBRARY_PATH=$(XILINX)/bin/lin XILINX=$(XILINX) \ PATH=$(PATH):$(XILINX)/bin/lin

bitfile: step0 step1 step2 step3 step4 step5

step0: $(SETUP) xst -ifn $(NAME).scr -ofn $(NAME).srp step1: $(SETUP) ngdbuild -nt on -uc $(NAME).ucf $(NAME).ngc $(NAME).ngd step2: $(SETUP) map -pr b $(NAME).ngd -o $(NAME).ncd $(NAME).pcf step3: $(SETUP) par -w -ol high $(NAME).ncd $(NAME).ncd $(NAME).pcf step4: $(SETUP) trce -v 10 -o $(NAME).twr $(NAME).ncd $(NAME).pcf step5: $(SETUP) bitgen $(NAME).ncd $(NAME).bit -w #-f $(NAME).ut hwtest: sudo xc3sprog $(NAME).bit

----- main.scr contains this:

run

-ifn main.prj

-ifmt VHDL

-ofn main.ngc

-ofmt NGC -p XC3S500E-FG320-4

-opt_mode Area

-opt_level 2

------ main.prj just lists the vhd source files.

--
David Ashley                http://www.xdr.com/dash
Embedded linux, device drivers, system architecture
Reply to
David Ashley

It matters very much whether it is in a clocked process or not. If 'a

Reply to
KJ

I wouldn't necessarily recommend this. What you need to do is first get your experience up to a level where post-route simulation reveals no surprises. The way to get that experience is to do a few of these in the first place and then get a feel for where your code is not quite up to snuff.

As an example, it's possible to get all the way through the build process and have no errors or warnings and have it still not work simply because you used a 'natural' to build a counter instead of unsigned (see my previous post for more on when this can be a problem). In the hands of an experienced designer, use of 'natural' can be better than 'unsigned'; outside of those experienced hands is quite a different story.

There are also times (like ASIC designs) or contracting where post-route sim is required as a check off item that needs to be completed.

In any case, in the hands of an experienced designer doing an FPGA, post-route sim can typically be skipped as you suggest but as a general rule no it can not. In fact, in the case of the original poster of this thread, the post-route simulation is waving the big red flag indicating that there is something wrong either with his design or testbench.....that's a good thing, better to find out sooner rather than later....the problem is that instead of simply debugging to find the cause of the problem he seems to want to flip whatever build time switches are available to make the problem somehow disappear.

KJ

Reply to
KJ

Hi KJ, No, I disagree with you about that it would generate a latch.

Actually it is a combinational logic. Through one of Xilinx tools, you can check that it just generates combinational logic. That is all.

No latch would be generated if 'a

Reply to
Weng Tianxiang

You're right, it wouldn't be a typical latch but "a

Reply to
KJ

Hi David, I never use DOS commands and all options are accessable through Xilinx ISE window system so that I don't know how to answer any questions about it.

Weng

Reply to
Weng Tianxiang

Reply to
Weng Tianxiang

Aha! So your list was just describing your own way of development, it wasn't meant as advice as to how best to do development.

Thanks-- Dave

--
David Ashley                http://www.xdr.com/dash
Embedded linux, device drivers, system architecture
Reply to
David Ashley

Hi All,

What do people think of my idea from my post of Sun, Sep 10 2006 7:25 pm? I have a description of what I am doing, followed by a question:

I am trying this. Does anyone have some (simple to them) samples of VHDL code along these lines that succeed in a Xilinx FPGA?

Best regards,

-James

Reply to
james7uw

It sounds like relatively straightforward logic and registers. There are absolutely no issues with using any design language to implement this.

Not unless the outputs are required to be clocked for some other reason no clock is required. A couple reasons why you might want to clock the outputs would be...

- More consistent timing on when the outputs become available (i.e. the clock to output delay generally doesn't change much if the final outputs are clocked)

- No glitching on the outputs. The output of a flip flop will either change or remain the same for the entire clock cycle whereas the output of combinatorial logic implemented inside an FPGA might glitch during the propogation delay while the new output value is being computed.

That doesn't imply that clocked are 'better' or 'worse' you just need to be aware of what will come out. As another very general statement, there is usually absolutely no need for internal signals to be clocked except to improve clock cycle performance. Since you've provided no information regarding what speed you need to run at, I'd say that there is no speed issue at present.

Not sure exactly what you were trying to describe but my interpretation is that is something of the form...

y
Reply to
KJ

Thanks for your reply. I was skeptical myself. I do have my equations and my VHDL code. Behavioural simulates correctly, post-synthesis and translate simulates correctly, then post-map fails simulation abysmally. About 140-150 lines reporting removed "redundant logic" were reported by the mapper, in addition to two lines indicating VCC and GND were "optimized" away (see copy of first lines of output in my first post). I suspect that it is the removed logic that is needed to make the simulation work, because when I use "keep" statements, the simulation improves greatly, but is still half wrong at an early stage. However, I don't want to use kludges, I want the tool to recognize it is all needed from the way I write the VHDL. Your sample VHDL is pretty much what I have; I just have a lot of simultaneous such lines at each timing period for different bit ranges going to different bit ranges. What could be causing everything to work fine at the first three stages and then have the post-map stage fail its simulation so badly? I certainly suspect all the removed "redundant logic" that the mapper is reporting. But how to indicate it is not really redundant, without using "keep" and "save" statements everywhere? I can't even complain that all my signals are connected (they are), because that's not the problem: the mapper is not removing "unused logic".

Best regards,

-James

Reply to
james7uw

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.