Hi all, I'm working on a ADS-XLX-V2PRO-DEVP20-6 (The Xilinx Virtex 2 PRO development kit from Avnet, with a XC2VP20 onboard), and I was wondering if it was possible to connect the DDR SODIMM module (the board comes with a Micron 128 MB module) to a Xilinx EDK 6.3 design with plb_ddr or opb_ddr IP. In fact the UCF that came with the board support package lack of any reference to a DDR clock feedback pin, so although I can connect every other element, without that pin I can't correctly clock the memory... Unfortunately, I couldn't find any design that uses DDR in EDK, so I don't know if that pin exists (and it isn't mentioned on the bsp software) or if it really isn't possibile to use the xilinx cores with it.
I good starting point would be to contact Avnet to get the EDK Base System Builder (BSB) support files for this board. The website indicates this board is bundled with EDK so they should have this available.
Fact is that I've found the BSB on their site, but it only has the option to use SDRAM, SRAM, FLASH or COMPACT FLASH but no DDR, and it's also very simple, since it can just use one of this component in the design (you can't select more than one, because the bus for them is shared and you need to manually extend the design with ISE to multiplex it... anyhow this shouldn't be a problem since the linux designs that are in the Board Support Package use this technique). But no examples with SODIMM DDR... It really seems that DDR is not supported with EDK (at least with ready-to-go cores)...
I'll try to contact AVnet, but anyway if anyone has any info...
The plb_ddr and opb_ddr are for single DDR chips, not a DIMM. So you will need to modify them to handle multiple chips. This is actually not that difficult; I have done it myself and have been using the design for awhile.
Unless later boards have corrected the design, there is a design error on the Avnet boards. It turns out the when using DDR signals, the pins on the Virtex2p are arranged in pairs that must share a common clock. On DDR DIMMs, the DQS signals are special and require a different clock from the other DIMM signals. However Avnet shared these pairs between the DQS and DM signals. For example, they put DDR_DM0 on pin R22 and DDR_DQS0 on pin P22.
If you look at the Xilinx docs, these pins are named IO_L56P_7 and IO_L56N_7. Notice that they have the same L number, and differ only in the N/P designation. This indicates that these pins are a pair that must share the same DDR clock.
The solution that I used for this problem was to recognise that in my application, the mask (DM) bits would never change during a data transfer. So I let DM use the same clock as DQS, and setup the DM signals slightly early and hold them slightly longer than needed.
You will need to simply route this feedback signal internally. That means that the phase of the DCM might need to be adjusted in software. And indeed, the DIMM test bitfile that Avnet provides does indeed provide the ability to determine and set the optimum DCM phase.
In practice, while I did implement the ability to alter the phase of the relevant DCM (the "ddr_clock" in the Xilinx ddr_clocks reference design), I no longer use that. I have the default startup "PHASE_SHIFT" set to "33" and never change it (the other DCMs have a "0" phase). I have used 128MB, 256MB, 512MB, and 1GB DIMMs on 2 different Avnet boards, all without adjusting the phase, and all operate perfectly.
They will need to share the same physical clock, so you need to modify it in the core. If you don't, you will get error messages, I think during place and route.
I did it in the top level design, which in my case is outside of EDK (I am using the so called "projnav" flow). What I did was to turn one of the clock outputs into a bidirectional pin. That is, for the DIMM, there are two differential clocks, so I have: DDR_Clk_0 : inout std_logic; DDR_Clk_L_0 : out std_logic; DDR_Clk_1 : out std_logic; DDR_Clk_L_1 : out std_logic;
See that one of them is now an "inout". For that one, I instantiated a buffer:
ddr_clk_io : IOBUF port map ( I => DDR_Clk_0_O, IO => DDR_Clk_0, O => DDR_Clk_0_I, T => DDR_Clk_0_T );
I tied the "T" pin to '0', and connected DDR_Clk_0_I to the feedback pin of the DCM. This means the feedback at least takes into consideration buf to pin and pin to buf delays. The only thing missing is board delay, which is relatively small, and constant.
Yep, and if you use the clock routing scheme I show above, just try "33" and probably it will work fine.
Hi Duane, thank you very much again for all the infos, you've been very kind, and now I understood how I can proceed! I'll start to prepare the design and as soon I'll be back in the lab I'll try it on the board, and let you know if it all works.
Ehh... what the heck. I can't post the files directly, since the originals are copyrighted by Xilinx. But if you know how to use diff files, then here it is:
You are free to use that in any way you want, without restriction. But of course, use at your own risk!
Note that this design has one extra thing. The DIMM not only has a PLB interface, but it also has an external interface. The related signals are pretty easy to see, since they almost all start with EXT_ or EXT2IP_ or IP2EXT_. If you don't want to bother stripping these signals out, just tie the extra inputs to '0'.
The purpose of these signals is to allow a high bandwidth interface to the DIMM without bogging down the PLB bus. I use this to allow RocketIO access (using the Aurora core) to the DIMM. Writing via the EXT interface has had lots of testing, and while writing, the PLB bus can simultaneously read/write the DIMM. Using this interface, I can fill a
1GB DIMM from the RocketIO in about 5 seconds. Verifying the entire block of acquired data with the PPC takes considerably longer ;)
Reading of the EXT interface is a recent addition. It is somewhat more complicated, and has not been thoroughly tested yet. But it seems to work. I have not verified simultaneous PPC access while reading via the EXT interface.
Oh, and while at it, I included a simple testbench in the same directory. The files bd_test.vhd and bd_test_siml.vhd are separate tests that do slightly different things. You compile one or the other into the testbench. The top level testbench file is bd_top.vhd.
The Avnet ADS-XLX-V2-DEV4000 has the same design error:
ERROR:Place:17 - The current designer locked placement of the IOBs ddr_dqs and ddr_dm makes this design unroutable due to a physical routing limitation. This device has a shared routing resource connecting the ICLK and OTCLK pins on pairs of IOBs. This restriction means that these pairs of pins must be driven by the same signal or one of the signals will be unroutable. Before continuing with this design please unlock or move one of these IOBS to a new location.
...when trying to use the plb_ddr core with the board. The board documentation says to use the old XAPP200 DDR core which does not use the DQS signals in the same way, so I guess that might work too for the V2Pro.
Thanks for the tip about the DM mask bits to get plb_ddr to work!
Hi Duane, again me here, I started working on the diff file, but I think it lacks of the dimm_controller.vhd implementation (that I think it's your revision of the ddr_controller.vhd, right? ). Or I am doing something wrong? Anyhow I'm checking if I can modify it myself. Can you confirm this? Any chance to check that file to?
By the way, I should mention one more subtle gotcha. The addresses to the DIMM need to be reversed, because this determines the DDR/DIMM commands.
# These need to be reversed from the schematic labeling, # because Xilinx made all their VHDL models (0 to n) NET "DDR_Addr" LOC = "V25"; NET "DDR_Addr" LOC = "U26"; NET "DDR_Addr" LOC = "T28"; NET "DDR_Addr" LOC = "T25"; NET "DDR_Addr" LOC = "U27"; NET "DDR_Addr" LOC = "T26"; NET "DDR_Addr" LOC = "R27"; NET "DDR_Addr" LOC = "R25"; NET "DDR_Addr" LOC = "R28"; NET "DDR_Addr" LOC = "P26"; NET "DDR_Addr" LOC = "V26"; NET "DDR_Addr" LOC = "M30"; NET "DDR_Addr" LOC = "P27";
Modified the core, but it didn't work... Unfortunately I discovered that I hadn't the Service Pack installed, so I had to modify manually the cores (two of them were older than the ones you used for the diff files...) and had to do some fine tuning... Tomorrow I'll ask to install the SP 2 on the lab's machine and check with it installed what I can do.
Ok, I had the SP2, patched the files and tied the external ports to 0 touse only the PLB connection. I even imported the core in EDK, but it seems not to work correctly. I'm wondering if it depends on how I clocked the DDR and the system, with the TWO classical DCM tied to 100 MHz both for the bus and for the DDR DCM...
I modified a little the ddr_clocks reference design. I added that diff to the same location as the other files. Notice that it is against an EDK6.2 version of that file. Also, I found and fixed one bug in read_data_path.vhd, though this only affects the external interface.
The bd_top.vhd file shows one example of how to connect everything. You probably should run this simulation to make sure everything works, then modify it to zero out the external interface and try it again.
I also added an example system.mhs file to show how they are connected in a real system. And finally, an example system_top.vhd file, to show the top level structure of how they connect to the pins.
Hi and thanks for the new files, worked on it again yesterday and now it still doesn't run properly but at least now when I write a 16 bit value the system stalls (before it couldn't write anything from 32 bit to 8 bit: I always got back 0), so it's clear that something as changed (although I don't know if for the better :) ). I'll see if finally I could make it work!
Well, I'll admit to only using with 32 bit values. There may very well be problems with other data widths, though I expect a fix would not be terribly difficult. That is definitely something that would be a lot easier to check in simulation rather than hardware.