Can PPC in V2P reconfig the FPGA slices?

J

Jim Wang 21 years ago

If I have a PPC BRAM program running on a V-II Pro, can it reconfig the CLB slices? I'd like to be able to do this w/o using an external MCU. I know about PAVE SIF, but would like to avoid running VxWorks RTOS on the PPC.

I.e. if the PPC gets a new bitstream over RocketIO/Ethernet/serial, can it selectively reprogram part of the FPGA, using SelectMAP/JTAG/serial or whatever?

Thanks for any advice !

Jim

jimwang at cal dot berkeley dot edu

Vote

S

Symon 21 years ago

CLB

know

Vote

S

Sean Durkin 21 years ago

As Symon said, look for ICAP, the "Internal Configuration Access Port". xapp660-xapp662 and xapp290 should be interesting for you as well. EDK

6.X comes with an IP-Core connecting ICAP to the OPB, so theoretically you can reconfigure the FPGA with simple memory writes. But in real life, it's so complicated and there's so many restrictions and bugs in the tools that in most cases it's useless and not worth the trouble. It's a bit easier if you use a MicroBlaze instead of the PowerPC, since you can put the MicroBlaze pretty much everywhere you want, which makes the whole process a lot easier.

cu, Sean

Vote

G

Gerd 21 years ago

Uh?

It's not _that_ bad, isn't it?

As for the tools being buggy, if your talking about modular design flow, then probably everybody agrees, but otherwise you shuold be fine. At least the ICAP is easier to handle than the SelectMAP (see my other post :( ).

Regarding the placement of the MicroBlaze, it really takes up so much area that I wouldn't consider it _better_ than the PPC. In fact with the PPC, you only loose the couple columns associated with that PPC (and you can area-constrain the ppc-icap-interface to those same columns), which should be less than the number of columns required for a Microblaze in the smaller devices (up to 2vp20 I would guess).

Anyway, if you have any specific problems with the ICAP, please ask :)

regards,

-g

Vote

S

Sean Durkin 21 years ago

Gerd wrote:

Yes, it is. :)

The modular design flow isn't the only problem. If you have a dozen or so signals crossing a reconfigurable module and have to start using bus macros (and not just one or two), you get a different "FATAL_ERROR" every day...

The problem is that when you use the PPC, you're pretty much fixed in where you can put what module. Plus there's the problem of board layout... in my case, I used the Virtex II-Pro Development Board from Memec, which has the SDRAM connected to the pins on "the left side" (if you look at the FPGA like Floorplanner and FPGA Editor do), whereas the PPC sits on the "right side". Now if you need to load a partial bitstream from SDRAM, you need to hook up the SDRAM on the left side to the PowerPC on the right side, so where do you put the reconfigurable module? Right, in between... which gives you the problem of crossing the reconfigurable module with 32 SDRAM data-signals, times 2 because it's bi-directional and the bus macros don't support bi-directional signals, plus the signals for the address bus, plus the byte-enables, and so on. In the end you have to cross the reconfigurable module with 80 or 90 signals, most of which are time-critical, using bus macros... and you can't use the ones from xapp290, since you want to *CROSS* a module, not connect two adjacent ones... which means you have to draw the macros yourself with FPGA Editor... which means you're going to lose what little hair you have left on your head because the damn thing keeps crashing every 5 minutes... And if you hit the "save" button once too often, it leaves you with an nmc-file you can't read, because FPGA Editor will serve you another "FATAL_ERROR"... But FPGA Editor forgets some important attributes when saving bus macros, so now you have to use XDL to convert the macro to XDL, add the attribute and convert it back to NMC... but then you can't open the macro in FPGA Editor anymore, because it will remove the attribute, even if you don't hit the "save"-button...

But after you finally have managed to get bus macros that support what you want to do with them, the *REAL* fun begins... Now you get more "FATAL_ERRORS", because the tools can't place the macros the way you want them to... and if they finally do, they start routing nets into the reconfigurable module area, ignoring the constraints you set.. and after a while you find out that the only way to get that work is to not let the macros end at the module border, but instead move a few slices into the module. So now you have to fire up FPGA Editor again to draw new bus macros, this time a little wider, and the fun begins again...

The easiest solution in my case was to just forget about the PPC and use the MicroBlaze instead, and put it right next to the SDRAM => no more module crossing, problem solved...

If you don't have any or very little inter-module communication, there shouldn't be any problems, but you can hardly avoid that unless you design your board to suit this kind of application in the first place.

Phew, that felt good... :)

No, ICAP is not the problem... except for the fact that still noone can tell me why you can't clock it with the same frequency as SelectMAP, even though it's supposed to be nothing other than an internal connection to the same SelectMAP-interface...

But the *REAL* problem with all of this is that you can't do the

*REALLY* fun stuff anyway, since a) all you can reconfigure are complete frames from top to bottom of the device and b) the architecture of the Virtex II-Pro bitstream has not been disclosed, so you don't know which bits to change to e.g. change filter coefficients or something like that... So what can you really do with it? Change RocketIO-parameters (but only because Xilinx has disclosed the necessary info for this one particular case) and reconfigure entire modules, but that's only fun if you don't have any communication between modules. So what's left? Sure, you could spent a few weeks or months revers engineering the bitstream format, and *THEN* it could be fun...

But I'd be interested to know what you guys (or anyone else) use partial reconfiguration for? From my point of view, it's a nice "gimmick", an interesting research subject, but there's no "real-life application" that justifies the extra effort...

cu, Sean

Vote

G

Gerd 21 years ago

Naa ;-)

Now I know where your problem lies.. however, that problem has been solved :)

First, don't use bus macros unless really necessary. For what you describe here, directed routing combined with a few placement constraints is probably an easier and more reliable solution than bus macros.

And I bet in many cases you don't even have to bother with directed routing.

[...]

It's the same for the ML300 and most other V2Pro boards I have seen, unfortunately.

Depends on the size of your module. You could put it on the right hand side of the PPC - on a 2vp7, you can get about 1000..1200 slices there (with space left for cross-routing - read on).

If you put it in the middle, you obviously have pretty much as many slices as you want, within the limits of the chosen device of cause. Now, there is a very simple trick to getting around of having to build all those bus macros: divide the reconfigurable frames into parts for your module, and parts reserved for cross-routing. Now area-constrain the logic which requires cross-routing in such a way that the routes only go through the designated area (unfortunately you can't area-constrain the routing.. the whole constraints business has lot's of room for improvement). So to stick to your example with the SDRAMs and assuming you are using an plb-sdram-controller, constrain the sdram-controller to the left, the plb-part of the controller to the lower left, the plb bus core to the lower right, and your cross-routing will go only through the lower part of your device. Your module obviously goes to the upper middle section.

Continue along this line of thought for a bit and you should come to a feasible solution for modules, including cross-routing (actually, from this point on there are at least two possible paths, one with static bitfiles and one with a more dynamic approach to reconfiguration; both have been implemented successfully).

Fair enough. But unless you specially design a board for reconfigurable applications, you'll most likely always end up having cross-routing in one way or another.

Well, who says you can't? Just watch the BUSY signal.

Well, yes and no. I have multiple modules on top of each other working, 'on top' like SLICE_XiY8:SLICE_XjY31 and SLICE_XiY40:SLICE_XjY71 for example. They can be reconfigured without interference - as long as you don't have SRL16s and LUT-RAMs in the columns, of cause.

That's not exactly true, either. Of cause one has to look a bit harder for this information, but it is actually contained to some degree in the EDK6.2 icap drivers, and there is quite some information strewn across various appnotes and answer records as well. No single source covers it all, though - agreed.

It should be faster than that, with a bit of XDL hacking :)

Real life as in space applications - check out Carl Carmichaels papers about irradiation testing, there is a note about in-situ repair using partial reconfiguration (although to be really useful here requires a bit harsher environments than simple space).

Reconfigurable modules are a practical application - the hardest part IMHO is the communcation between the reconfigurable module and the rest of the logic (say, the PPC), because that actually _does_ require bus macros.

Some people supposedly use genetic algorithms to program FPGAs and using partial reconfiguration can signifcantly speed up this process (and the evaluation of the results).

There are also several useful applications which 'only' require the reconfiguration of routing, though I guess I shouldn't go into any details about these.

regards, Gerd

Credit where credit is due: the genetic thing is from a chat I had on FPGA conference this year, although I don't remember the name of the person whom I talked with; also to Tobias Becker who proved the 'static approach' mentioned above by just doing it :).

Oh and if you happen to be on FPL in two weeks, maybe there will be a presentation of a simple case of the 'on top' modules. It could be arranged, anyway.

Vote

G

Gerd 21 years ago

Have to correct myself on this one. I'm not sure you'ld "loose" these columns at all - I simply haven't tried reconfiguring these columns with the PPC active. Has anybody?

regards,

-g

Vote

S

Sean Durkin 21 years ago

Directed routing requires FPGA Editor, which I think has an Anti-Sean-mode built in, no matter if I try to draw bus macros or directed routing. :)

Right, but that wasn't enough in my case, so that wasn't an option...

How do you do that? Constrain the logic specifically to an area inside the module? And how do you prevent the tools from routing where they shouldn't?

Exactly. But as long as it's not dozens of signals, it's not really a problem.

Of course I can watch the BUSY signal, but still noone has been able to tell me *WHY* that should even be necessary if the clock I use is less than the maximum 50/66MHz I can use for SelectMAP without handshaking. It's not a problem per se, but "back in the days" it took me while to find the little tiny footnote in one of the appnotes stating the fact...

... which again restricts you in what you can do with it... But how do you create a partial bitstream for a module spanning only half the height? In Virtex-I-days you could just "cut frames in half" and put them back together pretty much every way you wanted to, so that wasn't a problem, but how do you do that in a Virtex II-Pro? I was thinking along the lines of building modules-inside-modules, and then keeping e.g. the lower "sub-module" static while changing the upper part. But when I tried it the tools didn't support that. Might be different now, I just don't have to time to try that any more.

Well, when I worked at this kind of thing, there was no EDK6.2... part of my job was to design sort of an OPB_hwicap, which became pretty useless just about when I was done, because EDK6 came out. That was some time well-spent. ;)

Didn't they stop developing XDL after ISE5? I tried it a couple of times with ISE6, and got a warning about it not supporting some of the device-specific properties, so that's where I decided to stop. I imagine you could do some really fun stuff with XDL, but only if you have the time to really get into it, which I don't...

SEU-correction was something I looked into when researching that, but at the time it all seemed more or less theoretical... there were a lot of papers where this and that was suggested and planned, but none of it (except some Xilinx Appnotes based on older architectures) was actually implemented and working, at least not on a Virtex II-Pro. Virtex I and II are different stories... And another problem is detecting the SEUs in the first place, since you can't just read back whatever you want to without the risk of locking up the PPC or corrupting your design. Ray Andraka wrote something about this awhile back, but I can't find the link right now...

In what way? The process of creating a new design should be what takes up most of the time... in all the papers I read on this subject, it either involved generating HDL-code and feeding that to the regular design tools, or using something like JBits. Either way, evolving a new generation takes quite a lot of time, and I can't imagine the few ms to load that into the FPGA to be of any significance... unless you have really huge FPGAs and a really fast way to create new designs (i.e. not using the tools from Xilinx, but manipulating the bits yourself, which still doesn't seem feasible to me with a Virtex II-Pro).

What really would be interesting is do have the PPC do the genetic algorithm, and reconfigure the FPGA it sits on, giving you a black box that adapts to its environment, all on one chip. That could be fun, but doesn't seem possible at the moment...

That's one I looked into as well... interesting for all sorts of image or audio processing... just put the building blocks inside the FPGA and connect them whichever way you want to create whatever algorithm you need, without disrupting other channels. Same for network processing... Kind of like a higher-level FPGA, that offers you FFT and whatnot as a "primitive"... use the thing as a coprocessor that serves just your computational needs, whatever they might be... And the reconfiguring of routing should be extremely fast, compared to reconfiguring modules...

Even though it still interests me, I'm pretty much done with partial reconfiguration... The whole thing was my master's thesis, and that's done and over with. Now that I'm not in research anymore (or at least not primarily), I don't have the time or the application for partial reconfiguration anymore...

cu, Sean

Vote

G

Gerd 21 years ago

Creative use of area constraints in combination with apropriate levels of design hierarchy. area_group module_all range=....; inst module_0/* area_group=module_all; area_group module_sub1 range=....; inst module_0/sub1/* area_group=module_sub1; inst module_0/sub1/ff_0 loc=...; .. that's the general idea.

As for the tools and routing: you don't. Leave a buffer zone between your modules and any adjacent logic. If you can't for whatever reason, use a bit of fpga_editor tweaking to get all signals within the boundaries of your module, but you really want that buffer zone.

The footnote probably only says that the ICAP is 'specified' to 33 MHz, or something like this, right? Well, I say don't worry about it. Go with whathever you'ld use with SelectMAP, and keep an eye on the busy line. OTOH, why would you want to exceed 33 MHz anyway? There's not much you gain from going beyond that.

Well, the issue with SRL16s and LUT-RAMs is a general problem with active reconfiguration and certainly beyond the scope of this thread.

1) you can still do it the same way, cutting portions of the frames and piecing them back together. Worked fine for me, at least, and dynamically as well (that is: no .bit files involved at all). 2) if you don't feel like cutting bits and bytes, go with XDL, or synthesize/map/par all possible combinations. Lot's of work, but has been done for small numbers of modules, by people without any knowledge of, or consideration for, bitstream details.

Except for XDL and the 'all combinations' approach, the tools won't help you with that, correct. Any piece of software (or hardware) that gives you access to raw configuration frames will help you (and there are several of those, some indeed included in EDK 6.2, others in JBits, still others... well, somewhere else).

Supposedly it's being abandoned soon, or so the rumors say. I know it's being successfully used on v2pro, regardless of the warnings.

About V2Pro, ask me again in 10 days :)

SRL16s and LUT-RAMs again, yes. Just replace them with normal FF-based aequivalents, and you should be fine (not sure about the PPC, so far).

[...]

The guy I talked to created bitstreams directly without the tools. Which is where the speed of reconfiguration and recovery of results matter.

It is possible. Use a device >= 2vp7, with opb_hwicap, opb_ethernet and Linux running on it and have fun. The 7 is a bit crowded, but starting with a 2vp20 you should have plenty of free area.

regards,

-g

Vote

S

Sean Durkin 21 years ago

Ah, the "let's try and hope for the best"-approach. :)

Yes... but at frequencies higher than about 38MHz (that's what I tested in my particular case) it gives you invalid data, so 33MHz seems to be a reasonable, if somewhat conservative limit.

It's not about speed, it's about keeping it simple. If I have a bus clock of 50MHz, and ICAP supposedly works up to 66MHz like SelectMAP, why should I even bother with watching the BUSY-signal? The thought never even crossed my mind until I ended up with strange data when reading back and with not-working designs after reconfiguration... ICAP not being able to handle that speed just was about the last place I looked when implementing this... and it didn't show up in xapp661 (or

662 or 660, can't remember which one) until AFTER I found out about it... then they released a new revision with that little footnote in it...

Sounds good... never dared to try that, though (because of all the warnings about the possibility of frying the thing when you download invalid bitstreams), and didn't have the time to.

Good to know, should I ever decide to to dive into the subject again. :)

BRAMs are a problem, too, especially when they contain the program you run on your PPC. When you read back BRAM contents, the instruction fetches can get corrupted, causing your program to hang or behave abnormally, like exiting loops after a more or less random number of cycles and such. That gave me some headaches as well...

Maybe, if you have the corresponding hardware. :)

Anyway, good to get some feedback on this, some new ideas... now if I only had the time... :)

cu, Sean

Vote

G

Gerd 21 years ago

Hmm, you don't tell at which clock edge you are sampling your data, and also even with the SelectMAP you have to watch busy when doing any kind of readback (I assume that's what you mean with 'invalid data'), so I guess it would be normal to have problems when not doing so.

See the SelectMAP documentation about all of the meanings of BUSY during readback. I guess that would be the prime reason to bother with it, regardless of the clock. Btw. I've been using 100 MHz almost all of the time, as long as no external cables were involved, and it does work fine, at least on the 2vp7.

Are you sure it's instruction fetches that get corrupted? Because about the SRLs/LUTs the standing explanation is that readback and write access collide, not readback and read access - so I wonder if it really was the readback + read-instr.fetch combination that caused the problems, or whether it may have been write accesses (reentry points for function calls, loop counters or whatever) getting corrupted.

regards,

-g

Vote

S

Sean Durkin 21 years ago

It might be writes as well, good point. The thing just was that everything worked fine when debugging, but didn't when you just ran the program regularly. And as soon as I left out the BRAM portions during readback, everything was OK, so I assume it must've something to do with the readback interfering with the program execution in some way...

cu, Sean

Vote

A

Arun 21 years ago

gerd?NO? snipped-for-privacy@rzaix530.rz.uni-leipzig.de (Gerd) wrote in message news:...

Hi there, I am working on a project in which I need to use modular design to achieve partial reconfiguration using microblaze and its peripherals (as one module). I am using opb_hwicap. The way I am doing is... making a system design in xps, taking desing as sub module and exporting it to ISE. In ISE I am deleting the system_stub.vhd, system.bmm files and generating ngc for system.vhd without I/O buffers(xilinx option for modular design). I made a top level file in which the system and another module are used as blackboxes. ngc of top level is generated with I/O buffers using ISE. Then as per the modular design rules, I copied top level ngc to folder intial and generated ngd of it. coming to active module phase of system, when I am using this

ngdbuild -p xc2v1000fg456-4 -uc top_level.ucf -modular module -active system.ngc ../../Top/Initial/top_level.ngd system.ngd

I am getting error....

ERROR:NgdBuild:559 - Cannot find active block 'system.ngc' in the design.

Annotating constraints to design from file "top_level.ucf" ...

Checking timing specifications ... Checking expanded design ... WARNING:NgdBuild:885 - logical block 'testled' with type 'led_submodule' is unexpanded and will be presumed to be a module. WARNING:NgdBuild:885 - logical block 'top_micro/opb_hwicap_0' with type 'opb_hwicap_0_wrapper' is unexpanded and will be presumed to be a module. WARNING:NgdBuild:885 - logical block 'top_micro/rs232_p160' with type 'rs232_p160_wrapper' is unexpanded and will be presumed to be a module. WARNING:NgdBuild:885 - logical block 'top_micro/mb_opb' with type 'mb_opb_wrapper' is unexpanded and will be presumed to be a module. WARNING:NgdBuild:885 - logical block 'top_micro/dlmb_cntlr' with type 'dlmb_cntlr_wrapper' is unexpanded and will be presumed to be a module. WARNING:NgdBuild:885 - logical block 'top_micro/dlmb' with type 'dlmb_wrapper' is unexpanded and will be presumed to be a module. WARNING:NgdBuild:885 - logical block 'top_micro/ilmb' with type 'ilmb_wrapper' is unexpanded and will be presumed to be a module. WARNING:NgdBuild:885 - logical block 'top_micro/lmb_bram' with type 'lmb_bram_wrapper' is unexpanded and will be presumed to be a module. WARNING:NgdBuild:885 - logical block 'top_micro/microblaze_0' with type 'microblaze_0_wrapper' is unexpanded and will be presumed to be a module. WARNING:NgdBuild:885 - logical block 'top_micro/ilmb_cntlr' with type 'ilmb_cntlr_wrapper' is unexpanded and will be presumed to be a module.

I didn't understand why those wrappers are still unexpanded. Can any one please tell me how to get rid of this error? any help would be greatly appreciated.

regards arun

Vote

G

Gerd 21 years ago

'guess somebody will have to just try that one - should be simple enough. Maybe I get a chance next week.

-g

Vote

Can PPC in V2P reconfig the FPGA slices?

Join the Discussion

Didn't find your answer?