I have read several posts here about the difficulties to get Spartan 3 parts in small quantities.
Is it realistic to start a project using spartan 3 (actually the XC3S50 in the VQ100 package is what I probably want to use) when I only need small quantities - starting with getting some 10 samples, in full production lets say 100 to 500 pieces per year?
I would suggest that past history of Xilinx should be taken here you won't get any for a year after the announcement! there aren't any unless you have a spare million so don't bother asking :-)
Where would you buy them? What do they say? Can you get samples now?
Are there any features on the Spartan 3 that you absolutely need? (Can you use some other chip?)
What are the costs of alternatives? What are the costs of not being able to get the chips when you need them?
How long is it going to take you to do the design? (When do you absolutely need the samples?) Can you work on the design with two plans in mind and make the choice a month or two from now?
My rule of thumb is to not design in a chip unless I have parts in hand or a distributor has stock that I'm sure I can get.
If an interesting chip has some features that would make a project a lot better (or even possible), then you have to decide if you want to stick your neck out. Do you like fighting with not-quite-debugged tools? Do you have good contacts at the vendor?
--
The suespammers.org mail server is located in California. So are all my
other mailboxes. Please do not send unsolicited bulk e-mail or unsolicited
commercial e-mail to my suespammers.org address or any of my other addresses.
These are my opinions, not necessarily my employer's. I hate spam.
Spartan 3s are hard to get but they are available, I'm using XC3S400s in a new design and we were able to get sample quantities. If you are just starting the project put your sample order in now, the lead times are long. My client waited until a week before the boards showed up and they ended up having to buy the parts from a broker on the other side of the world. I wouldn't worry about production quantities, Xilinx claims their yields are good. The problem is that demand unexpectedly spiked so there is a shortage this quarter.
I don't know that the Spartan 3 parts are a major step forward in FPGAs. From what I can see, the main difference is the elimination of the huge startup currents on power up. The marketing claim is that these will be much cheaper parts because of the small die. But so far, I don't think anyone has seen the results of this.
If you design in a Spartan 3 based on quoted pricing today, you are not likely to see that price drop at any time through the life cycle of the part.
--
Rick "rickman" Collins
rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.
Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design URL http://www.arius.com
4 King Ave 301-682-7772 Voice
Frederick, MD 21701-3110 301-682-7666 FAX
I don't know about that, Xilinx initially set expectations low except on price. I heard Microblaze only ran at 85MHz on it compared to
120MHz or more on bigger Virtex.
But on a cpu project I am working on I am seeing synth reports of
311MHz on sp3-5 with the latest speed file v 320Mhz for v2pro-8 and the -7s seem to be same speed as sp3-5 IIRC. The sp2-4s are way down to 120MHz. Seems as if its v2 made dirt cheap (if and when we get them) with a small cut in speed. Also the LUT counts are similar to sp2 but the blockrams are 4x bigger.
I can still port back to sp2(e) with almost same floor plan but with much smaller ram instances although lots of 4ks could still be more usefull than equiv no of 16/18Ks but the speed cut would hurt.
For an oldtime VLSI guy, I couldn't imagine getting such performance on an ASIC flow without 100x the design resources.
I'd check your report files closely if I were you. If you are seeing
311MHZ on a Spartan 3 something is very wrong. I suspect that your synthesizer discarded most of your design. My experience sith Spartan XC3S400-4s is that they are much slower than Virtex2Ps (-5 is the V2P that I'm comparing it to). I'm able to get the Spartan 3s to meet 140MHz timing but that is with very few logic levels between pipeline stages. I'm sure that with lots of floorplanning it would be possible to push it higher than that but certainly not to 300MHz, especially not on something as complex as a CPU.
I don't know why you would not expect the XC3S parts to be faster than the XC2S parts. Certainly going with a 2x reduction in feature size (or close to it) *should* give you a huge increase in speed. In fact, they should outrun everything Xilinx makes given the feature size. But they cut a lot of corners to make the parts cheap so they don't follow the curve. So far, I have not seen the prices beat the older Spartan parts either. Sure, they are an improvement, but in this industry, improvement is normal and part of the game. But the XC3S parts seem to be just the next new chip, not anything really special.
If the XC3S parts were both faster than the Virtex line and cheaper than the older Spartan line, *that* would be something to crow about. But they are *neither* at the moment. They are just the standard improved line that combines both (more or less).
--
Rick "rickman" Collins
rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.
Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design URL http://www.arius.com
4 King Ave 301-682-7772 Voice
Frederick, MD 21701-3110 301-682-7666 FAX
I know what you are saying. When I first presented my paper cpu architecture to XST, the situation looked hopeless. I backed of and built a no of test projects that only included 1 object that was pushed to the max bringing all IOs to the pads. The synth reports are then crystal clear even for someone with little exp of the tool before. I also look at the layout and placement to see if it looks kosher. It did. From that I had a feel for what each Xilinx device
I know what you are saying. When I first presented my paper cpu architecture to XST, the situation looked hopeless. I backed of and built a no of test projects that only included 1 object that was pushed to the max bringing all IOs to the pads. The synth reports are then crystal clear even for someone with little exp of the tool before. I also look at the layout and placement to see if it looks kosher. It did. From that I had a feel for what each Xilinx device
Hopefully 4th time lucky, my girls are helping me way too much. With google I don't know what happened for several hours, I am sure a couple of half posts are infront. Apologies. Long replay warning.
I know what you are saying. My 1st paper cpu arch when presented to XST gives me little clue where to start. I always used to work on ASICs in teams where I write Verilog & C models and someone else (far less speed/area motivated) bangs the FPGA tool. With Virtex800 exp only at
Sure, but the more pipeline stages you add, the longer the latency is for each instruction. How many cycles latency will there be for a single add instruction? Do you intend to make sure that the number of threads is equal to this latency, so that the latency as perceived the thread executing the instruction is 0?
What's your cache / memory architecture? Handling lots of threads could be tricky.
I am sure anyone would love to get a cpu at 300MHz in FPGA but the arch will be on my terms. The code base is remarkably small v previous projects I have worked on, the Verilog is
I just posted a very long reply but the server just xxxxed it so I will write it again later offline.
Quick answer yes, HT must match 4 or 8 etc. Cache architecture is currently 1 way set associative, but more Blockrams would allow more ways. Question of whether the FPGA should hold lots of lite cpus or 1 monster cpu or maybe combinations of both!
In my experience, the stumbling block for custom CPUs is not so much the hardware as it is the compiler for it. I did a small microcontroller for a XC4036E design several years back that ran at 66 Mhz. It was a pretty simple machine that was sort of a cross between a PIC and an RCA1802 in that it used a 16 deep register file like the 1802, and it was a harvard architecture like the PIC. Like the 1802, the operands for the ALU were fetched from the register file and results returned to the register file. The beauty of it was that for control applications, you often did not even need any memory beyond the register file. The processor size was about 80 CLBs (translates to 80 slices in current architectures). I'm not a compiler person, so the big difficulty I had with it was the compiler.
I suspect that the difficulty for just about any home grown processor is going to be the tools to compile the code for it, although folks who are more saavy than I on the software side might argue that the high speed hardware design is the hard part.
john jaks> > > I am even tempted to max the datapath to 64b as it only
--
--Ray Andraka, P.E. President, the Andraka Consulting Group, Inc.
This is right, and John admits this in another reply. You should also add DEBUG support, as that's more important as the CPU targets bigger applications. Once you have a compiler, users will want to do more and more, and then debug becomes very important.
It depends a lot on the target use. Something that runs from a Block RAM inside the FPGA, can be very small/very fast, but is probably best coded in some form of Assembler. Best example of 'Advanced Assembler Art' is Randy Hyde's HLA (High level Assembler) but that currently targets only x86
- tho I'm sure that's not hard to fix :) This HLA allows IF..THEN..ELSIF etc, and handles the labels needed, as well as giving local scope (so is a big step-up from vanilla ASM).
Half agreed, as Jan has shown any std risc cpu project can grab lcc to do the task quite quickly by messing with the emit tables. If this were just another std risc project I'd probably do same, but then it wouldn't be anywhere near 300MHz either, more like MicroBlaze.
Only hyperthreading allows max speed, but if the processes don't communicate with each other then lcc could still be used as is and ignore the HT stuff.
Some of my background is in compilers and other tools but I never worked for anybody doing that. The lcc compiler (Hanson & Fraser) is possibly the best documented C compiler writing text book around and highly recomended as it explains thoroughly just how horrible C really is where most C books gloss over it's complexity. The complexity for me comes because I am combining essentially 3 langs together and putting in a mini OS runtime. The Transputer did it before but chose an unfriendly syntax and supported C only as an afterthought.
I will probably get through it ok but I would love to pass that part on but then that person would be knee deep in it instead.
The HW part is more fun though. The 1802 takes me back, not bad in a twisted sort of way, it certainly used very little logic, I had it under a scope at Inmos.
How much code are you writing? Would you be willing/happy to do it in asembler?
Assemblers can be pretty simple, especially if the target is raw binary running at loaded at 0 rather than something needing linkers and libraries. Also helps if the target is RISC and doesn't have messy addressing modes.
How much would a reasonably clean sample assembler help? There should be a good example from the academic world. Just type in the new opcode table.
--
The suespammers.org mail server is located in California. So are all my
other mailboxes. Please do not send unsolicited bulk e-mail or unsolicited
commercial e-mail to my suespammers.org address or any of my other addresses.
These are my opinions, not necessarily my employer's. I hate spam.
ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here.
All logos and trade names are the property of their respective owners.