I am thinking to buy a new PC (laptop, ubuntu).
One of the things I want to do more in the future is FPGA design. I
currently do very simple projects and I notice that with the 4 GB of RAM
I have in the laptop I currently use, that is already an issue.
And I want to try out soft-cores in the future.
What would one see as requirements for a PC for FPGA design for a hobbyist?
I guess memory is the main issue. Or not?
You can get some very powerful machines for not too much money. I would
get a decent I7 machine with lots of memory, that way it will be useful
for a long time. If you are low on money, there are some choices
available for a lot less, but get the fastest high memory machine you
can afford and you won't suffer from regret.
Although a lesser machine might do, in the long run one ends regretting
the choice. As an example I was recently working on the ep32 CPU a zero
address CPU and found that with Windows 10 64 bit some of the older
software used to generate programs for it would not work right, what
ended working right was when I used VMware virtual CPU running Windows 7
32 bits then everything worked flawlessly. Had my machine been not
capable of running VMware well the issues would not have been resolved.
My FPGA machine uses an I7 at 3.4GHz, 24GB of RAM, and a 512GB SSD drive
for the OS and virtual partitions, it can handle anything I throw at it.
If you can get a stationary machine, do so - it will be much more
cost-effective when you need a powerful system.
Modern FPGA tools should be quite good at using multiple cores. And
they are /very/ good at using memory - get as much as you can afford.
Portables are often very limited in their memory - these days I see a
lot of laptops that come with 4, 6 or 8 GB and don't let you upgrade at
all. And laptops all have tiny little screens - even the biggest ones
are tiny. A nice big monitor at 2560x1440 is pretty cheap these days,
and a lot better to work with than a laptop screen.
He is running Ubuntu, so will have different issues (some things will be
easier, some things harder). But having plenty of memory for running
virtual machines is definitely a good idea (though I'd recommend
VirtualBox over VMware).
Don't bother with the SSD unless you have more money left over, or need
to boot the machine regularly. He is running Linux - with enough memory
(in comparison to file sizes), disk speed is of little relevance as the
OS uses memory for cache. If you are getting an SSD, though, make sure
it is at least 200 GB - below that size, most of them are quite poor.
That is the beauty of having options, I used to use VirtualBox myself
but after a bout where VirtualBox removed the licenses to Windows 10 out
of the blue I switched to VMware.
I installed Windows 7 and used up a license, then upgraded it to Windows
10 and all was fine for a while, then on one of the upgrades to
VirtualBox it removed the licenses to Windows 10 and several Partitions
that had Windows Server that I used for school also lost their licenses,
that was it, and away it went.
SSD makes a huge improvement in OS booting, application startup, and
Virtual Partition speed, they are cheap now you can get a 256GB SSD for
$50 when on sale.
Myself I'm pondering about switching to Linux myself, I have 2 PCs
running Mint Linux and I have not had any problems with them in years,
the same cannot be said of Windows.
I have little experience with Windows 10, and none of it using virtual
machines - so I can't help you here.
But you are right about the beauty of choice.
It makes a difference to booting - but I boot my machines perhaps once
every two months, and if it takes 2 minutes instead of 1 minute, I don't
mind. And on Linux, which is hugely more efficient than Windows in the
way it caches files, it might make a slight difference to the first time
you start an application - but not after that.
If you have a very large working set (i.e., a lot of big files that you
are reading and writing at once), so that the data can't be cached in
your ram, then an SSD will often be faster. But even then, a couple of
decent HD's in raid (Linux supports top-class software raid) will often
give you similar speed.
Of course, an SSD is never a /bad/ thing - but if you have a choice of
an SSD or more ram, then more ram is usually the best use of your money.
It is different on a laptop, where the low power and robustness of the
SSD helps, and where you do want to reboot regularly and move ram data
onto disk for hibernation.
And on Windows, where the OS loads entire application exe's and dll's
rather than just the bits you need, and keeps re-loading the files it
has just read, then an SSD can make a significant difference. (To be
fair on Windows, Win 10 is better than previous generations at using
memory for cache.)
My main concern when it comes to FPGAs is simulation time. I spend much
more time simulating than I do creating a bit stream. Simulators are
speed crippled unless you pay for a faster version. So I'm not sure how
much the speed of your CPU matters.
Memory is always important. I have 16 GB in my laptop and tend to push
that limit just by keeping many browser windows and tabs open. So at a
minimum get 16 GB and more if you want.
Otherwise I think you will never notice the difference between the
various processor speeds, at least not as much difference as the price
would seem to indicate.
Don't sweat it. Before you spend any money on a new laptop, just get
familiar with the tools and the process. Then figure out if you want to
pay for new hardware.
Both of the statements above are wrong.
Modern FPGA tools can occupy multiple cores, but it does not translate into
significant improvements in synthesis or P&R speed.
Two cores are at best 15% faster than one core. Four cores over two - hardl
y noticeable at all.
As to memory, each FPGA device requires certain amount of memory for P&R. W
hen you have that much then getting more does not help. If you have less -
you better don't start, it would be too slow. The said "certain amount" is
typically specified in the documentation of your tools.
For example, Altera Quartus 15.1 (devices that are likely to matter for hob
Cyclone IV E - 512 MB to 1.5 GB
Cyclone IV GX - 512 MB to 2 GB
Cyclone V - 6-8 GB
MAX II/MAX V - 512 MB
MAX 10 - 512 MB - 2 GB
So, >2 cores and >8 GB of RAM matter *only* if one wants to run several com
In money-limited situation the choice between
a) 8 GB of RAM + HDD
b) 16 GB of RAM + 256 GB SSD
is very obvious - take b)
That assumes that you are also buying external HDD for backups and for rare
ly used staff, but you'll want it anyway, don't you?
On Tuesday, September 20, 2016 at 12:36:42 PM UTC+3, firstname.lastname@example.org w
to significant improvements in synthesis or P&R speed.
dly noticeable at all.
When you have that much then getting more does not help. If you have less
- you better don't start, it would be too slow. The said "certain amount" i
s typically specified in the documentation of your tools.
rely used staff, but you'll want it anyway, don't you?
Mistake above. I meant to say:
a) 16 GB of RAM + HDD
b) 8 GB of RAM + 256 GB SSD
Ubuntu is o.k. for Xilinx Vivado
For Altera Quartus, it is not supported. It does not mean that it wouldn't work at the end, but initially it would be a pain to setup, relatively to Win7 or to Red Hat related Linux distros.
Soft cores don't add to RAM requirements on HW development side.
Be very careful with your choice of laptop regarding cooling. Even the very
fast ones are designed to only be very fast for short periods, although mo
nster gaming laptops are an obvious exception. Most of us here have laptops
but we synthesize on the "farm" due to overheating. My fan was running alm
ost constantly until I stripped it down and replaced the dried up and usele
ss thermal paste on the processor.
The most critical thing involved in choosing hardware for synthesis (at
least with Altera tools) is not what you expect. It's Iris Pro graphics.
And it's even better when you have Iris Pro graphics /and/ a discrete GPU.
Even though the tools don't use the GPU for compute whatsoever.
Think I'm crazy? Here's the explanation.
FPGA synthesis is very memory heavy. The in-memory dataset can be large (eg
for Cyclone V Altera recommend 6GB of memory - that probably means you want
utterly hammers the CPU cache which is tiny in comparison (2MB/core on many
Intel parts). That means memory latency is a big bottleneck.
Some (not all) Iris Pro parts have EDRAM, which is 32/64/128MB of DRAM in
the CPU package. It's intended for the GPU to have closer memory for than
having to share DDR3/DDR4 with everything else. However on some Haswell,
Broadwell and Skylake parts, the EDRAM can be used as L4 cache for the CPU.
The latency of EDRAM is about half that of DDR3, and this shows in benchmark
results - eg against a dual-socket E5-2667v2 (8 cores per socket) the
Broadwell i7-5775c (quad-core 128MB EDRAM, 6MB L2) is about twice as quick.
Against an i7-6700k the Broadwell is about 10-20% quicker (I don't have the
exact numbers here). I tweaked other parts of the Broadwell machine with
some excessively 'enthusiast' parts (DDR3-2400, NVMe, crazy cooler) which
made insignificant differences but it was the CPU choice that stood out.
Now the bit I haven't benchmarked is as follows. The L4 is relatively
small, and so having the GPU take out a chunk isn't ideal. So my theory
goes that a machine with Iris Pro graphics and any old discrete GPU will
prevent the video system using the EDRAM and so keep it all for use as L4.
Because I don't know exactly what the GPU drivers will use EDRAM for I
haven't found a good way to benchmark EDRAM contention. The two test
machines I have are using ancient Radeon X1300 and Geforce 7700 GPUs just to
have a basic display off the EDRAM.
The downside to this is that EDRAM is pretty rare across Intel's product
range, particularly in desktops and servers. However it's more common in
laptops - which means that, if you can get the cooling package right, a
laptop isn't a bad option for synthesis. The other option, unless you're
willing to go to a desktop i7-5775c or i7-5675c, or a Xeon E3-1200 v4, is
the Skylake Skull Canyon NUC. Despite being thermally constrained this
clocks in about the same performance as an i7-6700k desktop with a massive
cooling solution (the NUC is also sharing EDRAM with the GPU).
(who would be very interested if there's any standardised benchmarks out
there for synthesis tools)
Quartus is absolutely fine on Ubuntu (up to 16.04). You just have to
install a handful of libraries and a udev rule and that's it.
(There are a couple more warnings you can make go away by deleting the
supplied libraries it uses and letting it fall back to the system ones, but
I usually don't bother and just ignore them)
We use it 100% on Ubuntu and never had an Ubuntu-related problem we couldn't
easily solve. (The main one is working out what Ubuntu have renamed the
packages to this time, which happens every two years).
For Quartus, there are two phases with different performance
Synthesis is mostly RAM-bound, except when it needs to interact with the
on-disk database (which can be about 1GB in size).
IP Generation (mostly the Qsys tool) - when you ask it to generate
Verilog for a system-on-chip you built, and it produces a large number of
verilog files instantiating all the IP you need for your system. This is
disk bound because it's all about latency.
SATA SSD will make IP generation about a factor of 2 faster, NVMe perhaps
1.5x faster than SATA SSD. SATA SSD is perhaps 10% quicker for synthesis
(figures off the top of my head, I don't have hard numbers with me). NVMe
didn't make much difference to synthesis.
What really kills is network filesystems. Do not put your files on NFS, or
worse any kind of off-premises network filesystem, because you will be in for
much pain. Do not put your files in Dropbox, because you will then thrash
trying to upload them (as well as eat bandwidth for breakfast).
At the moment flash is cheap enough that I won't really consider HDD any
more except for semi-archival storage. Though beware the cheap end of the
consumer flash market where there's plenty of dross (5400rpm laptop HDDs are
pretty dire too). My current trick is buying 'prosumer' kit (eg Samsung 850
EVO) and formatting it 7-10% short. That gives the controller some margin
so that performance consistency doesn't nosedive as it gets full, but is
cheaper than buying the 'PRO' models (which are ~50% more expensive last
time I looked at 1TB). I have no numbers to back that up as yet, but my
strategy is taken from staring at too many benchmark graphs.
Our Quartus measurement suggests that VirtualBox performance (Linux host on
Linux guest as it happens, but that shouldn't matter so much) is within
about 10-15% of running on the host. The same feels about true with VMWare
Fusion on comparable hardware (obviously we can't run Quartus on macOS, and
I haven't tried native Linux). I haven't tested Quartus directly on Windows
of any kind.
I think, eDRAM presents in all Xeon-E3 v4 (Broadwell) processors.
For Xeon E3 v5 (Skylake) Intel ark suggests E3-1585 v5, E3-1585L v5 and E3-1565L v5 as well as a few mobile Xeons, like E3-1545M v5
If memory latency is so crazily important then probably even a number of installed DIMMs matters. 4 DIMMs will work slower than 2 DIMMs.
ECC also adds to latency, but probably less so than 4 DIMMs.
What sort of sizes are you talking about for these files?
If you have enough ram, so that they are in your disk caches, then disk
latency will be pretty much irrelevant. When the app writes the files,
they go into cache - the actual write to disk takes place asynchronously
unless the app specifically waits for it - the delay before hitting the
disk surface does not matter. When the app reads the files, they are
already in cache and the data is returned immediately. (This is on
Linux - Windows tends to wait for files to be written out, then might
clear them from cache so that they must be re-read later.)
If these files are temporaries, then the best choice of filesystem for
them is tmpfs (again, on Linux). Even if you don't have quite enough
ram, so that the tmpfs spills into swap, it is more efficient -
structures like directories and allocation tables are entirely within
ram, and the tmpfs doesn't bother with logs, barriers, or anything else
integrity-related. More ram and a tmpfs will beat the fastest PCIe
SSD's by orders of magnitude.
But of course an SSD is faster than an HD, as well as being more
reliable (in some ways at least), and quieter. My point is just that
you concentrate on ram first, disk second when you are wanting to have
quick handling of files that are within the size that fits in ram.
I agree with that strategy (also based on just a few numbers of my own,
and lots of "this makes sense to me" feelings). But make sure you do
your 90-95% partitioning when the disk is new and clean, or after a full
secure erase! Decent SSDs already have a certain amount of
overprovisioning, but leaving a little space unused at the end to
increase the overprovisioning can improve performance and lifetimes
under heavy write load (at the expense of reduced capacity, of course).
Thanks for that post - it was very interesting, and I will keep those
ideas in mind if I need a fast system sometime.
There is a list of the Intel devices with eDRAM at Wikipedia (which
usually has more convenient lists than anyone else) :
First, thank to everybody who replied.
This has grown into quite an interesting discussion about some of the
"underlaying" issues of FPGA design.
Well, what got me thinking is the fact that my employer now has a
project with HP which includes an additional reduction on the price, but
-in the other hand- really limits my choice.
As I work about 120 km from where I live for 3 days in a week, I am more
interested in a laptop with a limited screen-size as I do need to carry
the thing on the train and in my backpack every day).
I do have a large external screen at home.
In addition to the limited choice, (e.g. all the devices in the project
are all 8 GB of RAM) it has been very difficult to get more additional
information about the devices (like, "can you extend the memory of the
HP envy from 8 to 16 GB, how much slots whould that use and what about
garantee if you do that?"
Also, for some reason, I find very limited information on how well
ubuntu runs on these devices.
(I guess "no news is good news").
BTW. For some reason, more and more of these devices come with relative
few USB slots.
Yesteray, I was testing my SPI-slave VHDL code on a cheap FPGA board
with a mbed board, and that required three USB slots: the FPGA board,
the mbed board and a logic analyser that set in between them.
Needless to say that the combination of altair quartus, chrome (for the
embed board) and the logic analyser software quickly ate up all my (4
GB) of memory I have on my current machine.
I simply run ubuntu natively, not in a VM.
OK, but to put things in perspective, I am just a hobbyist. My goal is
to be able to try out one of the new riscv CPUs.
Perhaps I will get a olimex iCE40HX8K board that should come out in
october and try out the "icestorm" toolchain (completely open-source
FPGA toolchain for the ice40HX fpga's).
The people of the icestorm project claim thair toolchain uses much less
resources then the commercial products; and they also claim that have
been able to get one of the riscv implementations running on one board.
Cheerio! Kr. Bonne.