RISC-V Support in FPGA - Page 2

Do you have a question? Post it now! No Registration Necessary

Translate This Thread From English to

Threaded View
Re: RISC-V Support in FPGA
On Mon, 01 May 2017 15:42:03 -0700, Kevin Neilson wrote:

Quoted text here. Click to load it

Or maybe project TAHITI from Agent's of S.H.I.E.L.D.

--  

Tim Wescott
Wescott Design Services
We've slightly trimmed the long signature. Click to see the full one.
Re: RISC-V Support in FPGA
Quoted text here. Click to load it

A basic RV32I (the minimal 32 bit user-mode instruction set) is very simple.
Here's one that's about 400 lines of SystemVerilog, that was designed by a
student over a few weeks as a summer project:

https://github.com/ucam-comparch/clarvi

Theo

Re: RISC-V Support in FPGA

Quoted text here. Click to load it

The code looks pretty clear at first glance.  I see a lot of SystemVerilog constructs that don't look synthesizer-friendly, though.

Re: RISC-V Support in FPGA
Quoted text here. Click to load it

I gave it a quick glance.  It all looks synthesizable to me.  We've used
SystemVerilog in both Vivado, and Synplify, and I think the code should  
work fine.  YMMV.

Regards,

Mark



Re: RISC-V Support in FPGA
Quoted text here. Click to load it

A primary motivation was to teach SystemVerilog to undergrads - rather than
teach them lowest-common-denominator Verilog that's universally accepted by
tools but is pretty tedious as a learning environment.

We tested it pretty extensively with Modelsim and Intel FPGA tools; we
didn't have enough summer to put it through Xilinx or ASIC tools but happy
to fix things if there's any issues.

Theo

Re: RISC-V Support in FPGA
Quoted text here. Click to load it
y

At first glance I thought I'd seen some object-oriented stuff in there but  
it was just structs.  I actually used a lot of SystemVerilog a few years ag
o when I was only using Synplify, but now I write cores that have to work i
n a broad range of synthesizers which sadly don't even accept many Verilog-
2005 constructs.

Re: RISC-V Support in FPGA
On 5/2/2017 5:52 PM, Kevin Neilson wrote:
Quoted text here. Click to load it

I wonder what is behind that.  Much of VHDL-2008 is supported in most  
tools, at least all the good stuff.  I believe the Xilinx tools don't  
include 2008, but I haven't tried it.  Otherwise I'm told the third  
party vendors support it and the Lattice tools I've used do a nice job  
of it.

I can't understand a vendor being so behind the times.

--  

Rick C

Re: RISC-V Support in FPGA
Quoted text here. Click to load it

Rick - yeah, it's pathetic.  The synthesizable subset of SystemVerilog was  
actually fairly concretely defined in the SystemVerilog 3.1 draft, in 2005.  
We're just now - 12 years later really finding an acceptable solution for  
FPGA designs.  To repeat myself - It's really pathetic.

Vivado seems to actually have BETTER language support for SystemVerilog than  
Synplify - believe it or not.  But this only works so far until you hit some  
sort of corner case and the tool spits out a netlist which doesn't match the  
RTL.  (We've hit too many of those issues in the past 2-3 years).  

Synplify, on the other hand barfs on perfectly acceptable, synthesizable code  
(i.e. SystemVerilog features that already have parallels in VHDL).  But  
Synplify has never (for us) produced a netlist which doesn't match RTL...

Regards,

Mark

Re: RISC-V Support in FPGA
On 5/3/2017 11:22 AM, Mark Curry wrote:
Quoted text here. Click to load it

Am I hearing a justification for staying with VHDL rather than learning  
Verilog as I've been intending for some time?  My understanding is that  
to write test benches like what VHDL can do it is useful to have  
SystemVerilog.  Or is this idea overblown?

--  

Rick C

Re: RISC-V Support in FPGA
Quoted text here. Click to load it

Rick - I was speaking of Synthesizer support within FPGA tools only.

Simulation support depends entirely on your vendor, and is an entirely  
different beast.  We've been happy with Modelsim for all our SystemVerilog
simulations - for many years.  Can't comment much on other simulation  
vendors, and their support.  I've not used VCS, or NCSIM (or whatever  
they're now called) in many years.  Never tried Xilinx "free" simulators,
but for "free" I'd expect you'd get what you pay for.

I'll not wade any deeper into language wars - use what you're most
comfortable with.  Doesn't hurt to have experience with both.

Regards,

Mark

Re: RISC-V Support in FPGA
Quoted text here. Click to load it
s  
Quoted text here. Click to load it
5.  
Quoted text here. Click to load it
  
Quoted text here. Click to load it

In my case, I mostly write for Vivado, but I have to write code which will  
also work for some ASIC synthesis tools which don't like anything too moder
n.  I'm not sure why; I just know I have to keep to a low common denominato
r.

Anyway, and this is a different topic altogether, I've reverted to writing  
very low-level code for Vivado.  I've given up the dream of parameterizable
 HDL.  I do a lot of Galois Field arithmetic and I put all my parameterizat
ion in Matlab and generate Verilog include files (mostly long parameters) f
rom that.  The Verilog then looks about as understandable as assembly and I
 hate doing it but I have to.  It's the same thing I was doing over ten yea
rs ago with Perl but now do with Matlab.  Often Vivado will synthesize the  
high-level version with functions and nested loops, but it is an order of m
agnitude slower (synthesis time) than the very low-level version.  And some
times it doesn't synthesize how I like.  I've just given up on high-level s
ynthesizable code.

Re: RISC-V Support in FPGA
Quoted text here. Click to load it

(continuing a bit OT...)

Kevin,

That's unfortunate.  We've been very successful with writing parameterizable code - even  
before SystemVerilog. Heck even before Verilog-2001.  Things like N-Tap FIRs,  
Two-D FIRs.  FFTs, Video Blenders, etc...  All with configurable settings -  
bit widths, rounding/truncation options/etc..  I think in a previous job I had a  
parametizable Galois Field Multiplier too.

I'm not sure what trouble you had with the tools.  It takes a bit more up front work,
but pays off quite a bit in the end.  We really had no choice, given the number of  
FPGAs we do, along with how many engineers support them.  Lot's of shared code
was the only way to go.  

If you've got something you like, then I suggest keeping it.  But for others,
I think writing parameterizable HDL isn't too much trouble - and is made
even easier with SystemVerilog.  And higher level too.

Regards,

Mark


Re: RISC-V Support in FPGA
Quoted text here. Click to load it
ble code - even  
Quoted text here. Click to load it
IRs,  
Quoted text here. Click to load it
 -  
Quoted text here. Click to load it
I had a  
Quoted text here. Click to load it
 front work,
Quoted text here. Click to load it
number of  
Quoted text here. Click to load it
 code
Quoted text here. Click to load it
ers,

I've just been burned too many times.  I know better now.  The last time I  
made the mistake I was just making a simple PN generator (LFSR).  The only  
complication was that it was highly parallel--I think I had to generate may
be 512 bits per cycle, so it ends up being a big matrix multiplication over
 GF(2).  First I made the high-level version where you could set a paramete
rs for the width and taps and so on.  It took forever for Vivado to crank o
n it.  This is just a few lines of code, mind you, and is just a bunch of X
ORs.  Then I had Matlab generate an include file with the matrix packed int
o a long parameter which essentially sets up XOR taps.  That was, I think,  
~20x faster, which translated into hours of synthesis time.  The synthesize
d circuit was also better for various reasons.  This is just one example.  
I also still have to instantiate primitives frequently for various reasons.
  The level of abstraction doesn't seem like it's changed much in 15 years  
if you really need performance.  This doesn't really have anything to do wi
th the SystemVerilog constructs.  I'm just talking about high-level code in
 general.  If I were allowed, I would still use modports, structs, enums, e
tc.

Re: RISC-V Support in FPGA
Quoted text here. Click to load it

Ah, we did find something similar in Vivado.  For use is was a large parallel  
CRC - which is pretty much functionally identical to your LFSR (big XOR trees).

We had code that calculated, basically a shift table to calculate the CRC of a long word.
The RTL code worked fine for ISE.  But when we hit Vivado, it'd pause 10 minutes or so  
over each instance (we had lots) which significantly hit our build times.

So, I changed this code to "almost" self-modifying code.  The code would by default
calculate the shift matrix using our "normal" RTL, which looked something like:
      assign H_n_o = h_pow_n( H_zero, NUM_ZEROS_MINUS_ONE );
where H_zero was an "matrix" of constants, and NUM_ZEROS_MINUS_ONE a static
parameter.  The end result is a matrix of constants as well, but "dynamically"
calculated. (Here "dynamically" means once at elaboration time, since all inputs
to the function were static).

Then we just added code to dump each unknown table entry sort-of like:
  if( ( POLY_WIDTH == 8 ) && ( NUM_ZEROS_MINUS_ONE == 7 ) && ( POLYNOMIAL == 'h2f ) )
    assign H_n_o = 'hd4eaf52e175ffba9;
  ...
  else // no table entry - use default RTL calc
    assign H_n_o = h_pow_n( H_zero, NUM_ZEROS_MINUS_ONE );

We "closed" the loop by hand.  If the "table" entry didn't exist, the tool would use the
RTL definition, and spit out the pre-calculated entry.  All done in  
verilog.   We insert that new table entry into our source code by hand, and continue - next
time the build would be quicker.

This *workaround* was a bit kludge, but was the rare (only really) exception for us
in our parameterized code.  Normally the tools just handled things fine.
And again to be clear the only thing we were working around was long synthesis times.  
The quality of results was fine in either case.

Maybe the code you were creating the pendulum swings the other way
and it was more the norm, rather than the exception to see things like this.

Interesting topic, I'm glad to hear of your (and others) experiences.

Regards,

Mark



Re: RISC-V Support in FPGA
Quoted text here. Click to load it
 of a long word.
Quoted text here. Click to load it
minutes or so  
Quoted text here. Click to load it
by default
 like:
Quoted text here. Click to load it
ic
ally"
Quoted text here. Click to load it
 inputs
Quoted text here. Click to load it
LYNOMIAL == 'h2f ) )
Quoted text here. Click to load it
l would use the
Quoted text here. Click to load it
nd continue - next
Quoted text here. Click to load it
ion for us
Quoted text here. Click to load it
hesis times.  
Quoted text here. Click to load it
is.

I looked up my notes for the LFSR I was referring to and one instance of th
e more-abstract version took 16 min to synthesize and the less-abstract ver
sion took less than a minute.  (And we needed many instances.)  When I try  
to do something at a higher level it ends up like your experience:  I have  
to do a lot of experiments to see what works and then tweak things endlessl
y.  It eats up a lot of time.  

Re: RISC-V Support in FPGA
On Wed, 03 May 2017 13:39:38 -0700, Kevin Neilson wrote:

Quoted text here. Click to load it


I use Vivado to do GF multiplications that wide using purely behavioural  
VHDL.  BTW, A straightforward behavioural implementation will *not* give  
good results with a wide bus.
I believe the problem is that most tools (in particular Vivado) do a poor  
job of synthesising xor trees with a massive fanin (e.g. >> 100 bits).  
The optimisers have a poor complexity (I guess at least O(N^2), but it  
might be exponential) wrt the size of the function.

You can use all sorts of mathematical tricks to make it work without need  
to go "low level".
For example, to deal with large fanin, partition your 512 bit input into  
N slices of 512/N bits each.  Use N multipliers, one for each slice, put  
a keep (or equivalent) attribute on the outputs, then xor the outputs  
together.  This gives the same result, uses about the same number of LUTs,  
but gives the optimiser in the tool a chance to do a good job.


I use the same GF multiplier code in ISE and Quartus, too (but not on  
buses that wide).

The entire flow is in VHDL and works in any LRM-compliant tool.  It's  
parameterised, too, so I don't need to rewrite for a different bus width.


I've been using similar approaches in VHDL since the turn of the century  
and have never been burned.

YMMV.

Regards,
Allan

Re: RISC-V Support in FPGA
Quoted text here. Click to load it
  
Quoted text here. Click to load it
  
Quoted text here. Click to load it
  
Quoted text here. Click to load it
  
Quoted text here. Click to load it
  
Quoted text here. Click to load it
  
Quoted text here. Click to load it
  
Quoted text here. Click to load it
,  
Quoted text here. Click to load it
  
Quoted text here. Click to load it

I used to do big GF matrix multiplications in which you could set parameter
s for the field size and field generator poly, etc.  Vivado just gets bogge
d down.  Now I just expand that into a GF(2) matrix in Matlab and dump it t
o a parameter and all Vivado has to know how to do is XOR.

I also have problems with the wide XORs.  Multiplication by a big GF(2) mat
rix means a wide XOR for each column.  Vivado tries to share LUTs with comm
on subexpressions across the columns.  Too much sharing.  That sounds like  
a good thing, but it's not smart enough to know how much it's impacting tim
ing.  You save LUTs, but you end up with a routing mess and too many levels
 of logic and you don't come close to meeting timing at all.  So then I hav
e to make a generate loop and put subsections of the matrix in separate mod
ules and use directives to prevent optimizing across boundaries.  (KEEPs do
n't work.)  It's all a pain.  But then I end up with something a little big
ger but which meets timing.

I really wish there were a way to use the carry chains for wide XORs.

Re: RISC-V Support in FPGA
On Thu, 04 May 2017 10:56:56 -0700, Kevin Neilson wrote:

Quoted text here. Click to load it


I thought about my historical code some more, and I realised that I did  
have some examples of behavioural GF multipliers that didn't work as well  
as the same function expressed as a bunch of wide xors.

The particular example I'm thinking of had a 128 in, 128 xor tree that  
really shouldn't be any harder to synth than a CRC.  It's a linear  
mapping stage in an SP block cipher (like AES, but not AES (which has a  
relatively weak mixing function)).

Vivado gave (IIRC) 11 or 12 levels of logic rather than the expected 3  
levels of logic.  Hmmm.  The revised source code (expressed as a bunch of  
xors) produced 4 levels of logic, and routed to speed.

BTW, I used my VHDL testbench for the original function to write out the  
VHDL for the xor tree.

  
Quoted text here. Click to load it

I think that carry chains (and similar structures) became less important  
for wide functions once six input LUTs became commonplace.

The Xilinx DSP48E2 has a wide xor mode that I think can give a 96 input  
xor in a single DSP48E2 slice.  I've never tried it.

Regards,
Allan

Re: RISC-V Support in FPGA
Quoted text here. Click to load it
  
Quoted text here. Click to load it
  
Quoted text here. Click to load it
Same here.  I have constant multiplier matrices and each has a column weigh
t of about 160 so I end up with a 160-input XOR for each column.  Ideally t
hat would be log6(160)=2.8 levels.  First I have to use very low-level co
de and even then Vivado shares subexpressions too much and I end up with 6  
levels unless I isolate column groups in different modules.  If I isolate e
ach column in its own module I can get the 3 levels.  Isolating column grou
ps also means they are placed as a group which reduces wirelengths.

Quoted text here. Click to load it
  
Quoted text here. Click to load it

Yeah, I looked into this at one point but decided against it for a few reas
ons.  I thought a nice feature would be to be able to turn off the carries  
in the DSP48 and then you could use them for GF multipliers.  I have used D
SP48s as GF(2) accumulators and I've used them as transposers to extract co
lumn data from rows stored in RAMs.

Re: RISC-V Support in FPGA
Pretty small (and fast):  
https://forums.xilinx.com/t5/Xcell-Daily-Blog/1680-open-source-ISA-RISC-V-processor-cores-run-on-one-Virtex/ba-p/742731

On 04/29/2017 08:04 PM, rickman wrote:
Quoted text here. Click to load it


Site Timeline