LatticeMico32 extremly poor performance without caches

Hi

just some results for LatticeMico32:

  • no cache
  • code and data in Block RAMs

testing with software loop

sw r0,r0,0x100 bri -1

this loop executes in 28 system clock cycles!

simulation done with Xilinx ISE built-in simulator ISIM, using coregen for addsub and block RAM components.

Antti PS as much as I see Lattice is at time of writing violating GPL license or does anyone know where to download the GPL licensed source code of the LatticeMico32 GNU toolchain !?

Reply to
Antti
Loading thread data ...

Hi Antti,

have you any idea why it is that slow? Branch penalty? Or is the write that slow? Does the performance improve with caches on? (I have not looked closely at Mico32 yet, maybe it is intended to only be used with caches?)

Regarding the GPL: I think it is sufficient if they clearly say that the softare is GPL-licensed and if they provide you the source-code on request. So they would be only violating the license if you ask them to provide you the source-code and they say "No". Once you have the source-code, you are free to publish it yourself on a web side (I am sure, you will ;-)

Thomas

formatting link

"Antti" schrieb im Newsbeitrag news: snipped-for-privacy@i42g2000cwa.googlegroups.com...

Reply to
Thomas Entner

You really do need to enable caches if you want high performance, then you should be able to get near single cycle execution (i.e. considering branch penalties, cache refills etc).

Cheers, Jon

Reply to
Jon Beniston

Jon Beniston schrieb:

well if the only memories are on chip Block RAMs then caches should not be needed, for Xilinx MicroBlaze LMB and PPC OCM buses the BRAMs work like always_hit cache memories.

on LM32 all memories are on Wishbone bus making the access to BRAM based memory block slower than the access to external memory (assuming cache hit).

LM32 does fit nicely into small XP devices like XP3, but only without cache so the requirement to have always caches to achive any normal clock-per cycle ration seems like severly limiting factor for LM32

OpenFire (opensource MicroBlaze clone) would run in Lattice silicon way faster then then LM32 (if execution from on-chip memory is compared)

Antti

Reply to
Antti

Antti.

This is why it is relatively slow.

If you look through the RTL, there is some support for this. I'm sure Lattice will enable it via the GUI in a latter version.

Cheers, Jon

Reply to
Jon Beniston

Jon Beniston schrieb:

lets hope the local-memory interface will be available and documented without it (and no cache) the performance is really bad.

I have it now running in Virtex-4 doing a maximum speed loop incrementing a register and writing it to GPIO

complete program as .COE for Xilinx coregen:

memory_initialization_radix=16; memory_initialization_vector=

98000000, B8000800, 34210001, 5801E000, E3FFFFFE, 2800E000;

xor r0,r0,r0 mv r1,r0 addi r1,r1,1 sw (r0+0xE000),r1 ; this is short store to GPIO base bi -2

This loop emits

181KHz on GPIO(0) at 12MHz system clock, so for 100Mhz clock the max IO toggle rate would be 1.5MHz

:(

Antti

Reply to
Antti

Can you cheat and build a system with cache and no memory, then preload the cache with the data you want?

-- The suespammers.org mail server is located in California. So are all my other mailboxes. Please do not send unsolicited bulk e-mail or unsolicited commercial e-mail to my suespammers.org address or any of my other addresses. These are my opinions, not necessarily my employer's. I hate spam.

Reply to
Hal Murray

"Hal Murray" schrieb im Newsbeitrag news:F4idnX0B6paUobzYnZ2dnUVZ snipped-for-privacy@megapath.net...

not sure, this approuch works nicely for Virtex PPC caches, but for LM32 guess it needs deep look into the RTL code to see if it is possible option or not.

currently I disabled the caches while the use some function in an way that is not supported by ISE (and I was too lazy-busy to fix it),

and actually I wantes to have resource useage numbers for minimal setup (eg no cache system) anyway

adding direct memory is of course possible as all RTL is available (and supposedly has at least partial support there) but lets see if there will be some updates to the LM32 release

Antti

Reply to
Antti Lukats

As far as I can see, I don't think this is supported via the GUI yet, but would of course be possible if you hacked the RTL to change the cache memories and tags to be initialised with the correct data, which should be fairly straightforward to do.

However, this would not be as efficient as using the instruction ROM and data RAM that are in the RTL, as you end up wasting resources on memories for the cache tag RAMs which aren't needed, and all of the cache refill logic etc..

Cheers, Jon

Reply to
Jon Beniston

Jon Beniston schrieb:

hm, I guess the LM32_RAM that is included when JTAG debugger is enabled is direct processor connected block RAM. as I had JTAG configured off this module was also out, so I only had wishbone block RAMs left in the system. For the custom instruction support there is anyway an update needed so I guess the access to direct CPU connected on chip memories will also be available then.

Antti

PS for those who want to play with LatticeMico32 on Xilinx platform I uploaded the ISE project navigator project that I used for testing, its rather minimal LM32 system tested to work in Virtex-4, available @

formatting link
, download area...

Reply to
Antti

Thomas Entner schrieb:

formatting link

LatticeMico32 GPL sources no need to ask, just get it.

Antti

Reply to
Antti

Hi Antti i am unable to get any download from lattice website and get the following message: "The file you have attempted to retrieve is not available at this time.

We apologize for the inconvenience.

If you c> Thomas Entner schrieb:

formatting link

Reply to
avionion

The file you download is a tar.bz2, not zip or exe. You need to extract it with:

tar xjf src.tar.bz2

Cheers, Jon

Reply to
Jon Beniston

Jon Beniston schrieb:

Jon,

I think the Lattice web has at least sporadic issues with the download there are several people complaining about the downloads not starting, so I dont it the issue of not being able to uncompress .bz2

Antti

Reply to
Antti

Reply to
avionion

snipped-for-privacy@gmail.com schrieb:

try here,

formatting link

200MB, full set of the LM32 distro (except datasheets and GCC sources)

Antti

Reply to
Antti

Reply to
avionion

snipped-for-privacy@gmail.com schrieb:

maybe their webmaster is learning at Xilinx?

Antti

Reply to
Antti

Reply to
Avion

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.