Are they always the same size, or does it depend on how much logic is compiled? Would a simple application use less?
Are the streams very compressible? We have done some simple run-length coding to greatly reduce the storage requirement for other FPGAs. Configs tend to have long runs of 0's.
The T20/256 claims to need 5.4 megabits. I'd like to store the fpga config and application code in a Raspberry Pi Pico, which has 2 MB of onboard flash. Storing the full config would use about a third of that, so reducing that would be useful.
I don't know about Efinix, but bitstream compression is not unusual in the industry. Bitstreams tend to have a lot of compressibility without using fancy algorithms.
It's easy enough to test. Construct a simple design and compile one instance, look at the size, then instantiate multiple instances and check the size again. There is likely a control somewhere to enable/disable compression if it's available.
So you compress for storage and expand prior to downloading? I assume this is done on the fly? RLL encoding?
2 MB is pretty small these days. The trouble with compression is it may not provide much reduction in size as the design fills up, but then again, maybe it still does.
Efinix is the one with little dedicated routing, instead using the logic elements for routing, right? I've not done enough research to tell how large a part is needed for a given size design. I recall a line of FPGAs from Atmel that was like that. It was not so good in the end. I think, before I used any of their parts, I would want to compile a design using tools from another, conventional FPGA maker and Efinix and see how they compare.
Gowin is a lot less expensive. So far, I like them. But they are Chinese, so I may not be able to use their parts.
I'm at home and don't have access to a compiled bitstream, and this is a discussion group.
I'll get a T20 bit stream Monday or Tuesday and see what it looks like. If there are many runs of 0's, compression and decompression are very simple. Or maybe a typical stream is just shorter than the max.
I recall a Xilinx or maybe Altera stream that compressed about 3:1 with a very simple algorithm. I think I compressed runs of 0's and 1's on that one, with a PowerBasic program.
We considered fancier dictionary-based schemes, sort of like Zip, but they weren't worth the hassle.
Here's a T20 bit stream. The length seems to be constant vs functions coded, but there are enough runs of all 0's that it's probably worth compressing.
formatting link
The actual config file will be binary, not hex of course.
Gzip compresses your 2.0MB down to 105kB. The decompressor isn't tiny, but it's fairly small. The lz4 decompressor is tiny and still gets to
221kB. Possibly less if you RLE first. bz2 gets it to 76kB, and xz or lzma to 72kB.
Compression is one area where it's best to rely on work done by people who understand the theory. Some of these algorithms have a tiny decompressor, the magic is in the compressor.
Not really. If you need to send the data to Voyager 1, then every bit matters. For this work, there is no need for "best", or even "very good". Accommodation must be made for variety in the design, including growth. There is every reason to think that a design with more used elements would compress less. So if a simple compression method (such as RLL) gets the design down to 400 kB, there's not much need to get it down to 100 kB.
Compression works by finding repeatability in the data. In this case, the best compression would likely be an algorithm that is aware of the structure of the bit stream. In other words, a custom. Since "best" is not needed, something simple and low effort is probably best.
The hex file consists mostly of character "0" bytes and linefeeds. Simple run length encoding would compact it a lot. It seems "7","B","D","E","F" are quite rare in these files.
The raw binary file obviously won't have the linefeeds and will be only one byte for every three in the ASCII .hex file so about 1.3M.
Back of the envelope RLE might get you a ~20x decrease in size.
The right compressor and it could be made a lot smaller. If you put up the binary I'll scan that for byte entropy too.
I've never heard of storing the bit stream in an ASCII file. FPGA bit streams are binary data. But maybe I'm just not remembering. It has been a while since I mucked with it at that level.
I had a design compiled for the 3.3V core voltage version of a chip. There was also a 1.2V core voltage version, which was the same chip, with the LDOs turned into bypass. Seems they use a bond wire to flip a bit in a status register the JTAG reads to distinguish the two. But the JTAG software checks, and you need a file that matches the ID value. I had to find this ID and then recompute the checksum (maybe a simple 8 bit add, rather than a CRC). I think that was an ascii file now that I think of it. I guess they use ASCII to make it easier to see what's what if you have to view it.
But the underling data is the binary equivalent to the ASCII, so the 3:1 gain of turning the ASCII data into binary is not really relevant. That's more a matter of discarding the pretty printing formatting. In fact, I'm pretty sure the Xilinx bit streams I've seen are binary. There was no translation in sending them to the chips. I expect this file is a .hex simply for purposes of sharing.
The sparsity of non-zero data in the file gives you an idea of the amount of unused resources in FPGAs. That's why they need the latest fab processes to be economical. They have much more silicon area than virtually any other device for the amount of resources actually used.
Not really what? Spouting words without meaning again, you are such an adversarial dope.
I was talking about the relative code complexity of a compressor compared to its matching decompressor. A decompressor can be tiny, which is a quality JL seemed to be concerned with.
Eggs, grandma's. And you're actually even wrong; it's predictability, not repeatability that matters. Compressors remove whatever is predictable (using whatever kind of prediction is appropriate - not just repetition).
Why do you have to respond so adversarially? This is on you.
I know exactly what you said. That was in response to
You snipped the useful content. Ok, fine, but why argue if you don't want to actually discuss anything I've said?
I'm simply pointing out that there are many compression algorithms that are very simple and do not, in any way, require "people who understand the theory." at least it's not required to understand more than the basics.
Sure, if you say so.
Yes, technically that's correct. Give yourself a star, you bested me!
In relation to the FPGA bit stream, there are long sequences of zeros, as someone has analyzed, but also other repeating sequences, which will result in much less compression by recognizing. Recognizing the long strings of zeros obtain the first order compression and likely the second order compression as well.
Rather than finding more compression, a more important point, is recognizing that this is not static, but is data that will get updated when the design changes. This can result in significant growth of the compressed stream and so extra space needs to be provisioned, and this makes the amount of compression less advantageous. Interestingly enough, even on a part that with a high logic utilization factor, there will still be lots of zero bit strings. Much of FPGA real estate is not used, even in a dense design. More of the chip is routing than logic, and the routing almost never has a high utilization.
Are you done with your tantrum yet? I'm happy to discuss this rationally, if you are.
I'm thinking of designing some small boxes with a Raspberry Pi Pico as the computer. Here's the rough idea:
formatting link
The Pi has only 2 MB of flash and 256KB of sram. An uncompressed T20 binary (not hex) bit stream would use about 1/3 of the flash. Compressing that by even 2:1 would help. Looking at the hex, there are long runs of 0's, which are the obvious compression target.
There must be a zillion kids around who are already working with Pi's.
Of course we could use a separate serial flash chip connected directly to the FPGA to store the config, but that would be inelegant.
Possibly. There are a hell of a lot of nulls in the binary ~70% Some very long sequences of >15k nulls too. All the rest are a few hundred or less. My program sees 17 blocks of 15372 nulls in the file. (it is expecting a damaged JPEG)
15372 = 2^2.3^2.7.61
Which seems to me a very odd random constant length for a block!
I suspect a bit like with EPROM programmers there are development tools around which expect a .HEX file. The binary would be much more meaningful for working out a compression strategy. FPGA isn't my thing.
It is more easily readable by a human I suppose.
Very probably.
I'm curious about the obvious walking ones patterns in it.
The nulls I can account for as unused parts of the functionality. The length of them seems peculiar though (I expected 2^N).
My comment was about really random data. An FPGA bit stream certainly has repeated patterns. One might build a N-bit structure, a multiplier or accumulator or filter or DDS, and bit-slice blocks are very likely repeated N times.
Maybe I can find some college kid who'd like to do a project or thesus to find or code a minimal decomp algorithm for efinix+rasperry pi, in exchange for some pittance.
I can imagine some dictionary-based thing where a dictionary entry is its own first occurrence in the bit file. The decompressor is basically scissors and a pot of glue.
I was thinking of just compressing runs of 0's, but there could be a few other smallish patterns that might not be horrible to stash in the decompressor dictionary. That presents the question, are there patterns that are common to *all* T20 bit streams?
That's interesting. Lots of flash and it looks like a USB C connector. We could buy a bunch and then design our own equivalent if it ever goes EOL.
We want an RTOS, USB, ethernet stack, javascript web pages, and a fair heap of application code. Maybe the standard Pi is underpowered, too much risk.
The RP2040 has a 16 KB execute-in-place cache, which runs code in the serial flash. That's not going to be super fast, but when we need number crunching we would do that in the FPGA. Many of the boxes that I'm considering will be slow and not even need an FPGA.
I don't program FPGAs any more; I have kids that learn and fight the tools. I prefer to architect products and draw schematics. I asked one of my engineers to give me an efinix config file and that hex thing was it.
Eventually we'd build the application code (compiled c) and the compressed FPGA config into a single file. The Pi has a boot mode that makes it look like a memory stick and we'd just drag the runtime file onto the Pi. That is just one very cool feature of the Pi. They seem to have done everything right.
ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here.
All logos and trade names are the property of their respective owners.