fast universal compression scheme and its implementation in VHDL

Hi folks,

my name is Jens and I am student of the Technical University Berlin. Through my course of study in microelectronics VHDL design is becoming my favorite hobby. My other interests are signal processing and compression in general. Lately I purchased an FPGA Evalution board second-hand (guess where?) and I am now starting my first "private" implementations. Just to give you a short intro... ;-)

I am interested in implementing compression algorithms using VHDL on an FPGA. I want to build a data transmission system that compresses portions of the incoming data (not the whole data but "frames" of like 800 bytes) on-the-fly. In my search for a fast (i.e. real-time capable at a "desired" data rate of - let's say - 300 MHZ?) "universal" compression scheme I came across the following stepping stones:

- is there any free example code for compression algorithms available in VHDL to get an overview and a first impression of implementation complexity?

- what would you think are the most promising algorithms for my purpose (i.e. when statistics and semantics of the input data are unknown), first of all I thought of delta encoding, sorted RLE, LZ, ....?

- as the input data is unknown the álgorithm must be lossless, reducing redundancy (if possible), not irrelevancy. what are the theoretical limits of "universal" compression, not emphazizing one particular statistical property (like similar by values in succession)?

- what is meant by the keyword "systolic implementations" and "pipeling" in that particular context? I came across that very often lately

- what if my code gains different compression ratios for consecutive data portions? surely a FIFO can decouple input and output rate but eventually the FIFO will underflow?

Thanks for you help + support in advance, any comments, hints and help is appreciated!

Bye Jens

P.S.: I'm looking for the standard works "Sayood, Khalid: Introduction to data compression, Academic Press, 199x or 200x" and/or ". Salomon: Data Compression, Springer-Verlag, New York, 200x". Are there any sources of an electronic copy (ps, pfd, etc.) or transcriptions?

Reply to
Jens Mander
Loading thread data ...

Have a look at bzip2: bzip2 compresses files using the Burrows-Wheeler block sorting text compression algorithm, and Huffman coding. Compression is generally considerably better than that achieved by more conventional LZ77/LZ78-based compressors, and approaches the performance of the PPM family of statistical compressors.

It's the most effective I have stumbled upon in terms of bytes saved so far. I have however not made any deep research into this ;) It can however take considerble amount of processor time. Thus a good candidate for hardware implementation.

Reply to
pbdelete

I too am interested in this and I would be interested to see results of Jens findings and work in this area if he decides to work on it and publish it.

Josh

Reply to
Joshua K Drumeller

It is again surpassed by 7-zip

formatting link
They use an improved LZ77, but also have a PPM option available. It compresses more but is slower. See
formatting link

Again, source code is available.

Thomas

Reply to
Zak

"Jens Mander" ...

AFAIK compression on such small blocks is often inefficient. The rest strongly depends on the type of data.

Try to find a good and simple compression for the bitstreams used to program FPGA's. Then publish a link to your thesis and the algorithm here please.

Arie de Muynck

Reply to
Arie de Muynck

Through

general.

I

short

of

complexity?

of

in

interesting project, it depends what your emphasis is, ie if its useful to have very high savings when its posible even if it doesnt hapen very often or wether its just necesary to make moderate savings as much of the time as possible. you might make very big savings on certain types of data, but the greater extent to wich u can recognise reduntant patterns etc increases overhead so that it becomes worse for data that is more random (or even already compressed).

if you have fixed diferent rate fifos you have to evelauate what needs to happen if or when your data is uncompresable.

the obvious thing to my mind is things like reducing strings of all zero (or al ones) bits or bytes, or alternate such bytes as seem to apear a lot in wave files etc, even strings of any identical bytes etc, if you take this further you end up with the usual huffman encoding. (wich many of the other schemes are based on).

it depends how much you can do in hardware, but you could try and do many different such simple proceses in parallel on the data and see wich ends up with most compresion. this is probably the key to taking advantage of the hardware aproach ?

many file storage systems use compresion to reduce blocks of data (i think ntfs uses 16k blocks). I was once interested in high level compresion, even going as far as finding identical files and eliminating the redundancy.

Colin =^.^=

Reply to
colin

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.