lossless compression in hardware: what to do in case of uncompressibility?

- D
- Denkedran Joe
  
  Contact options for registered users
posted
16 years ago

Thu, Nov 29, 2007 2:42 PM

Hi all,

I'm working on a hardware implementation (FPGA) of a lossless compression algorithm for a real-time application. The data will be fed in to the system, will then be compressed on-the-fly and then transmitted further.

The average compression ratio is 3:1, so I'm gonna use some FIFOs of a certain size and start reading data out of the FIFO after a fixed startup-time. The readout rate will be 1/3 of the input data rate The size of the FIFOs is determined by the experimental variance of the mean compression ratio. Nonetheless there are possible circumstances in which no compression can be achieved. Since the overall system does not support variable bitrates a faster transmission is no solution here.

So my idea was to put the question to all of you what to do in case of uncompressibility? Any ideas?

Denkedran Joe

- B
- Boudewijn Dijkstra
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Mon, Dec 3, 2007 9:14 AM

Op Thu, 29 Nov 2007 15:42:45 +0100 schreef Denkedran Joe :

Given that uncompressible data often resembles noise, you have to ask yourself: what would be lost?

If you can identify the estimated compression beforehand and then split the stream into a 'hard' part and an 'easy' part, then you have a way to retain the average.

--
Gemaakt met Opera's revolutionaire e-mailprogramma:  
http://www.opera.com/mail/

- P
- Phil Carmody
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Mon, Dec 3, 2007 11:28 AM

*Much* more information than if the signal was highly redundant.

Yeah, right. And if you juggle the bowling balls when crossing the rope-bridge, you'll not break the bridge.

Phil

--
Dear aunt, let's set so double the killer delete select all.
-- Microsoft voice recognition live demonstration

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Mon, Dec 3, 2007 5:27 PM

The message! Just because the message "resembles" noise does not mean it has no information. In fact, just the opposite. Once you have a message with no redundancy, you have a message with optimum information content and it will appear exactly like noise.

Compression takes advantage of the portion of a message that is predictable based on what you have seen previously in the message. This is the content that does not look like noise. Once you take advantage of this and recode to eliminate it, the message looks like pure noise and is no longer compressible. But it is still a unique message with information content that you need to convey.

Doesn't that require sending additional information that is part of the message? On the average, this will add as much, if not more to the message than you are removing...

If you are trying to compress data without loss, you can only compress the redundant information. If the message has no redundancy, then it is not compressible and, with *any* coding scheme, will require some additional bandwidth than if it were not coded at all.

Think of your message as a binary number of n bits. If you want to compress it to m bits, you can identify the 2**m most often transmitted numbers and represent them with m bits. But the remaining numbers can not be transmitted in m bits at all. If you want to send those you have to have a flag that says, "do not decode this number". Now you have to transmit all n or m bits, plus the flag bit. Since there are 2**n-2**m messages with n+1 bits and 2**m messages with m+1 bits, I think you will find the total number of bits is not less then just sending all messages with n bits. But if the messages in the m bit group are much more frequent, then you can reduce your *average* number of bits sent. If you can say you will *never* send the numbers that aren't in the m bit group, then you can compress the message losslessly in m bits.

- B
- Boudewijn Dijkstra
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Wed, Dec 5, 2007 8:57 AM

If you could, yes. Costs put limits on the available processing power.

In the context of the OP's hardware implementation, you may be able to distribute these two streams over the available output pins without sending extra bits.

If compression saves 'a lot of' bits ands flagging needs 'a few' bits, then it will not "add as much, if not more to the message than [I am] removing..." Your description below only applies to certain compression algorithms, so any conclusion derived from it may or may not apply to the general case.

--
Gemaakt met Opera's revolutionaire e-mailprogramma:  
http://www.opera.com/mail/

- C
- comp.arch.fpga
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Wed, Dec 5, 2007 9:46 AM

ROTFL. Did you even read it? He outlined the formal prove that I was referencing to in a little more detail.

This proof shows, that for ANY lossless algorithm there is an input that can't be compressed. I find it rather funny that you counter that proof by the assertion that it only applies to certain algorithms.

For the fun of it: Would you be so kind and present a single example of a compression algorithm that the proof does not apply to? Could be worth a PhD if you manage.

Kolja Sulimma

- B
- Boudewijn Dijkstra
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Wed, Dec 5, 2007 1:13 PM

Yes. Multiple times.

I didn't mean to counter the proof itself, only claims to the relation between compression ratio and the bandwidth needed to split a stream.

Nah. I'd rather waste my time on something else. :)

--
Gemaakt met Opera's revolutionaire e-mailprogramma:  
http://www.opera.com/mail/

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
16 years ago

Wed, Dec 5, 2007 4:13 PM

That is irrelevant to the conversation. No one has mentioned data rates, processing power or time requirements. So there is no point is raising issues that we have no information on. But my point remains. If your algorithm can remove the true noise separated from signal that looks like noise, then you would just toss the real noise and improve the signal while compressing at the same time. That is my point and from what you have said, it requires no extra processing, but is part of your compression.

If the OP has two streams, one for compressible signal and one for uncompressible signal, then he could just send the original message over the uncompressible channel and avoid the issue of compression altogether.

Compression can only save bits in the subset of signals that actually are reducible. If you do the math you will find that if the signal is randomly distributed, any coding scheme can not reduce the total number of bits sent. It is only when some signal patterns are more frequent than others that you can exploit the non-randomness of the signal to compress it into fewer bits.

I haven't said anything about algorithms, so everything I have said on this applies to *all* compression algorithms. My discussion below is intended to apply to the general case. That is why I reduce the discussion to one of compressing a large number to a smaller number.