why did Shannon only consider sequences satisfying the same distribution when computing capacity?

- Y
- yanli0008
  
  Contact options for registered users
posted
7 years ago

Sat, Feb 18, 2017 8:57 PM

Why did Shannon only consider sequences satisfying the same distribution when computing capacity? what happen if we are allowed to pick sequences satisfying any distribution?

These questions are motivated by the recent paper "A brief introduction on Shannon's information theory" by Chen.

This is about if Shannon's limit can be broken or not.

Yan

- T
- Tim Wescott
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Sat, Feb 18, 2017 9:03 PM

If by "sequences satisfying the same distribution" you mean that Shannon put some constraint other than dimensionality on coding -- no, he did not, that assertion is incorrect.

If by that phrase you mean why did he not consider non-Gaussian noise, it's because analysis with non-Gaussian noise is difficult, and his paper was already a giant one.

Other people have investigated Shannon-type limits in the presence of non- Gaussian noise. In general, you can often do better than a superficial reading of Shannon's paper would indicate, but all of this is _already known_, and _has been known for decades_, and _were not written by people looking for investors_.

There are no perpetual motion machines. If you're an investor, beware. If you're a shill, shame on you.

--

Tim Wescott 
Wescott Design Services 
http://www.wescottdesign.com 

I'm looking for work -- see my website!

- Y
- yanli0008
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Sat, Feb 18, 2017 9:31 PM

? 2017?2?19???? UTC

+8??5:03:13?Tim Wescott???

n

s

-

I read Shannon's paper again, the channel capacity limit is obtained in the following way if I understand correctly: suppose we have an alphabet of n letters and each appears with probability p_i. Now we consider sequences of these letters where each letter appears with frequency specified by p_i. N ow we count how many sequences sharing this same probability we can pick su ch that they are distinguishable after going through the channel. Surely, e ach distribution gives such a number. The capacity limit of the channel is then to find the optimal p_i giving the maximum this number.

Is this correct?

If this is correct, then by "same distribution" I mean the optimal distribu tion p_i giving the maximum number of distinguishable sequences.

Tell me where I made any mistake, thanks.

Yan

- T
- Tim Wescott
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Sun, Feb 19, 2017 12:28 AM

Shannon's capacity theorem assumes that your p_i = 1/n for all i. For data that does not, you compress the data. Shannon proved that, too. Google for "Shannon" and "entropy".

--

Tim Wescott 
Wescott Design Services 
http://www.wescottdesign.com 

I'm looking for work -- see my website!

- W
- whit3rd
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Sun, Feb 19, 2017 12:46 AM

The most important 'but' here, is that the Gaussian result is soluble, and the small-noise approximation converges to that result as a limit. Gaussian equals small-noise limit in most real cases.

Also, the central limit theorem tells us to expect Gaussian distribution; those big displays of balls-bouncing-from-pins are a great example:

- J
- Jim Thompson
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Sun, Feb 19, 2017 12:53 AM

Blessed are those that would denigrate Shannon's Theorems, for they shall be forever labeled shamans >:-} ...Jim Thompson

--
| James E.Thompson                                 |    mens     | 
| Analog Innovations                               |     et      | 
| Analog/Mixed-Signal ASIC's and Discrete Systems  |    manus    | 
| STV, Queen Creek, AZ 85142    Skype: skypeanalog |             | 
| Voice:(480)460-2350  Fax: Available upon request |  Brass Rat  | 
| E-mail Icon at http://www.analog-innovations.com |    1962     | 

     Thinking outside the box... producing elegant solutions.

- T
- Tim Wescott
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Sun, Feb 19, 2017 3:54 AM

The central limit theorem tells us to expect Gaussian, but experiment often tells us otherwise. In particular, if you add up a bazzilion teeny random variables then the result will tend to Gaussian -- unless even one of those teeny random variables has an infinite variance, in which case all of the averaging in the world isn't going to make it finite.

Doesn't make much difference at 300MHz, but it sure does at 300kHz.

--
Tim Wescott 
Control systems, embedded software and circuit design 
I'm looking for work!  See my website if you're interested 
http://www.wescottdesign.com

- Y
- yanli0008
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Mon, Feb 20, 2017 2:33 PM

? 2017?2?18???? UTC

-5??7:28:31?Tim Wescott???

UTC+8??5:03:13?Tim Wescott??? ?

l

.

t

e

.

Tim, are you talking about data compress rate? in that case, surely we cons ider each probability distribution separately. However, for channel capacit y, it can not be always 1/n achieving the capacity.

Thanks,

Yan

- T
- Tim Wescott
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Mon, Feb 20, 2017 4:08 PM

Go read Shannon's 1948 paper. Come back when you fully understand it.

--
Tim Wescott 
Control systems, embedded software and circuit design 
I'm looking for work!  See my website if you're interested 
http://www.wescottdesign.com

- J
- Jim Thompson
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Mon, Feb 20, 2017 5:37 PM

[snip]

|_____________________________________| |

Is that the polite way of saying, "Go away and don't come back" ?>:-} ...Jim Thompson

--
| James E.Thompson                                 |    mens     | 
| Analog Innovations                               |     et      | 
| Analog/Mixed-Signal ASIC's and Discrete Systems  |    manus    | 
| STV, Queen Creek, AZ 85142    Skype: skypeanalog |             | 
| Voice:(480)460-2350  Fax: Available upon request |  Brass Rat  | 
| E-mail Icon at http://www.analog-innovations.com |    1962     | 

     Thinking outside the box... producing elegant solutions.

- T
- Tim Wescott
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Mon, Feb 20, 2017 8:04 PM

Only if he doesn't apply nose to grindstone and read the paper.

And by "read" I don't mean just let his eyes travel over the words.

This is explained in a bazzilion different books on communications theory. Granted, it's not something that you can understand without effort, but it's there to understand.

--
Tim Wescott 
Control systems, embedded software and circuit design 
I'm looking for work!  See my website if you're interested 
http://www.wescottdesign.com

- Y
- yanli0008
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Tue, Feb 21, 2017 2:43 PM

? 2017?2?20???? UTC

-5??3:04:20?Tim Wescott???

e

This is so arrogant. Anyway, claiming 1/n always achieves channel capacity for any channel, you must be kidding.

If that is the case, just tell me what is the meaning of the formula C=ma x_X (H(X)-H(X|Y))? just assume X to be the 1/n distribution would do the jo b in your logic.

Maybe you are a super expert and do not want to explain such a fundamental question, then just do not come back and answer in an arrogant manner. That would be appreciated!

- P
- Phil Hobbs
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Tue, Feb 21, 2017 3:11 PM

Equal probability for each state with delta-function autocorrelation achieves maximum entropy, i.e. maximum information per bit. Any other distribution would _reduce_ channel capacity.

Cheers

Phil Hobbs

--
Dr Philip C D Hobbs 
Principal Consultant 
ElectroOptical Innovations LLC 
Optics, Electro-optics, Photonics, Analog Electronics 

160 North State Road #203 
Briarcliff Manor NY 10510 

hobbs at electrooptical dot net 
http://electrooptical.net

- M
- Martin Brown
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Tue, Feb 21, 2017 3:28 PM

If the channel is capable of representing n distinct symbols then a data coding that maximises your surprise at seeing the next symbol is as good as you can ever get. Namely the maximum entropy solution where all symbols are equally likely to occur in your data.

What "Y" are you taking as being known a priori here?

If there is channel dependent noise then obviously the bad channels have to carry less data, but the original model was simpler.

The Shannon entropy paper is still worth you reading:

formatting link

Starts with the noiseless case and then proceeds to add noise.

--
Regards, 
Martin Brown

- R
- rickman
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Tue, Feb 21, 2017 3:41 PM

That is a very rude and arrogant thing to say. If he has some misunderstandings, why shouldn't he come here to discuss it? You are not the SED overlord. If you don't wish to discuss the topic with him then I suggest *you* to not post. :-P

--

Rick C

- Y
- yanli0008
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Tue, Feb 21, 2017 5:29 PM

? 2017?2?21???? UTC

-5??10:28:12?Mart> > ? 2017?2?20????

UTC-5??3:04:20?Tim Wescott??? ?

Martin,

Thanks. 'Y' is the probability distribution at the output side of the given channel. Please loot at page 22-24 in Shannon's paper here (the subscript `y` there). Of course, I knew some channels essentially do not depend on th e channel itself, in these cases you just need to maximize the entropy of t he input, just like AWGN. However, I am talking about general channels. And I am talking about whethe r there is any generalization of Shannon's theory.

I did not say Shannon's theory is incorrect. I am asking what happens if al l probability distributions are taken into consideration. If only one proba bility distribution is allowed to pick (see my first two posts for the exac t meaning), Shannon's result is perfectly right. No doubt.

Yan

- Y
- yanli0008
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Tue, Feb 21, 2017 5:30 PM

? 2017?2?21???? UTC

-5??10:41:43?rickman???

Thanks, Rick.

- T
- Tim Wescott
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Wed, Feb 22, 2017 12:01 AM

Arrogant is refusing to study a complicated subject and then getting bent out of shape when someone suggests you do your homework.

And yes, that IS a hint.

--

Tim Wescott 
Wescott Design Services 
http://www.wescottdesign.com 

I'm looking for work -- see my website!

- T
- Tim Wescott
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Wed, Feb 22, 2017 12:02 AM

I explained things sufficiently. He did not want to believe. The material is publicly available.

What's the issue again?

--

Tim Wescott 
Wescott Design Services 
http://www.wescottdesign.com 

I'm looking for work -- see my website!

- T
- Tim Williams
  
  Contact options for registered users
Vote on answer
posted
7 years ago

Wed, Feb 22, 2017 4:57 AM

If he wanted it explained, kindly, with references and details, he would've paid for the tutoring. :)

Which, if I might dare to suggest... Tim might even be willing to do. (But you might not like the price. :^) )

Tim

--
Seven Transistor Labs, LLC 
Electrical Engineering Consultation and Contract Design 
Website: http://seventransistorlabs.com