Do you have a question? Post it now! No Registration Necessary
- Subject
- Posted on
- algorithm that can distinguish between a sequence of PI digits and a sequence of pseudoran...
- 10-28-2015
posted on
October 28, 2015, 1:04 am
October 28, 2015, 1:04 am
Hi,
Is it possible to distinguish between digits of PI and digits of a
pseudorandom sequence, without using any PI digit lookup table or
calculating digits of PI? For example, a 100,000 digit sequence of
consecutive digits in PI from any starting point compared to a
sequence of 100,000 pseudorandom digits.
cheers,
Jamie
Is it possible to distinguish between digits of PI and digits of a
pseudorandom sequence, without using any PI digit lookup table or
calculating digits of PI? For example, a 100,000 digit sequence of
consecutive digits in PI from any starting point compared to a
sequence of 100,000 pseudorandom digits.
cheers,
Jamie
Re: algorithm that can distinguish between a sequence of PI digits and a sequence of pseudorandom digits of equal length
I have an algorithm I made that gets something around a ~1% bias,
varying on comparison sequence length, when comparing digits of PI to
digits of my pseudorandom sequence so was just curious! :D
cheers,
Jamie
Re: algorithm that can distinguish between a sequence of PI digits and a sequence of pseudorandom digits of equal length
Try entering "convert pi to base e" in www.wolframalpha.com...
Or "convert pi to base pi/e" and even "convert pi to base pi/sqrt(2)"
Some interesting results ;-)
--
Cheers,
Chris.
Cheers,
Chris.
Re: algorithm that can distinguish between a sequence of PI digits and a sequence of pseudorandom digits of equal length
On 10/27/2015 7:23 PM, John Larkin wrote:
Hi,
There are definitely some periodic patterns in pi, the algorithm
I made is a compression algorithm that compresses based on periodicity,
and pi is about 1% more compressed than the pseudorandom sequence
up to the number of digits I have been able to analyze so far,
1million digit sequence length.
I tested it at many different scales and starting points in the
sequence, here are the results if interested:
The column on the left, 0, 100, 1000, 10000 etc.. is the first
element of the sequence that was compressed, and then the top row:
100, 1000, 10000 (up to 800000) is the total length of consecutive
digits that were compressed.
pseudorandom sequence results:
https://www.dropbox.com/s/cx9ht50hfwd67vo/20151027%20pseudorandom%20compressed%20sequence%20comparison%20results.png
pi sequence results:
https://www.dropbox.com/s/ts9vbd1hy5x4wa1/20151027%20PI%20compressed%20sequence%20comparison%20results.png
graph showing the percent different compression of the pi sequences
versus the pseudorandom sequences:
https://www.dropbox.com/s/y7wmelr6fhkt2h0/20151027%20PI%20vs%20pseudorandom%20compression%20comparison.png
I would like to test it on some sequences with longer lengths than
1million, ideally up to 10million at least, to see what that curve
does.
My compression algorithm can identify periodic patterns in sequences
so that is what was detected in the pi sequence, whereas the pseudorandom
sequence is the noisiest sequence I compressed so far.
If anyone has a sequence they think might be noisier (less periodic)
than the pseudorandom sequence I tested please send me a link to it
and I will compress it to see how it compares to the pseudorandom one.
Also any other sequences considered random that you think might have
some patterns are welcome :D
I will release the algorithm opensource at some point if it is useful.
cheers,
Jamie
Hi,
There are definitely some periodic patterns in pi, the algorithm
I made is a compression algorithm that compresses based on periodicity,
and pi is about 1% more compressed than the pseudorandom sequence
up to the number of digits I have been able to analyze so far,
1million digit sequence length.
I tested it at many different scales and starting points in the
sequence, here are the results if interested:
The column on the left, 0, 100, 1000, 10000 etc.. is the first
element of the sequence that was compressed, and then the top row:
100, 1000, 10000 (up to 800000) is the total length of consecutive
digits that were compressed.
pseudorandom sequence results:
https://www.dropbox.com/s/cx9ht50hfwd67vo/20151027%20pseudorandom%20compressed%20sequence%20comparison%20results.png
pi sequence results:
https://www.dropbox.com/s/ts9vbd1hy5x4wa1/20151027%20PI%20compressed%20sequence%20comparison%20results.png
graph showing the percent different compression of the pi sequences
versus the pseudorandom sequences:
https://www.dropbox.com/s/y7wmelr6fhkt2h0/20151027%20PI%20vs%20pseudorandom%20compression%20comparison.png
I would like to test it on some sequences with longer lengths than
1million, ideally up to 10million at least, to see what that curve
does.
My compression algorithm can identify periodic patterns in sequences
so that is what was detected in the pi sequence, whereas the pseudorandom
sequence is the noisiest sequence I compressed so far.
If anyone has a sequence they think might be noisier (less periodic)
than the pseudorandom sequence I tested please send me a link to it
and I will compress it to see how it compares to the pseudorandom one.
Also any other sequences considered random that you think might have
some patterns are welcome :D
I will release the algorithm opensource at some point if it is useful.
cheers,
Jamie
Re: algorithm that can distinguish between a sequence of PI digits and a sequence of pseudorandom digits of equal length
On 28/10/15 03:48, Jamie M wrote:
No, there are no periodic patterns in pi - at least, none found so far
(and others have tested /far/ better than you). It has not been proven
to be "normal" (meaning, in effect, that it's digits are random), but it
is strongly suspected.
Any given pseudorandom sequence will, of course, have some pattern.
And you will always expect to see some difference in the compressibility
of different sequences, even if they are random. The question is, are
those differences statistically relevant?
For example, if you toss a perfect coin 100 times, you should not be
surprised if you get 55 heads and 45 tails, and that is not an
indication of bias in the coin.
I am sure google will tell you. I believe you can easily find example
sequences to download for your tests.
No, there are no periodic patterns in pi - at least, none found so far
(and others have tested /far/ better than you). It has not been proven
to be "normal" (meaning, in effect, that it's digits are random), but it
is strongly suspected.
Any given pseudorandom sequence will, of course, have some pattern.
And you will always expect to see some difference in the compressibility
of different sequences, even if they are random. The question is, are
those differences statistically relevant?
For example, if you toss a perfect coin 100 times, you should not be
surprised if you get 55 heads and 45 tails, and that is not an
indication of bias in the coin.
I am sure google will tell you. I believe you can easily find example
sequences to download for your tests.
Re: algorithm that can distinguish between a sequence of PI digits and a sequence of pseudorandom digits of equal length
Jamie M wrote:
The patterns are nothing compared to the patterns in other irrationals.
e = 2.7 1828 1828 4590 4523
4 digits in a row repeat! And 45x290% and 45/223% (after a carry
increments the 3).
sqrt(2) = 1.4 14 213562373095 0488 01688
2 digits repeat at the start. Then 0 and 88, first with 2^2 in the
middle (0 4 88), then with 2^4 in the middle (0 16 88).
You find similar patterns in other square roots and the golden ratio
(related to sqrt of 5). But those numbers are all equal to mathematical
expressions. There is no mathematical expression equal to pi. It's
impossible for a mathematical expression to produce randomness, but pi
is defined by physical space, and physical space is intrinsically
capable of randomness.
Have you found obvious periodic patterns in pi? In the first 32 digits
there are no zeros and too many threes, but in the long run it appears
random AFAIK, and allegedly according to statistical analysis.
What kind of compression algorithm?
The patterns are nothing compared to the patterns in other irrationals.
e = 2.7 1828 1828 4590 4523
4 digits in a row repeat! And 45x290% and 45/223% (after a carry
increments the 3).
sqrt(2) = 1.4 14 213562373095 0488 01688
2 digits repeat at the start. Then 0 and 88, first with 2^2 in the
middle (0 4 88), then with 2^4 in the middle (0 16 88).
You find similar patterns in other square roots and the golden ratio
(related to sqrt of 5). But those numbers are all equal to mathematical
expressions. There is no mathematical expression equal to pi. It's
impossible for a mathematical expression to produce randomness, but pi
is defined by physical space, and physical space is intrinsically
capable of randomness.
Have you found obvious periodic patterns in pi? In the first 32 digits
there are no zeros and too many threes, but in the long run it appears
random AFAIK, and allegedly according to statistical analysis.
What kind of compression algorithm?
--
Re: algorithm that can distinguish between a sequence of PI digits and a sequence of pseudorandom digits of equal length
On 10/28/2015 5:24 AM, Tom Del Rosso wrote:
Hi,
The periodic pattern is a pattern apparently centered on each digit of
pi, where the periodicity decreases with possible sine wave modulation
as the surrounding digits are added to the sample.
The algorithm I used is a periodicity detecting compression algorithm.
It has no dictionary so is a bit strange, and is implemented with
recursion. I made the algorithm for basic AI research, and was testing
it on some sequences of numbers when I noticed the discrepancy in
compression between pi and pseudorandom sequences.
Here's the 32bit version of the recursive compressor code in C#:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace ConsoleApplication1
{
public class ChirpletInfo
{
public ChirpletInfo(int position)
{
this.Count = 1;
this.Position = position;
}
public int Count { get; private set; }
public int Position { get; private set; }
public void IncrementCount()
{
Count++;
}
}
public class ChirpletPair
{
public ChirpletPair(int v1, int v2)
{
this.Value1 = v1;
this.Value2 = v2;
}
public int Value1 { get; private set; }
public int Value2 { get; private set; }
public override bool Equals(object obj)
{
ChirpletPair p = obj as ChirpletPair;
if(p!= null)
{
return p.Value1 == this.Value1 &&
p.Value2 == this.Value2;
}
else { return false; }
}
public override int GetHashCode()
{
return Value1 * 31 + Value2 * 13;
}
public override string ToString()
{
return String.Format(",", Value1, Value2);
}
}
public class Node
{
public Node(int position, ChirpletPair pair)
{
this.Position = position;
this.Pair = pair;
}
public int Position { get; private set; }
public ChirpletPair Pair { get; private set; }
public List<int> Noise { get; private set; }
public List<int> GapOccurrence { get; private set; }
public List<Node> Children { get; private set; }
public override string ToString()
{
return String.Format("C()", Position,
Pair == null ? "null": Pair.ToString());
}
public string GetChirpletString(int depth = 0)
{
string indent = String.Concat(Enumerable.Range(0,
depth).Select(a => " ").ToList());
StringBuilder results = new StringBuilder();
results.AppendLine(indent + ToString() + GetNoiseString());
if(Children!= null)
{
foreach (var c in Children)
{
results.Append(indent + c.GetChirpletString(depth +
1));
}
}
return results.ToString();
}
public string GetNoiseString()
{
return string.Format("[]", string.Join(",", Noise));
//return null;
}
public void DeleteGapOccurrence()
{
GapOccurrence.Clear();
GapOccurrence = null;
}
public void SetNoise(IEnumerable<int> noise)
{
Noise = noise.ToList();
}
public IEnumerable<int> GetIndices()
{
yield return Position;
yield return Position + 1;
int pos = Position;
for (int i = 0; i < GapOccurrence.Count; i++)
{
yield return pos + GapOccurrence[i];
yield return pos + GapOccurrence[i] + 1;
pos += GapOccurrence[i];
}
}
public void AddGap(int gap)
{
if(GapOccurrence == null)
{
GapOccurrence = new List<int>();
}
GapOccurrence.Add(gap);
}
public void AddChild(Node n)
{
if(Children == null)
{
Children = new List<Node>();
}
this.Children.Add(n);
}
}
public class ChirpletCompressor
{
public IEnumerable<int> GetNoise(IList<Node> nodes, IList<int>
data)
{
HashSet<int> indices = new HashSet<int>();
foreach(var n in nodes)
{
foreach(int i in n.GetIndices())
{
indices.Add(i);
}
}
for(int i = 0; i<data.Count; i++)
{
if(!indices.Contains(i))
{
yield return data[i];
}
}
}
public Node Compress(IList<int> data) //call this
function to compress a single sequence
{
Node root = new Node(0, null);
Compress(root, data);
return root;
}
public void Compress(Node node, IList<int> data) //warning:
this compress function needs to be called with an empty node, if the
same node is reused for multiple calls to it the tree will build up a
mix of data
{ //calling
this function directly might be good for mixing data of 2 or more
sequences into the same tree, if the outer node is reused the tree will
build up
Dictionary<ChirpletPair, ChirpletInfo> chirpletCounts = new
Dictionary<ChirpletPair, ChirpletInfo>();
Dictionary<ChirpletPair, Node> newNodes = new
Dictionary<ChirpletPair, Node>();
for (int i = 1; i < data.Count; i++)
{
int j1 = data[i - 1];
int j2 = data[i];
if (j1 != j2)
{
ChirpletPair c = new ChirpletPair(j1, j2);
if (chirpletCounts.ContainsKey(c))
{
ChirpletInfo info = chirpletCounts[c];
info.IncrementCount();
if (info.Count == 2)
{
Node n = new Node(info.Position, c);
int gap = i - info.Position - 1;
n.AddGap(gap);
newNodes.Add(c, n);
}
else if(info.Count > 2)
{
Node n = newNodes[c];
int gap = i - n.GapOccurrence.Sum() - 1 -
info.Position;
n.AddGap(gap);
}
}
else
{
chirpletCounts.Add(c, new ChirpletInfo(i - 1));
}
}
}
var noise = GetNoise(newNodes.Values.ToList(), data);
node.SetNoise(noise);
foreach (var n in newNodes)
{
node.AddChild(n.Value);
Compress(n.Value, n.Value.GapOccurrence);
if (n.Value.Children != null &&
n.Value.Children.Count > 0)
{
// n.Value.DeleteGapOccurrence();
}
}
}
}
}
cheers,
Jamie
Hi,
The periodic pattern is a pattern apparently centered on each digit of
pi, where the periodicity decreases with possible sine wave modulation
as the surrounding digits are added to the sample.
The algorithm I used is a periodicity detecting compression algorithm.
It has no dictionary so is a bit strange, and is implemented with
recursion. I made the algorithm for basic AI research, and was testing
it on some sequences of numbers when I noticed the discrepancy in
compression between pi and pseudorandom sequences.
Here's the 32bit version of the recursive compressor code in C#:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace ConsoleApplication1
{
public class ChirpletInfo
{
public ChirpletInfo(int position)
{
this.Count = 1;
this.Position = position;
}
public int Count { get; private set; }
public int Position { get; private set; }
public void IncrementCount()
{
Count++;
}
}
public class ChirpletPair
{
public ChirpletPair(int v1, int v2)
{
this.Value1 = v1;
this.Value2 = v2;
}
public int Value1 { get; private set; }
public int Value2 { get; private set; }
public override bool Equals(object obj)
{
ChirpletPair p = obj as ChirpletPair;
if(p!= null)
{
return p.Value1 == this.Value1 &&
p.Value2 == this.Value2;
}
else { return false; }
}
public override int GetHashCode()
{
return Value1 * 31 + Value2 * 13;
}
public override string ToString()
{
return String.Format(",", Value1, Value2);
}
}
public class Node
{
public Node(int position, ChirpletPair pair)
{
this.Position = position;
this.Pair = pair;
}
public int Position { get; private set; }
public ChirpletPair Pair { get; private set; }
public List<int> Noise { get; private set; }
public List<int> GapOccurrence { get; private set; }
public List<Node> Children { get; private set; }
public override string ToString()
{
return String.Format("C()", Position,
Pair == null ? "null": Pair.ToString());
}
public string GetChirpletString(int depth = 0)
{
string indent = String.Concat(Enumerable.Range(0,
depth).Select(a => " ").ToList());
StringBuilder results = new StringBuilder();
results.AppendLine(indent + ToString() + GetNoiseString());
if(Children!= null)
{
foreach (var c in Children)
{
results.Append(indent + c.GetChirpletString(depth +
1));
}
}
return results.ToString();
}
public string GetNoiseString()
{
return string.Format("[]", string.Join(",", Noise));
//return null;
}
public void DeleteGapOccurrence()
{
GapOccurrence.Clear();
GapOccurrence = null;
}
public void SetNoise(IEnumerable<int> noise)
{
Noise = noise.ToList();
}
public IEnumerable<int> GetIndices()
{
yield return Position;
yield return Position + 1;
int pos = Position;
for (int i = 0; i < GapOccurrence.Count; i++)
{
yield return pos + GapOccurrence[i];
yield return pos + GapOccurrence[i] + 1;
pos += GapOccurrence[i];
}
}
public void AddGap(int gap)
{
if(GapOccurrence == null)
{
GapOccurrence = new List<int>();
}
GapOccurrence.Add(gap);
}
public void AddChild(Node n)
{
if(Children == null)
{
Children = new List<Node>();
}
this.Children.Add(n);
}
}
public class ChirpletCompressor
{
public IEnumerable<int> GetNoise(IList<Node> nodes, IList<int>
data)
{
HashSet<int> indices = new HashSet<int>();
foreach(var n in nodes)
{
foreach(int i in n.GetIndices())
{
indices.Add(i);
}
}
for(int i = 0; i<data.Count; i++)
{
if(!indices.Contains(i))
{
yield return data[i];
}
}
}
public Node Compress(IList<int> data) //call this
function to compress a single sequence
{
Node root = new Node(0, null);
Compress(root, data);
return root;
}
public void Compress(Node node, IList<int> data) //warning:
this compress function needs to be called with an empty node, if the
same node is reused for multiple calls to it the tree will build up a
mix of data
{ //calling
this function directly might be good for mixing data of 2 or more
sequences into the same tree, if the outer node is reused the tree will
build up
Dictionary<ChirpletPair, ChirpletInfo> chirpletCounts = new
Dictionary<ChirpletPair, ChirpletInfo>();
Dictionary<ChirpletPair, Node> newNodes = new
Dictionary<ChirpletPair, Node>();
for (int i = 1; i < data.Count; i++)
{
int j1 = data[i - 1];
int j2 = data[i];
if (j1 != j2)
{
ChirpletPair c = new ChirpletPair(j1, j2);
if (chirpletCounts.ContainsKey(c))
{
ChirpletInfo info = chirpletCounts[c];
info.IncrementCount();
if (info.Count == 2)
{
Node n = new Node(info.Position, c);
int gap = i - info.Position - 1;
n.AddGap(gap);
newNodes.Add(c, n);
}
else if(info.Count > 2)
{
Node n = newNodes[c];
int gap = i - n.GapOccurrence.Sum() - 1 -
info.Position;
n.AddGap(gap);
}
}
else
{
chirpletCounts.Add(c, new ChirpletInfo(i - 1));
}
}
}
var noise = GetNoise(newNodes.Values.ToList(), data);
node.SetNoise(noise);
foreach (var n in newNodes)
{
node.AddChild(n.Value);
Compress(n.Value, n.Value.GapOccurrence);
if (n.Value.Children != null &&
n.Value.Children.Count > 0)
{
// n.Value.DeleteGapOccurrence();
}
}
}
}
}
cheers,
Jamie
Re: algorithm that can distinguish between a sequence of PI digits and a sequence of pseudorandom digits of equal length
On 10/28/2015 5:24 AM, Tom Del Rosso wrote:
hi,
Here is a spreadsheet I made showing a version of the compression
algorithm from when I was designing it:
https://www.dropbox.com/s/m0ks8tmdk5h48um/20150329%20compression%20decompression%20with%20wavelets%20and%20chirplets%20example2.xlsx
The input test sequence is in column B, and then the steps of
compression progress through the spreadsheet to the right, and the
final compressed result is at column GD :D
It is a recursive tree compression algorithm, good to find periodicity
and fractals etc.
cheers,
Jamie
hi,
Here is a spreadsheet I made showing a version of the compression
algorithm from when I was designing it:
https://www.dropbox.com/s/m0ks8tmdk5h48um/20150329%20compression%20decompression%20with%20wavelets%20and%20chirplets%20example2.xlsx
The input test sequence is in column B, and then the steps of
compression progress through the spreadsheet to the right, and the
final compressed result is at column GD :D
It is a recursive tree compression algorithm, good to find periodicity
and fractals etc.
cheers,
Jamie
Re: algorithm that can distinguish between a sequence of PI digits and a sequence of pseudorandom digits of equal length
On 10/29/2015 12:57 AM, Jasen Betts wrote:
You are aware that atn() is a trig function derived from the unit
circle, right? So you are saying the same thing Tom is about it being
related to physical space.
You are aware that atn() is a trig function derived from the unit
circle, right? So you are saying the same thing Tom is about it being
related to physical space.
--
Rick
Rick
Re: algorithm that can distinguish between a sequence of PI digits and a sequence of pseudorandom digits of equal length
On 29/10/2015 07:05, rickman wrote:
Not true! There are many series and closed formulae for pi listed on the
Mathematica website.
http://mathworld.wolfram.com/Machin-LikeFormulas.html
My personal favourite is taking a circle as the limit of an inscribed
polygon with 2^n sides as n -> infinity.
No. Tom is wrong there are well known simple power series equal to pi.
The simplest is
atan(x) = x - x/3 + x/5 - x/7 + ...
atan(x) = sum n=0 to infinity { (-1)^n*x^(2n+1)/(2n+1) }
This is about as simple a closed form as you can get.
(and was taught in senior high school in my day)
https://en.wikipedia.org/wiki/Inverse_trigonometric_functions#Infinite_series
Convergence is terrible if you use the obvious x=1 but improves
considerably when you use Shanks approach for 1/5 and 1/239.
Shanks first used this formula to compute the first 100k digits.
pi = 4* { 4*atan(1/5) - atan(1/239) }
https://en.wikipedia.org/wiki/Approximations_of_%CF%80#Machin-like_formulae
No proof exists either way as to the normality or otherwise of pi.
I have a hunch that the BBP formulae hint that it might not be normal at
least in its base 16 representation.
Not true! There are many series and closed formulae for pi listed on the
Mathematica website.
http://mathworld.wolfram.com/Machin-LikeFormulas.html
My personal favourite is taking a circle as the limit of an inscribed
polygon with 2^n sides as n -> infinity.
No. Tom is wrong there are well known simple power series equal to pi.
The simplest is
atan(x) = x - x/3 + x/5 - x/7 + ...
atan(x) = sum n=0 to infinity { (-1)^n*x^(2n+1)/(2n+1) }
This is about as simple a closed form as you can get.
(and was taught in senior high school in my day)
https://en.wikipedia.org/wiki/Inverse_trigonometric_functions#Infinite_series
Convergence is terrible if you use the obvious x=1 but improves
considerably when you use Shanks approach for 1/5 and 1/239.
Shanks first used this formula to compute the first 100k digits.
pi = 4* { 4*atan(1/5) - atan(1/239) }
https://en.wikipedia.org/wiki/Approximations_of_%CF%80#Machin-like_formulae
No proof exists either way as to the normality or otherwise of pi.
I have a hunch that the BBP formulae hint that it might not be normal at
least in its base 16 representation.
--
Regards,
Martin Brown
Regards,
Martin Brown
Re: algorithm that can distinguish between a sequence of PI digits and a sequence of pseudorandom digits of equal length
On 29/10/15 08:05, rickman wrote:
There are all sorts of ways to define pi, and the atan function. Pi is
an irrational number, without a finite closed arithmetical
representation - but so is the square root of 2 and the golden ration
phi. You don't get real patterns in these either (though you will see
patterns in short sequences of the digits).
Mathematically, a circle is defined as a set of points at equal distance
from a single fixed point. It is not "physical", though it happens to
correspond nicely to an everyday physical image.
The definition of arctan is probably best given in terms of logarithms
rather than circles:
so pi = 2i(ln(1 - i) - ln(1 + i))
There are all sorts of ways to define pi, and the atan function. Pi is
an irrational number, without a finite closed arithmetical
representation - but so is the square root of 2 and the golden ration
phi. You don't get real patterns in these either (though you will see
patterns in short sequences of the digits).
Mathematically, a circle is defined as a set of points at equal distance
from a single fixed point. It is not "physical", though it happens to
correspond nicely to an everyday physical image.
The definition of arctan is probably best given in terms of logarithms
rather than circles:
so pi = 2i(ln(1 - i) - ln(1 + i))
Site Timeline
- » OT: More MPLAB program problems
- — Next thread in » Electronics Design
- » Phone detector - not working for 4G
- — Previous thread in » Electronics Design
- » OT: Microsoft Office 365 Outlook (online variant)
- — Newest thread in » Electronics Design
- » OT: Microsoft Office 365 Outlook (online variant)
- — The site's Newest Thread. Posted in » Electronics Design