I love that they ran with this far enough to get it working. We need a graph of the average number of bits to store an offset into pi versus size of stored data.
If you applies the same interpretation as with "gay porn" to "kiddie porn" then its would be something completely different. It would be porn performed (acted) by kids and possibly targeted at kids. That does make much sense. They are not acting, they are abused and people should name it like that.
> In this implementation, to maximise performance, we consider each individual byte of the file separately, and look it up in π.
I don't get it. we simply replace one byte (the data) in another byte, or even more than that (the index in pi)
What am I missing?
Besides, why do you have to "search" pi? why not just make a table mapping all possible 2/3/4 bytes (256^(2/3/4) combinations) to it's corresponding positions in pi, and every subsequent compression will run much more efficiently.
BTW, it is very easy to show that a simple huffman code based compression yields a better compression ratio than this method.
Assuming your data is much shorter than the number of digits to search, and that repeated digits do not appear often enough to matter, the hit rate is just pow(10, numberOfDigitsInData) / numberOfDigitsToSearch. Same idea for any other base (if you then count digits of that base, not base 10 digits of course).
That is, odds of finding a 6 digit datum in a million digits are fairly good. Finding longer data becomes exceedingly unlikely very fast.
I think I'm asking the inverse question - instead of having a known-good datum, given the fact that Pi isn't random, I wonder what coincidentally you could stumble onto, given a broad enough heuristic to "discover" interesting sequences.
For all we know, Pi is random (i.e. normal, although we haven't been able to prove it). That would mean any sequence appears eventually, with uniform odds. Hence any ("interesting") data you'd want to store does appear at some point, and I gave the odds of any (interesting or not) digit string appearing in the first x digits.
No, the Tower of Babel.[0] With this revolutionary technology, we can keep track of information using only metadata; in the information age, such a “digital Tower of Babel” could let us attain ever-increasing heights of Knowledge, if we are not scattered as a result.
PiFS – The Data-Free Filesystem (February 20, 2021 — 1 points, 1 comments) - https://news.ycombinator.com/item?id=26208704
Πfs: Never worry about data again (October 25, 2019 — 3 points, 1 comments) - https://news.ycombinator.com/item?id=21359338
The π Filesystem for FUSE: Store Your Data in π (February 21, 2019 — 1 points, 1 comments) - https://news.ycombinator.com/item?id=19223032
pifs - Avoid disk space usage by saving your files in the digits of Pi (December 14, 2018 — 3 points, 1 comments) - https://news.ycombinator.com/item?id=18687275
πfs – A data-free filesystem (March 14, 2017 — 285 points, 105 comments) - https://news.ycombinator.com/item?id=13869691
Πfs: Stores your data in π (January 6, 2016 — 2 points, 1 comments) - https://news.ycombinator.com/item?id=10856108
Πfs: Never worry about data again (January 5, 2016 — 5 points, 1 comments) - https://news.ycombinator.com/item?id=10847693