πfs – A data-free filesystem

open-paren · on Sept 29, 2021

Previous discussions

PiFS – The Data-Free Filesystem (February 20, 2021 — 1 points, 1 comments) - https://news.ycombinator.com/item?id=26208704

Πfs: Never worry about data again (October 25, 2019 — 3 points, 1 comments) - https://news.ycombinator.com/item?id=21359338

The π Filesystem for FUSE: Store Your Data in π (February 21, 2019 — 1 points, 1 comments) - https://news.ycombinator.com/item?id=19223032

pifs - Avoid disk space usage by saving your files in the digits of Pi (December 14, 2018 — 3 points, 1 comments) - https://news.ycombinator.com/item?id=18687275

πfs – A data-free filesystem (March 14, 2017 — 285 points, 105 comments) - https://news.ycombinator.com/item?id=13869691

Πfs: Stores your data in π (January 6, 2016 — 2 points, 1 comments) - https://news.ycombinator.com/item?id=10856108

Πfs: Never worry about data again (January 5, 2016 — 5 points, 1 comments) - https://news.ycombinator.com/item?id=10847693

netflixandkill · on Sept 29, 2021

I love that they ran with this far enough to get it working. We need a graph of the average number of bits to store an offset into pi versus size of stored data.

floren · on Sept 29, 2021

Well, based on this sentence:

> In this implementation, to maximise performance, we consider each individual byte of the file separately, and look it up in π.

I'd say best-case scenario, you're looking at 1:1 offset storage size vs. stored data size :)

k__ · on Sept 29, 2021

How would the location search slow down with inceased block size?

And is the algorithm to do so faster on a quantum computer?

netflixandkill · on Sept 30, 2021

that's way worse than 1:1 unless all integers between 0 and 255 occur in the first 256 digits of pi, which I'm 99.pi% sure is not the case.

noxer · on Sept 29, 2021

Wait until people find out all the CSAM is stored on there. They can't ban π soon enough. Its worse than bitcoin. /s

Stampo00 · on Sept 30, 2021

I had to look up the acronym. Are we not allowed to say "kiddie porn" on here?

account-5 · on Sept 30, 2021

Child sexual abuse material because that's what it is: sexual abuse. Calling it porn sends the message that it's legitimate, it isn't.

noxer · on Oct 1, 2021

If you applies the same interpretation as with "gay porn" to "kiddie porn" then its would be something completely different. It would be porn performed (acted) by kids and possibly targeted at kids. That does make much sense. They are not acting, they are abused and people should name it like that.

seanw444 · on Sept 30, 2021

It's the politically-correct version now, apparently. Hadn't heard it before the Apple phone-scanner debacle.

pastrami_panda · on Sept 29, 2021

Love it, the tone of the readme is amazing.

Protostome · on Oct 1, 2021

> In this implementation, to maximise performance, we consider each individual byte of the file separately, and look it up in π.

I don't get it. we simply replace one byte (the data) in another byte, or even more than that (the index in pi) What am I missing?

Besides, why do you have to "search" pi? why not just make a table mapping all possible 2/3/4 bytes (256^(2/3/4) combinations) to it's corresponding positions in pi, and every subsequent compression will run much more efficiently.

BTW, it is very easy to show that a simple huffman code based compression yields a better compression ratio than this method.

_tom_ · on Sept 30, 2021

Looks like "NFS" in the hacker news font.

redconfetti · on Sept 29, 2021

I wonder if this concept can be used for Πcoin.

gfodor · on Sept 29, 2021

I wonder how hard it would be to find offsets into pi that contain surprisingly legit bit sequences.

nickdothutton · on Sept 29, 2021

This. Given a few millions digits I wonder what the hit rate is.

MauranKilom · on Sept 29, 2021

Assuming your data is much shorter than the number of digits to search, and that repeated digits do not appear often enough to matter, the hit rate is just pow(10, numberOfDigitsInData) / numberOfDigitsToSearch. Same idea for any other base (if you then count digits of that base, not base 10 digits of course).

That is, odds of finding a 6 digit datum in a million digits are fairly good. Finding longer data becomes exceedingly unlikely very fast.

gfodor · on Sept 30, 2021

I think I'm asking the inverse question - instead of having a known-good datum, given the fact that Pi isn't random, I wonder what coincidentally you could stumble onto, given a broad enough heuristic to "discover" interesting sequences.

MauranKilom · on Sept 30, 2021

For all we know, Pi is random (i.e. normal, although we haven't been able to prove it). That would mean any sequence appears eventually, with uniform odds. Hence any ("interesting") data you'd want to store does appear at some point, and I gave the odds of any (interesting or not) digit string appearing in the first x digits.

betwixthewires · on Sept 29, 2021

This is seriously a very interesting concept. Sounds like tower of babel but somehow much more useful for it's obvious purpose.

generalizations · on Sept 29, 2021

Do you mean the library of babel?

wizzwizz4 · on Sept 29, 2021

No, the Tower of Babel.[0] With this revolutionary technology, we can keep track of information using only metadata; in the information age, such a “digital Tower of Babel” could let us attain ever-increasing heights of Knowledge, if we are not scattered as a result.

[0]: https://xkcd.com/496/

MauranKilom · on Sept 29, 2021

I don't get how the internet secretary thing relates to the tower of babel... Did you mean https://xkcd.com/2421/?

wizzwizz4 · on Sept 30, 2021

“You mean the fifth?”

“No, the third.”

betwixthewires · on Sept 30, 2021

einpoklum · on Sept 29, 2021

The last commit was made 5 years ago.

tmountain · on Sept 29, 2021

I doubt Pi has changed much…

pindab0ter · on Sept 29, 2021

To be fair, this is Hacker News.

jrootabega · on Sept 30, 2021

pi was abandoned in favor of pi 2

seanw444 · on Sept 30, 2021

All hail Tau