# Pi storage

Let me share my worst “best idea ever” moment. Sometime during my undergraduate I thought I had solved all the world’s problems. You see, on this fateful day, my hard drive was full. I hate it when my hard drive fills up, it means I have to go and get rid of some of my stuff. I hate getting rid of my stuff. But what can someone do? And then it hit me, I had the bright idea:

What if we didn’t have to

storethings, what if we could justcomputefiles whenever we wanted them back?

Sounds like an awesome idea, right? I know. But how could we compute our files? Well, as you may know pi is conjectured to be a normal number, meaning its digits are probably random. We also know that it is irrational, meaning pi never ends…. Since its digits are random, and they never end, in principle any sequence you could ever imagine should show up in pi eventually. In fact there is a nifty website here that will let you search for arbitrary strings (using a 5-bit format) in first 4 billion digits, for example “alemi” seems to show up at around digit 3149096356. So in principle, I could send you just an index, and a length, and you could compute the resulting file. But wait you cry, isn’t computing digits of pi hard, don’t people work really hard to compute pi farther and farther? Hold on I claim, first of all, I’m imagining a future where computation is cheap. Secondly, there is a really neat algorithm, the BBP algorithm, that enables you to compute the kth binary digit of pi without knowing any of the preceding digits. In other words, in principle if you wanted to know the 4 billionth digit of pi, you can compute it without having to first compute the first 4 billion other digits. Cool, this is beginning to sound like a really good idea. What’s the catch? Perhaps you’ve already gotten a taste of it. Let’s try to estimate just how far along in pi we would have to look before our message of interest shows up. Let’s assume we have written our file in binary, and are computing pi in binary e.g.

- 00100100 00111111 01101010 10001000 10000101 10100011 00001000 11010011

etc. So, if the sequence is random, there is a 1/2 chance that at any
point we get the right starting bit of our file, and then a 1/2 chance
we get the next one, etc. So the chance that we would create our file if
we were randomly flipping coins would be

alemi -> 0000101100001010110101001 with the index 3149096356 -> 10111011101100110110010110100100

which is longer in binary! As an aside, you may have felt uncomfortable
with my estimation for how long we have to wait to see our message, and
you would be right. Just because all N digits I draw at a time don’t
match up doesn’t mean that the second half isn’t useful. For instance if
I was looking for 010, lets say some of the digits are 101,010. While
both of those sequences didn’t match, if I was looking at every digit at
a time, I would have found a match. And you’d be right. Smarter people
than I have
computed just how long you should have to wait, and end up with the
better estimation

## Comments !