Friday, April 23, 2010

Survival Strategies Talk

Last week I gave a talk on my math research to a group of undergrads and faculty at Augsburg College.  University of St. Thomas in St. Paul put me up a couple of nights so I could do so.  Here are the notes and presentation.  The latter will change as I add stuff to it--I use the Prezi as a mindmap for the paper I'm writing.

The blocks in the text are where to click next on the presentation, but it's approximate.  I don't have a YouTube version yet.

My topic this afternoon is survival.  We’ll use math to talk about philosophy.  Both are valuable. █ I created this from data from the Wall Street Journal.  █.  According to their data, math and philosophy majors show the greatest gain in salaries. █ Liebnitz knew something about both of these subjects.  █ █.

So, by survival strategies, I don’t mean buy a cabin in the woods and stock up on poodles and rifle shells█.  Mathematicians never study anything that interesting. █

For some reason, it’s easier to talk about something surviving than it is to talk about it being alive. So let’s go with that.  The data on your computer’s hard drive █ may survive a system crash, even though it was never  alive.

Let’s take that example.  How many of you have lost data that’s important to you?  It’s not much fun.

What are the chances of this happening?  Let’s say that the probability that your data survives a year is some number p. █ Then the chance of surviving two years is p squared (assuming it doesn’t degrade over time) and so on. At the nth year we have p to the n power.  What kind of graph does p to the n have?  █

So no matter how good it is, the hard drive has probability zero of surviving forever.

If a fixed survival probability p is a problem we have to figure out how to increase it.  So the more general formula, where the survival probabilities can change from year to year is here.  █  The small p is the annual probability, and the big P is the probability that our subject will get that old.  The big pi means multiply.  So lower case p sub 5 is the chance that it will survive year five, given it’s already survived years one through four, whereas capital P sub five is the total probability that we get through year five.

In order live forever, I guess it’s obvious that we want to the probabilities to go up rather than remain constant.  How fast do they need to go up?

There’s a nice form we can use to describe increasing probabilities.  Here it is—a double exponential. █  If we multiply a sequence of annual probabilities like this, the geometric series formula helps us out,  █ so we can simplify total probability big Pn easily and see that as n goes to infinity, the chance of our subject surviving is p to the one over one minus b.

This is the best we can hope for, by the way—a limit greater than zero.  100% survival guarantee is not in the cards.  If in practice, these numbers are low for advanced civilizations, we wouldn’t expect to see them roaming around the galaxy, for example

█Here are some graphs of survival probabilities for the double exponential with different bs.  We want b to be small.

It’s interesting to compare the double exponential formula to actual census data. █  Here, the bars on the left are US populations of certain ages, and the bars on the right are the double exponential probabilities with b set to 1.08.  We have to have b less than one to get a shot at indefinite survival. Some people, like Ray Kurzweil think humans will be able to live forever, so this 1.08 figure would make a nice benchmark.

So…back to hard drives.  What do we do in practice to increase the odds of survival? █  Backups. █ How many backups do you need in order for your data to live forever?  The number of backups can’t be fixed.  Is that clear?  It has to grow.  How fast does it have to grow?  What if you added one every year?  Is that enough? Do we need exponential growth? [poll]█

The way to figure this out is first assume the backups are independent, which is an approximation at best.  But it lets us multiply the chances of failure (we’ll call these q) to get the chance of total failure.

The linear grown model might look like this. █   In year one we have a hard drive.  In year two we add a USB stick.  In year three we add a SIM card, then a DVD, then hard copy.  If any one of these fails it gets replaced.  The only way for total failure is if all backups fail during a year.  We can bound that from below █ and show that this method is successful.  So linear growth is enough.  This graph █ compares this backup scheme with the double exponential curve we saw a minute ago.  The curves are different but have the same limit.

You don’t need exponential growth. Of course real independence is impossible.  If we think about ecologies of biological life, and how these manage to survive, they combine exponential growth with variance in design to create multiple copies that develop independence from one another.  It would be a cool undergrad research project to figure out how to visualize this.  █

So, my friends, an ECOLOGICAL solution █ is a very powerful one to the survival problem.  How long has life existed continuously on this planet?  A few billion years, right?  █ No single organism has lived that long, but the ecological solution to the problem has worked for that long.

Before we congratulate ourselves, however, let’s think for a moment.  Are all of the things that we would like to survive copyable?  What about the Mona Lisa?  Is a reproduction just as good?  What about yourself?  A nation?  Making copies isn’t always an option.

So what about individual or singular survival? █ Here’s where it gets really interesting.  Assume we have some subject that we can’t █ create an ecology out of. █  We have to increase the probabilities of survival directly. So this is simple-sounding.  First, figure out what possible actions are the best ones for us, then do that.  █  We could say this is the “be smart” strategy.  █  Now I chose a fitness place as an example for a reason.  How many of you consider yourselves intelligent human beings?  I don’t need to look—everybody raised their hand, right?  Okay, how many of you believe that exercise is essential to good health?  How many of those who do actually exercise?

My point is that knowing what to do and doing it are different.  We’ll come back to that.  First, let’s focus on knowing what to do.

█A word from our sponsors.  Seriously, eat your fiber while you’re young.  Go to wikipedia and look up diverticulitis.  If you forget everything else I say today, remember that.

█ Well, we have some clues.  We call it science.  Frankly it’s amazing.  As a civilization, we are exploring every nook and cranny—and every crook and nanny too—of our environment, to find out how it works.  We are doing VIRTUALLY what evolution does physically, but much much faster.

Neils Bohr said that predictions are hard, especially about the future.  Whereas an ecology can just try out stuff physically and throw away the mistakes as the consequence of natural selection, our intelligent survival machine can only make virtual mistakes.  This presents a problem we’ll call induction risk. █

What’s the next element of this sequence?  We can guess it’s a zero because that’s the simplest explanation.  But is the simplest explanation always right?

What does a turkey think before the holidays? █  Probably—man, life is good.  Eating and sleeping all day.  It’s like a resort.  Then comes Thanksgiving.

We have two clear and opposite points of view from classical philosophy. █  Epicurus—don’t throw away possible solutions.  █  Occam’s Razor—keep the simplest explanation.  More modern versions include updating a priori probabilities█, and ones based on Kolmogorov complexity theory█.  But in my opinion these are engineering hacks to the basic problem that Hume█ identified:  telling the future isn’t 100% possible.  So we’re left with trial and error█ and imperfect models of the world. We cannot look at past experience and predict the future with certainty, and this induction risk █ means that even when we think we know what to do, we may be wrong.  Some one said that in theory, theory and practice are the same.  In practice they’re not.

Okay, well, blah, philosophy this, blah, blah.  Where’s the computer stuff?  █  As a critical thinker, what do you do when you don’t understand something complicated?  Look at simple examples.  Make a model that only takes into account a few key characteristics.  Let’s do that.

So we’re going to create an ecology of algorithms—computer programs that have to try to stay alive in an environment by predicting what comes next.  You can think of these critters as an analogy to a physical ecology like found in the wild, or as a virtual ecology of possible solutions in an intelligent being’s imagination.  Either way, the goal is to inductively find the solution to a simple problem.  Or else.

Here’s how it works.  █ It’s a game like rock paper scissors.  The environment can play a 0, a 1, or some other positive integer.  The critter can do the same, with the results listed on the chart.

The critter does not get to see what the environment plays until after it’s made its own choice, so it has to rely on what it thinks the environment will do based on past experience.  In other words, induction.

The critter can self-destruct by playing a zero.  Think of a computer program crashing.  That’s what this is—a fatal internal problem.  This is the halt code.

Or it can play a one and participate in the environment—think of this as eating something it found.  If it’s harmful, it dies.  If it’s good it survives, and otherwise it just waits to play again.

Or it can just observe by playing something other than a zero or one.

Here’s an example. █ People sometimes debate whether fire can be considered alive or not because it eats and breathes and reproduces and whatnot.  We can sidestep that and look at what a fire’s survival strategy is.  It’s pretty simple—it burns anything it can, right?  There’s no much subtlety.  Look at the environment and the critter and tell me how long it will survive.  Until the first environmental zero.

So if a fire is alive, its survival complexity is very low. It’s a constant, which is the least complex thing there is.

Another example.  █What would you guess the environment is doing here?  We’ll see in a minute how  the simulation does with this.

Now the goal of a critter is to accumulate an infinite number of ones.  It can’t just sit there and passively watch without participating in order to qualify for survival.  there’s a lot of interesting things we can do at this point.  We can classify environments by how hard they are to survive.  If there’s NO infinite subsequence of 1s in the environmental plays, then it’s not survivable at all.  We can say more than that by talking about the computational complexity of survivable algorithms, and whether or not they’re even reachable by a critter based on inductive reasoning.  These are fascinating questions.  But let’s get to the sim.

█ I created a simple programming language out of an existing Turing-complete one called—well, the name is rude.  If you’re interested I’ll tell you after.  I wanted the language to have no syntax other than symbol order—so every string of commands is an acceptable program.  And I make the symbols mostly symmetrical so they look like bugs when strung together.

This is implemented in perl, and you can have the code if you want it.  Here’s my first attempt. █  I set up a repeating environment 012012… and wrote a program to solve it.  The numbers are the state vector for the critter—numbers it can store for its own use.  Only one of these is getting used here, so I figured it would be an opportunity to improve on my design by evolving toward a smaller state vector.  I cranked up the starvation rate to help it along.  But what happened is that it found this solution pretty quickly. █  It’s like “dude, just look at the old environment and add one.”  So, okay, I felt a little stupid.

█ There are a lot of parameters and tweeks I won’t go into. The language █ has and interpreter change as I try different things out, but I keep a catalog of all of them.  Basically, successful critters get to reproduce, and their offspring mutate according to a parameter.

[short demo]

So one thing to look at is how these virtual ecologies compare to real ones.  █  What you see here is a set of three scripted runs.  Each run is a hundred individual simulations averaged.  The severity of the environment increases to see what effect this has on population size and diversity.  Basically the critters have to learn to count to three.  They start in an entirely benign environment—an endless buffet that our “burn baby burn” strategy would work fine in.  The only risk is self-halting.  Then it transitions to a 012 cycle, so the critters have to time the ones to survive.

There’s a lot going on here, but this is population in the middle.  Transitioning from the buffet causes the population to crash.  The lines at the bottom show how many critters died from eating the wrong thing versus self-halting.  And the lines at the top give diversity.

This is what we would expect to see.  As the environment becomes more severe, successful critters take over, which works, but limits the diversity.  Changing to a different environment once it has specialized is almost guaranteed to make the population crash and burn.

█The statistical statement of this is Fisher’s Fundamental Theorem, generalized by Price.  I haven’t formally computed these parameters yet—that would be a good undergrad research project.

Here’s another one, where three populations are given increasingly hard environments.  Population is at the bottom, and diversity at the top.  There’s more going on here I’ll get to in a minute.  Here are some of the solutions from these three hundred runs.  █

Not one of these looks at the old environment.  The command for that is the greater than sign—it looks like an antenna.  No antennas on any of these critters—they all survive by counting out the environment, synchronizing with it.  In fact, it’s very hard to breed critters of the cause-effect sort.  I have more to say about that, but you’ve had your computational fix now—let’s zoom back to the induction risk problem.

You’ll note that part of what I was looking at in those graphs was a diagnosis of how often self-halts happen.  The importance of this depends on whether or not these singular intelligent survival machines are really computational engines or not.

For the sake of argument, I’m going to assume something called the  █Church-Turing Thesis applies here, and wave my hands.   See me waving?

█ If our survival machine is an intelligent robot or a bureaucratic civilization that runs on computable rules, then we have to consider self-halting.

Everybody probably knows about Turing’s Halting problem.  Basically, you can’t know in general if a machine will give you the blue-screen of death. █  Before you dismiss this as something that only happens in windows, let’s look at a real world example.

█ In the New York Times, Paul Krugman writes this.    So basically a government self-destructed because of one bad decision that locked up the system.

More ominous are the accounts of how close we came to nuclear war in the 20th century.  If we consider human civilization as a whole, I think it’s fair to say pulling that trigger is a self-halt.  █ You can read more theory about this sort of thing from Peter Suber.

The key to this for me is this assumption. █ It goes like this:

1. We have to learn about the environment to learn how to reduce induction risk.
2. As part of that, we learn how to change our physical form
3. Our behavior derives from our physical form
4. Therefore, we can and must be able to self-modify

█Note that with drugs and genetic engineering we can already begin to do this.

Gregory Chaitin█ has this number he calls omega, which is the average halting probability for a random computational function.  We’re interested to know what one minus omega is—that would be our average chance of survival from self-halting.  But Chaitin proved that we can’t know this probability.  So there’s some unknown survival tax, if you will, because of the self-halt risk.

Recall that the two requirements for a singular intelligence to survive is that it learns what to do to increase chances, and then does it.  This second part shouldn’t be taken for granted.

Evolution has partially solved this problem with emotional loadings—we get hungry when we don’t eat, and we have a strong survival instinct.  This genetic knowledge is different from our real-time apparatus.  As an example, imagine that someone tells you a cobra is loose in your tent.  You lay down anyway and take a nap.  Sounds hard to do, right?  Now imagine falling asleep while driving on the interstate.  People do that all the time.  What’s the difference?  We might crudely say the first is physical and the second is virtual.  For a self-altering critter, everything becomes virtual.  This is related to something called Morevec’s Paradox, that we don’t have time to delve into.

I think the usual assumption is that an artificial intelligence would behave and think like we do.  So we program it to be hungry when it needs power, and will then plug itself into the wall.  I think what would more likely happen is it would just say to itself, I don’t like being hungry and turn itself off, rather like Claude Shannon’s so-called “ultimate machine,” whose only function was to turn itself off if someone turned it on.

One interesting question is this—is there a way to understand computation in a topological sense █ so that we could imagine a surface with attractors of what we might call computational orbits as these processes self-modify.  Are there attractors where the processes avoid self-halt under robust environmental conditions?  I think it’s fair to say that’s a hard question.

I’ve started taking a look at what happens in the ecologies generated by my critters.  Here’s one.  █  The graphs show self-halting frequency over time for runs of critters with increasingly difficult environments.  The general pattern is clear—the more difficult the environment, the more the self-halts.  This is because there is more churn in the population—more deaths and births, more critters born with self-halting tendencies.  But what’s interesting is that as the population solves the problem posed by the environment, the frequency drops.

This would not be surprising in a static environment, but this one has a twist.  █

Let’s start over. █ Imagine you’re working for NASA and you have a space probe that forgot all its programming.  You have to transmit the code back up to it.  Unfortunately, its communications protocols are compromised and there is no error checking or error correction on the other end.  Your program, however is allowed to self-modify.  How do you ensure that the program that runs on the probe is uncorrupted?

We know that when things move through time, they tend to get messy. █  Formally, this is called the second law of thermodynamics.  When our hypothetical intelligent survival machine moves through time, it gets messed up too.  One would think that if it’s to survive, it has to deal with that.  Let’s see what the implications are.

In the simulation, I created a random state-change matrix that flipped program symbols according to the rule generated.  Here’s an example.  █  The greater than sign gets changed to the left bracket 91% of the time between generations of critters for every critter in the population.  You might think this would present a challenge to survival.  If you’ve done any programming, you know that randomly changing code is unlikely to have a good outcome.

But although no individual can survive this, an ecology can. █ And we can look at the transitions as we increase this entropy.

Looking at individual runs rather than averages is instructive.  Here’s without random symbol swapping.  █  Almost all of them survived.  And these are the populations that had to deal with it. █  It’s a much different picture.

So if an ecology can survive, can an intelligence?

Let me suggest one way around our space probe problem.    First, the real world isn’t exactly like scenario I described.  The real world isn’t a single processor running a single program, but a massively parallel “computer”.  So even if we can’t create an ecology of copies of ourselves, we could send copies in parallel.  Here’s the difference. █  Now instead of having to have the whole thing survive intact you only need on of several copies intact.  This is like the control systems on an airplane—redundancy.

In any of these, no matter what sort of error correction might be used, there are two interesting questions these modules have to be able answer.

Am I me?  Or was I corrupted in transit?  In this case, the best thing in the modular case is a self-destruct.

The second question is are you me?  If we’re trying to maintain state—an identity—then replacing fellow defects with good copies is essential.

Can you think of any big complex thing that works this way?

Bodies, companies,…

We could go further and compartmentalize.  Now I only need one copy per module to survive.

This leads to two interesting criteria for survival in the face of entropy:  an AM I ME test and an ARE YOU ME test.  We see both of these in compartmentalized biological organisms in the form of programmed cell death and the immune system.

The enduring features of the universe are many and varied rather than singular and intelligent.  Think about that.

[Note: I ran out of time.  The full paper will have more details.]