Saturday, November 05, 2011

Self-Limiting Intelligence

I think intelligence grows only to the point where it begins to interfere with itself. In short, when we get smart enough, we begin to outsmart ourselves and actually undermine our own survival. Here, the inclusive 'we' could apply to individuals, but is more aimed at organizations: corporations, institutions, or governments.

I have been researching survival of these entities in the abstract for several years, and I seem to be all alone in this. This is a real mystery, because survival is the sine qua non for everything else we care about. If you are interested in some of the background, see Survival Strategies [1] or "Surviving Entropy" [2] in this blog.

My interest is in the seemingly pessimistic question "is it likely that an intelligent being or organization can survive for an indefinite period?" This is contrasted with the messy sort of survival exhibited by ecologies that evolve over time, which I refer to in short-hand as a MIC (multiple independent copies). The intelligent systems are shortened to SIS for singular intelligent system. The primary difference is that it's impossible to reproduce and mutate an organization the same way it is a bacterium. All your eggs are in one basket, so to speak. Whereas a bacteria culture can lose 99% of its population and pull through, a singular system can't afford a single lethal mistake.

In [1] I showed a couple of interesting facts about a SIS. First, it has to learn how to predict (or engineer) its environment at a very fast rate, unlike a MIC, which gets this for free via even the most desultory rate of reproduction. In actual fact, we have evidence that the ecology of life on Earth (a MIC) has survived for some billions of years, whereas we have no evidence of any government or other organization (a SIS) surviving for more than a few thousand years (I'm being generous). Put another way, when we look at the vast and enduring features of the universe around us, they are uniformly non-intelligent. This is the source of the so-called Fermi Paradox.

The second interesting fact about a SIS is that although it may be smart enough to change itself, it is impossible for it to predict the ultimate result of those changes. For an organism that is the product of an ecology, this is not an issue. Animals often come prepared for their earthly homes with protective coloration and other adaptations for the environment they will live in. They don't need to change this, or if they do, the provision is built-in but limited (like a chameleon). A frog can't re-engineer itself into a bird if it finds the need to fly. A SIS, on the other hand, may have to adapt to completely foreign environments over time.

The problem a SIS faces is that it generally cannot predict what will happen to it after a self-change, so it doesn't know if this change is good or bad in the long run. It can try to guess by simulating itself, but there's an essential limitation here. There are two types of simulation, detailed below.
Suppose a SIS considers changing its 'constitution' in some way, which will affect the way future decisions are made. It builds a sophisticated computer model of itself making this change to see what will happen. There are two possibilities:
1) The simulation is perfectly good: so good that the SIS cannot change the outcome even if it's a bad one.
2) The simulation is only approximate: the SIS can take a look at the future and change its mind about making the change.
In the first case, a perfect simulation tells us not only what the future holds, but also whether or not the organization will make the change. This is because it incorporates all information about the SIS, including the complete present state. So it will present a result like "you make the change and then X happens," or "you don't make the change." A perfectly true self-simulation has to have this property. So it's like Cassandra's warning--even if it predicts an undesirable future, it still has to live it! 
Such perfect simulations are really only possible with completely deterministic machines, like a computer with known inputs. In practice, all sorts of variables might knock it off course. So what about approximations? The essential element of an approximation is to be able to make a decision about the future. The most fundamental one might be "if I make this change, will I eventually self-destruct?" This is the most fundamental question for a SIS. The most dangerous challenge from the environment for a SIS comes from within itself.
The US constitution makes it harder to change the constitution than to pass ordinary laws. This is a prudent approach to self-modification.
Unfortunately, decision problems like this are not reachable by general-purpose processes. This is covered in [1], but you might peek at Rice's Theorem to see the breathtaking limitations of our knowledge of what deterministic systems will do. So we can simulate in the short term, but the long-term effect will be a mystery.

So a SIS can only learn about self-change empirically, by trying things out, or short-term simulations. It can't ask about the general future. Although the external environment may be quite challenging, and survival may be a risk because of factors beyond its control, the internal question of how to manage self-change are just as bad or worse. Hence my hypothesis that the odds will catch up with any SIS eventually, and it will crash. This also jibes with with all the empirical evidence we have.

This is where I left the question in [1], but in the last couple of years I think I've identified a fundamental mechanism for self-destruction that any SIS has to overcome.It has practical implications for institutions of higher learning and other sorts of systems like businesses and governments.

In my last post, I showed a diagram for an institutional effectiveness loop that looks more technical than the usual version. Here it is again, with some decorations from the talk I gave at the Assessment Institute.

The diagram actually comes from my research on systems survival, and it is a schematic for how a SIS operates in its environment. The (R) and (L) notations refer to 'Reality' and 'Language' respectively. Recall that the I in SIS stands for Intelligent, and this is what I mean: the intelligent system has ways of observing the environment, encoding those observations into a language that compresses the data by looking for interesting features, and  models the interactions between these. This allows a virtual simulation of reality to be played out in the SIS, enabling it to plan what to do next in order to optimize its goals. This is the same thing as an institutional effectiveness loop in higher education, in theory at least.

Language is much more malleable than reality: we can imagine all sorts of interactions that aren't likely to actually occur. For example, astrology is a language that purports to model reality, but doesn't. It's essential for the SIS to be able to model the real environment increasingly well. The mathematical particulars are given in [1] in terms of increasing survival probabilities.

There's something essential missing from the diagram above. That is the motivation for doing all this. When the SIS plans, it's trying to optimize something. This motivation is not to be taken for granted, because there's no reason to assume that a SIS even wants to survive unless it's specifically designed that way. For example, a modern air-to-air missile has good on-board ways to observe a target aircraft (e.g. radar or heat signature), a model for predicting the physics of its own flight and the target's, and the means to implement a plan to intercept. So by my definition, it's reasonably intelligent. But it doesn't care that it will be blown up along with its target.

Motivation to survive is a decoration on a SIS. Of course it won't likely survive long without it, but it's not to be taken for granted, which makes the question of what happen under self-change very important. It's quite possible to make a change that eliminates the motivation for self-survival. What exactly constitutes survival is a messy topic, so let's just consider this general feature of an SIS, which has applications to personal life as well as governments, corporations, military organizations, and universities:
Motivations can change or be subverted when self-modifications are made.
This doesn't sound very profound; it's the particular mechanism shown below that is the interesting part. Here's how it works. When we observe our environment, we encode this into some kind of language, specialized to help us understand where we are in relation to our goals. For example, if I stub my toe on external reality, I get a finely-tuned message that informs me immediately that my most recent action was inimical to my goals for self-preservation: it hurts! This pain signal is just like any other bit of information encoded into a custom language: it can be intercepted or subverted. There are medicines and anesthetics that can reduce or completely eliminate the pain signal. Because signals are purely informational, they are always vulnerable to such manipulation by any system that can self-change.

Motivations are closely tied to these signals. It may be a simple correspondence, as with pain, or something abstract that comes from modeling the environment, like fear of illness. Sometimes these come into conflict, as the example below illustrates.
Sometimes I get sleepy driving on the interstate. If I find myself beginning to micro-sleep, I pull off the road and nap for 15 minutes. How is it that my brain can be so dumb as to fall asleep while I'm driving? Something very old in there must be saying "it's comfortable here, there's not much going on, so it's a good time to sleep," in opposition to the more abstract model of the car careening off the road at speed. We can try to interfere with the first signal with caffeine or loud music or opening the windows, or we can just admit that it's better to give in to that motivation for a few minutes in a safer place.
The mechanism for limiting intelligence works like this:
A SIS tries to attain goals by acting so as to optimize encoded signals that correspond to motivations. If it can self-modify, the simplest way to do this is to interfere with the signal itself.
I think it is very natural for a SIS to begin to fail because it fools itself to artificially achieve goals by presenting itself with signals that validate that. Even if external reality would disagree.
I just finished reading Michael Lewis' The Big Short, which is rife with examples of signal manipulation. Here are a couple. 1) The ratings agencies (S&P, Moody's) had two motivations in conflict: generating revenue by getting business rating financial instruments (such as CDOs), and generating accurate ratings. These are in conflict because if they rate something poorly (and perhaps unfairly), they may lose business. The information stream got subverted that should have signaled that  it was a bad idea repackaging high-risk loans into triple-A rated instruments. 
2) According to Lewis, counterparts at Goldman Sachs learned exactly how to tweak the signals in order to get the result they wanted from the bond raters (by manipulating the way risky loans were structured to optimize an average credit rating, for example). 
3) In an example of self-deception, risk management offices of the investment banks managed were fooled by the ratings agencies "credit-laundering" and their own trading desks, which allowed vast liabilities to go unnoticed. 
4) The whole economic apparatus of the world largely ignored signs that the system was on the verge of collapse. 
If we imagine that a SIS is continually trying to increase its survival chances, an observation that probabilities are decreasing instead is obviously bad news. If it can self-modify it has the choice to accept this unwelcome fact about probabilities, or it could interfere with the signal (ignore it, for example).

Alternatively, the internal model of an SIS may associate a potential benefit with a planned act, which is a good thing. Any evidence that this may not work out as intended would decrease the value of the act, and this (also bad) news might be subverted, so that only supporting evidence is accepted. This is usually called confirmation bias in humans.

It's natural to ask, if this is such a problem, why hasn't civilization already collapsed from ignoring bad news and amplifying good news? The answer, I think, is that humans comprise the civilization and all its organized systems, and humans can't completely self-modify. Yet. Imagine if you could.

What if every emotional reaction could be consciously tuned through some mental control panel? What to be happier? Just turn up the dial. Don't like pain? Turn that dial down.

Because humans are actually members of a MIC (that is, an ecology), we are subject to selection pressure from the environment. Viewed as discrete systems, our organizations inherit some of this evolutionary common sense, but it's diluted. Individual humans often have a lot to say about how an organization operates, and can imbue them with denial and confirmation bias. Organizations are easily self-changed, and can't predict how those changes will turn out. I think, however, that certain strategies can ameliorate some of the most self-destructive behaviors. Here they are:
1) Create cultures of intellectual honesty, and actively audit signals, languages, and models to make sure they correspond to what's empirically known, whether it's good news or not. Intellectual honesty should be audited the same way financials are: by an outside agency doing an in-depth review. In the long run this doesn't solve the problem because any such agency will have the same problems (self-deception, inability to predict effects of changes, etc.), but it might increase the quality of decision-making in the short to near term.
2) Be conservative and deliberate about changes to signals, languages, predictive models, and fundamental structure. Audit those continually and transparently. Everyone should know what the motivations are and what signals apply to each. Moreover, 'best practices for survival' should be used. Since much of our learning is from other systems that failed, this wisdom should be carefully archived and used.
These are particularly advisable for organizations that have motivational signals that are difficult or slow to interpret. For enterprises that are very close to objective reality, these measures are less necessary because of the obviousness of the situation. For example, it's hard to argue with the scoreboard in a sporting event. We can close our eyes if we don't like the score, but there's really not much room for misinterpretation. Therefore, one would expect a successful team to be either very lucky or else have good models of reality reflected in their language and signals. The same could be said of military units in active service, traders on a stock exchange, or any other occupation with signals that are hard to interfere with.

Examples in the other direction, where signals are or have been ignored are the financial crisis already mentioned, the looming disaster of global warming, the eventual end of cheap oil, and human overpopulation.  On an individual level, unnecessarily bad diets, lack of exercise, smoking, and so on are examples of abstract survival signals ("doctor say so") versus visceral motivations (e.g. tastes good) that show flaws in our motivational calculus.

I intend in the next post or two to show how this is related to the business of higher education.


  1. I'm not sure what to do with this thought but I thought I would offer it, in case it helped:

    No government has survived for thousands and thousands of years due to intentional mutation but cultures regularly do. Governments have to change intentionally while culture change through a kind of collective or subconscious agreement. Now, I don't know if that becomes the ecosystem a government operates in or a competing method of survival.

    To take one example, there are people still out there happily reading and translating The Iliad -- long after the heroic society it is based on has crumbled to dust.

  2. (That should have read system rather than society, to avoid introducing those pesky language issues you mention above.)

  3. Thanks for the comment, Matt. It does get more complicated when we start defining what a system is and what survival for that system actually means. The "I" in SIS means whatever it is (government, culture, etc.) has to actively be parsing the environmental data to try to survive. Otherwise, it's surviving by accident, not intelligence. So if we take a manageable chunk of a culture--say the language--and look at it, it might or might not have "intelligence" behind it. I think you could argue that French does. It would be interesting to look at the most long-lived religions and see what makes them tick. Do you suppose there are remnants of a stone-age mystical belief still around? I don't know.

    Culture sort of gets a free ride on the ecology of humans, like bird nests 'survive' as long as there are birds. As long as we are around, there will be some sort of culture descended from the past. In order to say that could survive for long periods, we'd have to argue that humanity will survive for a long time to come. So we're really talking about the survival of civilization as a whole, I think.

  4. When I was thinking of culture, I was thinking of something more discern -- say, Chinese culture (to choose the longest lived) or English culture or American culture. Some get wiped out by contact with other cultures (e.g., the Aztecs) and others mutate (e.g., American being primarily derived from British culture). The odd bit are the relics of older cultures (astrology, to take one you mention) that survive after the culture that spawned them died.