Thursday, August 27, 2009

Amplification Amplification

There's something seductive about self-reference--a topic delightfully expounded in Douglas Hofstadter's book Metamagical Themas. Self-reference shook the foundations of math with the Russell Paradox and then Gödel's incompleteness weirdness. But that doesn't stop the fun--by no means! The latest, coolest application I've seen is the RepRap, which is a machine that can build other things, including itself. I suppose that if the conditions are right it will create a whole ecology of replicator replicators. That's what we are too, of course, just squishier and less prone to rust.

This gratuitous introduction is by way of explaining the title. I've spent some time thinking about Zog's Lemma about the amplification of error, and wondering if there's actually anything there. I think there may be, even in the cold light of the fluorescent sun. I will therefore try to amplify on my prior post.

The idea is this:
As economic value increases, for weak predictors the error of prediction can increase over time.
I opened up WinEdit to try a mathematical formulation, but I haven't gotten too far with that yet. The formality may get in the way anyway. We're considering the prediction of the value of some future valuation. The examples will make it clear, but think Prediction = Actual + Error. The last term could be negative. Consider a couple of scenarios.
1. Imagine pulling a dollar bill from your wallet or handbag. How much is it worth, in dollars? One, right? Well, if you got a fiver by mistake, it's worth five bucks. You're not likely to be convinced by a huckster that your fiver is only really worth one and fork it over for four quarters.
Here we have a situation where predictive error is very low and economic value is low too. The actual value of the dollar is realized when you buy something with it. Let's suppose that a fast food burger is a dollar. If you get a single, it was a dollar. If you get five, it had Lincoln on it.
2. Now imagine that you can't tell the difference between a one dollar bill or a five, or a twenty, or a hundred... You have to depend on "experts" for this. The problem is real because you don't want to walk to the burger stand with a hundred dollar bill; you're just not that hungry. This is not entirely hypothetical: US currency is all the same size, so if you're blind this is the way it is. Euros come in different sizes.
Now the predictor has dropped because you can't be sure whom to trust. There's a chance someone will lie to you when you show them the bill and try to take advantage of you.

Both of these scenarios are with relatively low stakes. The competition for your one dollar or five is not as great as if it were a million, right? What happens when we now consider situations where the economic value of the prediction is high?
3. Airplanes are pretty well understood--just watch this. Predicting what will happen under lots of conditions is achievable. Still, things go wrong once in a while. I think that we can agree that this is high stakes--no need to put a dollar figure on a successful landing.
In this situation, what has happened is that the error has gotten smaller over time. That is, by assiduously investigating every prediction failure, we learn how to make better predictions. This is just how science works, of course.

Now for the weird part. Try to imagine situations where the economic value of the predictor is high, but the accuracy isn't so great. What happens now?
4. You want to plant your crops as soon as possible for economic reasons. The shaman says that the gods have decreed that the last freeze will come late because of some indiscretion within the tribe.
This is deadly serious--if your crop gets frosted over, you may not last the next winter through. But the predictor isn't a very good one. What are the dynamics? In this case, the hapless farmer doesn't have many options. Without an anachronistic scientific approach, there is no way to reduce the error. But that's not the end of the story.

The Shaman has a vested interest in maintaining credibility, no? If it emerges that he's full of buffalo cakes, the tribe may put him out on his ear. Since P = A + E, and (despite his best efforts) he has no way to influence the actual outcome A, nor can he improve his predictions P, his only margin for improvement is E. But how can that be? True, he (or she) can't actually improve the error, but he can try to make it appear so with mystical mumbo-jumbo of sufficient impressiveness. So his best bet is to create the best fakery he can--double and redouble the pageantry and behavior that's so far out of norm that it MUST be profound. A successful faker can then get by without paying any attention to the real problem of prediction, and can just get better and better at pretending to.

The sticky point of this equation--what we need actual math for--is whether Herr Shaman could completely abandon accuracy. Maybe he has some residual knowledge passed down about the seasons, the moon, etc. Can the error rate actually naturally increase in such a situation? Only if the effort in maintaining the better predictor is not more than compensated by an equal effort spent toward new and improved humbug.

But what if we can affect the error ourselves?

5. Odysseus wants to sneak his men past the furious cyclops Polyphemus, who he has just blinded. He and his followers ride underneath sheep so as to fool the giant as he checks who exits by touch. (Homer I ain't.)
The predictor is the cyclops in this high stakes assessment. Odysseus (not being Circe) can't change the fact that his men are men, so the A is immutable. He can only affect the prediction by increasing the error, which he does in fine style. Notice the contrast between this and the previous example. In the previous one, disguising the size of E was done for economic value. Here, E is actually increased as much as possible for the same reason. Odysseus spends a lot of time increasing E, actually.

In common parlance, this is called cheating or manipulation, and it's common even in low-stakes situations as we all know.
But what about the amplification of error I advertised? Glad you asked.

Just imagine a recursive situation like number five above, where there are repeated rounds. Each time, Odysseus tries to sneak by the mutilated son of Poseidon, and each time the cyclops tries to detect him. There are lots of situations like this: they evolve. In biology it's sometimes called a Red Queen Race, after a bit from Alice in Wonderland about running as fast as you can just to stay in place. So we have a curious effect where the predictor P is precariously balanced against the error E, like a tug of war: not moving much perhaps, but with great forces involved.

But there's no reason to assume the playing field is fair. What if the conditions are better for creating E than for eliminating it? That might be the case for the SAT test. The chart below was clipped and pasted together from one in InsideHigherEd yesterday here. It lists year over year improvements in SAT score averages by income range.

It shows convincingly that more money gets you higher scores. Exactly why that's true is debatable, but remember this is an increase for one year. One hypothesis is that students of wealthier families have more access to test preparation and can pay to take the test more times, and probably have parents who enable all this as well as pay for it. Does all this effort increase the teen's actual ability to succeed in college (let's call it A)? Or is it responding to the economic value of the predictor (the SAT score) by increasing E instead?

If it's the latter case, then we have a pretty snapshot of an imperfect estimator demonstrating prediction error on the increase because of economic forces. This would imply that the psychometricians who create and score the SAT haven't found a way to counter the increased error that's hypothesized. Zog's moment in the sun?

No comments:

Post a Comment