Higher Ed/: November 2010

Sunday, November 21, 2010

Collatz Ecologies

I mentioned the Collatz Conjecture in "On Design" as an example of the qualitative difference between simulation and inverse problem solving. In this article I want to use it for another purpose: to show how structure emerges out of iteration. Specifically, I want to create a very simple model of Darwinian evolution and demonstrate with simulations and mathematical proof that patterns emerge naturally. In a later post I will talk more about what is significant about this, but here's the preview: when stable patterns emerge in some iterated system, it's possible to build new systems on top of the old ones. Moreover, these new systems can be seen as independent of the old ones. The full discussion on that can wait. This is the fun part.

Iteration at the heart of the conjecture is a single branching formula that works on (usually positive) integers:

$N_{new}:=\left\{ {N_{old}/2\mbox{ if even}\atop (3N_{old}+1)/2\mbox{ if odd}}\right.$

I have used the more compact form of the formula that goes ahead and divides by two in the odd case, since 3N+1 will always give an even number when N is odd. As an example, starting with 8, we get the sequence 8 ->; 4 ->; 2 ->;1, since all but the last is even (8 is a power of 2). A more interesting example is 3 -> 5 -> 8 -> 4 -> 2 -> 1. The unproven conjecture is that any starting number eventually ends up at one. If the conjecture is not true, then for some starting N, either the sequence grows without bound, or it forms a repeating loop. The Wikipedia page has a nice summary of what is known.

Artificial Life is the use of computer simulations to understand biological-like behavior. Conway's Game of Life is one of the best known. For more on the general topic see the Wiki on ALife. For our purposes here, I want to use the Collatz iterator to construct a population that is subject to Darwinian evolution. For that we need these components:

An initial state from which to begin the simulation. In practice this will be an individual species, identified with an odd integer 3,5,7...
A simulation of change over time. This will be the Collatz iterator acting on the numerical species.
A fitness function. If a species "evolves to" 1 via the iteration, it is eliminated from the population. The Collatz Conjecture in this context is that all species eventually go extinct.
Reproduction and variation. For any odd numbered species, when it is transformed by the iterator, an imperfect copy is also created. The copy is the species N-2, where N is the new number of the original after iteration. For example, when 11 -> (3*11+1)/2 = 17, the species 17 - 2 = 15 is also added to the population.
A carrying capacity C. When the population comprises C species already, reproduction is paused until a vacancy opens up through some species going extinct.

For my purposes I don't care how many copies of an individual species exist, since they will all share the same deterministic fate. It also helps keep the computation cost down. So if 17 already exists in the population, and another 17 comes along, I don't create two copies of 17.

Initial Results. Alife sims are just plain fun to play with. I include my code at the bottom of this post in case you want to try it yourself. For what follows, the carrying capacity is set to 100. Here are the first seven starting species charted over 20 generations.

We see exponential growth capped at C except for N=3 and N=5 (they overlap), which form the line at the bottom. What's going on there?

A little investigation shows that the population {2,3,4,5,6,8} forms a closed "ecosystem" that is generated from starting with 3 or 5 (or 6 if we include evens). Moreover, this system follows a stable pattern that will go on forever, showing that there exists Collatz ecologies that never go extinct. In fact, if any other N-ecology ever generates 3 or 5, it will create this pattern as well, and so will also live forever. This gives us:

Conjecture 1: All odd N-ecologies with N > 1 survive forever for sufficiently large C

I hedged a bit there, but I don't think C needs to be very large at all. Another question that we can frame as an affirmative in the form of a conjecture is:

Conjecture 2: There is only one bounded ecosystem for odd N > 1.

Here bounded means that the largest species encountered doesn't grow beyond a ceiling. It looks like the other graphs are bounded too, because they run into C, but this is only true for the number of species at any one time. The individual species identifications continue to grow. We'll look at that next.

The graph below shows the population of N=9 after 16 generations. It has been capped by C, and so the number of species will stay about 100, but which ones those are will continue to change. The individual species numbers are shown by the heights of the bars. The x-axis is unimportant--it just arranges the species from smallest to largest. Each bar is a distinct species. The four highest ones are {1893,1895,1896,1898}.

Obviously there's a lot of structure here. If we plot the populations of all 16 generations together, we see a consistent pattern over time. Each generation has its own color:

The quartet we saw as the top plateau for the population seems to be an enduring structure. Moreover, this pattern is visible on the other populations created from N = 9, 13, and 25, for example:

The graph shows the 16th generation of each, when N=9 and 25 are just hitting the carrying capacity. The quartet and plateau patterns are obvious in all three.

A note about the above graphs. I was constantly fiddling with the program, running different numbers of generations and messing around with how reproduction gets handled. The final version of the program kills any number that attempts to grow beyond a billion, in order to prevent integer overflows. So if you run the program at the bottom and compare it to these you'll see minor differences at the upper end. The graphs above were generated without that constraint, but no overflows actually happened in the data represented.

The Quartets are not hard to explain. Whenever N begets another two odd numbers by the action of the iterator and mutator/reproducer, a quartet will be formed. And because the "two odds in a row" means that the starting number is multiplied by essentially (3/2)(3/2) = 2.25, it and its three new kinfolk will leapfrog over the competition. Suppose for some odd N we get another odd number (plus new variation). This will happen when $(3N+1)/2 = 2k+1$ for some $k=1,2,...$. Solving for $N$ gives $N=(4k+1)/3$. This fraction is only an integer when $k$ is of the form $k=3m+2, m=0,1,...$ Plugging all this back in, we find that $N=4m+3, m=0,1,...$, so $N=3,7,11,15,19,...$ all have the property that they are odd numbers that also yield new odd numbers under iteration. When this happens, two pairs of new species are generated. In terms of $m$ they are $9m+8, 9m+6, 9m+5, 9m+3$. This gives exactly the pattern we observe in the simulations.

Plotting the population of N=7 over 100 generations against a log scale shows how the shape of the ecology changes over time. The lines of dots sloping up represent the constant log(3/2) growth of the odds. The dots seem to pretty much cover all the integers (not all at once, but eventually). That gives us a third question:

Conjecture 3: Except for N = 1, 3, or 5, the ecology for odd N eventually reaches any given positive integer.

As supporting evidence (certainly not proof!), if we look at N=7,9,...99 and locate the missing integers after 1000 generations, we see that every one has swept up the integers through 250.

The horizontal axis is N (odd numbers from 7 to 99), and the vertical axis locates the missing value: each dot shows an integer that was missed by the N-ecology after 1000 generations. The distribution of these misses, inclusive of all the populations above shows some structure:

Is there some crazy number out there that isn't reachable by a given N-ecology? Does the carrying capacity change this one way or the other, over straight exponential growth?

Behavior at the point of capacity shows another emergent pattern. We see this in the number of species in an N-ecology after it faces the limitation of C, and new species are created much more slowly. We would expect that larger Ns have lower extinction rates once population size C is reached. This is because in order for a species to die, it has to land on a power of two, after which it zips to 1 and goes extinct. Powers of two are distributed more sparsely (on a linear scale) as N increases. Of course, there is only one "2" in the population at any given time, so the actual extinction rate due to the 2 -> 1 evolution is either zero or one each generation. The graph below shows 1000 generations, averaging the odd Ns from 7 to 99, giving the demand for new population, the actual number created, and the extinction rate. The first of these would be expected to be half the population, since the odds get reproduced, and indeed we see the line bounces around 50. The extinction events come from 2->1 and from the overflow limit, when a number tries to grow beyond a billion.

The red line is more interesting. It shows the actual number of slots that opened up for new population. If we subtract out the ones we can account for due to 2->1 or overflow, we get a constant average of about 5.4. This must be the rate at which two species turn into one species due to the action of the iterator. In other words

$(3N_1 + 1)/2 = N_2 / 2$ or $3N_1 + 1 = N_2$.

How likely is it that $N_2 \mod 3 = 1$? Should be one third, but we have to remember that $N_1$ is odd and $N_2$ is even, so for positive integers $n1$ and $n2$ we have $3(2n_1+1)+1 = 2n_2$ which gives us $3n_1 + 2 = n_2$. That doesn't change the odds. So if all integers were in the population at the same time, we'd expect a third of them to join up after iteration, decreasing the new generation's size by one each. Given that we have to account for an actual average loss of 5.4, this seems to imply that at any given time, the ecology is 16% saturated (5.4%/33%). At 100% saturation, all integers are present, and one third will disappear due to collisions. If this analysis is right, is this saturation level really constant over time? It seems unlikely.

Reference. I discovered by Googling around that Hiroki Sayama had a similar idea to use the Collatz sequence to study artificial life. His version is completely different from mine, and you can find it here.

The code. [download perl script from snipplr]

Edit: I made a couple of minor edits to two sentences after posting.

Saturday, November 20, 2010

The Long View

In "On Design," I gave three versions of designing for an outcome: soft, forward, and inverse, in increasing degree of difficulty. The question "how difficult is inverse design?" is of utmost importance when we consider complex systems. As a real example, consider how the government of the United States is "designed." By this I mean, the way laws and policies are created and enforced. Because of conflicting goals of different constituents and the inherent difficulty of the project, there is no complete empirical language to describe a state of affairs, let alone do a forward simulation to see the status of the nation in, say, three years. And even if we did have such a language, we would be limited to simulation and prediction of only the "easiest" parameters. And even these would subject to the whims of entropy, a subject I'll take up later.

I submit that individual governments, as well as companies, universities, and militaries use soft design with bits and pieces of forward and inverse design thrown in (for example trying to forecast near term economic conditions to help determine monetary policy).

Government in the general sense has been "designed" by the process of being tossed into the blender of fate, to be tested by real events. See the following video of the history of Europe to see what I mean.

Natural selection would seem to be be at work here, weeding out the worst designs. But it's not Darwinian because the countries change rapidly over time with the population and minds of leaders. One truly spectacular bad idea (like invading Russia, apparently) can bring down a whole nation. So what we are left with is a very temporary list of "least bad designs." Of course, many other factors are important, such as geography, natural resources, and so on. Even so, the people who live there still have to make use of such advantages. If Switzerland abandoned its natural mountain fortress and invaded Russia, it likely wouldn't end well.

Darwinian evolution is different from this national evolution. In the former, good solutions can be remembered and reused through genes or any other information passed from generation to generation. Diversity is created through recombination, mutation, population isolation, and so on. Darwinian evolution comes with an empirical language that we partly understand. To make a metaphor of it, "programs" are written in phenotypes and these "are computed by" the laws of physics and chemistry using the design and environment as "inputs." The fact that scientists can discover this language and use it to make predictions should be appreciated for the miracle that it is: we are witnesses to a dynamic but understandable problem-solving machine of enormous scope that has worked spectacularly well at producing viable designs with only an empirical language. Evolution does not use predictive techniques (that is anticipating that a critter will need wings and therefore building them--for a dramatic example of this, see this video). But there do exist creatures who do use forward and inverse design to plan their day. If you throw a ball at target, you're predicting. If you go to the fridge to get food, you're using the inverse technique: starting with the outcome (get food) and working back to the solution. But this still isn't good enough.

Here's the rub: forward design isn't enough to guarantee any more than short term outcomes, and our ability to do inverse thinking is very limited. Let me pose a problem:

What present actions will lead to [insert subject] existing in a healthy state 100 years from now? 1000 years from now? 10,000 years from now?

The question is posed as an inverse design problem, starting with the goal and asking what needs to be done to achieve that goal. If we had a good language with which to describe the states, we could at least imagine an evolutionary approach, shown in the diagram below, with the red dot being the desired goal.

We're asking "where do we need to be NOW in order to end up where we want to be AFTER?" The forward design approach is to simulate lots of "befores" and see where then end up "after," and choose the best solution we can find. If we are lucky, we can use a Darwinian approach, combining partially successful solutions or tweaking "near misses" to home in on the best solution we can find. This depends on the sort of problem we're trying to solve, and specifically whether or not it is continuous in the right way. Sometimes being close isn't any good--those "a miss is as good as a mile" problem. All of this highlights the importance of the empirical language, which must have rules precise and reliable enough to allow this kind of prediction and analysis. Clearly in the case of governments, companies, universities, or even our own selves, this is not possible.

With only forward design techniques and lacking a good empirical language, we can still solve the problem with massive brute force: by actually creating a host of alternatives and seeing what happens in real time. A computer game company could do this, for example. Rather than spending a lot of money to find the bugs in its game, it could just begin selling it with the knowledge that the game will be reproduced on many different systems and display many kinds of problems. With this data in hand, it can begin to debug. This seems to be a real strategy. This sort of solution obviously won't work for a government, although in a democracy we have a non-parallel version: swapping out one set of leadership for another routinely, to see what works best (in theory at least).

Paying it Backward. What would it look like if we had all the tools to solve the inverse problem? We could pose it precisely, work forward simulations like the one pictured above, and we would have a huge advantage, shown in the graphic below.

Again, the red dot is where we've decided we want to be after a while. The inverse solver shows us all of our possible starting places in the now. There are multiple ones if we have not completely specified the eventual outcome, which tells us what opportunities we have now to optimize other things than the one we were thinking about when we posed the problem.

Example: An artillery officer is given the task of attacking a distant building. The locations of his guns and the target are fixed. Using ballistics equations we can work backwards to show all the possible solutions: e.g., a high arcing shell like a mortar, or a flat trajectory. This decision will determine how much powder is used.

The inverse solver illuminates "free" parameters and allows us to customize our solution.

Conclusions. For the real-world messy problems we face in complex human organizations, it's fair to say that we are not very good at long-term planning. In some cases it may be impossible because the forward simulators simply don't exist--there are no reliable cause and effect mechanisms. Trying to predict the stock market might fall into that category. In other areas, short-term goals are given preference over long term goals. This is reasonable for at least two reasons. First, the longer out the prediction is, the more likely it is wrong. Second, as individual humans our ability to affect events is a narrow window (e.g. a term for a legislator), and most of us want positive feedback now, not 1000 years from now.

As a very real example of our collective difficulty with long-term planning, consider the issue of human-caused climate change. With the terminology given in these two posts, it's easy to dissect the arguments:

Empirical language. The basic facts about temperature change and CO2 levels are accepted by the scientific community, but still debated as a political matter.
Forward design: Cause and effects are challenged in the political discourse, computer simulations are therefore called into question. Unaccounted-for causes are conjured to explain away data that is (to some extent) agreed on.
Inverse design: Prescriptions about what to do now to affect the future climate are attacked as being too detrimental to the present, and/or useless.

The question of what happens to the climate is scientific--this is the arena where we are best at design. If we can't understand and act on such threats intelligently, it's very hard to make the argument that we have any long-term planning ability collectively. Note that I'm not neutral on this particular question. The evidence is overwhelming that the risk to our decedents is very high. But go read the experts at RealClimate.org.

Higher Education. This isn't a climate change blog; what does this analysis have to do with your day to day job in the academy? Everything. From the oracle at Delphi: "Know Thyself." What functional aspects of your institution are understood? How many are understood well enough to make predictions? How many are understood well enough to make inverse predictions?

We work backwards all the time. Suppose budgetary concerns have pushed up freshman enrollment targets, so the admissions office is tasked with bringing in 1000 new students for the fall. This is posed as an inverse problem, and with the standard empirical language of admissions we can build basic predictors. This is the "admissions funnel" prospects->applicants->accepted->enrolled (this is the simplified version). We usually have historical data on the conversion rates between these stages. If the universe is kind, we can use Darwinian methods--keep what works, assuming what works last time will work again, tweak to see if we can make it better. If there's a significant discontinuity (suppose state grants suddenly dried up, or we have a new competitor) the old solutions may not work anymore.

We can and should work hard to create an empirical language and use it to simulate our futures, trying to end up in a good one. This isn't enough.

Short term optimization leads to long-term optimization sometimes, but not as a rule. Think of it as a maze you're trying to escape. If at every junction you choose a path that takes you closer to the opposite wall of the maze, you may make smart moves in the short term only to discover there's no exit there: a long-term failure.

(original image by Tiger Pixel )

Even though we can't solve the inverse problem entirely, we may find that we have enough empirical vocabulary to make some important decisions. What do we want the demographics of the student body to look like in 10 years? Answering this question about the future puts constraints on how we operate now, even if they are fuzzy and inexact. Forget about solving the problem exactly--that's impossible--and think about what constraints are imposed in the big picture. Do we want a strong research program, or a large endowment, a national reputation for X? More idealistically, what do we want to be able to say about our alumni in a decade or two? How much does their success matter, and what sort of success are we talking about? Work backwards. I'll finish with an example.

Example: Student Success. If we focus on a goal for our ultimate accomplishment: the education of students, what does this reveal? How exactly do we want our students to benefit from their education? Some possibilities:

Happiness with life
Being good citizens
Being successful financially
Being loyal to their alma mater
Getting a job right out of school

The overwhelming narrative in the public discourse is that we want students to be "globally competitive" and "get well-paying jobs." But that's a means to some end. What is the ultimate aim? On a national scale, the answer might be better national security and a stronger economy. For an institution, the answer might be that we want our products strongly identified with our brand, or that we want them to donate lots of money in the annual campaign. I think it's important to start there and ask "why do we care?" You could follow that up with lots of activities, like telling the students themselves why you care, if that's appropriate.

Suppose that we care because we want the institution to be able to rely on an international body of alumni who will contribute back in money, connections, expertise, and other intangibles. This will enable the university to grow by presenting global options that may not be apparent now, but also establish an exponentially-growing revenue stream through an expanding endowment as successful alumni give large gifts back to the institution. Working backwards, we might identify several tracks to success:

The state department track--prepare students for high-level international government positions
The military track--ditto for the miltary
The global entrepreneur track--help them achieve independence
The big corporate track--give students the skills to compete and succeed within vast multi-nationals
The wildcard track--for those students who don't fit the mold, are intelligent and creative, but don't want to be entrepreneurs or work in a cubical. This could include scientists, philosophers, and artists of all stripes.

If we keep working backwards, even without exact solutions, we can make some good guesses as to the curriculum each track needs, and the type of faculty mentors we need. This neatly sidesteps the drift toward vocational education that the public narrative implies, and gives the institution a raison d'être. There's nothing wrong with telling students "we want you to succeed so that you'll help us succeed." This sort of pseudo-altruism is what keeps the population going, after all. Thanks, mom and dad.

Think Backwards. Short term forward planning is like beer: it's obviously a good idea at the time, but watch out for the hangover.

Thursday, November 18, 2010

On Design

A couple of day ago I had occasion to think about the difference between finding solutions randomly or by design. You might not think that random searching is the best way to find something, but this is essentially what has produced the stunning variety of life on Earth over the last three and a half billion years, especially in the last 600 million or so. This evolutionary approach is characterized by massive trial and error (usually in parallel) with success, or fitness, ascertained by the system. Unfit results are thrown away. Fit results are tinkered with randomly to create new experiments.

Evolutionary trial and error is a good way to solve a complex problem, as I tell my students. Yesterday we did a big review of all our integration techniques in Calculus 2, and one of the problems was $\int\sin x\cos xdx$. The official way to do this is by substitution, but someone yelled out to use integration by parts. I always try to follow these suggestions to the bitter end, even if I know they'll fail, because I want them to get used to the idea of trying things that don't work: it's a the heart of the creative/inductive approach of solving problems like evolution does. Even if your calculus skills are rusty, you can appreciate the following sequence.

Following the student's suggestion, we get $\int\sin x\cos xdx=\sin^{2}x-\int\sin x\cos xdx$

Now notice something cool--the original problem reappeared on the right side. If we move it to the other side to double up and then divide by two we get the solution:

$\int\sin x\cos xdx=\frac{\sin^{2}x}{2}+C$

In the process, the students discovered another useful trick for solving integrals. It's so cute they wanted to try it on the other problems (none of which worked).

Trial and error leads us down all sorts of unexpected pathways. Organizing this idea to work on computers is done routinely, including genetic programming and simulated annealing. More speculatively, artificial life simulations attempt to study biological-type interactions between agents in a simplified computational setting.

The original Tierra artificial life simulation shown above. Image courtesy of Wikipedia. (I have my own version of this that I use to research survival of complex systems. You can find the original code here.)

So what about design? The word is used for different purposes, but in this case I don't mean art design, like creating a new logo for your company, but designing for a function. Abstractly, imagine we want to create a design D of some object or process so that we obtain some attribute A. It might be a bridge that can support 100 tons at once, or a liberal arts curriculum designed to create open-minded graduates. There's a clear task at hand and a clear goal at the end. I propose that for the most pure form of design to occur, and guarantee a successful outcome, that we need the following positivist ingredients:

Language. Definitions of the task and the outcome in unambiguous terms and in a common "empirical" language. By that I mean that we can take precise descriptions in this language and actual implement them without room for error. Think of specifications on a mechanical gear, for example.
Forward Analysis or Simulation. A way to deterministically go take the specification D and ascertain if A applies or not. For example, we could take the plans for the bridge, with materials and all specifications considered, and determine if it can support the desired weight load. With 1 and 2 together we can use evolutionary methods to find solutions. Perhaps not the overall best solution, but it's a good way to search.
Backward Analysis or Inverse Prediction. Here we take the attribute and work backwards to get the design. This is much faster than trial and error using forward analysis, but it requires a much deeper knowledge of the subject.

As a simple example, if I want an exponential function to model the growth rate of some phenomenon I'm studying, I can specify the problem precisely.

Description. At time 0 the population is 20 units. At time 10 it is 35 units. Find the exponential function that exactly fits this data.
Simulation. I can solve the forward problem for any candidate solution. I someone proposes $P=10e^{.3t}$, we can just let t=10 and compute the population to be about 201 at time 10--not close to the observed value.
Inverse Prediction. Because this problem is well-understood, I can solve the backwards problem and design a solution: $P=20e^{.056t}$. We can go back to the forward analysis and simulate the population at time 10 to be modeled as 35.01.... We can get as close as precision dictates to the measurements.

Backward analysis or inverse prediction (my terms) is very powerful. If I don't have enough information to determine a design, this inverter will tell us that D isn't unique given the constraints. This inverse predictor grants us a deep understanding of the structure of the problem. It's also very hard to come by.

For convenience, let me attempt definitions that solidify the foregoing discussion and delineate the different ways of going from specification to design.

Soft Design. Language is not empirical, outcomes will not be certain and cannot be reliably simulated, but is generally a D -> A evolutionary approach based on experience or preference. Example: designing a new letterhead for your company.
Forward Design. This requires an empirical language that permits simulation of D -> A. An evolutionary approach guided by experience homes in on a reasonable solution. Example: designing a lawn irrigation system.
Inverse Design. Successfully determines A -> D through proven deterministic rules that completely model the attribute A and design constraints. Example: designing a computer algorithm to invert a matrix.

Finding inverses such that we can compute A->D is hard. There is no general way of doing it--this is guaranteed by Rice's Theorem--so solving the Inverse Design problem is itself a Forward Design problem. What?

Let me try that again: in subjects where we don't already have a workable theory, the only way to find one is with an empirical language and trial and error. Of course, if there are connections to know theories like physics, we can get a good head start. But one doesn't have to find complicated examples. Here's a simple one:

The Collatz Conjecture is easy to state. Take any positive whole number, like 34. If it's even, divide by two. Otherwise multiply by three and add one. Repeat this process. The conjecture is that you always end up back at one. No one knows if it is true. If you take a moment to look at the Wikipedia page linked above you'll find that an amazing number of mathematical tools have been tried on it to no avail. This is evidence of an evolutionary Forward Design process that's trying to crack the problem. Assuming it's solved one day, it will create a body of theory that broadens the scope of the Inverse Design problems we can solve.

Most problem domains do not enjoy the complete ability to do inverse prediction. I would go further and say that most of the design work done on practical problems is soft or forward design, with bits of inverse design thrown in here and there for the most well-understood and well-defined problem types.

"Think like the designer" is something I often tell my daughter when she's working with a computer program or trying to find a water fountain in a building, or otherwise in the middle of someone else's design. What were they trying to achieve, and what constraints did they face? This is an informal attempt at inverse thinking on the fly. Sometimes it works: water fountains are often near the bathrooms because it saves on plumbing.

Applying this to learning outcomes assessment is straightforward. We usually don't have good definitions of what it is we're trying to achieve, so most course design, assessment design, program design, curriculum design, and so on is soft. In cases where, e.g., standardized testing attempts to pin down the definition of "learning," there is no only the problem of the validity of those tests, but the top-down mentality that goes with it. It seems to me that there is an assumption that good definitions are enough to grant us the ability to do inverse design. This is obviously absurd: we have good measurements of stock market data, but that doesn't guarantee we can pick winners.

Even if standardized tests are a great definition of learning, one would still have to apply a forward design process, employing trial and error, to find best practices. In higher education, there are many obstacles to this, and because of demographic differences, comparisons across institutions are not automatically valid. It's a hard problem. For my part, I'm not satisfied that the empirical language is developed well enough to even dream of an inverse design. The creation of a good empirical language over time goes hand in hand with the solution to the forward design problem, which is itself a forward design problem...

Next: "The Long View"

Tuesday, November 16, 2010

Rongness

Wolfgang Pauli famously had little tolerance for error. The Wiki page on him puts it like this:

Regarding physics, Pauli was famously a perfectionist. This extended not just to his own work, but also to the work of his colleagues. As a result, he became known in the physics community as the "conscience of physics," the critic to whom his colleagues were accountable. He could be scathing in his dismissal of any theory he found lacking, often labelling it ganz falsch, utterly wrong.

However, this was not his most severe criticism, which he reserved for theories or theses so unclearly presented as to be untestable or unevaluatable and, thus, not properly belonging within the realm of science, even though posing as such. They were worse than wrong because they could not be proven wrong. Famously, he once said of such an unclear paper: Das ist nicht nur nicht richtig, es ist nicht einmal falsch! "Not only is it not right, it's not even wrong!"

This "not even wrong" quote shows up all over, and is even the title of a book about string theory:

The publisher plays with the idea by typesetting "wrong" backwards on the cover. I have a different take on this that I used in my (unpublished) novella Canman:

The System is never rong.
--System Documentation v1.0

It looks ironic because it appears to be a typo, but "rong" is deliberately spelled like that. Literally, "rong" is "not even wrong." It's pronounced with a long 'o' sound, so that you can distinguish the two words.

The original idea was that if you use the right approach but make a mistake you can get the wrong answer. But if you use a method that can't possibly work, you might accidentally get the correct answer once in a while, but it's still rong. Astronomy may give you the wrong distance to a remote galaxy, but astrology will lead you to rong conclusions. The idea that the Sun orbits the Earth is wrong, but the idea that Apollo pulls it around in a chariot is rong.

I think this is a useful term because it grounds any subsequent discussion. That is, it identifies the particulars of the argument as the issue (potential wrongness), or the whole methodology as the issue (potential rongness).

Of course, this opens the door for more meta-levels of error. One could propose "roong" to mean "not even rong," and "rooong" to mean "not even roong," and philosophers could then debate the degree of rongness of a particular idea.

Sunday, November 14, 2010

Identity Rental

Identity theft sounds bad, but does it have to be? What if someone stole access to your bank account and started depositing money there, making wise investments, and never taking a penny out?The point is, however, that the "theft" part doesn't refer to what is done after the assumption of identity, but the mere fact of donning it. I have a Facebook page mostly so someone else can't set one up for me (well, they could, but it would at least compete with the real one).

But sometimes we don't want to fill our own shoes. Wouldn't you like someone else to assume your identity for the purpose of paying your rent or house payment? That would be handy. Heck, imagine if you had a clone who could go to work for you. Essentially, this happens all the time: anytime something is done "in the name of" someone else, it's identity lending. So if the president sends a minion to your office with a "Simon Says" letter, the former's identity is lent to the latter. Identity has economic value.

In the academy one of the cardinal sins (in addition to voting against your own motion) is to assume let someone else assume your identity and speak for you. This is because the pecking order of scholarship depends on your own, presumably authentic, record. I have chosen a complicated way to describe plagiarism, but there's a reason for it. Consider the following quote from this Nov. 12th article in The Chronicle entitled "The Shadow Scholar":

I've written toward a master's degree in cognitive psychology, a Ph.D. in sociology, and a handful of postgraduate credits in international diplomacy. I've worked on bachelor's degrees in hospitality, business administration, and accounting. I've written for courses in history, cinema, labor relations, pharmacology, theology, sports management, maritime security, airline services, sustainability, municipal budgeting, marketing, philosophy, ethics, Eastern religion, postmodern architecture, anthropology, literature, and public administration. I've attended three dozen online universities. I've completed 12 graduate theses of 50 pages or more. All for someone else.

This is intended to outrage profs of all stripes, I'm sure. Underneath all the fuss is an assumption about the nature of the physical universe--that past performance predicts future performance. If a student buys a paper and passes is off as their own, it advertises a level of performance that doesn't really exist, and so if we give that student a diploma, the predictive value of that script is diminished. From the student's perspective, this may be a good investment, if at some ethical cost.

The inductive premise (i.e. that the paper predicts actual performance) fails on at least two grounds. First, although a student may have the resources to buy a paper now doesn't mean the same will hold true later. This is weak, however. Supply would quickly meet demand if there were actually a lot of call for such things post-graduation, and the cost would drop. In other words, if one were required to write term papers for one's occupation, these could probably be outsourced to India or something, for cheap. The other objection is more serious--the connection between writing a thesis and being able to think is real, and it's quite possible that the nice essay that was bought from a service does not reflect the student's ability to communicate or think. So the university may mistakenly graduate a student who can't write well or can't put thoughts in order.

What the academy is attempting (and apparently failing) to do is fix an identity to a track record of success. This is a very simplistic notion to begin with, and probably needs an overhaul. The government is interested too: see my post on "FUD." Consider this quote from the article:

[T]he lazy rich student will know exactly what he wants. He is poised for a life of paying others and telling them what to do. Indeed, he is acquiring all the skills he needs to stay on top.

Let's follow that to the logical conclusion. Suppose the lazy rich person (now: LaRP) can afford to outsource everything in his life. Heck, you don't have to be rich: people already outsource all kinds of things in normal life. See "Enlightened Outsourcing" or Tim Ferris's book The 4-Hour Work Week. But suppose that you could outsource everything you do at your job, and still turn a profit based on your salary. This wouldn't consume your whole day because other people are doing most of the job. So you could hold down more than one job, maybe at the same company. This violates most preconceptions about what it means "to work somewhere," but whatever--it's a new century. As long as the job gets done, what difference does it make?

This sounds cynical, and certainly this attitude wouldn't find many takers within the academy, but imagine a Track B. Instead of actually doing your homework and taking tests and whatnot, you manage to get it all accomplished by third parties. The bar would probably have to be higher, since if you compare to what a Track A ("honest") student has to do to accomplish all this alone, the Track B student has more time on his or her hands. So double the work load. Or triple it. And don't hide the fact that everything is outsourced--the point is not that the Track B student can do differential equations, but that they can get it done by someone else. How many railroad ties did J.P. Morgan lay himself? Not many, I bet.

That won't happen. But it should be easy enough to detect exceptional performance from students. Something like FACS scores could be used to ascertain a baseline of thinking and communication skills, and partial and complete work could be compared against that benchmark. This actually happened to me once. Our whole senior high school class was bussed off to Rend Lake College, about an hour from my hometown of Pinckneyville, Illinois. We spent the day taking tests on various subjects, and I ended up getting the high score in math and tied for the high score in physics (trusting 30 year old memories here, so caveat emptor). But the word quickly circulated that the other guy who did well on the physics test (from a different school) was not very good at academics. He had just sat next to the proctor, who happened to foolishly leave the test key sitting there. Whether this is true or not, I don't know, but it illustrates the finely-honed human ability to judge ability and compare it against performance. This is why women sometimes look at my shoes when I'm out with my wife: they see us together and think she's too good for me, so they look down to see if the explanation can be found in expensive footware--a signal of wealth. That may be more than you wanted to know. :-)

More: here's another post on identity-for-sale.

Update: If you thought this issue was simple, check out this article in TechDirt: 200 Students Admit To 'Cheating' On Exam... But Bigger Question Is If It Was Really Cheating Or Studying

Wednesday, November 10, 2010

SACS: QEP Change

Last week at the NCICU meeting hosted at Guilford College, we were treated to a very nice lunch. (You can see where my priorities are.) We also heard presentations on finances, institutional effectiveness, QEP, and substantive change, as these pertain to our regional accreditation (SACS).Only those in the SACS region are likely to find the following significant.

I was particularly interested in the topic of creating and assessing a quality enhancement plan (QEP). Most of what I heard aligned with the notes I took from annual meeting, but there was one new twist related to us by SACS VP Mike Johnson, viz., a change in the wording in the Principles of Accreditation that reflects a broadening of what a QEP can be. Note that what follows is my unofficial interpretation, and I welcome corrections.

Here's the old version of the description of process, from the original Principles, pages 9-10.

The language to pay attention to is the part about "issues directly related to improving student learning." From the same document, the actual section 2.12 from the requirements:

Compare to the 2010 version of each. The first thing to notice is that the new version has more text.

The new language included in both references includes an option that wasn't there before. In addition to a focus on learning outcomes, it's possible to focus on the environment supporting student learning. My interpretation of the discussion at the meeting is that this also allows for some flexibility with regard to assessment, given that an environment is generally affective, not specific like, say a writing tutorial.

The specific example mentioned was that of a first-year experience, which may have many components working together to enhance student success. Because it's diffuse, the assessment of particular learning outcomes, with a "before and after" benchmark approach may not be reasonable. This seems like a good thing, because it will allow for more creative options with regard to both the construction and assessment of the QEP.

Sources: See the SACS website at www.sacscoc.org for full documentation. Also, there is an email list for SAC-related questions, which you can subscribe to here.

Update: In addition to the changes noted above, there is a new comprehensive standard under institutional effectiveness that applies to the QEP. My understanding is that this allows reviewers to find an institution out of compliance with a comprehensive standard rather than a core requirement. The effect of the former can be remediated through the usual report process, but a core requirement failure would be much more severe. So this gives institutions the same chance to fix things that other comprehensive standard failures do, which wasn't the case before. Here's the text:

It's interesting what this focuses on. Capability to execute, broad involvement, and assessment are on the list, but nothing about an appropriate focus on learning outcomes. Apparently if you get that wrong, it's still a deadly sin.

Saturday, November 06, 2010

SAT-W: Does Size Matter?

This ABC News story tells of 14-year-old Milo Beckman's conclusion that longer SAT essays lead to higher scores. Interestingly, the College Board's response dances around the issue but doesn't deny it:

Our own research involving the test scores of more than 150,000 students admitted to more than 100 colleges and universities shows that, of all the sections of the SAT, the writing section is the most predictive of college success, and we encourage all students to work on their writing skills throughout their high school careers.

It would have been easy to say that with such-and-such alpha level, controlling for blah and blah, the length of the essay is not a significant predictor of the total score. But they didn't. It is interesting that the writing score is supposed to be the most predictive. It's not hard to find the validity report on the College Board SAT page. Here are the gross averages from the report.

The average high school GPA is much higher than I would have expected. Notice too that the first year college GPA is .63 less. The obvious question here is whether this difference has increased over time due to grade inflation. An ACT report "Are High School Grades Inflated?" compares 1991 to 2003 high school grades and answers in the affirmative:

Due to grade inflation and other subjective factors, postsecondary institutions cannot be certain that high school grades always accurately depict the abilities of their applicants and entering first-year students. Because of this, they may find it difficult to make admissions decisions or course placement decisions with a sufficient level of confidence based on high school GPA alone.

This is somewhat self-serving. Colleges don't need to really predict the abilities of their applicants--that's not how admissions works. What we do is try to get the best ones we can for the net revenue we need. That is, a rank is sufficient, and even with grade inflation, high school grades still do that. In the end, SAT and ACT are used to rank students for admissions decisions too.

The grade inflation is remarkable. Let's take a look at this graph from the ACT report.

Two things are obvious--first, the relationship between grades and the standardized test are almost linear. Especially in the 1991 graph, there's very little average information added by the test (by which I mean the deviation from a straight line is very small). Second, there's an inflation-induced compression effect as the high achievers get crammed up against the high end of the grading scale, reducing its power to discern.

Recall from above that the average SAT-taker's high school GPA was 3.6 in the most recent report, and look where that falls on the scale above. We could guess that the "bend" in the graph is getting worse over time, and probably represents a nonlinear relationship between grades, standardized tests, and college grades. If you have a linear predictor (e.g. for determining merit aid), it would be good to back-test it to see where the residual error is.

First Year College GPA is related to the other variables through a correlation, taken below from the SAT report.

We're really interested in R-squared, the percentage of variance explained. In the very best case, if we take the biggest number on the chart and square it, is about 36%. That is, the other two thirds of first year performance is left unexplained by these predictors. Indeed, SAT-W is larger than the other two SAT components (even combined). Now why might that be? Is SAT-W introducing a qualitatively different predictor?

To pursue that thought, suppose Mr. Beckman is right, and the SAT-W is heavily influenced by how long the essay is. This leads to an interesting conjecture. Perhaps what is happening is that the SAT-W is actually picking up a non-cognitive trait. That is, perhaps in addition to assessing how well students write, it also assesses how long they stick to a task: their work ethic, so to speak. If so, I wonder if this is intentional. The College Board has a whole project dealing with non-cognitives , so it's certainly in the air (see this article in Inside Higher Ed).

My guess is that they figured out the weights in reverse, starting with a bunch of essays and trying out different measures to see which ones are the best predictors. And length was one that came up as significant. It's not an entirely crazy idea.

You can read the essays. I did not know this until I saw the College Board response to the ABC article:

Many people do not realize that colleges and universities also have the ability to download and review each applicant's SAT essay. In other words, colleges are not just receiving a composite writing section score, they can actually download and read the student's essay if they choose to do so.

So theoretically, you could do your own study. Count the words and correlate.

Finally, I have to say that the ABC News site is an example of an awful way to present information. It makes my head hurt just to look at the barrage of advertisements that are seemingly designed to prevent all but the most determined visitors from actually reading the article. I outlined the actual text of the article in the screenshot below.

The first thing you get is a video advertisement in the box. If you want to read the whole article, you have to click through four pages of this stuff. I didn't make it that far. Maybe it's a test of my stick-to-it-ness, and there's a reward at the end for those with the non-cogs to complete the odyssey. If so, I flunked.

Update: Here's a 2005 NYT article about SAT writing and length, thanks to Redditor punymouse1, who also contributed "Fooling the College Board."

Friday, November 05, 2010

EduPunk and the Matthew Effect

Inside Higher Ed today has a piece on "The Rise of Edupunk." I didn't find much new in the article, except that perhaps mainstream institutions are beginning to pay attention, but this struck a chord: Quoting Teresa Sullivan, president of the University of Virginia,

In a bow to the “Edupunks,” Sullivan explained that Virginia is incorporating student habits into its pedagogy.

In "Edupunk and the New Ecology" I proposed three elements necessary to student success outside the academic walls:

Access to quality learning materials
A learning community
Motivation to succeed

I find it interesting that the conversation is moving beyond technology and specifically focusing (at least in this instance) on non-cognitives. As a case study, we might consider Matthew Peterson, whose story is summed up in MITnews from Nov. 3:

In his junior year at Klein Oak High School in Spring, Texas, Mat Peterson — now an MIT freshman — was struggling with the his physics course. A friend of his recommended that he look at MIT OpenCourseWare, where Peterson turned to Walter Lewin's videos and found the help he needed.

Matthew was motivated enough to use MIT's Open Course Ware. Here's the branding from the website:

No registration required. I would suggest that young people today (rightly or wrongly) expect intellectual property to be public property: songs and movies are free to download, information on any topic is at your fingertips, and there is impatience with any barrier that gets in the way. And in fact, the actual knowledge that professors design to deliver in curricula falls into that category. Think about how archaic the idea is of lugging around a Biology textbook. And paying $150 for it when everything you need to know is already free online seems ridiculous. A week ago I wrote about the Fat Middle, and textbook publishers are a good example of one ripe for revolution. But it's not just textbooks. Lectures too. The whole traditional delivery of static classroom experience is already freely available for many subjects. The effect on MIT itself has been significant. In a "MIT OpenCourseWare: A Decade of Global Benefit", Shigeru Miyagawa writes:

Over the past 10 years, OCW has moved from a bold experiment to an integral part of MIT. Currently, more than 93% of undergraduates and 82% of graduate students say they use the site as a supplement to their course material or to study beyond their formal coursework. Eighty-four percent of faculty members use the site for advising, course materials creation, and personal learning. More than half of MIT alumni report using the site as well, keeping up with developments in their field, revisiting the materials of favorite professors, and exploring new topics. Open publication of course materials has become an ordinary element of scholarly activity for MIT faculty, and the ubiquitous availability of that curriculum to our own community has become the everyday reality of teaching and learning at MIT.

So how does this play out over the next decade?

The Matthew Effect wasn't named for Mr. Peterson--that's just a happy coincidence. Some time ago, in assessing our general education goals (not my current institution), I found indications that "higher ability" students learn faster. The graph below is taken from Assessing the Elephant. It shows assessed writing ability for a cohort over six semesters, controlled for survivorship. The cohort is split into high and low GPA groups. It's not surprising that the high GPA group was judged on average to have better writing skills, but it was surprising to see that the amount of change was so different.

This jibes with my own classroom experience, however, and I would venture to guess that (as Gladwell poses in Outliers) talent and motivation go hand in hand. In fact, we have evidence for that. In a recent post, "Assessing Writing," I showed that if we can just get the low GPA students into the writing lab for help, we can significantly improve their writing over four years. If this is true, then their ability is constrained by activity and engagement and not solely by innate talent.

I had not seen the Wikipedia article on the Matthew Effect before, but it's interesting:

[E]arly success in acquiring reading skills usually leads to later successes in reading as the learner grows, while failing to learn to read before the third or fourth year of schooling may be indicative of life-long problems in learning new skills. This is because children who fall behind in reading, read less, increasing the gap between them and their peers. Later, when students need to "read to learn" (where before they were learning to read) their reading difficulty creates difficulty in most other subjects. In this way they fall further and further behind in school, dropping out at a much higher rate than their peers.

If we apply this same idea to the Edupunk model, what we might expect is that self-starters, confident students, and those with enough knowledge and skill to begin self-education, will flourish like Matthew Peterson. On the other hand, a student who struggles in school and as a result doesn't like it much, seems unlikely to be in a position to benefit from the OCW or other free resources. This is a recipe for an increasing divergence between intellectual haves and have-nots.

A Modest Proposal is to begin early to teach students how to access and use self-serve education. Honestly, this has to start in the home--the teachers can't do it all. The article "Home Libraries Provide Huge Educational Advantage" from Miller-McCune points to research linking home libraries to educational achievement.

Home library size has a very substantial effect on educational attainment, even adjusting for parents’ education, father’s occupational status and other family background characteristics,” reports the study, recently published in the journal Research in Social Stratification and Mobility. “Growing up in a home with 500 books would propel a child 3.2 years further in education, on average, than would growing up in a similar home with few or no books.

(Here's Science Daily on the same topic.) How might it work? Math instruction would be very different from the traditional approach. Instead of telling students that the equation for a line is y = mx +b, they'd be tasked to go out and find it. Or figure out that the formula is needed at all. There are many "meta" levels of questioning. If we built this kind of inquiry into the curriculum from the start, we'd avoid the "catch-up" information literacy training we try to do in college.

Note that I don't propose this as a panacea or some great new idea that could revolutionize education. There is enough snake oil on sale already. But complex problems require evolutionary solutions, and this seems like something that ought to be tried out. The proof is in the actual success or failure of the attempt.

Mass culture in the United States works against us. The irony of the information age is that although deep veins of accumulated knowledge are there to be mined, most informational interactions are expected to be brief and shallow. Television is an extreme example, but it seems pervasive to the point of oppressiveness. And so we come back to non-cognitives. It's like the problem of building self-discipline: a chicken and egg dependency. (If you procrastinate on learning how to not procrastinate...) It isn't really about learning. It's about wanting to learn. So I leave you with the problem at the heart of the edupunk Matthew Effect:

How do people learn to want to learn if they don't want to?