## Thursday, November 18, 2010

### On Design

A couple of day ago I had occasion to think about the difference between finding solutions randomly or by design. You might not think that random searching is the best way to find something, but this is essentially what has produced the stunning variety of life on Earth over the last three and a half billion years, especially in the last 600 million or so. This evolutionary approach is characterized by massive trial and error (usually in parallel) with success, or fitness, ascertained by the system. Unfit results are thrown away. Fit results are tinkered with randomly to create new experiments.

Evolutionary trial and error is a good way to solve a complex problem, as I tell my students. Yesterday we did a big review of all our integration techniques in Calculus 2, and one of the problems was $\int\sin x\cos xdx$. The official way to do this is by substitution, but someone yelled out to use integration by parts. I always try to follow these suggestions to the bitter end, even if I know they'll fail, because I want them to get used to the idea of trying things that don't work: it's a the heart of the creative/inductive approach of solving problems like evolution does. Even if your calculus skills are rusty, you can appreciate the following sequence.

Following the student's suggestion, we get $\int\sin x\cos xdx=\sin^{2}x-\int\sin x\cos xdx$

Now notice something cool--the original problem reappeared on the right side. If we move it to the other side to double up and then divide by two we get the solution:

$\int\sin x\cos xdx=\frac{\sin^{2}x}{2}+C$

In the process, the students discovered another useful trick for solving integrals. It's so cute they wanted to try it on the other problems (none of which worked).

Trial and error leads us down all sorts of unexpected pathways. Organizing this idea to work on computers is done routinely, including genetic programming and simulated annealing. More speculatively, artificial life simulations attempt to study biological-type interactions between agents in a simplified computational setting.
The original Tierra artificial life simulation shown above. Image courtesy of Wikipedia. (I have my own version of this that I use to research survival of complex systems. You can find the original code here.)

So what about design? The word is used for different purposes, but in this case I don't mean art design, like creating a new logo for your company, but designing for a function. Abstractly, imagine we want to create a design D of some object or process so that we obtain some attribute A. It might be a bridge that can support 100 tons at once, or a liberal arts curriculum designed to create open-minded graduates. There's a clear task at hand and a clear goal at the end.  I propose that for the most pure form of design to occur, and guarantee a successful outcome, that we need the following positivist ingredients:

1. Language. Definitions of the task and the outcome in unambiguous terms and in a common "empirical" language. By that I mean that we can take precise descriptions in this language and actual implement them without room for error. Think of specifications on a mechanical gear, for example.
2. Forward Analysis or Simulation. A way to deterministically go take the specification D and ascertain if A applies or not. For example, we could take the plans for the bridge, with materials and all specifications considered, and determine if it can support the desired weight load. With 1 and 2 together we can use evolutionary methods to find solutions. Perhaps not the overall best solution, but it's a good way to search.
3. Backward Analysis or Inverse Prediction. Here we take the attribute and work backwards to get the design. This is much faster than trial and error using forward analysis, but it requires a much deeper knowledge of the subject.
As a simple example, if I want an exponential function to model the growth rate of some phenomenon I'm studying, I can specify the problem precisely.
1. Description. At time 0 the population is 20 units. At time 10 it is 35 units. Find the exponential function that exactly fits this data.
2. Simulation. I can solve the forward problem for any candidate solution. I someone proposes $P=10e^{.3t}$, we can just  let t=10 and compute the population to be about 201 at time 10--not close to the observed value.
3. Inverse Prediction. Because this problem is well-understood, I can solve the backwards problem and design a solution: $P=20e^{.056t}$. We can go back to the forward analysis and simulate the population at time 10 to be modeled as 35.01.... We can get as close as precision dictates to the measurements.
Backward analysis or inverse prediction (my terms) is very powerful. If I don't have enough information to determine a design, this inverter will tell us that D isn't unique given the constraints. This inverse predictor grants us a deep understanding of the structure of the problem. It's also very hard to come by.

For convenience, let me attempt definitions that solidify the foregoing discussion and delineate the different ways of going from specification to design.
• Soft Design. Language is not empirical, outcomes will not be certain and cannot be reliably simulated, but is generally a D -> A evolutionary approach based on experience or preference. Example: designing a new letterhead for your company.
• Forward Design. This requires an empirical language that permits simulation of D -> A. An evolutionary approach guided by experience homes in on a reasonable solution. Example: designing a lawn irrigation system.
• Inverse Design. Successfully determines A -> D through proven deterministic rules that completely model the attribute A and design constraints. Example: designing a computer algorithm to invert a matrix.

Finding inverses such that we can compute A->D is hard. There is no general way of doing it--this is guaranteed by Rice's Theorem--so solving the Inverse Design problem is itself a Forward Design problem. What?

Let me try that again: in subjects where we don't already have a workable theory, the only way to find one is with an empirical language and trial and error. Of course, if there are connections to know theories like physics, we can get a good head start. But one doesn't have to find complicated examples. Here's a simple one:
The Collatz Conjecture is easy to state. Take any positive whole number, like 34. If it's even, divide by two. Otherwise multiply by three and add one. Repeat this process. The conjecture is that you always end up back at one. No one knows if it is true. If you take a moment to look at the Wikipedia page linked above you'll find that an amazing number of mathematical tools have been tried on it to no avail. This is evidence of an evolutionary Forward Design process that's trying to crack the problem. Assuming it's solved one day, it will create a body of theory that broadens the scope of the Inverse Design problems we can solve.
Most problem domains do not enjoy the complete ability to do inverse prediction. I would go further and say that most of the design work done on practical problems is soft or forward design, with bits of inverse design thrown in here and there for the most well-understood and well-defined problem types.

"Think like the designer" is something I often tell my daughter when she's working with a computer program or trying to find a water fountain in a building, or otherwise in the middle of someone else's design. What were they trying to achieve, and what constraints did they face? This is an informal attempt at inverse thinking on the fly. Sometimes it works: water fountains are often near the bathrooms because it saves on plumbing.

Applying this to learning outcomes assessment is straightforward. We usually don't have good definitions of what it is we're trying to achieve, so most course design, assessment design, program design, curriculum design, and so on is soft. In cases where, e.g., standardized testing attempts to pin down the definition of "learning," there is no only the problem of the validity of those tests, but the top-down mentality that goes with it. It seems to me that there is an assumption that good definitions are enough to grant us the ability to do inverse design. This is obviously absurd: we have good measurements of stock market data, but that doesn't guarantee we can pick winners.

Even if standardized tests are a great definition of learning, one would still have to apply a forward design process, employing trial and error, to find best practices. In higher education, there are many obstacles to this, and because of demographic differences, comparisons across institutions are not automatically valid. It's a hard problem. For my part, I'm not satisfied that the empirical language is developed well enough to even dream of an inverse design. The creation of a good empirical language over time goes hand in hand with the solution to the forward design problem, which is itself a forward design problem...

Next: "The Long View"