Sunday, November 01, 2009

What is Learning?

Hope your Halloween was a good one! I went as a Psycho Metrician. A friend back in Hartsville is re-instituting the Halloween story tradition I mentioned here, but it's this coming weekend, so I have a few days to exercise my creative gears and write a short story. I've been assigned my historian friend and colleague Bob to write about. He currently has a paradoxical problem I'll have to work in to the story somehow. It goes like this:

Bob is publishing a book using Lulu. The ISBN has to appear in two places: once on the copyright page inside the book, and once on the back cover with a bar code. The problem is that if he submits a new version to change the first number, it constitutes a new edition, and a new ISBN is assigned, which automatically goes on the back. This means the two ISBNs can never match. Obviously, there has to be some way to get around this or nothing would ever get published. If one were to continue to stubbornly try putting the previous ISBN in the book, generating a new mismatched one on the back, and do this periodically, you would have created a clock: a periodic process matched to an aperiodic one. This was one of Einstein's insights, or so I heard in a seminar once.

The solution involves memory: create a new ISBN, write it on a wee scrap of paper, use an editing software to page to the copyright leaf and replace the ISBN there with the new one.

Time can resolve paradoxes; it's really quite amazing. Take for example the classic logic paradox:
This statement is false.
If you try to determine if it's true or not, you'll quickly get a headache. If it's true, it implies it's false, and conversely. This may not seem significant, but when Bertrand Russell discovered a similar paradox having to do with the loose definition of "set" that was being used at the time, it led Gottlob Frege to add an addendum to his Grundgesetze der Arithmetik just as it was going to press, stating (Logicomix, p 171):
Hardly anything more unfortunate can befall a scientific writer, than to have one of the foundations of his edifice shaken after the work is finished. [...] The collapse of one of my laws, to which Mr. Russell's paradox leads, seems to undermine not only the foundations of my Arithmetic but the only possible foundations of Arithmetic as such.
Ordinary logic does not incorporate time. The two statements "I am a child," and "I am not a child" cannot both be true. Yet in common experience, they are both true of each of us at different points in life.

In programming, a common idiom is
x = !x
which means "assign to x the value of its logical opposite." True becomes false and conversely. What is a terrible problem for static logic is a useful tool for processes. If regularly applied, it's just like the pendulum on a clock swinging back and forth: imagine one side as True and the other False. This is an example of feedback, and we can extend this idea to create digital circuits like flip-flops and gates, which are used to create computer memory. A schematic of one is shown below. The symbol meanings aren't of importance for our discussion here, just the connections. Look at how the wires are connected between components, reading from left to right.

(image from Wikipedia)

Notice how the outputs from the logic units go back into the inputs of their neighbors? You can see those clearly where the lines make Xs. Those would potentially be logical contradictions if viewed statically, but because we have time at our disposal it becomes a means of controlling the state of this logical arrangement: we can switch it at will between holding a True or False value until we need it later. This is a memory element that can hold a single bit of information. It continually "reminds itself" of the value it's suppose to remember, and if we change it, the logical contradiction gets washed out as it switches over to the new value to remember. So not only does time allow us to resolve logical contradictions, time and logic together give us memory. Naturally, the biological version in our skulls doesn't run on semiconductors, but the paradox resolution has to exist in some fashion or else mental states would never change.

Beyond memory, any finite deductive process can be created using these tools, and as we shall see later, inductive processes too, with the addition of another ingredient. Learning to think deductively and then inductively in a domain of interest ought to be on our list of learning objectives.

How do we define learning? This line of thought originated when I got my copy of Learning that Lasts, by Marcia Mentkowsky & Associates. It's about the Alverno College experience, and I've been looking forward to reading it. Thinking about the assessment of learning outcomes is part of my administrative role, and I come to the task as an outsider, from math and computer science rather than educational psychology. The basic problem with the subject of assessing learning outcomes is what seems to be a chasm between theory and practice. There are no end of books and articles on assessing outcomes, but finding actual real, live, and public examples of program or course-level "closing the loop" is rare. I got permission to publish ours at Coker (here)--as imperfect as they are--in the hopes that others would too. If you search the web, you find lots of plans, but few completed examples. That doesn't mean they don't exist, it just means they aren't generally published, which is a shame.

Given the integration of learning and assessment at Alverno, I figured the book would have lots of examples. Oddly, though, it seems to be mostly theory. I confess I haven't finished it yet, but leafed through it looking for data and analysis. There are tables with statistics in the appendices, and there are some stats sprinkled throughout the text, but it's not what I expected. I'll reserved final judgment until I've waded through it, of course.

But to the point: what definition of learning is given in Learning that Lasts? This is the crux of the matter if we want to discuss learning outcomes, so I dived into the section on page 7 eager to find some clarity. Alas. First, I'm told that
Learning is both process and outcome, often interwoven.
I do get the fact that "learning" can be a verb or a gerund (or participle, I suppose), but I'm really only interested in the process of learning and how we might affect or effect it. Mixing up verbs and nouns and saying they're interwoven is just confusing. A process requires time, and an outcome is a state frozen in time--they're incompatible dimensionally just like, well, verbs and nouns. The definition continues to go downhill for a while. Parts make sense, like:
We [...] recogniz[e] that any observations of what is being learned are often inseparable from how one understands knowledge...
Yes, agreed. That's why we need a good definition.
..., its epistemology,...
What epistemology? The first clause covers ordinary epistemology in "how one understands knowledge," so do we mean the epistemology of how we understand knowledge? How we understand how we understand knowledge? Given that how we understand knowledge is a part of our knowledge, it would seem to be covered already.
...and their connection to meaning systems.
The non-parallel structure gets in the way, and I'm not sure what 'their' refers to. Probably observations, since that's the only plural. I'm not sure what a meaning system is, though. Is this a structuralist thing? I found a paper (here) that uses the term extensively, although no definition is given there either, nor was google scholar any help. In any event, observations are connected to meaning systems, I assume. Does that mean that I as the observer will be necessarily subjective? Or does it mean that the learner's meaning system affects what we observe? As I'm trying to puzzle that out, I'm told that
These are often contradictory.
At this point I feel like I'm reading Sartre's Being and Nothingness. After this there is a kind of review of literature of what other education researchers think learning is. It's a real potpourri. The wiki page on learning is better at providing an organized overview from psychology. But neither source actually comes out and says that learning is a physical change over time.

Google books pointed me to something close. In The Science of Learning, Joseph Pear defines it on page 12 as
a dependency of current behavior on the environment as a function of prior interaction between sensory-motor activity and the environment.
Here, learning is both noun and verb. The dependency is a state of whatever was previously learned having an influence on future behavior, and the "prior interaction" is the verbish part. So we could turn it around and say that
Learning is an interaction between sensory-motor activity and the environment that causes current behavior to depend on the current environment.
This definition leads to interesting conclusions, such as:
  1. Understanding interactions between learner and environment is essential to understanding learning.
  2. Because "interactions" happen continuously, the general topic of learning is too broad to be of much use in the classroom. Better to focus on a narrowly-defined type of interaction in order to make useful observations within some domain like teaching multiplication.
  3. Because "teaching" and "assessing" are both interactions, they are really only distinguished in how we view them externally--to the learner they are both learning experiences that will influence future behavior.
And so on. I'm not proposing that this is a great definition of learning, just trying to show that any precise definition can be useful for structuring inquiry. Vague lists of attributes like "constructive, cumulative, self-regulated, goal oriented, situated, and collaborative" (page 7 of Learning that Lasts, quoting Erik De Corte) aren't useful to me. The authors of that book do give a reason for not really defining the focus of their work (page 6):
We advance the multiple, diverse ways of knowing about learning that undergird learning that lasts, primarily because these modes of inquiry support the trend toward multi-disciplinary and collaborative work across the liberal arts and professions and across the roles that make up various levels of practice. This emphasis on diverse means and methods is connected to our concerns for a viably transformed higher education that focuses on student learning and learning-centered educational programs.
I'm all for definitions, means, and methods that are particular to different disciplines or across them, but that's not a substitute for a foundation to build from. The style of writing, which you can get a sense of from the quotes, feels like a random walk, as if watching a bee flit from flower to flower buzzing in fits and starts. If you blink, you've lost the target insect, but don't worry because there are lots more, all wandering around doing their work according to undecipherable plans.

For contrast, let me give an example of good definitions, profound questions, and deep answers that can come from finely structured inquiry. I don't mean to propose that this is possible in all cases, and probably learning in its most general sense is not amenable to this approach. Nevertheless, it's well worth the trip.

The question is an important one for any learner, and it centers on the tension between two competing ways of summarizing knowledge. My source is An Introduction to Kolmogorov Complexity and Its Applications by Li and Vitanyi, Chapter 5 of the second edition.
Principle of Multiple Explanations (Epicurus): If more than one theory is consistent with the observations, keep all theories.

Occam's Razor Principle: Among the theories that are consistent with the observed phenomena, one should select the simplest theory.
As an example, consider the sequence of numbers {1,2,3,4}. Occam's razor might prompt us to say that this is simply the first four positive integers, and the next few will be {5,6,7}. But Epicurus would advise us not to be so hasty--the numbers might be the first four digits of Bob's history book ISBN, in which case there is no predicting what the next ones will be.

Without good definitions, one could debate endlessly the merits of these two principles. Part of the difficulty comes when trying to define what "simplest explanation" means. Isn't it simpler to explain the motion of the sun by assuming that it circles the Earth?

It takes considerable mathematical machinery to attack this problem, but it has been done. It involves complexity theory, probability, and computation. If you're inclined, get a copy of the book and read it for yourself. I'll just give the overview presented on page 319:
It is widely believed that the better a theory compresses the data concerning some phenomenon under investigation, the better we learn, generalize, and the better the theory predicts unknown data. This is the bass of the "Occam's razor" paradigm about "simplicity." Making these ideas rigorous involves the length of the shortest effective description of the theory: its Kolmogorov complexity. [...] This train of thought will lead us to a rigorous mathematical relation between data compression and learning.
I make a big deal in this blog about the difference between learning deductive reasoning and learning inductive reasoning, where the production of new knowledge comes from. The analysis described in the paragraph above enables us to find characteristics of the latter. On page 323, the authors explain the philosophical puzzler involved and introduce the role of probability (Bayes' rule):
The philosopher D. Hume (1711-1776) argued that true induction is impossible because we can only reach conclusions by using known data and methods. Therefore, the conclusion is logically already contained in the start configuration. Consequently, the only form of induction possible is deduction. [...]

R.J. Solomonoff's inductive method [...] may give a rigorous and satisfactory solution to this old problem in philosophy.

Essentially, combining the ideas of Epicurous, Occam, Bayes, and modern computability theory, Solomonoff has successfully invented a "perfect" theory of induction. It incorporates Epicurus's multiple explanations idea, since no hypothesis that is still consistent with the data will be eliminated. It incorporates Occam's simplest explanation idea since the hypotheses with low Kolmogorov complexity are more probable. The inductive reasoning is performed by means of the mathematically sound rule of Bayes.
Bayes rule is a technique from probability theory that allows you to take into account "priors," or knowledge we already have about the situation. If we were trying to predict stock equity prices, we'd look at the time series history of price moves--this information is then assumed to be given. If you see an expression like P(x|y), the vertical bar means "given" the information found in y. Your estimation of the probability of getting a good meal at a restaurant might be affected by reading the reviews, which would be written P( good meal | reviews ). This sounds a lot like the definition of learning from Pear earlier: current behavior affected by past events.

Conclusion. Time, logic, and probability are the abstract ingredients discussed here, bookending this article, which can be used to create models of deductive and inductive processes. Real learners in the physical world are bound by logic, but the actual squishy mechanisms of learned behavior are not yet well understood. This need not prevent us from trying out tight definitions to see where they lead. I'll continue to dig up other definitions from other disciplines to see how they compare. If you have a good one, please send it to me.

No comments:

Post a Comment