## Tuesday, April 28, 2009

### Part Seven: Measurement, Smeasurement

Why Assessment is Hard: [Part one] [Part two] [Part three] [Part four] [Part five] [Part six]

In outcomes assessment we use the word 'measure' as a matter of course. Our task today is to make sense of this language.

Measurement is a word laden with meaning. It means more than assessment or judgment or rating. Consider the following statements.
• I picked some strawberries today. We measured them to be very tasty!
• I measured the kids before they went to bed--they were all happy.
• We went to the art museum, and measured the artists' creativity.
To me this sounds quite odd. On the other hand, we might easily say:
• I measured the bag of potatoes. It was five pounds.
• I measured three cups of flour for the bread.
• When the builder measured the door, he discovered it was crooked.
There is a difference in common language between a subjective, perhaps casual assessment, and a more rigorous objective and verifiable one. Objectivity and reliability might be said to be the hallmarks of measurement, but there's a lot more to it than that.

If we say we can measure something, we evoke a certain kind of image--a child's growth over time marked off on the closet wall perhaps. Because we reduce complex information to a single scalar, for convenience we usually choose some standard amount as a reference. We aren't required to create this unit of measurement, but any type of measurement should allow this possibility. Hence we have pounds and inches and so forth.

Despite all the language about measuring learning, there are no units. At least I've never seen any proposed. So I will take it upon myself to do that here: let's agree to call a unit of learning an Aha. So we can speak of Stanislav learning 3 Ahas per semester on average if we want. Of course, we need to define what an Aha actually is. I have come to this backwards, defining a unit without a procedure to measure the phenomenon. What might be the procedure for measuring learning?

Because of the objectivity and reliability criteria for real measurement, things like standardized tests come to mind. Good! We can measure Ahas by standardized test. Of course, these instruments aren't really objective (they are complex things, created by people who are influenced by culture, fad, and so forth) nor truly reliable (you can't test the same student twice, as Heraclitus might say). But if we wave our hands enough, we can imagine those problems away.

Still, there is a substantial problem before us. We can't put all knowledge of everything on this test, so what particular kinds of questions are there to be? We run smack into the question what kind of learning? Unlike length, of which there is only one type, or weight, or energy, or speed, there are multiple types of learning: learning to read, learning to jump rope, learning to keep quiet in committee meetings so you don't get volunteered for something. If an Aha is to be meaningful, we have to be specific about what kind of learning it is. But each type is different and needs its own unit. We could coordinate the language to paper over this difficulty, just like we have one kind of ounces for liquid and another kind of ounces for weight. But this is not recommended since it creates the illusion of sameness. Undeterred, we might propose different units for different types of learning: Reading-Aha, Jump-Rope-Aha, Committee-Aha, etc.

How specific do we need to be? Reading, for example, is not really a single skill. I'm no expert, but there are questions about vocabulary, recognition of letters and words (dyslexia might be an eussi), pronunciation, understanding of grammar, and so forth. So reading itself is just a kind of general topic, more or less like height and weight are "physical dimensions." In the same way that it would be silly to average someone's height and weight to produce a "size" unit, we don't want to mix the important dimensions of reading into one fuzzy grab bag and then have the audacity to call this a unit of measure. Where does this devolution stop? What is the bottom level--the basic building block of learning--that we can assign a unit to with confidence?

There may be an answer to that question. If you've read my opinions about assessing thinking on this blog, you'll know I find "critical thinking" too hard to define, and prefer the dichotomy of "analytical/deductive" and "creative/inductive" because those can be defined in a relatively precise (algorithmic) way. A couple of research papers tie electrical brain activity to creative thinking exercises. Science Daily has articles here and here. This is a topic I want to come back to later, but for now consider the point that neurological research may eventually have the ability to distinguish measurable differences in brain activity and potentially provide a physical basis for studying learning.

There are tremendous difficulties with this project, even if there is an identified physical connection. That's because brains are apparently networks of complex interactions, and by nature highly dimensional. It's going to be very hard to squash all those dimensions into one without sacrificing something important.

Note that none of these issues prevents us from talking about learning as if it were a real thing. It's meaningful if I say "Tatianna learned how to checkmate using only a rook and a king." Most of language is not about measurable quantities. We can make very general comparisons without being precise about it. Shakespeare:

Shall I compare thee to a summer's day?
Thou art more lovely and more temperate:
Rough winds do shake the darling buds of May,
And summer's lease hath all too short a date . . . "Sonnet 18," 1–4

The amazing thing about language is that we converge on meanings without external definitions or units of measure. Meaning seems to evolve so that there is enough correlation between what you understand to be the case and what I understand that we can effectively communicate. This facility is so good that I think we can easily make false logical leaps. I would put it like this:
Normal subjective communication is not an inferior version of some idealized measurement.
We should not assume that just because we can effectively talk about love, understanding, compassion, or learning, that those things can be measured. Failing a real definition of an "atom of learning" and commensurate unit, we shouldn't use the word "measurement." But if learning assessments aren't measurements, what are they? I'll try to tackle that question next time.

Next: Part Eight