Higher Ed/: Pizza, Dinosaurs, and Critical Thinking

Today at the supermarket with my daughter Epsilon, I sized up the frozen pizzas. See the photo below.

I asked Epsilon to look at the pizza boxes and analyze them. She surprised me by observing that the images all had fresh veggies or pictures of plants in the background, suggesting freshness. Yeah, yeah, I said, but what about that one THERE--Amy's Pizza (Pesto). That slice they show is disproportionately large, I told her. See? They cut an extra big piece to make you think you get more than you really do.

She looked. Pursed her lips, and held out her fingers to make a crude caliper to measure the distance from the center of the cut to the edge of the pie. Then she rotated it around carefully, tracing the edge of crust. No, she said, that's the right size.

She was right. My conspiracy theory was hot air.

We want our graduates to be able to do that sort of thing, right? We might call it critical thinking, although the term makes me break out in hives. Richard Arum and Josipa Roska have a book coming out on that topic called Academically Adrift: Limited Learning on College Campuses, claiming (according to this NYU blog post) that a large number of college students show "no significant gains" in this trait. The research is based on the College Learning Assessment (CLA), which I've written about before.

I haven't seen the book itself, only the blog post I cited and blurb at Amazon.com, which includes this:

Almost everyone strives to go [to college], but almost no one asks the fundamental question posed by Academically Adrift: are undergraduates really learning anything once they get there? For a large proportion of students, Richard Arum and Josipa Roksa’s answer to that question is a definitive no.

Assuming this is an accurate description of the book, that's a pretty bold claim. The description qualifies this a bit by describing the actual skills addressed, including critical thinking, complex reasoning, and writing. These are left undefined, making one wonder what the difference between critical thinking and complex reasoning is, how complex is "complex" and how do we measure complexity. But this is a usual practice, repeated here, to use terms that the general readership will associate with something, and then switch to technical results (statistical parameters derived from tests, for example) without drawing attention to the difference between the two.

From the blog review:

A forthcoming book on what students actually learn in college [...] questions whether students at the nation’s colleges and universities are in fact acquiring the skills necessary to compete in the global marketplace. Economists, sociologists, and educators agree that future labor markets will demand sophisticated critical thinking, problem solving, and writing skills from workers, the very skills that a college education is presumed to develop in students.

You can begin to see the shift in meaning in the passage above, establishing some credibility for critical thinking (whatever that is). In other contexts, this is the set up for a joke. At a birthday party in Germany once, I saw a humorous skit where one woman wanted to buy a hundred eggs from another, and they talked while the eggs were counted:

One, two..how old is Hans now? He must be a teenager.
Oh yes, he's thirteen!
Really? Fourteen, fifteen... And his sister is at Universität, nicht wahr?
She's already twenty-two, can you believe it?
My, my. Twenty-three, twenty-four

It's funny because the meaning of the numbers shifts between the number of eggs supposedly in the basket and whatever the topic of the conversation is. The basket feels awfully light at the end.

An analogous ratcheting of meaning escalates the argument in the paragraph quoted from the blog post above. The sleight of hand is hard to see, and it's not revealed until later on. The heart of the matter is the list "sophisticated critical thinking, problem solving, and writing skills." These are left undefined, so that they are simply placeholders for what will come later, because CLA scores will be used to pass judgment on these. In the first instance, general problem solving, for example, means a host of complex activities. These would include solving differential equations, synthesizing organic molecules, designing logic circuits, inductively finding theories to fit facts, and many, many other tasks that are essential to running a civilization. Think about how big the domain is for sophisticated critical thinking, problem solving, and writing skills. It's vast. By comparison, the CLA is very limited. The sleight of hand is to substitute the one for the other.

Second quote:

Unlike standardized tests such as the GRE or GMAT, the CLA is not a multiple choice test. Instead, the multi-part test is a holistic assessment based on open-ended prompts. “Performance Task” section prompts students with an imagined “real world” scenario, and provides contextual documents that provide evidence and data. The students are asked to assess and synthesize the data and to construct an argument based on their analysis.

The CLA is a standardized test too, in that it's scored systematically to increase reliability. As such, it's of relatively low complexity, and therefore can be gamed with prep courses. Note that "real world" is in scare quotes. This may be because the problems can't literally be considered 'real world' under the circumstances: it is a timed, probably low-stakes test with items that are obviously made for the instrument. Real world scenarios are more like this:

You are an executive at a multinational oil company that has just caused a huge environmental disaster off the coast of the United States. What should you do?

The post continues with the charge:

The results of Arum and Roska’s research are troubling: 45 percent of the students in their sample demonstrated no significant gains on the CLA between their freshman and sophomore years of college.

After eight months of college classes, the CLA didn't detect an effect for almost half the students. This sounds serious. But is the CLA even supposed to be valid on a per-student basis? My understanding is that different students may not even see the same questions, and that the instrument is for comparing institutions. The CLA website supports this:

This signaling quality of the CLA allows institutions to benchmark where they stand and how much progress their students have made relative to the progress of students at other colleges.

It's unfair to ask too much technical detail of a blog post, but it's not encouraging to read that:

The average student’s performance on the CLA rose by only 7 percentile points, suggesting that college curricula and instructional strategies are not developing students’ higher order thinking skills.

How can the average performance move at all in terms of percentiles? The bottom 10% are the bottom 10%. It may not be the same group before and after, but there's going to be a bottom 10%. It's not like the range goes from 0-100% into 7-107%--a Lake Wobegon effect. Does the author mean percentage points? Percent of what? How do we put 7% in context? Furthermore, how do we establish that the beginning level of student "critical thinking" is or is not sufficient to meet the hypothetical global challenge? As written, this is simply meaningless gobble-dee-goop. But this doesn't stand in the way of reaching conclusions:

In some ways, Arum says, the results of the study are not surprising. Previous research has shown that the average student spends only 13 hours per week studying, far less time than is spent on social pursuits. Likewise, faculty incentives within higher education (promotion, tenure) are aligned with research pursuits, rather than the quality of undergraduate instruction. According to Arum, this misalignment of goals results in far less attention to teaching and learning than is necessary to cultivate higher order skills in students.

This is all very neat as a presentation, which we can dissect into:

Implied problem: we need to be globally competitive
Elaboration: competitiveness requires critical thinking skills (etc)
CLA measures critical thinking skills (bait and switch)
Change in scores is too low
Colleges are not educating students appropriately
???
This is caused by two factors: lazy students and unmotivated professors

The sixth step is left as unknown, and I assume more detail is found in the book itself. Establishing a cause requires work.

The conclusion of the piece is written as a quote attributed to Dr. Arum:

In the future, U.S. higher education will increasingly be held accountable for demonstrating measurable improvement in undergraduate learning.

This isn't entirely logical, since students (in the exposition) bear part of the blame. Don't we need to hold them accountable too?

It's ironic that a piece about critical thinking has such flimsy foundations. We can blame the blog format for leaving out information--that's to be expected. But the basic problem is the disingenuous switch between general and specific about the definition of critical thinking. That can't be papered over, given the breadth of the argument.

There's another article about assessing critical thinking that came across my desk this week: "Assessing Critical Thinking in STEM and Beyond." It's about the CAT--an instrument similar to the CLA in that it uses free-form responses. The big difference, as far as I can tell, is that faculty score the CAT. It isn't outsourced. Involving faculty in the assessment process is important. Especially when felines are involved.

Update: See "Berry Stein on the CAT" for Dr. Stein's comments that address some of the points I make below.

It's fun to treat the respective articles as source material for a critical thinking exercise. One point of comparison is the description of what critical thinking actually is. Recall that this is the hole in the logical exposition of the first article. From the second:

There is little question that as a result of an increasingly technological and information driven society the ability to think critically has become a cornerstone to both workplace development and effective educational programs.

This is supported with citations that underline how important critical thinking is. Critical thinking isn't actually defined, but later on a table shows what the test is intended to assess:

Separate factual information from inferences that might be used to interpret those facts.
Identify inappropriate conclusions.
Understand the limitations of correlational data.
Identify evidence that might support or contradict a hypothesis.
Identify new information that is needed to draw conclusions.
Separate relevant from irrelevant information when solving a problem.
Learn and understand complex relationships in an unfamiliar domain.
Interpret numerical relationships in graphs and separate those relationships from inferences.
Use mathematical skills in the context of solving a larger real world problem.
Analyze and integrate information from separate sources to solve a complex problem.
Recognize how new information might change the solution to a problem.
Communicate critical analyses and problem solutions effectively.

This is similar to the CLA (see a description of CLA item scoring here) and these were tested for face validity by asking faculty if they thought the items were representative of critical thinking. The answer was yes (for the most part). But even if the list is composed of different kinds of critical thinking, it does not mean that there are not other kinds of critical thinking. Maybe there are lots and lots of other kinds, so many that generalization from the list is invalid. In other words, just like with the CLA, good performance on the test might demonstrate ability in some kinds of critical thinking, but that does not mean that the test illuminates general critical thinking skills (it may not even be possible). Both of these papers turn around the logic, going from [My test scores show critical thinking] to [critical thinking always shows up on my test]. This is the logical fallacy called affirming the consequent. In other words, an inappropriate conclusion.

So while it is a wonderful idea to provide guidance and resources for integrating into coursework the types of thinking skills assessed in either test, I don't see what value lies in attempting to generalize the findings to some "critical thinking" parameter that is probably so general that it's meaningless. Why is that even needed? It's a farce that couldn't be maintained long in any case. Consider this quote from the CAT paper:

Virtually every business or industry position that involves responsibility and action in the face of uncertainty would benefit if the people filling that position obtained a high level of the ability to think critically.

What skills are most useful if you're dealing with uncertainty? I would say that a solid understanding of probability and in particular Bayes' Theorem would be essential. Also, a solid understanding of how computers work, and the kind of logical thinking that is required to debug programs is very valuable in many domains, like figuring out why the stock market fell out of a clear blue sky. Much of the time uncertainty means considering branching possibilities, and the systematic investigation of that kind of structure is within computer science. There are hints at these types of thinking in the bullet point list above (the one about correlation, for example), but the notion of conditional probability isn't directly addressed. In fact, it's interesting to note that in the CAT piece, the face validity survey revealed that:

The question with the lowest overall support (81.2%) involved using a mathematical calculation that was needed to help solve a complex real-world problem.

The point here is that there is already a mismatch between the quote about the need for critical thinking (to address uncertainty) and the skills listed that are supposed to assess critical thinking. This is a symptom of the over-generalization. Another example of this is the reference in the earlier quote to a "technological and information driven society," but no obvious correspondence to the skills list. How about understanding how computers and algorithms work? Data mining? Everyone should know how to use a pivot-table to disaggregate data dimensions, read and create graphs (including log and log-log scales), etc. And while we're at it, what about non-cognitives like identifying confirmation bias in your own conclusions?

The effects of the unwarranted generalization are gross distortions of the value of education. The CLA takes this to absurd lengths, and sets the stage for a "create a problem, sell the solution" business model, which I've mentioned before. The CAT seems much more useful, since faculty actually see and evaluate the results, and the scope seems broader (NB: I've not seen either instrument in its entirety).

The odd thing about the CAT is that the main reason for using a standardized instrument is to be able to compare results to other institutions'. But this probably isn't reliable without outsourcing the scoring. I would broaden the scope of CAT, and rather than see it as a test instrument, view it as a starting point to customize a general education curriculum through structured inquiry with faculty. I don't mean deciding all over again what structure and what courses are required (shudder), but rather what kinds of thinking we expect students to demonstrate. I don't know of a similar program, and the CAT could be a wonderful launch pad for that.

But thinking, critically or otherwise, is broader than arguments and problems. Thinking is what shapes who we are as human beings, and I believe that a liberal education should include context as well as process. The other day Epsilon and I walked to the store. I pointed down to the divisions of the sidewalk as we stepped over them, and told her each one is a hundred million years. Here's where the planet formed, about 4 billion years ago. Point to where you think humans came along. So we walked and talked. We walked over 35 divisions, almost around the corner, before getting to the Cambrian Explosion and multi-cellular life. We had a little celebration there. We paused to mourn the world's loss during the Permian Event, and finally arrived at the last two concrete rectangles, where everything she was interested in happened: dinosaurs, lasting a whole slab, the K-T gap that ended their career and launched ours, and that last inch or so where humans made an appearance.

This sort of knowledge isn't good for making arguments in board rooms or figuring out how to stop oil spills. But having a context is no less important. To know that the carbon atoms we borrow for a while--that then cycle on to be replaced in the combinatorical dance that animates us--that those atoms were coughed out by ancient dying stars and drifted through space for unimaginable stretches of time before playing their current role. To understand the bizarre nature of the physical world at the micro and macro level, to understand the limits of logic itself. Those represent important milestones in our evolution as humans, and simultaneously exalt and humble us. Do they prepare one for the global marketplace? I can't prove it.

To be more concrete about some big holes in both the CLA and CAT lists, let me give some examples of other thinking skills that are as important in my opinion.

1. Understanding exponential growth

2. Understanding conditional probabilities

3. Understanding The Prisoner's Dilemma and Tragedy of the Commons

4. Understanding the consequences of evolution through natural selection

In summary, the premise of the CLA, the disingenuous generalization of results, and the strong charges alluded to in the blog post and Amazon description of Academically Adrift do not leave a good first impression. My first look at the CAT through the eyes of the authors of "Assessing Critical Thinking..." still seems over-reaching, but is more interesting to someone who actually needs to get meaningful assessment done.

Higher Ed/

Saturday, May 08, 2010

Pizza, Dinosaurs, and Critical Thinking

No comments:

Post a Comment

Search This Blog