Tuesday, November 10, 2009

Numbers and Names

Words have meaning. This is true even if the words are not formally defined; people could talk to each other before dictionaries came around. The facility to speak and understand is so fluid in fully-functioning humans that we underestimate how difficult it is (see Moravec's Paradox). Undoubtedly there was strong evolutionary bias toward creating this ease of communication, as opposed to making our brains facile with long division, for example.

Because words are powerful, they get hijacked. There is economic value attached to the effect of certain words, like "new and improved" and so they get put into use like blue-collar workers marching off to punch in. Sometimes this is manipulative or cynical, as brilliantly illustrated by Orwell in 1984. In the Russian revolution, Bolsheviks were pitted against Mensheviks, names stemming from a narrow vote. The former word means "majority" and the latter "minority." Imagine if your political faction is saddled with with the second name...

In outcomes assessment, or reporting out psychometrics in general, the use of common words is sloppily introduced. I've addressed the big one: "measurement" elsewhere. Another good source for this kind of error propagation is studies that use factor analysis. I came across a good example while reading "A Look across Four Years at the Disposition toward Critical Thinking Among Undergraduate Students" by Giancarlo and Facione while browsing Insight Assessment's research page. This company produces the Critical Thinking Dispositions survey I blogged about here. I don't mean to be critical of the authors, but rather highlight a practice that seems to be endorsed by most who write about such things. The study itself is interesting, giving a before-and-after look at undergraduates as assessed by the survey. They introduce the topic of dispositions thus:
Any conceptualization of critical thinking that focuses exclusively on cognitive skills is incomplete. A more comprehensive view of CT must include the acknowledgement of a characterological component, often referred to as a disposition, to describe a person’s inclination to use critical thinking when faced with problems to solve, ideas to evaluate, or decisions to make. Attitudes, values, and inclinations are dimensions of personality that influence human behavior.
Notice the implication that personality comes in dimensions. Dimensions are by definition independent of one another, and as we shall see, the idea is that we can assemble a linear combination of these pieces to assemble a whole disposition. This is an enthymeme without which the rest of the analysis cannot proceed, but it's a big leap of faith. As such, it ought (in the research community) to be spelled out explicitly. The mindset that attitudes and values and inclinations together create some kind of vector space is so wild that you'd think caution would be advised. If the implication is that these dimensions really are orthogonal (completely independent of one another), it's ridiculous on the face of it. What does it mean to have a very small amount of "attitude" but lots of "inclinations?"

Most things are not linear. If I'm talking softly, you may only hear bits and pieces of what I hear. Increasing the volume will enable you to hear me clearly within a range, but we wouldn't be so bold as to say "talking twice as loud makes you understand me twice as well." We use linearity not because things are linear but because it makes it easy to do the analysis. In small ranges, it often makes sense to approximate non-linear phenomena with linear models, but one has to be careful about reaching conclusions.

In the article, the assumption is that the disposition to think critically is the linear combination of a few component dimensions. These are listed:
Factor analysis of the CCTDI reveals seven distinct elements. In their positive manifestation, these seven bipolar characterological attributes are named truthseeking, open-mindedness, analyticity, systematicity, critical thinking (CT) self-confidence, inquisitiveness, and maturity of judgment.
Notice the passive voice "are named." Are named by whom? Here's the process: The survey is administered and the results recorded in a matrix by student and item. A correlation matrix is computed to see what goes with what. Then a factor analysis (or singular value decomposition, in math terms) is performed, which factors the matrix into orthogonal dimensions. To understand this, it helps to look at an animation of a simple case. If the dimensions have different "sizes" (axes of the ellipse in the animation), then a more-or-less unique factorization results. If the dimensions are close to the same size, it's hard to make that case. Each dimension is defined by survey items and associated coefficients. It supposedly tells us something about the structure of the results. Note that orthogonal means the same thing it did earlier: completely independent. You can have zero of one factor and lots of another, and this needs to make sense in your interpretation.

So: we do some number crunching and find associations between items. These are collected together and named. We could call them vector1, vector2, and so on, but that wouldn't be very impressive. So we call them "openmindedness", "attentiveness," and use words that already have meanings.

It's not even clear what the claim actually is. Is it that we humans perceive critical thinking dispositions as a linear combination of some fundamental types of observation, presumably presented to us in whole form by our perceptive apparatus? Or is it that in reality, our brains are wired in such a way that dispositions are generated in as linear combinations?

It would be relatively easy to test the first case using analysis of language, like the brilliant techniques I wrote about in "High Five." I don't see any evidence that this sort of thing is done routinely. Instead, researchers eyeball the items that are associated with the dimensions that pop out and give them imaginative names. They may or may not be the same names that you and I would give them, and may or may not correspond to actual descriptions that someone on the street would use to describe the test subject.

I hope you can see the sleight of hand by now. In the case of this particular article, the authors go one step further, by describing in detail--in plain English--what the dimensions are (I have bolded what was underlining in the original):
The Truthseeking scale on the CCTDI measures intellectual honesty, the courageous desire for best knowledge in any situation, the inclination to ask challenging questions and to follow the reasons and evidence wherever they lead. Openmindedness measures tolerance for new ideas and divergent views. Analyticity measures alertness to potential difficulties and being alert to the need to intervene by the use of reason and evidence to solve problems. Systematicity measures the inclination to be organized, focused, diligent, and persevering in inquiry. Critical Thinking Self-Confidence measures trust in one’s own reasoning and in one’s ability to guide others to make reasoned decisions. Inquisitiveness measures intellectual curiosity and the intention to learn things even if their immediate application is not apparent. Maturity of Judgment measures judiciousness, which inclines one to see the complexity in problems and to desire prudent and timely decision making, even in uncertain conditions (Facione, et al., 1995).
These descriptions would serve suitably for ordinary definitions of ordinary terms (without the use of "measurement"), but no evidence is presented that the ordinary meanings of all these words corresponds in any way to the factor analysis results, other than that someone decided to give the dimensions these names. The final touch is claiming that we "measure" these elements of personality with precision:
For each of the seven scales a person’s score on the CCTDI may range from a minimum of 10 points to a maximum of 60 points. Scores are interpreted utilizing the following guidelines. A score of 40 points or higher indicates a positive inclination or affirmation of the characteristic; a score of 30 or less indicates opposition, disinclination or hostility toward that same characteristic. A score in the range of 31-39 points indicates ambiguity or ambivalence toward the characteristic.
All of this strikes me as absurd. It's not that surveys can't be useful. To the contrary, they undoubtedly can give us some insights about student habits of mind. But to suppose that we can slice and dice said behaviors with this precision is far over-reaching, particularly in the use of ordinary language to create credibility without proof that these associations are strong enough to withstand challenge.

This practice is unfortunately common. The NSSE reports include dimensions like this, for example.


  1. It seems that the authors just did what personality theorists do. They came up with their own "Little 7" interpretation that stands along in some sort of relation to the Big 5 (that you blogged about earlier), the Big 3 and other Big n dimensional dissections of personality. They didn't dwell on how they named their factors because their in-discipline colleagues will recognize and accept their method. Is it bad that psychometricians don't necessarily imply orthogonality when they talk of dimensions? The mathematrician cringes, but the NAP most likely nods and moves on.

    Imprecise language causes me heartburn too, but I think in this instance we may be the mensheviks.

  2. Well, the big five research was purely text-based, so its validity derives from the honing evolution of language. This is a whole lot different from dreaming up a label for a basis vector on a matrix decomposition of item responses. The question for the researchers to answer is how to validate the names they give the dimensions. Factor analysis dimensions are by definition orthogonal--no wiggle room there. It's not just mathematicians that cringe--take a look at The Philosophical Foundations of Neuroscience for more a cogent critique than mine.