Wednesday, June 08, 2011

Policy Questions Raised at AALHE

I flew back yesterday from Lexington and the first AALHE conference. I found it very stimulating. I put faces to names from the ASSESS list server, which was delightful.

In the opening plenary, Trudy Banta gave us a broad perspective on the evolution of the measurement and accountability, pointing out the weaknesses of value-added derivations and standardized tests in particular, and suggesting authentic assessment (e.g. portfolios and their analyses) as useful alternatives.

One point is particularly compelling to me. Trudy mentioned the pedigree of the SAT, and it's not hard to imagine the many hours and dollars that have gone into fine-tuning this test. These are smart people working with a will for a long time toward a well-defined purpose: predicting how well high school students will do in college. In my own experience as an IR person, the SAT does add some predictive strength to linear models, but not much once high school GPA is considered. A handful of percent of R^2 is it. At my present institution, it's virtually worthless as a predictor of first-year grades, which points also to the known biases of the test.

In short, there is some usefulness to the SAT, but it may not warrant all the trouble and expense. And of course some schools are now SAT-optional. I've written before about how, as a market signal, SAT overprices some students and overlooks others, creating the opportunity to use other (e.g. non-cognitive) indicators to find good students.

Trudy's comments did not go this far, but it's not hard to connect the dots. It's an important point: if all this effort yields so little result, maybe we're doing it wrong. The alternative is to admit that maybe this is the best we can do, and that our ways of knowing will just never be much good.

It should be noted that the plenary was planned as a kind of debate with two viewpoints, but that because of a cancellation by Mark Chun, only one side was presented. So in the defense of big standardized tests, here are some advantages:

  • They can have good reliability, so they seem to measure something.
  • Because of this, they can create self-sustaining corporations that provide a standardized service to higher education, so there are fixed points of comparison. 
  • Even if the validity isn't stellar, some information is better than none, right?
The conversation is much more subtle than "down with standardized tests!" But you may have noticed that fine distinctions aren't part of the public debate on the quality of higher education. It would be great if Mark and Trudy and other experts introduced a public discussion like this on a message board or something--somewhere that the complexities of the issues could be fully explored and commented on over time.

[Update, see my "Apology and Addendum" that goes with this section.]

Bookending the conference, the closing plenary was given by David Paris, and also included a historical and political overview of where we are now. He is the executive director of the New Leadership Alliance for Student Learning and Accountability. You can sign up for email updates on their website. Paraphrasing David, the Obama administration talks nicer, but wants the same things as the Bush administration. And what that seems to be is "accountability." One important take-away for me is that we (higher education) have a window of opportunity that will soon close. That is, we can "do it to ourselves" or "have it done to us." The 'it' is amorphous, but centers on the accountability for cost and learning trade-off. In other words, all this public noise about how bad college is will create changes, and we should try to shape those.

It's hard to disagree with that. One approach is the Alliance's Institutional Certification Program, which you can learn about here. I'd like to see some examples of how it works in practice, but on the face of it, it seems like a great idea, similar in some aspects to the Bologna 'centering' process in Europe. I would call it a kind of horizontal professionalization, complementary to the 'vertical' type provided by discipline-based accreditations and professional organizations. Overall, it seems like a serious and well-reasoned approach, and I look forward to finding out more about it.

There was a good Q&A afterwards, and I suggested that higher education needs access to data pertaining to what happens to students after graduation, and that the federal bureaucracy sits on a gold mine of such information. This didn't have a chance to become a real discussion because of the format, but my interpretation of David's response was that the student-level record project that was floated and shot down (with help from higher education lobbyists) would have solved that problem. So we are complicit in causing the problem.

I don't buy this. First of all, what we need most right now is a deep historical understanding of how higher education has affected lives after school (graduation or not), for individual institutions if not individual majors. I understand that the data are imperfect and that there are privacy issues to be solved. But I think 9/11 shows that any privacy concerns can be solved rather quickly if the motivation is there. Specifically, it seems like it should be possible to combine some combination of student loan records, FAFSAs and  PELL grant records, tax records, federal employee records (including military) and so on to link where students took instruction, what their demographics were, and what happened to them afterwards. There are plenty of other places to find data, I'm sure, like state system databases.

The payoff for this is potentially enormous. Suppose we (higher education) can agree with the politicos that employment is a good outcome of a college education, and even talk about the details like what kind, how much pay, where geographically, and so on. Then we could look at what kinds of schools have what kinds of effects on what kinds of students. Is a biology program at my (private) university worth the cost premium over  the public one down the road? What about access? Which schools are economic engines by taking low-income inputs and (eventually) producing high-income outputs?

The payoff is so great that I think that if the government really can't find a way to do this, we should figure out how to do it ourselves. 

What we have now is survey data that shows that college generally pays for itself, but the price is going up. This is in opposition to the cries of critics that say students aren't learning much, if anything.

Here's my analogy. It's imperfect, but has the advantage of being vivid. Students are like movie scripts coming to our production companies. We have our specialties: niche art films or mass market gloss, and so on. Each script has an individual experience, no matter what our organization is. They get cast differently (different instructors), and we do our best to make a good fit usually. Or maybe we just get the cheapest actors we can find and hope for the best. We always have to rewrite the script to some extent, and our mission--the hope that gets buried in the Sisyphean rolling of semesters up and down the calendar--is that the screenplay comes to life and realizes its potential.

It takes a lot of money to make this happen, and it comes from investors who are increasingly grumpy. They say the movies are no good. They're too expensive and nobody likes them. Film critics like Arum and Roksa used data from a large number of scripts in production to claim that a large portion weren't being significantly improved in the rewriting stage.

But we don't have any real information about what the audience thinks. We can analyze the heck out of our own productions, and do six-sigma and professionalize all we want, but until we understand what the audience thinks, we don't really know if we're doing any good. Maybe all those films get shown once, and then end up in a storage shed. 

Maybe some types of films shouldn't be made anymore. Maybe Forensic Dolphin Psychoanalysis shouldn't even be a major because there's no audience for it. 

The recent explosion of for-profits makes the problem more urgent to solve. Most industries have to depend on their products working. We don't have much direct evidence one way or the other in the kind of detail we need to make decisions. It's starting to happen, for example with student load default rates getting attention. But we can go a lot further than that.

Imagine if we could sit down with the investors and agree on the long-term goals, have metrics that more or less tell us how well we're doing, and plan together how to get there. Are the goals strictly economic? Or do we want people to pursue happiness and bear arms? Even if it is just economics, it presents an opportunity to really understand how, say, liberal arts education matters in job mobility, lifetime income, and so on. Institutions could say "yes, you'll get a fine job, but in addition you'll have a fulfilling career." Or whatever their mission is. Instead of talking past each other about standardized assessments and authentic assessments, we could figure out how to work backwards from the real goals to real assessments that matter at the strategic level and then add institutional flavors that matter to us. That would be an exciting and productive conversation.

The alternative is grim. If we only focus on what goes on inside our production studios, the future of the nation is at the mercy of every critic with a theory or an agenda. I'm not sure which is worse: theories or agendas. Some will want to break down every step of movie making into a reductionist flow-chart, and create spreadsheets to show the rate of script-to-casting time or use biometrics to calculate charisma factors of the actors. There's no bottom to this because although movie scripts are only 120 pages, individual brains of our students have perhaps 100 trillion synapses each. If each one is, say eight bits of information, that's about four billion times the complexity of a movie script. Each one. Others will work backwards from agendas to create policies that make no sense in context.

Even if we think we've solved the problem and get universal agreement that our learning outcomes are being achieved at the university, how do we really know what the effect is after graduation unless we measure that? Maybe they're learning the wrong things. Maybe some small college has methods that work twice as good as ours. Maybe we can reduce costs and increase quality. The ingredients are wonderful: a diverse ecology of isolated experiments. We just can't see where the lab results are recorded in order to make conclusions.

Yes, we need individual student records. But we can't wait for that. The problem is too important. Maybe tax records and FASFAs can't be mashed up because of political or technical reasons (but do you really think Mark Zuckerberg couldn't figure this out in an afternoon?). If that's the case, then we have to find another way to measure historical and ongoing long-term outputs in such detail that it can inform institutional decisions. 

What we're doing at present is creating our own dramatic screenplay: an epic version of "No Exit."


  1. I have two comments:

    1). When I correct for non-linearities in ACT scores, the ACT's prediction value relative to HS GPAs spikes, especially for populations that have below average ACT scores. Meanwhile, HS GPAs are mostly linear with respect to the vast majority of outcomes measures that I examine. I can't say anything about SAT scores, because I don't work with them much, but I suspect the worthiness of ACT scores is under-appreciated.

    2). I can't underscore enough the importance of expansive student-record tracking systems. I know that they are difficult to set-up, manage, and maintain, but the potential value that these systems have is simply enormous.

  2. Reuben--I'd like to know more about how you do the ACT correction. Non-linear with respect to what? Do you create separate linear models for different ranges?

  3. I find non-linearities between ACT scores and most variables that I examine(grades, graduation rates, self-reported income, probability of receiving a Pell grant, etc.). I do find that the relationship between ACT scores and retention rates is basically linear at our institution. I also find that it is linear for 4-year graduation rates (but not for 6-year graduation rates).

    If you can figure out the functional form, then you can simply plug in the correct form in any regression model (aka cubic function, exponential, quadratic, etc.). Sometimes though, the form has no recognizable form (which usually means it is non-monotonic). In this case, estimating multiple linear models for different ranges is certainly a valid way to correct for the non-linearities.

  4. Thanks--that's interesting.