Saturday, November 06, 2010

SAT-W: Does Size Matter?

This ABC News story tells of 14-year-old Milo Beckman's conclusion that longer SAT essays lead to higher scores. Interestingly, the College Board's response dances around the issue but doesn't deny it:
Our own research involving the test scores of more than 150,000 students admitted to more than 100 colleges and universities shows that, of all the sections of the SAT, the writing section is the most predictive of college success, and we encourage all students to work on their writing skills throughout their high school careers.
It would have been easy to say that with such-and-such alpha level, controlling for blah and blah, the length of the essay is not a significant predictor of the total score. But they didn't. It is interesting that the writing score is supposed to be the most predictive. It's not hard to find the validity report on the College Board SAT page. Here are the gross averages from the report.
The average high school GPA is much higher than I would have expected. Notice too that the first year college GPA is .63 less. The obvious question here is whether this difference has increased over time due to grade inflation. An ACT report "Are High School Grades Inflated?" compares 1991 to 2003 high school grades and answers in the affirmative:
Due to grade inflation and other subjective factors, postsecondary institutions cannot be certain that high school grades always accurately depict the abilities of their applicants and entering first-year students. Because of this, they may find it difficult to make admissions decisions or course placement decisions with a sufficient level of confidence based on high school GPA alone.
This is somewhat self-serving. Colleges don't need to really predict the abilities of their applicants--that's not how admissions works. What we do is try to get the best ones we can for the net revenue we need. That is, a rank is sufficient, and even with grade inflation, high school grades still do that. In the end, SAT and ACT are used to rank students for admissions decisions too.

The grade inflation is remarkable. Let's take a look at this graph from the ACT report.

Two things are obvious--first, the relationship between grades and the standardized test are almost linear. Especially in the 1991 graph, there's very little average information added by the test (by which I mean the deviation from a straight line is very small). Second, there's an inflation-induced compression effect as the high achievers get crammed up against the high end of the grading scale, reducing its power to discern.

Recall from above that the average SAT-taker's high school GPA was 3.6 in the most recent report, and look where that falls on the scale above. We could guess that the "bend" in the graph is getting worse over time, and probably represents a nonlinear relationship between grades, standardized tests, and college grades. If you have a linear predictor (e.g. for determining merit aid), it would be good to back-test it to see where the residual error is.

First Year College GPA is related to the other variables through a correlation, taken below from the SAT report.

We're really interested in R-squared, the percentage of variance explained. In the very best case, if we take the biggest number on the chart and square it, is about 36%. That is, the other two thirds of first year performance is left unexplained by these predictors. Indeed, SAT-W is larger than the other two SAT components (even combined). Now why might that be? Is SAT-W introducing a qualitatively different predictor?

To pursue that thought, suppose Mr. Beckman is right, and the SAT-W is heavily influenced by how long the essay is. This leads to an interesting conjecture. Perhaps what is happening is that the SAT-W is actually picking up a non-cognitive trait. That is, perhaps in addition to assessing how well students write, it also assesses how long they stick to a task: their work ethic, so to speak. If so, I wonder if this is intentional. The College Board has a whole project dealing with non-cognitives , so it's certainly in the air (see this article in Inside Higher Ed).

My guess is that they figured out the weights in reverse, starting with a bunch of essays and trying out different measures to see which ones are the best predictors. And length was one that came up as significant. It's not an entirely crazy idea.

You can read the essays. I did not know this until I saw the College Board response to the ABC article:
Many people do not realize that colleges and universities also have the ability to download and review each applicant's SAT essay. In other words, colleges are not just receiving a composite writing section score, they can actually download and read the student's essay if they choose to do so.
So theoretically, you could do your own study. Count the words and correlate.

Finally, I have to say that the ABC News site is an example of an awful way to present information. It makes my head hurt just to look at the barrage of advertisements that are seemingly designed to prevent all but the most determined visitors from actually reading the article. I outlined the actual text of the article in the screenshot below.

The first thing you get is a video advertisement in the box. If you want to read the whole article, you have to click through four pages of this stuff. I didn't make it that far. Maybe it's a test of my stick-to-it-ness, and there's a reward at the end for those with the non-cogs to complete the odyssey. If so, I flunked.

Update: Here's a 2005 NYT article about SAT writing and length, thanks to Redditor punymouse1, who also contributed "Fooling the College Board."

No comments:

Post a Comment