I don't propose to do a review of either of these papers here, but one quote struck me from the latter. It should first be noted that the authors strike a cautious note about the nature of such research in the introduction on page 2:
[O]ur position is that future efforts by survey researchers should: (a) clarify the basis for claims about “effect sizes”; (b) develop better measures of teachers’ knowledge, skill, and classroom activities; and (c) take care in making causal inferences from nonexperimental data.The point from the paper that struck me was this quote from page six:
Two important findings have emerged from these analyses. One is that only a small percentage of variance in rates of achievement growth lies among students. In cross-classified random effects models that include all of the control variables listed in endnote 4, for example, about 27- 28% of the reliable variance in reading growth lies among students (depending on the cohort), with about 13-19% of the reliable variance in mathematics growth lying among students. An important implication of these findings is that the “true score” differences among students in academic growth are quite small [...]Let's think about that for a moment. The variation in "learning" (as numerically squashed into an average of a standardized test result, I think) is mostly not due to the variation among students. This struck me as absurd at first, but then I realized it's just a statement about how variable students are: to what degree the phenotypes sitting in our classrooms differ in their respective abilities to learn integral calculus (in my case).
As a reality-check, I pulled up my FACS database and classified students by their first semester college GPA and looked at their average trajectory in writing, as assessed by the faculty. Here it is.
The top line is 3.0+ students, then comes 2.0-2.99 students and so on over eight semesters, showing average writing scores. The improvement semester by semester does seem pretty constant, regardless of how "talented" the students are. Note that this particular graph isn't controlled for survivorship.
The numbers hide most of the real information, however. Is the improvement of the lowest group really comparable to that of the uppermost? There are different skills involved in teaching fast learners versus slow (which is considerably related to how hard the students work, a non-cognitive). If one substitutes standardized test scores for actual learning, this problem can only get worse.
No, I still don't like averages. Here's the non-parametric chart for the 3.0+ group over eight semesters.
The blue portion of the bar is the proportion of these students who receive the highest rating, and so forth. The red ones are "remedial" ratings.
Update: the effect size of differences between student in my FACS scores is small, but it seems to be real. Especially if you look at particular treatments like that of the writing lab in an earlier article. As a first approximation, perhaps student abilities don't matter as to how much they learn, but I strongly suspect that conclusion is vulnerable on a number of fronts. See my more recent article on Edupunk and the Matthew Effect.