Monday, July 26, 2010

Course Evaluation Ranks

This is a follow-up to "Online Evals." In the spring we changed from paper student evaluations (using ETS's SIR II) done in class to electronic online evaluations (also SIR II). Some of these were done in class, with students bringing in laptops. Most were not, although we don't have an exact count.

One of the concerns expressed by faculty beforehand was that the scores would change. The previous post shows that the distribution of course averaged definitely changed from one skewed to the positive end to a more symmetrical one. It occurred to me that perhaps the most important question from a faculty point of view is what happens to the relative rankings of instructors when this happens. To find out, we calculated the average absolute difference |paper rank - online rank| for each faculty who was evaluated both semesters, using the single item on the evaluation that's used for administrative purposes (yeah, I know....). There were 82 faculty in the sample, so the greatest possible change was 81 ranks. The average for paper-paper semesters was 17, and for paper-electronic was 20 or 24, depending on which semester we used. The standard deviations were around 1.8, but the distribution is not normal, as you can see on the histogram below. It shows absolute rank change.

Generally, paper-paper performs better, meaning that the ranks are more stable than paper-online.


  1. Anonymous7:48 AM

    I have not problem with your observation that "paper-paper performs better," but I fail to see why that means "that the ranks are more stable than paper-online."

    Perhaps by taking an course eval on paper, in the classroom, encourages a student, whether by tacit intimidation or otherwise, to score a faculty member more highly than they deserve. Yet, when a student, away from the classroom situation can take more time to reflect on their experience offers a more honest assessment of the learning experience.

  2. I just meant that the variation is higher for paper-online differences, which is not an improvement.

    From personal experience, I think there is something to your idea about why the difference occurs. If you saw the distribution in the first post, you saw how dramatic the difference is. It would be nice to know what's going on.