Tuesday, May 11, 2010

Assessing Your Program-Level Assessment Plan

...is the title of this IDEA paper (pdf).  It's a great introduction, suitable for sending out to chairs, program coordinators, and teaching faculty.  It describes the difference between program effectiveness and student learning, and elements of each that could be assessed. 

Note: there's actually much more in this article than I have time to write today.  Below are some of the salient points, but take a look for yourself--it's good.

What's particularly good:  It's a very clear exposition of the big picture of assessment, well written, attractively presented, and fully-formed for distribution as preparation for a chat with people who will be doing assessment.

There is good advice about constructing outcomes that speak to the whole range of performance:

[P]rogram-level student learning outcomes should be appropriate for students graduating from the program.
I often suggest using a scale of performance that matches the students' careers: pre-college through graduate, for example.  This is how our Faculty Assessment of Core Skills works, and it's very effective.

There is a description on page three about the need to have common goals, with the example of writing assessment used for illustration:
[T]here might be a program-level writing outcome, but instead of having a shared understanding among faculty of what constitutes competent college-level writing, each faculty member is allowed to define — and measure — the outcome in their own way. Grammar and mechanics might be the prevailing criteria for some faculty, while style and voice might be the most important elements for other faculty. Still other faculty might look at a paper’s organization and use of sources.
This passage is used to make a case for common rubrics, scoring methods, and standardization, or at least this is implied.  This is all good, but let me add a cautionary note that the recommendation could be confusing in that it conflates two different things.

First, it can be very useful to let faculty say what they really think without rubrics or imposed agreements filtering their view.  This is appropriate for big fuzzy outcomes like thinking and communications skills.  However, as the paper points out, this is not useful for giving detailed feedback or identifying what to improve.  For that we could use detailed outcomes feedback tied to some assignment. Let's say it's a writing sample, and we're interested in style, content, audience, and correctness.  Each of these should be assessed individually--that's the detailed feedback we need.  Those ratings are going to be subjective, but there should be enough inter-rater reliability if they are suitably defined.  The problem is trying to aggregate these into something called "writing."  This is a very common error that causes all kinds problems.  Let me clarify by example:
We give students a score in "sports" by averaging their scores from participating in baseball, table tennis, and swimming. 
Each of the sub-scores in the example might be meaningful, but the big average is not.  It might be correlated with something (we might call it athleticism), but it's useless for trying to identify improvements to pitching or backstroke.  So why do it?  My advice: every time you feel the urge to aggregate (e.g. by averaging), go do some push-ups instead.  Everyone will be happier.

Curriculum mapping is covered, showing a simple schematic.  I like it.  I like to add on the map where things are being assessed too.  It's not necessary to assess everything all the time, and it's nice to have particularly the big assessment (e.g. senior paper) marked.  You could even hyperlink to reports and results on the actual map.

A Cautionary Note:  There's no escaping that there are different philosophies about how to do assessment, or what the role of the administration in assessment it. The paper leans toward the formalization/bureaucratic end of this spectrum.  This is probably the mainstream view, but not my own approach.  For example, the paper recommends establishing a language of assessment, but suggests the terms "objectives, outcomes, competencies, dispositions, goals, indicators, measures, tools, and methods."  I suppose those are important, but I'd rather have faculty establish a vocabulary that relates directly to the learning outcomes themselves.  Things like "deductive thinking," or "evolution of creativity," or whatever it is they're trying to teach.

Finally, a note about data and results.  This is my first impression, and I may change it after mulling this over.  But I think the article goes a bit too far down the pseudo-science route.  This is again a mainstream idea, but I don't see it work well in practice.   Here are recommended uses of data from the paper:
1. Does the data represent an identifiable trend?
2. Does the data represent an acceptable level of achievement?
3. Does the data surprise you?
The simply analogy to this is measuring the oil in your vehicle over time.  You can monitor the level, detect leaks, how dirty it is, and other unusual conditions.  The remedies are straightforward. 

On the other hand, real-life assessment is not like that.  Last week I attended a presentation of faculty wrapping up their assessments for the year.  They had a graph of major field test average results (a standardized discipline-specific test from ETS) that went back ten or twelve years.  It looked sort of like this:




In my experience, the best opportunity for finding improvements based on assessments is while the faculty are still looking at the results of student work.  Example:
A dance student gives her final project performance.  During the performance, a group of three faculty (one outside the discipline) make notes on a standardized review sheet according to their established system.  Later, they review the student's artistic statement, portfolio of work, and notes about the performance to identify strengths and weaknesses.  They can easily determine if these are probably due to the curriculum or the particular student.  After doing a dozen of these reviews, patterns will be evident all by themselves.  You won't need a graph to tell you that lighting was bad on all the performances, or that the writing on statements was just not acceptable, etc.  There is no need to try to identify every aspect and nail it down with a rubric and plot it on graph paper.  I think this view (the pickle factory management perspective) simply doesn't trust faculty with the most important part of the process.  But I think I'm in the minority on this.  
There's more in this vein, about using tracking systems to keep track of all the data that's going to roll in, for example.

These distractions for me to not diminish the nice presentation.  You might just want to be prepared to answer a few practical questions, and have a game plan in mind for small programs that aren't going to produce meaningful graphs and charts and 8x10 color glossy pictures of learning outcomes.

No comments:

Post a Comment