Higher Ed/: April 2006

The Commission on Colleges is reviewing its Principles of Accreditation. Below is a comment I sent them on section 3.3.1, which deals with institutional effectiveness. I was motivated by my experiences during our recent accreditation, where I encountered different interpretations of what IE is, and how it should be done.

I’m writing to suggest that 3.3.1 be clarified either in the statement of the section itself or in secondary documentation.

The language in both the old and new drafts is similar:

"The institution identifies expected outcomes, assesses whether it achieves these outcomes, and provides evidence of improvement based on analysis of the results..."

The problem is with the "evidence of improvement" part. The requirement all seems very reasonable until you actually try to put it into practice, at which point a sticky logical problem arises. The assessment loop described in the statement sounds like this:

1. Set a goal

2. Measure attainment of the goal and do some analysis

3. Provide evidence that the goal is now being attained better

The problem is that there is a logical gap between 2 and 3. In my conversations with others, there seems to be wide divergence on the meaning.

The missing steps that are implied between 2 and 3 are inserted below:

1. Set a goal.

2. Measure attainment of the goal and do some analysis.

* Take actions that may lead to improvement.

** Measure goal attainment again, and compare to the first time you did it.

3. Provide evidence that the goal is now being attained better.

Example:

1. Goal: reduce attrition

2. Measurement: currently the 5 yr grad rate is 60%

* Take action: raise admission standards

** Measure again: now the 5 yr rate is 62%

3. Evidence: before and after rates constitute evidence of improvement

Notes: without * there is no guarantee that the unit is doing anything but measuring. A unit could simply measure lot of things, identify a couple that randomly improved, and report that as evidence of compliance with 3.3.1.

An emphasis on taking action intended to result in improvement is lacking in the current version of 3.3.1. This has led to strange interpretations of the standard (in my opinion). For example, a consultant with a degree in Institutional Research, who had been a member of a SACS visiting team, related the following to me. He described to me an IE ‘success’ story, recounted below as I remember it.

At a certain college, there is a required freshman course on a particular subject. Upon graduation, the students were tested on retained knowledge in that subject, and the results turned out to be disappointingly low. The action the administration took was to teach the course in the junior year instead of the freshman year. Since the students had less time to forget the material, test scores went up, and everyone celebrated.

If you accept my version at face value, it illustrates my point. Is it an appropriate action to shorten the length of time between study and assessment to improve scores? Maybe so, but it seems like there ought to be more complex considerations. Why were the students taking the course? Did they perhaps need the knowledge in their freshman year? If the result is to simply ensure that graduates have memorized some facts, then maybe the action was the correct one. The point is that without any emphasis on what actions are taken and why, it’s possible to conform to the letter of 3.3.1 without actually accomplishing anything.

Of course, anyone familiar with the theory and practice of institutional effectiveness will understand that 3.3.1 implies that there are necessary actions between steps 2 and 3. But I would like to convince you that step 3 should actually be replaced in the statement of 3.3.1 by a statement about action:

Proposed version:

The institution identifies expected outcomes, assesses whether it achieves these outcomes, and provides evidence of actions intended to produce improvements. These actions should be based on analysis of the results...

This is a more accurate statement of what most IE activities actually are. There are several related points:

There is never a guarantee of action producing favorable results. That is, you can have the best assessment plan in the world, the best intentions, and reasonable actions, and still fail to produce improvements through no fault of your own. Many things are exceedingly hard to predict (like the stock market, the price of oil, or the weather, all of which affect institutions). A properly functioning IE system will still make many mistakes because of that. A realistic 3.3.1 would take this into account.
As it stands in the new and old version of 3.3.1, the modifying phrase “based on analysis of the results” is ambiguous. Does it mean that the evidence is based on analysis of reports or that the improvement is? That is, which is true:

“The institution identifies expected outcomes, assesses whether it achieves these outcomes, and provides evidence of [improvement based on analysis of the results]…”,

In this case, the improvements must be based on analysis of results, implying that the actions leading to improvement were evidence-based. Analysis led to action.
“The institution identifies expected outcomes, assesses whether it achieves these outcomes, and provides evidence [based on analysis of the results] of improvement…”

Here it’s the evidence, rather than any improvement, which is based on analysis of results.

This interpretation implies that the data demonstrate improvement.

Although to someone who doesn’t actually have to put this into practice, this may seem like counting the number of angels who can sit on a pin. But for the practitioner there is a crucial philosophical difference between the two interpretations, one that must eventually be confronted. It boils down to these two different assessment loops:

(a) First interpretation (action-based):
1. Set a goal
2. Measure attainment of the goal
3. Take reasonable actions, based on what you’ve learned, to try and improve the situation

(b) Second interpretation (evidence-based):
1. Set a goal
2 Measure attainment of the goal
3. Prove from analysis of the measurements that you’ve improved the situation

As I’ve already pointed out, the second interpretation leaves out some steps: taking action and measuring goal attainment a second time. I am arguing that this second interpretation is impractical in many situations. Unfortunately, in conversations with others in the business, it seems like the second interpretation may be more common. At the very least, this should be clarified.

Let me refer to the first, action-based, interpretation as AB, and the second, evidence-based as EB, for convenience.

While EB may sound reasonable as a policy, it rules out many popular kinds of assessment. For example, many institutions rely on surveys for formative assessments. Under EB, one would have to do a scientifically valid before-and-after survey to show compliance. This is difficult in practice, since the surveys are usually not done in a rigorous scientific manner. In general, surveys are better used to form ideas about what actions might be useful than they are at precisely identifying goal attainment. In fact, most subjective, formative assessments like focus groups, suggestion boxes, etc., would be invalid for most uses under the EB interpretation of 3.3.1 because they cannot give reliable before-and-after comparisons with statistical accuracy (at least under the conditions IR offices normally operate).

Despite this, practitioners and reviewers of IE programs often see this as a valid assessment loop:

1. Set a goal.

2. Measure aspects of goal achievement by surveying constituents.

3. Analyze survey results to identify an action to take.

4. Take action.

Notice, however, that this is not EB, since there is no evidence that any improvements are made. That is, unless you take the leap of faith that the actions taken in response to the survey improve goal attainment. But without direct evidence to that point, this is a reach. On the other hand, this is clearly AB.

Strict adherence to EB still leaves plenty of summative assessment techniques, usually a calculation like the number of graduates, the discount rate, standardized test, etc. With these, one can obtain before-and-after comparisons that are sometimes meaningful.

Of course, one can use both kinds of assessment.

In an ideal world, the assessment loop would look like this:

Set a goal.
Measure goal attainment with a summative assessment.
Identify possible strategies with a formative assessment.
Take actions based on analysis of the formative assessment.
Measure the goal attainment again with the summative assessment.
Show evidence of improvement, based on comparison of the summative assessments.

I’ll refer to this as the AB+EB model. This is probably the best approach for those goals for which a summative assessment is available. It is not, however, what is described in 3.3.1, which omits any reference taking action.

It must be admitted, however, that even pursuing AB+EB is still no guarantee of success. Just because we set a goal doesn’t mean that goal is even theoretically attainable, let alone that we will find a way to attain it practically.

Viewed in this light, the requirement in 3.3.1 of evidence of improvement seems too strong for many situations. If the goal is simply completion of a task, and by completion we improve some situation, then that goal trivially satisfies 3.3.1 as soon as the task is completed. But for more complex issues, like retention or student learning outcomes, what is really called for is an active process of trying things out to see what works. This is often a subjective judgment.

There are many situations where AB+EB is impractical. For example, learning outcomes are often measured in subjective ways that are not amenable to summative assessment. Our Art program has a very good sophomore review process, during which the faculty interview students and review their work and progress. This is highly individual per student, and the actions taken are customized accordingly. As an assessment process, it works very well, but could not be standardized to the point where the EB component would be possible.

The misplaced emphasis on EB creates a climate where excessive standardized testing for learning outcomes is favored over more effective subjective assessments. This is unfortunate, because the validity of these instruments is often questionable.

As an example, at a round-table discussion at the annual SACS meeting, several IR directors were discussing measuring general education outcomes. Several were of the opinion that you could just take a standardized test off the shelf, which had already been proven valid, and use it. This readiness to accept validity from a third party belies the very meaning of validity: “Does this test give meaningful answers to the questions I’m asking?” A test-maker isn’t the one asking the questions and setting the goals. Without a very thorough review of these issues, an artificial summative measure is likely to appear to give meaning where there actually is none.

To summarize: demonstration of action is more important and more practical than demonstration of success via a summative assessment. The AB model should be the standard described in 3.3.1 instead of the ambiguous language that can be interpreted as AB or AE (but probably AE).

One final note: I think Trudy Banta is right when she calls for more scholarship of assessment. The community would benefit from more published research, widely disseminated, on these topics. In order for higher education to adequately respond to calls for accountability (such as from the Commission on the Future of Higher Education), we need wide, intelligent, practical discourse on the subject of what constitutes effectiveness.

Higher Ed/

Saturday, April 15, 2006

SACS 3.3.1

Search This Blog