Tuesday, March 15, 2011

Improving the Principles of Accreditation

If you are in the SACS/CoC region, you know all about the Principles of Accreditation, the document that outlines accreditation standards, and which every institution must use to report compliance every ten years and (with fewer sections) every five years in between.  Until March 31, the Commission is accepting comments that will inform a review of said document. This is an excellent opportunity to make your voice heard in what is, after all, a peer-review process.

I have some observations I will post here for comment before sending them off to the Commission, in order to see if others agree or can suggest better approaches. I will restrict my comments to the institutional effectiveness (IE) sections. So here goes.

Note: CR = Core Requirement, CS = Comprehensive Standard, and FR = Federal Requirement

CR 2.10  The institution provides student support programs, services, and activities consistent with its mission that promote student learning and enhance the development of its students. (Student Support Services)
Although this isn't in the IE sections formally, it has a requirement that student support services "promote student learning and enhance the development of its students." This is a clear IE requirement, and is exceptional in that no other functional units, including academic programs, are required to pass this level of detailed IE review as a core requirement. Taken literally, an institution can be sanctioned severely for not assessing learning for student support services, which seems out of line with the more strategic level requirements that comprise the CR sections. I would suggest moving the IE language to 3.3.1.1, quoted below in the current version:
3.3.1 The institution identifies expected outcomes, assesses the extent to which it achieves these outcomes, and provides evidence of improvement based on analysis of the results in each of the following areas: (Institutional Effectiveness)
3.3.1.1 educational programs, to include student learning outcomes
3.3.1.2 administrative support services
3.3.1.3 educational support services
3.3.1.4 research within its educational mission, if appropriate
3.3.1.5 community/public service within its educational mission, if appropriate
Note that "student support services" doesn't appear. There are administrative and educational support services, but not student support services. The nomenclature needs to be cleaned up so we know exactly what we're talking about. One approach is to assume that any service that has learning outcomes is an educational support service, and therefore the modification could be as follows:
  1. End the statement of CR 2.10 after "mission," omitting the IE component.
  2. Change 3.3.1.3 to read "educational support programs, to include student learning outcomes."

CS 3.5.1  The institution identifies college-level general education competencies and the extent to which graduates have attained them. (College-level competencies)
This is the general education assessment requirement. Note that it doesn't appear in 3.3.1 unless general education is defined as a program by the institution. On the other hand, 3.5.1 is NOT an IE requirement--there is no statement about use of results to improve, just that you assess the extent to which students meet competencies.This is a knotty puzzle, so let me take it one part at a time.

First, the phrase "the extent to which" is ambiguous. Does it mean relative to an absolute standard or a relative one? This is by no means splitting hairs. If it means the former, then the institution MUST define what an acceptable competency is for each outcome, and presumably report out percentages that meet the standard. If it's a relative "extent to which" then simply reporting raw scores of a standardized test against national norms would work.
Example (absolute): Graduates will score 85% or more on the Comprehensive Brain Test. In 2010, 51% of graduates met this standard.

Example (relative): In 2010, graduates averaged 3.1 on the Comprehensive Brain Test, versus a national average of 2.9.
The standard is silent about the complexities of sampling graduates too. Are ratings from all graduates to be included? I would assume not, since this standard generally doesn't apply to IE processes because of the impracticably of it.

My sense of this is that we should allow institutions to define success in either absolute or relative terms, as best suits them, and include this standard with the other 3.3.1 sections so that it formally becomes part of IE. This will resolve the ambiguity about whether or not general education is program, and require that general education assessments actually be used for improvements. It also would broaden the scope to include students generally, not just graduates, who may have been out of general education courses for two years by then.

The modification could simply be to:
  1. Add: CS 3.3.1.6 general education, to include learning outcomes
  2. Delete: CS 3.5.1

Finally, let's look at 3.3.1.1 itself. The first issue is subtle. It concerns the meaning of the language "provides evidence of improvement based on analysis of the results". This can mean two different things, and I've seen it interpreted both ways, causing confusion.

The first interpretation is that it means that you have to demonstrate that improvement happened. This is a very high standard. It means things like benchmarking before changes are occurred, and then assessing the same way later on to see what impact occurred. When I hear speakers talk about QEP assessment, this is generally assumed, but it leaks over into the other IE areas too. Anytime you take a difference between two measurements it amplifies the relative error--this is a basic fact from numerical analysis. So you have to have very good assessments and they have to be objective (for reliability) and numerous (small standard error) and scalar (so you can subtract and still have meaning). Also, pre-post tests are the only method that can be used with any resemblance to scientific method. That is, you can't survey two different populations to compare unless you think you can explain all the variance between the populations (read Academically Adrift to see how problematic that is even for educational researchers). My objections to this are:
  1. No small program could ever meet this standard; the N will never be big enough.
  2. Many subjective assessments are very valuable, but are useless in this interpretation
  3. We don't yet have the technology to create scientific scalar indices of things like "complex reasoning" or other fuzzy goals, despite the testing companies' sales literature.[1]
  4. Random sampling is often impossible, which introduces biases that are probably not well understood
  5. Pre-post testing is quite limited, and severely restricts the kinds of useful assessments we might employ
The other interpretation is simply that we use analysis of assessment data to take actions that would reasonably be expected to improve things, but we don't have to prove it did. This is the standard I've seen most widely applied, except perhaps for QEP impact. Arguably the higher standard should apply to QEP, but  that's beyond the scope of my comments here.

In practical terms, the second interpretation is the most useful. It has to be borne in mind that assessment programs are mostly implemented by teaching faculty, who are probably not educational researchers by training, and in my experience tend to become frozen with a sort of helplessness if they think they are expected to track learning outcomes like a stock ticker tracks equity prices. I blogged about this a while back. Too much emphasis on proofs of improvement is paralyzing and counterproductive. On the other hand, free-ranging discussions that include the meaning of results, subjective impressions from course instructors, and other information that speaks to the learning outcome under consideration is a gold mine of opportunities to make changes for the better.

The most powerful argument against the strict (first) interpretation, however, is that there is simply no way to guarantee improvement on some index unless (1) one cheats somehow, manipulating the index, or (2) the index reflects some goal that is so obviously easy to improve that it's trivial. Either way the meaningfulness of the program vanishes, and we are left with many programs that are out of compliance (not showing improvement) or in compliance in name only (showing fake improvement or showing trivial improvement). 

My recommendation is to clarify the meaning of the language to read (bold emphasizes the change):
CS 3.3.1 The institution identifies expected outcomes, assesses the extent to which it achieves these outcomes, and takes actions based on analysis of the results in each of the following areas: (Institutional Effectiveness)
It should be obvious that the actions are intended to effect positive change, since this is rather the whole point of IE.

There is another issue with 3.3.1 that deserves attention. This concerns goals or outcomes that are not about student learning. It seems to me from the interpretations I've seen of 3.3.1 by reviewers and practitioners that the methods that apply to learning outcomes leak over to administrative areas, and that the expectations for compliance are tilted out of plumb. For example, there seems to be an expectation that every unit do surveys and have goals related to those, and that moreover this is necessary and sufficient.

Part of the problem may be the language "the extent to which," which almost insists on a scalar quantity.  But in fact, the most important goals for a given unit's effectiveness may have little to do with surveys and not be naturally a scalar.

One example I saw takes issue with a compliance report that presented "action steps" as goals. An action step might be "Approval of architectural drawings of the new library by 5/1/11." This sort of thing shows up all over Gantt charts:

(image courtesy of Wikipedia)

 This method of tracking complicated goals to completion is very common and very effective. The bars may or may not represent progress toward completion--they are simply timelines. So, for example, an OK stamp from the city engineer on your electrical plans is not a "percent to completion" item--it's either done or not. In other words it's Boolean.

Reviewers can be allergic to Boolean outcomes like:
  • Complete the library building on time and on budget.
  • Implement the new MS-Social Work program by Fall 2012
  • Gain approval via the substantive change process for a full online program in Dance by 2013 (good luck!)
  • Maintain a balanced budget every fiscal year.
For some reason, there seems to be a bias against this kind of goal, and I can't figure out why. These are obviously important operational items, key to the effectiveness of respective units. But a unit that includes these and doesn't have a satisfaction survey may be cited, whereas the reverse may be true too.

It may be that I'm making a mountain out of a termite hill here, but I think the language of "extent to which" could be changed to make it clearer that Boolean objectives are sometimes the natural way to express effectiveness goals.

For example, 3.3.1 could read (change bolded):
CS 3.3.1 The institution identifies expected outcomes, assesses success in a manner appropriate to each outcome, and takes actions based on analysis of the results in each of the following areas: (Institutional Effectiveness)
I know the non-parallel structure will offend the grammarians, but someone more expert can consider that conundrum.

One final nit-pick is that outcomes may not really be expected, but simply striven for--aspirational outcomes, in other words. If they are expected, then by nature they are less ambitious than they might otherwise be. I also put the odd dangling " in each of the following areas" at the front where it belongs, so my final version is this:
CS 3.3.1 In each of the following areas, the institution identifies aspirational outcomes, assesses success in a manner appropriate to each outcome, and takes actions based on analysis of the results: (Institutional Effectiveness)

[1] I've written much about assessing complex outcomes before, and this is not the page to rehash that issue. The short version is that learning happens in brains, and unless we understand how brains change when we learn, we are not able to speak about causes and effects as they relate to the physical world. See this article about London taxicab drivers to see a study that links learning to physiological changes to the brain. I don't mean to imply that assessing complex outcomes is useless, but just that we should be modest about our conclusions. It's called complex for a reason.

2 comments:

  1. Anonymous12:05 PM

    2.10 I am personally happy with the level of fuzzy overlap between 2.10 and 3.3.1. As I read this language 3.3.1 covers everything including student support services with its existing language, but 2.10 is a core requirement because institutions seeking initial accreditation MUST show that they provide such services and that at least some of them in a general way "promote student learning."
    3.5.1 Your suggestions here reflect my puzzlement about why this standard does not include the same language as 3.3.1 requiring a continuous improvement process. I do not think your proposed changes would be workable, though. Here's why: section 3.5 governs undergraduate education only. Many institutions accredited by SACS (med schools, seminaries) do not provide undergraduate education, and thus are exempted from the requirements of 3.5. If you move general ed to 3.3.1 then it applies to ALL institutions. I would vote instead to simply alter the language of 3.5.1 to replicate the language of 3.3.1.
    3.3.1 "improvement versus attempted improvement"--I applaud your comments and would be much happier personally with "takes actions" or even "takes actions and assesses the effectiveness of those actions." I am not sure the current state of educational politics will tolerate this, however.
    3.3.1 Boolean goals. I certainly understand this dilemma, and have fought it out locally. Boolean goals DO fit many administrative units. I'd vote for making your change, and for probably including a discussion of this issue in SACS training for IE reviewers.

    ReplyDelete
  2. Thanks for the comments. I see your point about 3.5.1. I wonder if it would suffice to add "if appropriate" as with 3.3.1.4 and 3.3.1.5. That's not perfect either, but I think it would be a very odd thing for an undergrad institution to try to make the case that it didn't have an interest in general education.

    ReplyDelete