[Previous comic][Next comic]
Photo credit (first panel) Garrettc (second and fourth panels): Kaptain Kobold ,
(third panel) Josh Bancroft, (fifth and last panels) bjornmeansbear
This comic may be distributed under Creative Commons.
Strategy 1. [P]eople are much more upset by prospective losses than they are pleased by equivalent gains.Here the idea is to frame opportunities in the negative. Compare:
We have an opportunity to become known for our liberal arts studies program.with
If we don't act, we'll sacrifice a chance to be the first ones to do this.Maybe this is why using the "wesayso" stick of accreditation is so effective: do this or else you may have to go job hunting.
Strategy 2. Volume mattersThis is the idea (which I seem to recall reading in Machiavelli's The Prince) of spreading out good news and administering bad news all at once. It's no coincidence that it's usual practice for the White House to dump bad news on a Friday afternoon, where it's less likely to be smeared out over a whole week's media frenzy.
The myth of assessment: if we only had better tests, it would be obvious how to improve teaching.This shows up early in the piece (emphasis added):
To improve the success rates of students who are unprepared for college-level work, community colleges must develop richer forms of student-learning assessment, analyze the data to discover best teaching practices, and get faculty members more involved in the assessment process[.]Although this isn't lit-crit gooble-de-goop like "monological imperatives," it's arguably more misleading because of the image of simplicity that's conjured up--the idea that good analysis of test results will show us how to teach better. It's actually a lot more complicated than that. The article goes on to be more specific, describing the results of a paper "Toward Information Assessment and a Culture of Evidence" by Lloyd Bond:
[M]easures should be expanded to include more informative assessments such as value-added tests, common exams across course sections, and recordings of students reasoning their way through problem sets[.]Value-added tests (such as pre- post-testing) may show areas of improvement, but not why they improved. For that you'd need controlled experiments across sections using different methods, and even then you only have a correlation, not a cause. Same with common exams. Transcripts of student reasoning could be good fodder for an intelligent discussion about what goes right or wrong, but can't by itself identify better teaching methods.
From the beginning of the project, the Carnegie team stressed the importance of having rich and reliable evidence—evidence of classroom performance, evidence of student understanding of content, evidence of larger trends toward progress to transfer level courses—to inform faculty discussion, innovation, collaboration and experimentation. Because teaching and learning in the classroom has been a central focus of the Carnegie Foundation’s work, our intent was to heighten the sensitivity of individual instructors, departments, and the larger institution generally to how systematically collected information about student learning can help them improve learning and instruction in a rational, incremental, and coherent way.The tests themselves provide rough guideposts for the learning landscape. It's the intelligent minds that review such data that leads to possible improvements (from page 3):
[T]he development, scoring, and discussion of common examinations by a group of faculty is an enormously effective impetus to pedagogical innovation and improvement.The effective process described is not successful because the exam showed the way, but rather because a dialogue among professionals sparks innovation. I've made the point before that when solving very difficult problems, the most robust approach is evolutionary--try something reasonable and see what happens. This report emphasizes that the "see what happens" part does not even rely on perfect data:
To summarize, encouraging a culture of evidence and inquiry does not require a program of tightly controlled, randomized educational experiments. The intent of SPECC was rather to spur the pedagogical and curricular imagination of participating faculty, foster a spirit of experimentation, strengthen capacity to generate and learn from data and evidence[...]The important part is not the test, but what you do with the results. This is the opposite conclusion one would reach from reading the quotes from the review article, which immediately devolves to the Myth.
[W]e have used several terms: pre-collegiate, developmental, remedial, and basic skills, recognizing that these are not synonymous and that, for better or worse, each brings its own history and values.I suggest that we add to the list "differently-learned."
One of the hallmarks of a senior graduate student is that he or she knows the types of tasks that require permission and those that don't.Others include tenacity and flexibility and interpersonal skills. Here Dr. Azuma writes:
Computer Science majors are not, in general, known for their interpersonal skills. [...] [Y]our success in graduate school and beyond depends a great deal upon your ability to build and maintain interpersonal relationships with your adviser, your committee, your research and support staff and your fellow students. [...] I did make a serious effort to learn and practice interpersonal skills, and those were crucial to my graduate student career and my current industrial research position.He then cites "Organizations: The Soft and Gushy Side" by Kerry J. Patterson, published in Fall 1991 issue of The Bent, which contains the nugget I want to highlight:
To determine performance rankings, we would place in front of a senior manager the names of the 10-50 people within his or her organization. Each name would be typed neatly in the middle of a three-by-five card. After asking the manager to rank the employees from top to bottom, the managers would then go through a card sort. Typically the executive would sort the names into three or four piles and then resort each pile again. Whatever the strategy, the exercise usually took only minutes. Just like that, the individual in charge of the professionals in question was able to rank, from top to bottom, as many as 50 people. It rarely took more than three minutes and a couple of head scratches and grunts. Three minutes. Although politics may appear ambiguous to those on the receiving end, those at the top were able to judge performance with crystal clarity.This is what actual assessment looks like. It happens all the time, formally and informally. Our supervisors, colleagues, neighbors, and friends constantly assess us just like we do them--it's the nature of living in a tribe. Are these impressions valid? If there's enough feedback through inter-rater reliability, then this fact alone will create validity. For example, if a co-worker has few social skills, such that he becomes the butt of jokes at the office, this very fact probably makes it less likely he'll be promoted or enjoy the reciprocation of favors that makes teamwork effective. The implicit agreement of characteristics is powerful.
Once Mischel began analyzing the results, he noticed that low delayers, the children who [couldn't wait], seemed more likely to have behavioral problems, both in school and at home. They got lower S.A.T. scores. They struggled in stressful situations, often had trouble paying attention, and found it difficult to maintain friendships. The child who could wait fifteen minutes had an S.A.T. score that was, on average, two hundred and ten points higher than that of the kid who could wait only thirty seconds.I had just finished reading this when my 11-year old daughter came in an begged to go to the bookstore--she'd gotten a gift certificate for her birthday.
"But they're almost closed," I said, "you'll only have 15 minutes by the time we get there."She didn't even think about it. We went right away. As it turns out, the store closed later than I thought, and we had an hour to shop, but it was jolting to have this conversation right after reading the New Yorker piece. Why isn't this kind of thing--including assessment--part of the curriculum? Take a look at the AAC&U piece about trends in general education and see if you can spot non-cognitives anywhere. There are experiences that would probably lead to non-cognitive development (internships, for example), but they aren't addressed directly. I think it's time we seriously considered non-academic factors that are important to success and begin to talk about them.
"I want to go anyway," she said.
"How about this," I proposed, "We go tomorrow instead, and you can have two hours if you want."
20 years from now: Consumer Reports will be assessing the quality of BA degrees, right along side washing machines and flying-mobiles. Parents will ask, "why should we pay 3 times the cost when Consumer Reports says that there is only a 2% increase in quality?!"Here, a second assumption compounds the first, viz, that college ratings easily translate into worth in dollars. I scratch my head over this sort of thing, which also came out of the Spellings Commission. If what we really care about is dollars, then why not just focus on salary histories of graduates? The US government already publishes volumes of reports on such things a average salary of an engineering graduate. Why not add one more dimension, so that the school issuing the diploma can be identified?
Wealthy institutions, such as the small elite liberal arts colleges which charge over $50,000 in comprehensive fees, and private elite universities know that keeping prices high is the surest way to attract the wealthiest customers who will also become future donors. This is allays the motivating factor at my institution, where the president is always public about staying among the elite by charging high tuition and by regularly raising tuition above 6%. "we have to remain at the mean of our peers" is the justification.I remember exactly this kind of conversation with institutional researchers at a round table discussion a couple of years ago. One elite college was raising rates dramatically year after year to "catch up" to the competition. Is it worth it? Is the Harvard experience worth more because of the contacts you will make? You bet it is. How is that going to be measured with a standardized rating system?
Sidebar to whiney profs. Should you ever need brain surgery (or a car, or HDTV, etc.), let us know, we’ll hook you up with someone who shares your beliefs; i.e., outcomes and impact can’t be measured (except by you when you decide how well your students are doing) and inputs are sufficient to assess quality.I couldn't find any assessment advice at InterEd.com, but the company advertises itself as a higher ed consulting business (apparently specializing in adult ed). Taking the comment at face value, it asks us whiney profs to understand that clear outcomes are determinable and desirable when employing a neurologist or when buying durable goods. That's certainly true enough. The implication is that this should be also true of educational products. "Steve's" adendum to Mr. Tucker's comment extends this dubious logic. If you look closely, you can see the flailing arms.
[E]ven those who insist that what Higher Ed does cannot possibly be assessed still seem to want to have doctors who are licensed, lawyers who've passed the bar, and plumbers who've demonstrated their competence.This is a common confusion, I think, and the reason for highlighting these embarassing passages. The passing of a bar exam and demonstrating competence are two different things. Would you rather fly with a pilot who's passed all the fill-in-the bubble tests, or one who's landed at your airport successfully many times? I'm quite sure all of the Enron accountants were fully certified, and that malpractice is accomplished by doctors who passed their boards. I'm sure there are incompetent lawyers who've passed the bar. A plumber is competent if he or she makes enough money at it to stay in the plumbing business. A test result may have some correlation with demonstrated competence, but they are not the same thing.
20 years from now: Consumer Reports will be assessing the quality of BA degrees, right along side washing machines and flying-mobiles. Parents will ask, "why should we pay 3 times the cost when Consumer Reports says that there is only a 2% increase in quality?!"This is one of those statements that it makes no sense to even dissect; if you believe it to be true, then no amount of argument is going to change your mind. This sort of epistomological black hole seems common in American society. If you haven't read it, take a look at Susan Jacoby's The Age of American Unreason--an excellent book on the declining rationality of the body politic.
Assumption One: The best way to improve student performance and close achievement gaps is to establish rigorous content standards and a core curriculum for all schools—preferably on a national basis.Proponents of hard assessment ought to be informed by other kinds of "measurement," like ratings of companies and their bond issues by large well-paid companies like Standard and Poor. Their project involves mainly crunching numbers to see what the prognosis is--very much more quantitative than assessing learning. You'd think the success rate would be pretty good. I think the recent financial mess indicates clearly that such ratings are not very good.
Assumption Two: Standardized-test scores are an accurate measure of student learning and should be used to determine promotion and graduation.
Duval [County public school] students passed 80 percent of their AP courses last year with a "C" or better. But only 23 percent of the national AP exams, taken near the end of those courses, were passed.
The national exam pass rate for public schools was 56 percent.
In other words, students successfully complete a course that is essentially a preparation for a standardized test, and then fail said test. The College Board, which reviewed the results, blames the effect on under-prepared students taking the courses combined with inexperienced teachers. When I read this, I thought it was perhaps a good opportunity to look for evidence of a phenomenon I've said should exist: that standardized tests are more valid for low-complexity subjects of study. Here, complexity means in the computational sense (search for the word in my blog for lots more on the subject). If we assumed all things equal (dubious, but I have no choice) then preparation courses for lower complexity subjects ought to be more effective than for higher complexity subjects. This would manifest itself in the correlation between passing tests and passing AP exams. This is all possible, because the statistics for the school district in question are posted online.
In pure complexity terms, math is low complexity and languages are higher complexity. This is easy to see--math is a foreign language with little new vocabulary and a few rules. Spoken languages have massive vocabulary and many, often arbitrary-seeming, rules of grammar. So if my theory is any good, it ought to be the case that math courses can prepare a student better than language courses for a standardized test, all else being equal. Also, the overlap in students between the two courses is probably pretty good since college-bound seniors will be taking both a foreign language and math. Of course, even learning a foreign language is mostly committing deductive processes to memory, and hence of not the highest complexity (inductive processes would be). So this is a contest between low complexity and, shall we say, medium complexity.
Here are the results:
I debated whether or not to include both calculus courses (clearly, more advanced students are in the BC section). I also assumed that the three languages are equally complex, although in practice Spanish dominates. If a student passed a calculus course, he or she had a 48% chance of passing the AP subject test. For languages (the more complex subject), only 39% of those who passed the course went on to succeed on the exam.
Does this prove anything? Not really--there are too many uncontrolled variables. But it's still fun to push this as far as it can go. If the complexity to difficulty relationship holds, we would expect that the subject with the worst test/course pass ratio would be the most complex subject. Of course, sample size plays a role, so let's agree (before I look at the numbers) that there had to be at least N=50 to qualify. For all tests combined, the average test/course ratio was 29%. Anything lower than that would be lower than average preparation for the test (and higher complexity maybe). The least effective (or most complex) course was a three-way tie, with a 16% conditional probability of passing the AP test, given that the course had been passed. The three subjects were World History, Human Geography, and Micro Economics. These each had enrollments in the hundreds and thousands.
Are these subjects the hardest to test because of complexity? It's easy to guess that the first two might be, cluttered with endless facts and fuzzy theories. Micro Economics is much more like chemistry or physics, one would think. Chemistry scored a low 22%, but Physics B was 58%.
This was fun, but it's hard to really make the case that complexity is the driving force here. It does give me some ideas, however, about comparing difficulty (course pass rate) versus complexity (course to test pass ratio). Meanwhile, I still have the placement test project to try out...
[Update: here's an interesting article about the effectiveness of the calculus AP test]
[T]raditional scoring, which treats students' responses as absolute (effectively a 0 and 1 based probability distribution), begs the question: Is a student's knowledge black and white? How can a student express belief in the likelihood that an alternative may be correct? Further, how can a student's ability to carry out a process be traced and evaluated? Addressing these questions requires going beyond traditional multiple-choice testing techniques.A couple of days ago I wrote about Ed Nuhfer's knowledge surveys, which approximates student confidence in subject material with a survey. Jody's idea extends this to a testing environment. Obviously there are differences between surveys and tests. One might expect students to be honest about their confidence in a survey, or perhaps underestimate it slightly, because they may see it as affecting the review and test itself. On a test, a student has nothing obvious to gain by admitting uncertainty. That changes if "near-misses" are partially rewarded. This is like partial credit on a pencil and paper test. But how can one indicate such subtleties on a multiple-choice test?
Dr. Ed Nuhfer (California State University - Channel Islands) has worked extensively in the development of Knowledge Surveys that cause the faculty member to develop clear, chronological expected student learning outcomes and then conduct pre and post-tests on student confidence to demonstrate these skills.The idea is to find out if students think they know how to successfully answer items on a test. It didn't take long to track down research papers on this idea entitled "The Knowledge Survey: A Tool for All Reasons" by Ed Nuhfer and "Knowledge Surveys: What Do Students Bring to and Take from a Class?" by Delores Knipp.
Handbook of Research on Assessment Technologies, Methods, and Applications in Higher Education
ISBN: 978-1-60566-667-9; 500 pp; May 2009
Published under the imprint Information Science Reference (formerly Idea Group Reference)
http://www.igi-global.com/reference/details.asp?id=34254
Edited by: Christopher S. Schreiner, University of Guam, Guam
DESCRIPTION
Educational institutions across the globe have begun to place value on the technology of assessment instruments as they reflect what is valued in learning and deemed worthy of measurement.
The Handbook of Research on Assessment Technologies, Methods, and Applications in Higher Education combines in-depth, multi-disciplinary research in learning assessment to provide a fresh look at its impact on academic life. A significant reference source for practitioners, academicians, and researchers in related fields, this Handbook of Research contains not only technological assessments, but also technologies and assumptions about assessment and learning involving race, cultural diversity, and creativity.
****************************************
"There is a stunning range of inquiry in this well-edited book, which certainly exceeds the bounds of the usual handbook in so far as it is always readable and stimulating. Every department chair and assessment coordinator needs a copy, but so do faculty members seeking to get aboard the assessment train that has already left the station. Bravo to IGI Global and the editor for gathering such exceptional essays and articles under one cover!"
- Dr. Michel Pharand, The Disraeli Project, Queen's University, Canada
****************************************
TOPICS COVERED
Assessment applications and initiatives
Assessment technologies and instruments
Collaborations for writing program assessment
Communication workshops and e-portfolios
Creativity assessment in higher education
Effective technologies to assess student learning
Faculty-focused environment for assessment
Instructional delivery formats
Method development for assessing a diversity goal
Multi-tier design assessment
Reporting race and ethnicity in international assessment
Technology of writing assessment and racial validity
For more information about Handbook of Research on Assessment Technologies, Methods, and Applications in Higher Education, you can view the title information sheet at http://www.igi-global.com/downloads/pdf/34254.pdf. To view the Table of Contents and a complete list of contributors online go to http://www.igi-global.com/reference/details.asp?ID=34254&v=tableOfContents. You can also view the first chapter of the publication at http://www.igi-global.com/downloads/excerpts/34254.pdf.
ABOUT THE EDITOR
Christopher S. Schreiner is Professor of English and Chair of the Division of English and Applied Linguistics at the University of Guam. Before teaching on Guam, he was Professor of Literature at Fukuoka Women’s University in Japan, and Professor of Integrated Arts and Sciences at Hiroshima University. He has coordinated assessment for the Division of English and Applied Linguistics in preparation for the WASC visit, and authored the summary assessment report for the grant-funded Project HATSA in the College of Liberal Arts and Social Sciences at the University of Guam. One of his recent articles, “Scanners and Readers: Digital Literacy and the Experience of Reading” appeared in the IGI Global book, Technology and Diversity in Higher Education (2007).