<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-20035359</id><updated>2012-01-28T20:52:01.320-05:00</updated><category term='gre'/><category term='learning outcomes'/><category term='phones'/><category term='assessment'/><category term='news'/><category term='books'/><category term='collaboration'/><category term='accountability'/><category term='positivism'/><category term='measurement'/><category term='CAAP'/><category term='strategy'/><category term='analytics'/><category term='motivation'/><category term='meta-cognition'/><category term='bell curve'/><category term='personality'/><category term='tuition'/><category term='spam'/><category term='rss'/><category term='geotagging enrollment map yahoo.pipes'/><category term='reliability'/><category term='online resources'/><category term='email'/><category term='academic freedom'/><category term='mashup'/><category term='probability'/><category term='talent'/><category term='self-appraisal'/><category term='halloween'/><category term='dimensions'/><category term='higher education'/><category term='reading'/><category term='course statistics'/><category term='visualization'/><category term='formative'/><category term='vocation'/><category term='UMIFA'/><category term='soccer'/><category term='ACT'/><category term='dawkins'/><category term='LSAT'/><category term='student loans'/><category term='sci-fi'/><category term='dilbert'/><category term='brain'/><category term='backchannel'/><category term='memory'/><category term='philosophy'/><category term='literacy'/><category term='ideas'/><category term='faculty development'/><category term='porfolio'/><category term='employment'/><category term='prezi'/><category term='incentives'/><category term='college board'/><category term='rubrics'/><category term='creative'/><category term='epistemology'/><category term='standardized testing cla validity'/><category term='motorcycles'/><category term='lecture'/><category term='knowledge surveys'/><category term='college rankings'/><category term='institutional research'/><category term='bandwidth'/><category term='kolmogorov'/><category term='CAT'/><category term='SACS'/><category term='marketing'/><category term='mba'/><category term='design'/><category term='etherpad'/><category term='for-profit'/><category term='statistics'/><category term='meetings'/><category term='attrition'/><category term='Gömböc'/><category term='correlation'/><category term='error'/><category term='bureaucracy'/><category term='conferences'/><category term='certiport'/><category term='IUPUI'/><category term='QEP'/><category term='education'/><category term='tragedy of the commons'/><category term='summative'/><category term='technology'/><category term='course evaluation'/><category term='FUD'/><category term='majors'/><category term='perl'/><category term='CLA'/><category term='search engine'/><category term='Donald Michie'/><category term='standardized tests'/><category term='michigan state university'/><category term='seneca'/><category term='ROTC'/><category term='accreditation'/><category term='risk'/><category term='application'/><category term='leadership'/><category term='artificial life'/><category term='assessment higher education conference NCSU'/><category term='Dostoevsky'/><category term='strategic planning'/><category term='creativity'/><category term='zimmer'/><category term='dialogue'/><category term='brainstorming'/><category term='PELL'/><category term='word cloud'/><category term='whisky'/><category term='aalhe'/><category term='planning'/><category term='bread'/><category term='retention'/><category term='ratings'/><category term='mindmap'/><category term='e-learning'/><category term='standardization'/><category term='learning'/><category term='inductive'/><category term='artificial intelligence'/><category term='Mankell'/><category term='Principles of Accreditation'/><category term='repository'/><category term='teaching'/><category term='degrees'/><category term='outcomes assessment'/><category term='stakeholder'/><category term='knowledge'/><category term='online education'/><category term='math'/><category term='ebooks'/><category term='vodcasting'/><category term='rubric'/><category term='effectiveness'/><category term='anti-intellectualism'/><category term='assessment higher education'/><category term='meeting'/><category term='general eduation'/><category term='income'/><category term='bubble'/><category term='degree'/><category term='ets'/><category term='publishing'/><category term='AACU'/><category term='arxiv.org'/><category term='business school'/><category term='class logistics'/><category term='discount rate'/><category term='straighterline'/><category term='self-publishing'/><category term='averages'/><category term='academic fiction'/><category term='loans'/><category term='words'/><category term='identity'/><category term='twitter'/><category term='credentials'/><category term='standards'/><category term='machiavelli'/><category term='NSSE'/><category term='yahoo.pipes'/><category term='highered'/><category term='data compression'/><category term='noncognitive'/><category term='monologue'/><category term='writing'/><category term='assessment institute'/><category term='2020'/><category term='genes'/><category term='deductive'/><category term='university'/><category term='pottery'/><category term='blue hat syndrome'/><category term='eportfolio'/><category term='LEAP'/><category term='data mining'/><category term='SQL'/><category term='funny'/><category term='colleges'/><category term='web'/><category term='encoding'/><category term='recruiting'/><category term='unit'/><category term='liberal arts'/><category term='dimension'/><category term='comic'/><category term='google trends'/><category term='IQ'/><category term='college ratings'/><category term='projects'/><category term='game theory'/><category term='open source'/><category term='co-curricular'/><category term='Clemson'/><category term='US News'/><category term='survival'/><category term='presentation'/><category term='trends'/><category term='personal learning environment'/><category term='information literacy'/><category term='psychology'/><category term='cost'/><category term='bloom&apos;s taxonomy'/><category term='intelligence'/><category term='grading'/><category term='web 2.0'/><category term='k-12'/><category term='ANOVA'/><category term='endowments'/><category term='percents'/><category term='student evaluations'/><category term='value-added'/><category term='craigslist'/><category term='squid overlords'/><category term='dunning-kruger'/><category term='abandoned'/><category term='multiple-choice'/><category term='committees'/><category term='value added'/><category term='blogs'/><category term='Cantor'/><category term='reporting'/><category term='notes'/><category term='humor'/><category term='simulation'/><category term='Bologna Club'/><category term='market research'/><category term='reports'/><category term='geotagging'/><category term='logic'/><category term='paradox'/><category term='WoW'/><category term='open courses'/><category term='stoics'/><category term='economy'/><category term='college'/><category term='edupunk'/><category term='grades'/><category term='financial aid'/><category term='WGU'/><category term='Tom Zane'/><category term='difficulty'/><category term='learning outcomes example'/><category term='decisions'/><category term='salary'/><category term='vodcast'/><category term='prerequisites'/><category term='classroom'/><category term='Rodin'/><category term='software'/><category term='textbooks'/><category term='reference'/><category term='mapp'/><category term='higher ed'/><category term='dropbox'/><category term='quality'/><category term='library portal'/><category term='expertise'/><category term='grit'/><category term='validity'/><category term='testing'/><category term='crowdsourcing'/><category term='architecture'/><category term='disposition'/><category term='self-assessment'/><category term='recursion'/><category term='Turing'/><category term='simplicity'/><category term='pricing'/><category term='rules'/><category term='proxy'/><category term='ideology'/><category term='irony'/><category term='academically adrift'/><category term='workflow'/><category term='ignorance'/><category term='grade inflation'/><category term='IT'/><category term='NCLB'/><category term='critical thinking'/><category term='yammer'/><category term='map'/><category term='directory'/><category term='CIRP'/><category term='factor analysis'/><category term='Cologne'/><category term='complexity'/><category term='banking'/><category term='climate'/><category term='evolution'/><category term='department of education'/><category term='Moravec&apos;s paradox'/><category term='problem solving'/><category term='Bologna process'/><category term='SPSS'/><category term='enrollment'/><category term='portfolio'/><category term='waypoint'/><category term='induction'/><category term='cms'/><category term='cheating'/><category term='analysis'/><category term='Hamming'/><category term='polling'/><category term='peer review'/><category term='normal distribution'/><category term='internet'/><category term='German'/><category term='chat'/><category term='forms'/><category term='happiness'/><category term='football'/><category term='recruitment'/><category term='AP test'/><category term='prediction'/><category term='social groups'/><category term='3.3.1'/><category term='CLA NSSE value-added SAT'/><category term='science'/><category term='thinking'/><category term='grants'/><category term='financial aid popcorn institutional research'/><category term='powerpoint'/><category term='meme'/><category term='admininstration'/><category term='SAT'/><category term='recession'/><category term='economies of scale'/><category term='GPA'/><category term='linguistics'/><category term='institutional_research'/><category term='budget'/><category term='translation'/><category term='tenure'/><category term='HEOA'/><category term='transfers'/><category term='Zog&apos;s Lemma'/><category term='farming'/><category term='universities'/><category term='games'/><category term='communication'/><category term='syracuse university'/><category term='institutional effectiveness'/><category term='admissions'/><category term='TLT'/><category term='chart'/><category term='beans'/><category term='economics'/><category term='first-generation'/><category term='web2.0'/><category term='matrix'/><category term='food'/><category term='analytical'/><category term='Westwood College'/><category term='vsa'/><category term='surveys'/><category term='religion'/><category term='FACS'/><category term='IE'/><category term='public policy'/><category term='publication'/><category term='collatz'/><category term='standardized test'/><category term='high point university'/><category term='harvesting feedback'/><category term='fiction'/><category term='open education'/><category term='UPMIFA'/><title type='text'>Higher Ed/</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://highered.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><link rel='next' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default?start-index=101&amp;max-results=100'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>393</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-20035359.post-4593372040283465438</id><published>2012-01-28T20:51:00.001-05:00</published><updated>2012-01-28T20:52:01.345-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='FACS'/><category scheme='http://www.blogger.com/atom/ns#' term='QEP'/><category scheme='http://www.blogger.com/atom/ns#' term='SACS'/><title type='text'>Assessing a QEP</title><content type='html'>On Wednesday, &lt;a href="http://www.guilford.edu/"&gt;Guilford College&lt;/a&gt; hosted a &lt;a href="http://www.ncicu.org/"&gt;NCICU&lt;/a&gt; meeting about &lt;a href="http://sacscoc.org/"&gt;SACSCOC&lt;/a&gt; accreditation. I had volunteered to do a very short introduction to my experience with the Quality Enhancement Plan (QEP) at &lt;a href="http://www.coker.edu/"&gt;Coker College&lt;/a&gt;, since I had seen the thing from inception to impact report. I got permission from Coker to release the report publicly, so here it is:&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;a href="http://zzascape.com/CokerCollegeImpactReport.pdf"&gt;Download Coker College's QEP Impact Report&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The whole fifth year report passed with no recommendations, and the letter said nice things about the QEP, so it's reasonable to assume that it's an acceptable exemplar to use in guiding your own report.&lt;br /&gt;&lt;br /&gt;Assessment of the QEP program is an important part of the impact report, and this is a good place to record how that worked. The QEP at Coker was about improving writing effectiveness in students, and we tried several ways of assessing success. Only one of these really worked, so I will describe them in enough detail so you don't repeat my mistakes. Unless you just feel compelled to.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Portfolio Review.&lt;/b&gt;&lt;br /&gt;I hand-built a web-based document repository (see "&lt;a href="http://highered.blogspot.com/2009/06/dropbox-idea.html"&gt;The Dropbox Idea&lt;/a&gt;" for details) to capture student writing. After enough samples were accumulated, I spent a whole day randomly sampling students in four categories: first year/fourth year vs day/evening. There were 30 of each, for 120 students. Then I sampled three writing samples from each to create a student portfolio. There was some back and forth because some students didn't have three samples at that point. I used a box cutter to redact student names, just like I imagine the CIA does. Each portfolio got an ID number that would allow me to look up who it was.&amp;nbsp;The Composition coordinator created a rubric for rating the samples, and one Saturday we brought in faculty,&amp;nbsp;administrators, adjuncts, and a high school English teacher to rate the portfolios. We spent a good part of the day applying the rubric to the papers, and many of the papers were rated three times. All were rated at least twice by different raters.&lt;br /&gt;&lt;br /&gt;The results were disappointing. There were some faint indications of trends, but mostly it was noise, and not useful for steering a writing program. In retrospect, there were two conceptual problems. First, the papers we were looking at were not standardized. It's hard to compare a business plan to a short story. Second, the rubrics were not used in the assignments, but conjured later when we wanted to assess. It's essential for rubrics to be effective that they be as integrated as possible into the construction of the assignment.&lt;br /&gt;&lt;br /&gt;So this was a lot of work for a dud of a report, most of which is probably my fault.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Pre-post Test&lt;/b&gt;&lt;br /&gt;One of the administrators decided we should apply a writing placement test, which we already had data for, as a measure of writing gain by giving it again as a post-test after students took the ENG 101 class.&amp;nbsp;The assignment was to find and correct errors in sample sentences.&amp;nbsp;The English instructors told us it wouldn't work and it didn't. More noise.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Discipline-Specific Rubrics&lt;/b&gt;&lt;br /&gt;We did, in fact, learn something from the rubric fiasco. We allowed programs to create their own rubrics, which could be applied to assignments in the repository. So an instructor could look at a work, pull up the custom rubric, and rate it right there and then. Since the prof knew the assignment, this seemed like a way to get more meaningful results. I think this would have worked, but by the time we got all the footwork done, the QEP program was a couple of years under way. I left the college before it was possible to do a large-scale analysis of the results that were in the database. In summary: good idea, executed too late.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Direct Observation by Faculty&lt;/b&gt;&lt;br /&gt;Back in 2001, when I got the job of being SACSCOC liason, I got a copy of the&lt;i&gt;&amp;nbsp;&lt;/i&gt;brand new &lt;i&gt;Principles&lt;/i&gt;&amp;nbsp;and started reading. The more I read, the more I was terrified. And nothing frightened me more than CS 3.5.1, the standard on general education. I didn't know at the time that the standard said one thing, but everyone interpreted in a completely different way (it was written as a minimum standard requirement, but everyone looked for continuous improvement). So I was one of those people you see at the annual meeting who look like they are on potent narcotics, drifting around with a dazed look at the&amp;nbsp;enormousness&amp;nbsp;of the challenge. (Note: I think they should hand out mood rings at the annual meeting so you can see how stressed someone is before you talk to them.)&lt;br /&gt;&lt;br /&gt;In an act of desperation, I led an effort to create what we now call the Faculty Assessment of Core Skills (FACS), which is nothing more than subjective faculty ratings of liberal arts skills demonstrated by students in their classes. The skills included writing effectiveness. At the end of the semester, each instructor was supposed to give a subjective rating to each student taught for observed skills on the list. You can read all about this in the &lt;i&gt;&lt;a href="http://zzascape.com/elephant.pdf"&gt;Assessing the Elephant&lt;/a&gt;&lt;/i&gt; manuscript, or &lt;a href="http://highered.blogspot.com/search?q=FACS"&gt;on this blog&lt;/a&gt;, or in one of the three &lt;a href="http://www.zzascape.com/Resume.rtf"&gt;books&lt;/a&gt; I wrote chapters for on the subject.&lt;br /&gt;&lt;br /&gt;Because we had started the FACS before the QEP, we had baseline data, plus data for every semester during the project's life. Thousands and thousands of data points about student writing abilities. When we started the FACS I didn't have much hope for it--it was a "Hail Mary" pass at CS 3.5.1. But as it turns out, it was exactly what we needed. We were able to show that FACS scores improved faster for students who had used the writing lab than those students who didn't. Moreover, this effect was sensitive to the overall ability of the student, as judged by high school grades. &amp;nbsp;See "&lt;a href="http://highered.blogspot.com/2010/10/assessing-writing.html"&gt;Assessing Writing&lt;/a&gt;" for the details.&lt;br /&gt;&lt;br /&gt;I have given many talks about the FACS over the years, and get interesting reactions. One pair of psychologists seemed amazed that anything so&amp;nbsp;blatantly&amp;nbsp;subjective could be useful for anything at all, but they were very nice about it. When I post FACS results on the ASSESS-L list serve, you can hear the crickets chirping afterwards. I guess it doesn't seem dignified because it doesn't have a reductionist pedigree.&lt;br /&gt;&lt;br /&gt;So I was shocked at the NCICU meeting, when SACSCOC Vice President &lt;a href="http://www.sacscoc.org/SSheeley.asp"&gt;Steve Sheeley&lt;/a&gt; said things like (my notes, probably not his exact words) "Professors' opinions as professionals are more important than standardized tests," and "Professors know what students are good at and what they are not good at."&lt;br /&gt;&lt;br /&gt;The reason for my reaction is that when one hears official statements about assessment, it's almost always emphasized that it has to be suitably scientific. "Proven valid and reliable" is a standard formula, and certainly "measurable"&amp;nbsp;figures (see "&lt;a href="http://highered.blogspot.com/2009/04/part-seven-measurement-smeasurement.html"&gt;Measurement Smesurement&lt;/a&gt;" for my opinion on that). However it is stated, there isn't much room for something as touchy-feely as subjective opinions of course instructors. I do give good arguments for both validity and reliability in &lt;i&gt;Assessing the Elephant&lt;/i&gt;, but FACS is never going to look like a psychometrician's version of assessment. So it was a shock and a very pleasant surprise to hear a note of common sense in the assessment symphony. I think when Steve made that remark, he assumed that this special knowledge professors&amp;nbsp;acquire&amp;nbsp;after working with students was simply&amp;nbsp;inaccessible&amp;nbsp;as assessment data. But it's not, and by now Coker has many thousands of data points over more than a decade to prove it. And it turned out to be the key to showing the QEP actually worked.&lt;br /&gt;&lt;br /&gt;I have implemented the FACS at JCSU, and created a cool dashboard for it. I showed this off at the meeting, and you can download a sample of it &lt;a href="http://zzascape.com/facs.html"&gt;here&lt;/a&gt; if you want. The real one is interactive so you can disaggregate&amp;nbsp;the data down to the level you want to look at, even generating &lt;a href="http://highered.blogspot.com/2011/01/individual-facs-reports.html"&gt;individual student reports&lt;/a&gt; for advisors. Setting up and running the FACS is trivial. It costs no money, takes no time, and you get rich data back that can be used for all kinds of things. Everyone should do this as a first, most basic, method of assessment.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-4593372040283465438?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/4593372040283465438/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2012/01/assessing-qep.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/4593372040283465438'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/4593372040283465438'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2012/01/assessing-qep.html' title='Assessing a QEP'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-6907136053006081607</id><published>2012-01-25T06:22:00.000-05:00</published><updated>2012-01-25T06:25:03.205-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='epistemology'/><category scheme='http://www.blogger.com/atom/ns#' term='meta-cognition'/><title type='text'>Closed and Open Thinking</title><content type='html'>Most readers will know &lt;a href="http://en.wikipedia.org/wiki/Occam%27s_razor"&gt;William of Occam's principle&lt;/a&gt; about not multiplying eventualities unnecessarily. It's commonly thought of as "the simplest explanation is the best explanation." I learned about a countervailing principle in Arora and Barak's &lt;a href="http://www.amazon.com/Computational-Complexity-Approach-Sanjeev-Arora/dp/0521424267/ref=sr_1_3?s=books&amp;amp;ie=UTF8&amp;amp;qid=1327339465&amp;amp;sr=1-3"&gt;Computational Complexity: A Modern Approach&lt;/a&gt;. It's even older than the venerable Mr. Occam, dating back to the&amp;nbsp;Epicureans, and it states that we should not abandon any explanation that is&amp;nbsp;consistent&amp;nbsp;with the facts. I have mentioned this before, but I had an interesting thought at lunch today: what if this tension between efficiency and open-mindedness is at the heart of the Dunning-Kruger&amp;nbsp;effect? In case you've missed that bit of news, here's the introduction from the &lt;a href="http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect"&gt;Wikipedia entry&lt;/a&gt;:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;The &lt;b&gt;Dunning–Kruger effect&lt;/b&gt; is a &lt;a href="http://en.wikipedia.org/wiki/Cognitive_bias" title="Cognitive bias"&gt;cognitive bias&lt;/a&gt; in which unskilled people make poor decisions and reach erroneous conclusions, but their incompetence denies them the &lt;a class="mw-redirect" href="http://en.wikipedia.org/wiki/Metacognitive" title="Metacognitive"&gt;metacognitive&lt;/a&gt; ability to recognize their mistakes.&lt;sup class="reference" id="cite_ref-morris_0-0"&gt;&lt;a href="http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect#cite_note-morris-0"&gt;[1]&lt;/a&gt;&lt;/sup&gt; The unskilled therefore suffer from &lt;a href="http://en.wikipedia.org/wiki/Illusory_superiority" title="Illusory superiority"&gt;illusory superiority&lt;/a&gt;, rating their ability as above average, much higher than it actually is, while the highly skilled underrate their own abilities, suffering from illusory inferiority.&amp;nbsp;&lt;/blockquote&gt;&lt;blockquote class="tr_bq"&gt;Actual competence may weaken self-confidence, as competent individuals may falsely assume that others have an equivalent understanding. As Kruger and Dunning conclude, "the miscalibration of the incompetent stems from an error about the self, whereas the miscalibration of the highly competent stems from an error about others" (p. 1127).&lt;sup class="reference" id="cite_ref-Kruger_1-0"&gt;&lt;a href="http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect#cite_note-Kruger-1"&gt;[2]&lt;/a&gt;&lt;/sup&gt; The effect is about paradoxical defects in cognitive ability, both in oneself and as one compares oneself to others.&lt;/blockquote&gt;This just puts some research behind what Bertrand Russell is quoted as having said:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;span class="st"&gt;The trouble with the world is that the stupid are cocksure and the intelligent are full of doubt.&lt;/span&gt;&lt;/blockquote&gt;So what we have is two epistemologies, and we shouldn't be hasty to choose one as better than the other, despite the obvious bias of the quotes above.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Method 1 (Closed). &lt;/b&gt;Obtain a small amount of evidence, and create the most restrictive explanation that fits the facts. Subsequent facts that come to surface do not affect the conclusion.&lt;br /&gt;&lt;br /&gt;William of Occam would probably sue me for defamation if he were around to read this. I have intentionally restated his principle in a very narrow sense in order to contrast it with:&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Method 2 (Open).&lt;/b&gt; Continually gather information and create increasingly complex explanations that account for all the observations. Although the current explanation may be the simplest one that fits the facts, no explanation is ever final--all the others that are&amp;nbsp;consistent&amp;nbsp;with facts are kept in reserve.&lt;br /&gt;&lt;br /&gt;I have given the methods intuitive names for convenience (closed vs open), not as prejudgments. The closed method will be the better one in situations where observations can be explained simply. This may be because the underlying cause and effect relationship is of low complexity, or perhaps that the variance in observed characteristics is small. &amp;nbsp;"All dogs have four legs" would be an example of the latter. "Stuff falls when you drop it" applies to the former.&lt;br /&gt;&lt;br /&gt;The most basic structure of language is a verb applied to a noun, which is a model for the closed epistemology. "Birds fly," "Fire burns," and so on, are summaries of real world observations that can be arrived at accurately from just a few examples and without much error. It's an easy conjecture that these simple relationships became so integral to understanding that exceptions were met with challenge. Such as: "If an&amp;nbsp;ostrich&amp;nbsp;doesn't fly, then it can't be a bird." This is what school children encounter when they learn that a whale isn't a fish. The language we use rather gracelessly allows these exceptions in the form of&amp;nbsp;conjunctive&amp;nbsp;appendices, but this is clearly a hack. I will suggest below that a formal language is required to overcome that difficulty (for example, expressions of formal logic, which defines a consistent way of using "or" and "and," and allows unlimited nesting of exceptions, so that &lt;i&gt;any &lt;/i&gt;true/false relationship can be expressed unambiguously).&lt;br /&gt;&lt;br /&gt;Quickly assembling a set of closed rules for a new environment seems like a good idea. It's a fast best-guess approach to finding useful cause and effect relationships.&lt;br /&gt;&lt;br /&gt;Of course, the closed method is not suitable to doing science. Khun's &lt;i&gt;&lt;a href="http://en.wikipedia.org/wiki/The_Structure_of_Scientific_Revolutions"&gt;The Structure of Scientific Revolutions&lt;/a&gt;&lt;/i&gt;&amp;nbsp;suggests that closed outlooks solidify at any level of complexity, and require some bashing to break up. An example would be the certainty (due to Aristotle) that celestial bodies move in perfect circles. This is like Gould's idea of "&lt;a href="http://en.wikipedia.org/wiki/Punctuated_equilibrium"&gt;punctuated equilibrium&lt;/a&gt;" in biological evolution. I graphed the associated relationship between predictability and complexity recently in "&lt;a href="http://highered.blogspot.com/2011/12/randomness-and-prediction.html"&gt;Randomness and Prediction&lt;/a&gt;."&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The question is when to use the open versus closed approach. &amp;nbsp;&lt;/b&gt;Historically, I think the closed approach may have had a blanket "explanation" in the form of mystical associations of cause and effect, which provides a putative low-complexity relationship. "Joe got struck by lightning because he displeased the weather god" has the appearance of an explanation, except that it's not actually predictive. It takes a dedicated effort to discover that fact, however. For that we need an open method.&lt;br /&gt;&lt;br /&gt;The disadvantages of the open method make a long list. First, it's energy intensive--you have to continually be making observations, comparing what you see to what you think you &lt;i&gt;should &lt;/i&gt;see (e.g. three-legged cat), and updating the every-growing explanation. &amp;nbsp;It also takes more energy to use or communicate the current explanation, and as soon as you do, it's out of date again.&lt;br /&gt;&lt;br /&gt;These are not fatal flaws, but ones to be considered. For some phenomena, this is probably how we naturally reason, if in a limited way. For example, our memory and minds do something like Bayesian reasoning (updating the probability of an event based on how frequently we encounter it), although our on-board system has been shown to be deeply flawed (see &lt;a href="http://en.wikipedia.org/wiki/Daniel_Kahneman"&gt;Daniel Kahneman&lt;/a&gt;'s &lt;a href="http://www.amazon.com/Thinking-Fast-Slow-Daniel-Kahneman/dp/0374275637"&gt;recent book&lt;/a&gt;, for this and a lot more).&lt;br /&gt;&lt;br /&gt;Perhaps the open process needs a kind of empirical 'clean-up' to be really useful. Elegant explanations generally only work with clean data. That is, if you want to discover Newtonian mechanics, it's unlikely that you can do this with just your eyes and ears. When Galileo began measuring the "drop" times on an inclined plane, he was onto something.&lt;br /&gt;&lt;br /&gt;In addition to a solid empirical methodology, an open method also needs a way to reduce the size of an explanation while retaining its predictive power. In my graphs in "Randomness and Prediction," I plotted&amp;nbsp;predictability&amp;nbsp;versus complexity, not size. It works like this.&lt;br /&gt;&lt;br /&gt;Suppose I have an observed relationship that I have cataloged like this: (1,2), (2,4), (3,8), (4,16), where this might be thought of as a cause and effect. A one 'causes' a two, and so on. Because my empirical methods are sound, I trust that there's not too much error in the observed values. As the list grows by using the open method, I have a better and better 'explanation' of past events and a better and better predictor of future ones (fine print about the inductive hypothesis goes here...). But the list will become too&amp;nbsp;unwieldy&amp;nbsp;to remember, communicate, or use effectively, as the observations accumulate. What I need is a kind of data compression to reduce the list to a&amp;nbsp;manageable&amp;nbsp;size. If I do this correctly, the explanation doesn't change, nor does the complexity, but the size does. I can reduce it to effect = 2^cause if I have the idea of an exponential function. We might call this data reduction the creation of a formal theory.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Conclusions&lt;/b&gt;&lt;br /&gt;I started by wondering if people who don't know things, and further don't know that they don't know them, could be attributed to one of the two epistemologies mentioned at the beginning. I think the argument above shows that it's possible that the two barriers of empiricism and abstract thinking needed to effectively use an open method are too formidable for a lot of people. For one thing, it's not hard to get by using closed systems, and it may require formal education in scientific method and meta-cognition to effectively use open systems.&lt;br /&gt;&lt;br /&gt;One final note appropriate to the calendar in the US: it's a lot easier to communicate closed explanations than open ones. Even with data compression, "things fall" is less complex than Newton's laws. So in a debate made with sound bites from political candidates, the closed epistemology wins. It's easier, it's comfortable to the listener--the whole construct of English is build to 'hack' a closed way of thinking by adding a few&amp;nbsp;contingencies&amp;nbsp;("Cats have four legs, but I once saw one with three.")--and the explanations take up less time to say. You have to expand "Drill!" into "Drill, baby drill!" to make it &lt;i&gt;bigger&lt;/i&gt;&amp;nbsp;because the basic message can be summed up in one word, and that may seem too short for some audiences as a serious thought.&lt;br /&gt;&lt;br /&gt;This is just another reason why we should be deliberate about teaching science and meta-cognition in school, not as alien ways of thinking that only people in white coats use at work, but as the mode of thinking that differentiates us from the other mammals, and might allow us someday to collectively make good decisions.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-6907136053006081607?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/6907136053006081607/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2012/01/closed-and-open-thinking.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/6907136053006081607'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/6907136053006081607'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2012/01/closed-and-open-thinking.html' title='Closed and Open Thinking'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-4604593354543171682</id><published>2012-01-22T07:50:00.003-05:00</published><updated>2012-01-22T07:50:35.358-05:00</updated><title type='text'>Assorted Links</title><content type='html'>&lt;b&gt;You can file &lt;a href="http://www.sightmap.com/"&gt;www.sightmap.com&lt;/a&gt; under "novel data representation."&lt;/b&gt; It's a heat map overlay of Google Maps that shows the most popular spots for taking photos, using the upload site &lt;a href="http://www.panoramio.com/"&gt;Panoramio&lt;/a&gt; as the source.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-zXaRltl3-io/Txv__XdUr0I/AAAAAAAAAgQ/KAk9a0RlgyM/s1600/spain.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="268" src="http://2.bp.blogspot.com/-zXaRltl3-io/Txv__XdUr0I/AAAAAAAAAgQ/KAk9a0RlgyM/s320/spain.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;This could be good fodder for a student research project. The only disappointment for me was not being able to zoom all the way down to street level resolution.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;There's a new journal for those interested in the intersection of empiricism and computer science&lt;/b&gt;, in the spirit of Wolfram's &lt;i&gt;&lt;a href="http://www.wolframscience.com/"&gt;A New Kind of Science&lt;/a&gt;&lt;/i&gt;. EPJ.org's new "&lt;a href="http://www.epjdatascience.com/"&gt;Data Science&lt;/a&gt;" title seeks to address these challenges:&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;li&gt;how to extract meaningful data from systems with ever increasing complexity&lt;/li&gt;&lt;li&gt;&lt;i&gt;&amp;nbsp;&lt;/i&gt;how to analyse them in a way that allows new insights&lt;/li&gt;&lt;li&gt;&lt;i&gt;&amp;nbsp;&lt;/i&gt;how to generate data that is needed but not yet available&lt;/li&gt;&lt;li&gt;&lt;i&gt;&amp;nbsp;&lt;/i&gt;how to find new empirical laws, or more fundamental theories, concerning how any natural or artificial (complex) systems work&lt;/li&gt;&lt;/blockquote&gt;&lt;br /&gt;Now I have one less excuse for not organizing my &lt;a href="http://highered.blogspot.com/2010/04/surviving-entropy.html"&gt;research notes&lt;/a&gt; into actual articles. While we're at it, here's a &amp;nbsp;&lt;a href="http://jeffhuang.com/best_paper_awards.html"&gt;list of "Best Paper" awards in computer science&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Game theory is a fascinating and powerful set of ideas.&lt;/b&gt;&amp;nbsp; Ever notice at the baggage carousel in the airport how everyone crowds up as close as they can, which means no one can see anything? If everyone took three steps back, the whole group would benefit. Paradoxes like these are the subject matter for this subject from mathematics and economics. There's a site that maps out the field in an easily accessible format. It's even easy to remember: &lt;a href="http://gametheory101.com/"&gt;GameTheory101.com&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;While browsing for study tips for my daughter Epsilon, I found &lt;a href="http://calnewport.com/blog/category/patterns-of-success-for-students/"&gt;Study Hacks&lt;/a&gt;&lt;/b&gt;, with this bit of non-cognitive wisdom, originally quoted from a Reddit discussion thread:&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;The people who fail to graduate from MIT, fail because they come in, encounter problems that are harder than anything they’ve had to do before, and not knowing how to look for help or how to go about wrestling those problems, burn out.&amp;nbsp;&lt;/blockquote&gt;&lt;blockquote class="tr_bq"&gt;&lt;strong&gt;The students who are successful, by contrast, look at that challenge, wrestle with feelings of inadequacy and stupidity, and then begin to take steps hiking that mountain,&amp;nbsp;&lt;/strong&gt;knowing that bruised pride is a small price to pay for getting to see the view from the top. They ask for help, they acknowledge their inadequacies.&amp;nbsp;&lt;strong&gt;They don’t blame their lack of intelligence, they blame their lack of motivation.&lt;/strong&gt;&lt;/blockquote&gt;&lt;br /&gt;Check out &lt;a href="http://www.erez.com/"&gt;this guy's portfolio&lt;/a&gt; as a case study.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;From University of Portland comes a fascinating case study&lt;/b&gt; "&lt;a href="http://faculty.up.edu/lulay/failure/vasacasestudy.pdf"&gt;Why the Vasa Sunk: 10 Lessons Learned&lt;/a&gt;." From the introduction:&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;Around 4:00 PM on August 10th, 1628 the warship Vasa set sail in Stockholm harbor on&amp;nbsp;its maiden voyage as the newest ship in the Royal Swedish Navy. &amp;nbsp;After sailing about&amp;nbsp;1300 meters, a light gust of wind caused the Vasa to heel over on its side. Water poured&amp;nbsp;in through the gun portals and the ship sank with a loss of 53 lives.&amp;nbsp;&lt;/blockquote&gt;&lt;br /&gt;The rest is a case study in how not to manage a complex project. As &lt;a href="http://www.goodreads.com/author/quotes/15038.Ashleigh_Brilliant"&gt;Ashleigh Brilliant&lt;/a&gt; wrote, "It could be that &amp;nbsp;purpose of your life is only to serve as a warning to others."&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Finally, a more positive spin on leadership&lt;/b&gt; from &lt;i&gt;The Atlantic&lt;/i&gt;: "&lt;a href="http://www.theatlantic.com/health/archive/2012/01/study-of-the-day-humble-leaders-are-better-liked-and-more-effective/250687/"&gt;Humble Leaders are More Liked and More Effective&lt;/a&gt;." Take it with a grain of salt (it's a small study), but be proud of your humility.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-4604593354543171682?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/4604593354543171682/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2012/01/assorted-links.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/4604593354543171682'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/4604593354543171682'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2012/01/assorted-links.html' title='Assorted Links'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-zXaRltl3-io/Txv__XdUr0I/AAAAAAAAAgQ/KAk9a0RlgyM/s72-c/spain.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-1332222355088412767</id><published>2012-01-19T20:19:00.000-05:00</published><updated>2012-01-20T09:01:32.385-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='testing'/><title type='text'>An Index for Test Accuracy</title><content type='html'>This post is an overdue follow-up to "&lt;a href="http://highered.blogspot.com/2011/12/randomness-and-prediction.html"&gt;Randomness and Prediction&lt;/a&gt;," which takes up the question of how we should judge the quality of a test. There are many kinds of tests, but for the moment I'm only interested in ones that are supposed to predict future performance. Since education is in the preparation business, the measure of success should be "did we prepare the student?" If that question can be answered&amp;nbsp;satisfactorily with a yes or no, this feedback can be used to determine the accuracy of tests that are supposed to predict this outcome.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;As an example, I used the College Board's &lt;a href="http://professionals.collegeboard.com/profdownload/pdf/RN-30.pdf"&gt;SAT benchmarks (pdf)&lt;/a&gt;&amp;nbsp;, in which a test taken during high school years is used to predict first year college grades. The benchmark study is interesting because it is one of the few examples of test-makers who actually check the accuracy of their instruments and report that information publicly. You can find my first thoughts on this in "&lt;a href="http://highered.blogspot.com/2011/09/sat-error-rates.html"&gt;SAT Error Rates&lt;/a&gt;." The source material mainly consists of Table 1a on page 3 of the College Board report:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-u3zlu5wzYk8/TxgYUV2nx0I/AAAAAAAAAfo/jd8DVBPtI6Y/s1600/SAT.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="309" src="http://2.bp.blogspot.com/-u3zlu5wzYk8/TxgYUV2nx0I/AAAAAAAAAfo/jd8DVBPtI6Y/s320/SAT.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We can use this to see the power of the SAT to predict first year college grades at any cut-off score on the table. If we picked 1200, for example, we can see that 73% of the students we admit will have a first year grade overage at 2.7 or above. In other words, a 73% true positive rate and a 27% false positive rate. Because we are helpfully given the number of samples in each bin (the N column), we can also calculate the false positive and true negative rates for the test. Just multiply N by the percentage of students with FGPA &amp;gt; 2.7 to find the number of students in that bin who were successful in their first year (by that definition), and subtract that from N to get the number who were not. The graph below shows this visually.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-i6wqx_dIVqk/Txgd1R8lwzI/AAAAAAAAAfw/ySFQEwc2WkQ/s1600/sat2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="261" src="http://1.bp.blogspot.com/-i6wqx_dIVqk/Txgd1R8lwzI/AAAAAAAAAfw/ySFQEwc2WkQ/s400/sat2.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The two graphs look roughly like normal distributions with means about 150 SAT points apart. This is all quite interesting, but for my purposes here I just want to pull one number from this: the total percentage of students with FGPA &amp;gt; 2.7, which we can get by summing up all the heights on the blue line and dividing by the total of all samples. This turns out to be 59%.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The College Board's benchmark has 65% accuracy. In other words:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;If a student's SAT score exceeds the benchmark, there is a 65% chance they will have FGPA &amp;gt; 2.7&lt;/li&gt;&lt;li&gt;Of all students, 59% will have FGPA &amp;gt; 2.7&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;The difference between these numbers is not large: .65 - .59 = 6%. Using the benchmark to select "winners", we can do six percent better than just randomly sampling. If all we care about is the percentage of "good" students we get, that's the end of the story. But there's another dimension: the rate of unfair rejections, or false negatives.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If we randomly sample whom we accept, then 59% of those we reject would have had FGPA &amp;gt; 2.7 (assuming this is the rate for the whole population). Since it's unfair to reject qualified candidates, we might call 1-.59 = 41% the &lt;i&gt;fairness&lt;/i&gt;&amp;nbsp;of the method of selection. Another name for fairness is the true negative rate. I plotted it against the accuracy (true positive rate) in the previous article. Here it is again.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-nETOdq0Shnk/ToO5UUQuwII/AAAAAAAAAbU/7XjtowJgd4E/s1600/SAT3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="263" src="http://2.bp.blogspot.com/-nETOdq0Shnk/ToO5UUQuwII/AAAAAAAAAbU/7XjtowJgd4E/s400/SAT3.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;The blue line is accuracy, and the red line is fairness. They meet at 65%. So we can see that although using the SAT benchmark is only six percent more accurate than random sampling, it is .65 - .49 = 16% more fair. How do we make sense of how good this is?&lt;br /&gt;&lt;br /&gt;One overall measure of test predictive power is the average rate of correct predictions, taking into account both true positives and true negatives. We might call that the "correctness rate" of the cut-off benchmark. Where the lines cross above in the above graph, both the rates for true positives and true negatives is 65%, so the correctness rate is also 65%. In general, the formula for the&amp;nbsp;correctness rate c at a give cut-off benchmark &amp;nbsp;is:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;c = (number of actual positives that meet the benchmark + number of actual negatives that do not meet the benchmark) / (total number of all observations)&lt;/blockquote&gt;Below is a graph that adds the correctness rate to the accuracy and fairness plots.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-ZqfCdbTbUNw/TxlVuku-2mI/AAAAAAAAAgA/1YHQ5odJX6E/s1600/sat5.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="282" src="http://4.bp.blogspot.com/-ZqfCdbTbUNw/TxlVuku-2mI/AAAAAAAAAgA/1YHQ5odJX6E/s400/sat5.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The correctness rate potentially solves the problem of considering accuracy and fairness separately. It does not, however, give us an absolute measure to compare the quality of tests with. This is because the fraction of actual positives in the population can vary, making detection easier or more difficult. If we are interested in comparing different tests over different kinds of detection environments, we need something different. In the next section we will derive an index to try to address this problem.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;A Comparative Index&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;In general, there is not a good way to turn results about predictability into results about complexity. However, using ideas from computational complexity, I stumbled upon a transformation that gives us another way to think about the predictive power of a test.&lt;br /&gt;&lt;br /&gt;In order to proceed, imagine an even better version of the test. In this fantasy, a proportion p of the test benchmark results come back marked with an asterisk.&amp;nbsp;Imagine that this notation means that the result is &lt;i&gt;known to be true&lt;/i&gt;. The unmarked ones have no guarantee--some will be correct and some not. In this way we imagine separating out the good and useful work of the test in to the p group, whereas the rest is just random guessing.&lt;br /&gt;&lt;br /&gt;It's just like a multiple choice test. Some answers you know you know, and others you guess at. By working backwards we can find that "known true" fraction:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;Correctness rate = (fraction known correct) + (fraction not known correct)*(rate of correct responses with random sampling)&lt;/blockquote&gt;Using the numbers from the SAT benchmark in the previous section gives us:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;.65 = p + (1- p) * .59&lt;/blockquote&gt;&lt;blockquote class="tr_bq"&gt;p = (.65 - .59)/(1-.59)&lt;/blockquote&gt;The fraction that would have to be "known true" is p = 14.6%. The advantage of this transformation is that we have a single number that is easy to visualize, and takes the context into account. If you wanted to explain it to someone, it would go like this:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;The SAT benchmark prediction is like having a perfect understanding of 14.6% of test-takers and guessing at the rest.&lt;/blockquote&gt;The graphs below show the linear relationship between average test accuracy, the larger of the percent of positives or negatives in the population (the "guess rate"), and the index p--the equivalent proportion of "perfect understanding" outcomes.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/--qubfn43ryk/TxlX9gUbxaI/AAAAAAAAAgI/CfScibBhOo4/s1600/sat6.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="273" src="http://1.bp.blogspot.com/--qubfn43ryk/TxlX9gUbxaI/AAAAAAAAAgI/CfScibBhOo4/s400/sat6.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;The "guess rate" is just the bigger of the fraction of negatives or positives in the population. If there are more positives, then without more information, you would guess than any randomly chosen outcome would be positive. If there are more negatives, the best guess (without any other information) is that the outcome would be negative.&amp;nbsp;In formulas, we will call this guess rate "r."&amp;nbsp;For the SAT example, the real positive rate is 59%, so r = .59. If the real positive rate had been 45%, we'd use r = 1 - .45 = 55%.&lt;br /&gt;&lt;br /&gt;As an example to illustrate the graph above, if the number of actual positives and negatives are evenly split at r = 50%, then a test that can predict with 80% correctness has the equivalent "perfect understanding" index of 60%. But if the proportion of positives is r = 70% instead of 50%, the index drops to 33%. It's reasonable to say that even though the correctness rate is the same, the first test is almost twice as good as the second one.&lt;br /&gt;&lt;br /&gt;Note that if the guess rate equals the test accuracy, the test explains exactly nothing, which is as it should be.&lt;br /&gt;&lt;br /&gt;Here's a general formula for computing the index p, which is the proportion of "perfect understanding" test results. The other two variables are c = the test's average correct classification rate, and r = the larger of the proportions of negatives or positive actual outcomes. In the SAT example, 59% were successful according to the FGPA criterion, so r = 59. If it had been 45% successful, then we'd use r = 1-.45 = 55%. &amp;nbsp;Given these inputs, we have a simple formula for the index p:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;p = (c &amp;nbsp;- r)/(1 - r)&lt;/blockquote&gt;On the last graph, p is the height of the line, c is the bottom axis, and four values of r (guess rate) are given, one for each curve as noted on the legend.&lt;br /&gt;&lt;br /&gt;(Note: edited 1/20/2012 for clarity)&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-1332222355088412767?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/1332222355088412767/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2012/01/index-for-test-accuracy.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/1332222355088412767'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/1332222355088412767'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2012/01/index-for-test-accuracy.html' title='An Index for Test Accuracy'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-u3zlu5wzYk8/TxgYUV2nx0I/AAAAAAAAAfo/jd8DVBPtI6Y/s72-c/SAT.png' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-6493243583140016267</id><published>2011-12-16T07:39:00.000-05:00</published><updated>2011-12-17T11:07:13.912-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='analysis'/><category scheme='http://www.blogger.com/atom/ns#' term='correlation'/><category scheme='http://www.blogger.com/atom/ns#' term='data mining'/><title type='text'>Free Hypothesis-Generating Software</title><content type='html'>In the last year there have been announcements of two free software packages that use machine learning techniques to mine data for relationships. The resulting mathematical formulas can be used to form hypotheses about the underlying phenomena (i.e. whatever the data represents).&lt;br /&gt;&lt;br /&gt;The first one I have mentioned before. It's &lt;a href="http://creativemachines.cornell.edu/eureqa"&gt;Eureqa&lt;/a&gt; from Cornell, which uses &lt;a href="http://en.wikipedia.org/wiki/Symbolic_Regression"&gt;symbolic regression&lt;/a&gt;. There is an example on the Eureqa site that poses this sample problem:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;span class="Apple-style-span" style="font-size: 16px;"&gt;This page describes an illustrative run of genetic programming in which the goal is to automatically create a computer program whose output is equal to the values of the quadratic polynomial&amp;nbsp;&lt;/span&gt;&lt;i style="font-size: 16px;"&gt;x&lt;/i&gt;&lt;sup&gt;2&lt;/sup&gt;&lt;span class="Apple-style-span" style="font-size: 16px;"&gt;+&lt;/span&gt;&lt;i style="font-size: 16px;"&gt;x&lt;/i&gt;&lt;span class="Apple-style-span" style="font-size: 16px;"&gt;+1 in the range from –1 to +1. That is, the goal is to automatically create a computer program that matches certain numerical data. This process is sometimes called&amp;nbsp;&lt;/span&gt;&lt;i style="font-size: 16px;"&gt;system identification&lt;/i&gt;&lt;span class="Apple-style-span" style="font-size: 16px;"&gt;&amp;nbsp;or&amp;nbsp;&lt;/span&gt;&lt;i style="font-size: 16px;"&gt;symbolic regression&lt;/i&gt;&lt;span class="Apple-style-span" style="font-size: 16px;"&gt;.&lt;/span&gt;&lt;/blockquote&gt;The program proceeds as an evolutionary search. The graph pictured below is a schematic of the way the topology of the evolved "critters" is formed.&lt;br /&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="http://www.genetic-programming.com/BBB3664gen1.gif" imageanchor="1" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" height="150" src="http://www.genetic-programming.com/BBB3664gen1.gif" width="400" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;A family tree of mathematical functions. (Image Source: geneticprogramming.com)&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;There is a limitation to genetic programming that is also a threat to any intelligent endeavor: the problem may not be&amp;nbsp;amenable&amp;nbsp;to evolutionary strategies. There are some problems where the only way to solve them is exhaustive search. Only if the solution space is "smooth" in the sense that good solutions are "near" almost-good solutions is the genetic approach going to find solutions faster than exhaustive search. On a philosophical note, modern successes with physical sciences suggest that the universe is kind to us in this regard. The "unreasonable effectiveness" of mathematics (the title of &lt;a href="http://en.wikipedia.org/wiki/The_Unreasonable_Effectiveness_of_Mathematics_in_the_Natural_Sciences"&gt;an article by Eugene Wigner&lt;/a&gt;) in producing formulas that model real world physics is a hopeful sign that we may be able to decode the external environment so that we can predict it before it kills us. (The internal organization of complex systems is another matter, and there's not much success to look to there.). Note, however, that even here formulas have not really been evolutionary, but revolutionary. The formulation of Newton's laws of motion are derivable from Einstein's relativity, but not vice versa. The "minor tweak" approach doesn't work very often, Einstein's &lt;a href="http://en.wikipedia.org/wiki/Cosmological_constant"&gt;Cosmological Constant&lt;/a&gt; notwithstanding.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;The second data miner is aptly called MINE&lt;/b&gt;, and comes from the Broad Institute of Harvard and MIT. You can read about it on their site &lt;a href="http://www.broadinstitute.org/news/3784"&gt;broadinstitute.org&lt;/a&gt;. The actual program is hosted at &lt;a href="http://exploredata.net/"&gt;exploredata.net&lt;/a&gt;, where you can download a java implementation with an R interface. Here's a description from the site:&lt;/div&gt;&lt;blockquote class="tr_bq"&gt;One way of beginning to explore a many-dimensional dataset is to calculate some measure of dependence for each pair of variables, rank the pairs by their scores, and examine the top-scoring pairs. For this strategy to work, the statistic used to measure dependence should have the following two heuristic properties.&amp;nbsp;&lt;/blockquote&gt;&lt;blockquote class="tr_bq"&gt;&lt;b&gt;Generality: &lt;/b&gt;with sufficient sample size the statistic should capture a wide range of interesting associations, not limited to specific function types (such as linear, exponential, or periodic), or even to all functional relationships.&amp;nbsp;&lt;/blockquote&gt;&lt;blockquote class="tr_bq"&gt;&lt;b&gt;Equitability:&lt;/b&gt; the statistic should give similar scores to equally noisy relationships of different types. For instance, a linear relationship with an R2 of 0.80 should receive approximately the same score as a sinusoidal relationship with an R2 of 0.80.&lt;/blockquote&gt;&lt;div&gt;It's interesting that this is a generalized approach to my &lt;a href="http://highered.blogspot.com/2011/12/x-raying-survey-data.html"&gt;correlation mapper software&lt;/a&gt;, the difference being that I have only considered linear relationships. For survey data, it's probably not useful to look beyond linear relationships, but I look forward to trying the package out to see what pops up. It looks easy to install and run, and I can plug it into my Perl script to automatically produce output that complements my existing methods. A project for Christmas break, which is coming up fast.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Update:&lt;/b&gt; I came across &lt;a href="http://www.realclimate.org/index.php/archives/2011/12/curve-fitting-and-natural-cycles-the-best-part/"&gt;an article at RealClimate.org&lt;/a&gt; that illustrates the danger of models without explanations. Providing a correlation between items, or a more sophisticated pattern based on Fourier analysis or the like, isn't a substitute for a credible explanatory mechanism. Take a look at the article and comments for more.&lt;br /&gt;&lt;br /&gt;By coincidence, I am reading Emanaul Derman's book &lt;i&gt;&lt;a href="http://www.amazon.com/Models-Behaving-Badly-Confusing-Illusion-Reality-Disaster/dp/1439164983/ref=sr_1_1?s=books&amp;amp;ie=UTF8&amp;amp;qid=1324071889&amp;amp;sr=1-1"&gt;Models.Behaving.Badly: Why Confusing Illusion with Reality Can Lead to Disaster, on Wall Street and in Life&lt;/a&gt;. &lt;/i&gt;It has technical parts, which I find quite interesting, and more philosophical parts that leave me scratching my head. The last chapter, which I haven't read yet, advertises "How to cope with the inadequacies of models, via ethics and pragmatism." Stay tuned...&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Update 2:&lt;/b&gt; You can read technical information about MINE in &lt;a href="http://www.mediafire.com/?4xc57v4jfmdc1k3"&gt;this article&lt;/a&gt; and &lt;a href="http://www.sciencemag.org/content/suppl/2011/12/14/334.6062.1518.DC1/Reshef.SOM.pdf"&gt;supplementary material&lt;/a&gt;.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-6493243583140016267?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/6493243583140016267/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/12/free-hypothesis-generating-software.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/6493243583140016267'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/6493243583140016267'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/12/free-hypothesis-generating-software.html' title='Free Hypothesis-Generating Software'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-8412969158595690951</id><published>2011-12-10T10:16:00.001-05:00</published><updated>2011-12-10T11:38:50.055-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='prediction'/><title type='text'>Randomness and Prediction</title><content type='html'>I saw a question at &lt;a href="http://stackoverflow.com/"&gt;stackoverflow.com&lt;/a&gt; asking why computer programs can't produce true random numbers. I can't locate the exact page now, but &lt;a href="http://stackoverflow.com/questions/632873/why-is-it-hard-for-a-program-to-generate-random-numbers"&gt;here's a similar one&lt;/a&gt;. The question spooked around in my head all day, despite my head-down work to catch up on paperwork after being away at the SACSCOC meeting (&lt;a href="https://twitter.com/#!/search/realtime/%23sacscoc"&gt;see the tweets&lt;/a&gt;). After coming home, I finally gave in and wrote some notes down on the topic. It has applications to assessment, believe it or not.&lt;br /&gt;&lt;br /&gt;According to complexity theory, "random" means infinitely complex. Complexity is the size of a perfect description of, for example, a list of numbers. If we are given an infinitely long list of numbers like&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;1, 1, 1, 1, 1, ....&lt;/blockquote&gt;it's easy to see that it has very low complexity. We can describe the sequence as "all ones." &amp;nbsp;Similarly, the powers of two makes a low complexity sequence, or any other simple arithmetic sequence. We could create a more complicated computer program that tries to produce numbers that are as "mixed-up" as possible--this is what pseudo-random number generators do, but if we have access to the program (i.e. the description of the sequence), we could perfectly predict the numbers in the sequence. It's hard to call that random.&lt;br /&gt;&lt;br /&gt;Truly random numbers (as far as we know) come from real-world&amp;nbsp;phenomena&amp;nbsp;like radioactive decay. You can have a certain amount of this so-called "entropy" for free from internet sources like &lt;a href="http://www.fourmilab.ch/hotbits/"&gt;Hotbits&lt;/a&gt;. I use such services for my artificial life experiments (insert maniacal laugh here). Real randomness is a valuable commodity, and I'm constantly running over my limit for what I can get for free from these sites. Here's a description from their site of where the numbers come from:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;cite style="background-color: white; text-align: justify;"&gt;HotBits&lt;/cite&gt;&lt;span class="Apple-style-span" style="background-color: white;"&gt;&amp;nbsp;is an Internet resource that brings&amp;nbsp;&lt;/span&gt;&lt;em style="background-color: white; text-align: justify;"&gt;genuine&lt;/em&gt;&lt;span class="Apple-style-span" style="background-color: white; text-align: justify;"&gt;&amp;nbsp;random numbers, generated by a process fundamentally governed by the inherent uncertainty in the quantum mechanical laws of nature, directly to your computer in a variety of forms.&amp;nbsp;&lt;/span&gt;&lt;cite style="background-color: white; text-align: justify;"&gt;HotBits&lt;/cite&gt;&lt;span class="Apple-style-span" style="background-color: white; text-align: justify;"&gt;&amp;nbsp;are generated by timing successive pairs of radioactive decays detected by a Geiger-Müller tube interfaced to a computer.&lt;/span&gt;&lt;/blockquote&gt;What would be involved if you wanted to predict a sequence of such numbers (which will come in binary as ones and zeros)? As far as we know, radioactive decay is not predictable from knowing the physical state of the system (see &lt;a href="http://en.wikipedia.org/wiki/Bell's_theorem"&gt;Bell's Theorem&lt;/a&gt;&amp;nbsp;for more on such things).&lt;br /&gt;&lt;br /&gt;Even in a mechanical system such as a spinning basket of ping-pong balls, like those used for selecting the winning numbers in lotteries, a complete description of the system that is sufficient to allow you to predict which balls will emerge to declare the winner would be a very long set of formulas and data. In other words, even if it's not infinitely complex, it's very, very complex.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;But what if we want partial credit?&lt;/b&gt;&amp;nbsp;This was the big idea that had half my brain working all day. What if we are content to predict some fraction of the sequence, and not every single output? (Like "lossy" compression of image files versus exact compressors for executables.) For example, if I flip a coin over and over, and I confidently "predict" for each flip that it will come up heads, I will be right about half the time (in fact, &lt;i&gt;any&lt;/i&gt;&amp;nbsp;predictor I use is going to be right half the time). So even with the simplest possible predictor, I can get 50% accuracy.&lt;br /&gt;&lt;br /&gt;Imagine that we have an infinitely complex binary sequence S-INF that comes from some real source like radioactive decay. We write a program to do the following:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;For the first 999,999 bits, we output a 1 each time. For the one-millionth bit we output the first S-INF bit, which is random. Then we repeat this process forever, with very long strings of 1s followed by a random one or zero. Call this sequence S-1&lt;/blockquote&gt;It should be clear that a perfect predictor of S-1 is impossible with a finite program. It's still infinitely complex because of the interjection of bits from S-INF. But on the other hand, we can accurately predict the sequence a large portion of the time. We'll be wrong once in every two million bits, on average, if we just predict that the output will be 1 every time.&lt;br /&gt;&lt;br /&gt;There is a big difference between randomness and predictability. If we take this another step, we could imagine making a picture of prediction-difficulty for a given sequence. An example from history may make this clearer.&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;Before Galileo, many people presumably thought that heavy objects fell faster than lighter objects. This is a simple predictor of a physical system. It works fine with feathers and rocks, but it gives the wrong answer for rocks of different weights. Galileo and then Newton added more description (formulas, units, measurements) that allowed much better predictions. These turned out to be insufficient for very large scale cases, and Einstein added even more description (more math) to create a more complex way of predicting gravitational effects. We know that even relativity is incomplete, however, because it doesn't work on very small scales, so theorists are trying ideas like string theory to find an even more complex predictor that will work in more instances. This process of scientific discovery increases the complexity of the description and increases the accuracy of predictions as it does.&amp;nbsp;&lt;/blockquote&gt;Perfect prediction in the real world can be very, very complex, or even infinitely complex. That means that there isn't enough time or space to do the job perfectly. As someone else has noted (Arthur C. Clark?), some systems are so complex that the only way to "predict" what happens is to watch the system itself. But even very complex systems may be predictable with high probability, as we have seen. What is the relationship between the complexity of our best predictor and the probability of a correct prediction? This will depend on the system we are predicting--there are many possibilities. Below are some graphs to illustrate predictors of a binary sequence. Recall that a constant "predictor" can't do worse that 50%.&lt;br /&gt;&lt;br /&gt;The optimistic case is pictured below. This is embodied in the philosophy of progress--as long as we keep working hard, creating more elaborate (and accurate) formulas, the results will come in the form of better and better predictors.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-luFV_SXbPrw/TuOFQlnGZYI/AAAAAAAAAfA/Q1a8ZoW29c0/s1600/predict1.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="296" src="http://1.bp.blogspot.com/-luFV_SXbPrw/TuOFQlnGZYI/AAAAAAAAAfA/Q1a8ZoW29c0/s400/predict1.PNG" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&amp;nbsp;The worst case is shown below. No matter how hard we try, we can't do better than guessing. This is the case with radioactive decay (as far as anyone knows).&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-cPuwE5cOdl8/TuOFShSYA3I/AAAAAAAAAfI/6TK-cNfXry8/s1600/predict2.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="296" src="http://2.bp.blogspot.com/-cPuwE5cOdl8/TuOFShSYA3I/AAAAAAAAAfI/6TK-cNfXry8/s400/predict2.PNG" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;The graph below is more like the actual progress of science as a "punctuated equilibrium." There are increasingly large complexity deserts, where no improvement is seen. Compare the relatively few scientists that led to Newton's revolution or the efforts of Einstein and his collaborators to the massive undertaking that is string theory (and its competition, like loop quantum gravity).&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-d8oQ3CJJ9Xw/TuOFUFkQ9aI/AAAAAAAAAfQ/ybfw0h4NPaU/s1600/predict3.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="296" src="http://2.bp.blogspot.com/-d8oQ3CJJ9Xw/TuOFUFkQ9aI/AAAAAAAAAfQ/ybfw0h4NPaU/s400/predict3.PNG" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;Note that merely increasing the complexity of a predictor is easy. The hard part is figuring out how to increase prediction rates. You can always make a formula or description more complex, but by doing so it doesn't guarantee that the predictors are any better. Generally speaking, there is no computable (that is, systematic or deterministic) method for automatically finding optimal predictors for a given complexity level. You might think that you could just try every single program of a given complexity level and proceed by exhausting the possibilities, but you run into the &lt;a href="http://en.wikipedia.org/wiki/Halting_problem"&gt;Halting Problem&lt;/a&gt;. There &lt;i&gt;are&lt;/i&gt; practical ways to tackle the problem though. This is a topic from the part of computer science called&amp;nbsp;&lt;a href="http://en.wikipedia.org/wiki/Machine_learning"&gt;machine learning&lt;/a&gt;. A new tool that appeared this year from Cornell University is &lt;a href="http://creativemachines.cornell.edu/eureqa"&gt;Eureqa&lt;/a&gt;, a program for finding formulas to fit patterns in data sets using an evolutionary approach.&lt;br /&gt;&lt;br /&gt;Next time I will apply this idea to testing and outcomes assessment. It's very cool.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-8412969158595690951?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/8412969158595690951/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/12/randomness-and-prediction.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/8412969158595690951'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/8412969158595690951'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/12/randomness-and-prediction.html' title='Randomness and Prediction'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-luFV_SXbPrw/TuOFQlnGZYI/AAAAAAAAAfA/Q1a8ZoW29c0/s72-c/predict1.PNG' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-3248350814319602725</id><published>2011-12-09T10:15:00.001-05:00</published><updated>2011-12-09T14:39:44.982-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='statistics'/><category scheme='http://www.blogger.com/atom/ns#' term='surveys'/><category scheme='http://www.blogger.com/atom/ns#' term='correlation'/><title type='text'>X-Raying Survey Data</title><content type='html'>I continue to develop and use the software I patched together to look at correlates (or covariates) within large scalar or ordinal data sets like surveys. I have gotten requests from several institutions in and out of higher ed to do these. A couple of interesting graphs that resulted are shown below, with permission of the owners of the data, who shall remain anonymous. Both of these are &lt;a href="http://www.heri.ucla.edu/"&gt;HERI&lt;/a&gt; surveys. I have found the HERI surveys the most revealing, partly because they discriminate so well between different dimensions. Some other surveys seem to produce (in the data sets I've seen) big globs of correlated items that are hard to get meaning from.&lt;br /&gt;&lt;br /&gt;First the &lt;a href="http://www.heri.ucla.edu/cirpoverview.php"&gt;CIRP Freshman Survey&lt;/a&gt; at a private college. It neatly divides up the survey respondents into clusters. Rich urban kids negatively correlated to working class or middle class kids, athletes, the religious, and the environmentally-conscious all show up clearly. I've labeled the optional questions with an approximation of the prompt.[&lt;a href="http://zzascape.com/COV20-1.JPG"&gt;Download full-sized graph&lt;/a&gt;]&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-BPPQkkX6jmw/TuIneCOa0gI/AAAAAAAAAew/Hkk54n_pNow/s1600/COV20-1.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="400" src="http://3.bp.blogspot.com/-BPPQkkX6jmw/TuIneCOa0gI/AAAAAAAAAew/Hkk54n_pNow/s400/COV20-1.JPG" width="385" /&gt;&lt;/a&gt;&lt;/div&gt;Next is the &lt;a href="http://www.heri.ucla.edu/yfcyoverview.php"&gt;Your First Year College Survey&lt;/a&gt; at a different private college. I find the link between texting in class and recommending the school to others particularly interesting. That's at the bottom. Red lines are negative correlations.[&lt;a href="http://zzascape.com/graph.6.png"&gt;Download full-sized graph&lt;/a&gt;]&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-rMRtcNziIZw/TuIn8lEEDfI/AAAAAAAAAe4/KUzJ9hHS96E/s1600/graph.6.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="355" src="http://1.bp.blogspot.com/-rMRtcNziIZw/TuIn8lEEDfI/AAAAAAAAAe4/KUzJ9hHS96E/s400/graph.6.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-3248350814319602725?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/3248350814319602725/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/12/x-raying-survey-data.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/3248350814319602725'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/3248350814319602725'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/12/x-raying-survey-data.html' title='X-Raying Survey Data'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-BPPQkkX6jmw/TuIneCOa0gI/AAAAAAAAAew/Hkk54n_pNow/s72-c/COV20-1.JPG' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-3497399969010935838</id><published>2011-12-09T07:07:00.001-05:00</published><updated>2011-12-09T07:25:41.620-05:00</updated><title type='text'>Higher Ed's 1%</title><content type='html'>I found this &lt;a href="http://chronicle.com/article/Graphic-How-Presidents-Pay/129981/"&gt;chart at the Chronicle&lt;/a&gt; very interesting. It shows the ratio of presidential pay to average professor pay. For example, at Stevenson University, it's 16 to 1.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-iGjAhbTJ9VA/TuH6l_Ll8YI/AAAAAAAAAeg/AbDRuYMJFNQ/s1600/pay.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="233" src="http://3.bp.blogspot.com/-iGjAhbTJ9VA/TuH6l_Ll8YI/AAAAAAAAAeg/AbDRuYMJFNQ/s320/pay.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;My rule of thumb is that the quality of an institution is ranked by Instructional Costs/FTE. Here are some selected institutions from the right side of the chart, along with their "&lt;a href="http://highered.blogspot.com/2009/09/zzas-best-liberal-arts-schools.html"&gt;Z-scores&lt;/a&gt;", courtesy of &lt;a href="http://www.collegeresults.org/search1ba.aspx?institutionid=164988,175980,111948,164173,130794,194824,221999"&gt;CollegeResultsOnline&lt;/a&gt; (2009 data).&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-Kuv-QhFwRSk/TuH8egBNPsI/AAAAAAAAAeo/NO64vcZASXY/s1600/pay2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/-Kuv-QhFwRSk/TuH8egBNPsI/AAAAAAAAAeo/NO64vcZASXY/s1600/pay2.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-3497399969010935838?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/3497399969010935838/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/12/higher-eds-1.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/3497399969010935838'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/3497399969010935838'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/12/higher-eds-1.html' title='Higher Ed&apos;s 1%'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-iGjAhbTJ9VA/TuH6l_Ll8YI/AAAAAAAAAeg/AbDRuYMJFNQ/s72-c/pay.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-7807193689166820757</id><published>2011-12-02T07:22:00.001-05:00</published><updated>2011-12-02T09:06:23.546-05:00</updated><title type='text'>Rubrics as Dialogue</title><content type='html'>In the last few articles beginning with "&lt;a href="http://highered.blogspot.com/2011/11/end-of-preparation.html"&gt;The End of Preparation&lt;/a&gt;", I have contrasted two epistemologies. One proceeds by definition, which I called monological, and the other emerges from dialogue. These are distinct and equally useful ways of understanding the world. We could display the discipline of physics as a very successful monological system. It allows scientists and engineers to model physical systems and design systems that will work as intended. It allows us to understand what the sun is, and where the atoms in our bodies came from, all as part of a broad model that uses consistent monological language. This successful union of reality, language, and model is what makes the physical sciences so powerful. With such a language we can think precisely about complexity, for example, which is the size of the description of some physical system. A jar of marbles being shaken is complex because each marble's position and velocity are different, requiring a long list of physical attributes to be listed. If they are all at rest in a square shape, the description in physical language can be compressed, and the state is less complex. Physicists would refer to this as high entropy versus low entropy.&lt;br /&gt;&lt;br /&gt;By comparison, the language and ways of knowing that create popular culture is dialogical. There are no rules set down about what new words (or memes, if you prefer) will arise, and no deterministic rules that could be applied to predict cultural evolution. A stock exchange has some elements of monologism (precise definitions regarding financial transactions, for example), but the evolution of prices is dialogical--unpredictable&amp;nbsp;consensus&amp;nbsp;between buyers and sellers.&lt;br /&gt;&lt;br /&gt;One of the characteristics that&amp;nbsp;distinguishes&amp;nbsp;a monological language from a dialogical one is that in the former case, the names can be arbitrary. What matters is their relationship in the model that's used for understanding the world. For example, electricity comes in Volts and&amp;nbsp;Amperes, and its power is measured in Watts. These are names of scientists, as are the Ohm, Henry, and Farad, terms that refer to electrical properties of circuit elements. If they were dialogical names, they would more likely be "Zap," "Spark," and "Shock" or something similarly descriptive. This is because in a dialogue, it's an asset for words to be descriptive--you don't have to waste extra time saying what it is you meant. By contrast, it's enough to know that V=IR when calculating voltage in a circuit. It doesn't matter whether we call it Volts or Zaps.&lt;br /&gt;&lt;br /&gt;It's a trope to poke fun at academics who speak in high-falutin' language just to say something ordinary. When Sheldon in &lt;i&gt;Big Bang Theory&lt;/i&gt;&amp;nbsp;gets stuck on a climbing wall, he says "I feel somewhat like an inverse tangent function that's approaching an&amp;nbsp;asymptote," which is then reinforced by his desperate follow-up "What part of 'an inverse tangent function approaching an asymptote' did you not understand?" [&lt;a href="http://www.youtube.com/watch?v=gI028-a63DI"&gt;video clip&lt;/a&gt;] &amp;nbsp;Some might argue that some academic disciplines that are inherently more dialogical use language that's unnecessarily opaque. This point was&amp;nbsp;publicly&amp;nbsp;made in the "&lt;a href="http://en.wikipedia.org/wiki/Sokal_affair"&gt;Sokal Affair&lt;/a&gt;," where a scientist submitted a jargon-laden meaningless paper to a humanities journal as a hoax, and it was published.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Using Rubrics for Assessment&lt;/b&gt;&lt;br /&gt;In order to connect these ideas to the assessment practice of using rubrics, let me first review what they are.&lt;br /&gt;&lt;br /&gt;The term "rubric" in learning outcomes assessment means a matrix that indexes competencies versus accomplishment levels. For example, a rubric for rating student essays might include a "correctness" competency, which is probably one of several on the rubric. There would be a scale attached to correctness, which might be Poor, Average, Good, Excellent (PAGE), or one tied to a development sequence like "Beginning" through "Mastering." In our Faculty Assessment of Core Skills survey, we use Developmental, Fresh/Soph, Jr/Sr, Graduate to relate the scale to the expectations of faculty.&lt;br /&gt;&lt;br /&gt;A rubric alone is not enough to do much good. A fully-developed process using rubrics might go something like this, starting with developing your own.&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Define a learning objective in ordinary academic language. "Students who graduate from Comp 101 should be able to write a standard essay that uses appropriate voice, is addressed to the target audience, is effective in communicating its content, and is free from errors."&lt;/li&gt;&lt;li&gt;The competencies identified in the outcomes statement are clear: voice, audience, content, and correctness. These define the rows of the rubric matrix.&lt;/li&gt;&lt;li&gt;Decide on a scale and language to go with it, e.g. PAGE.&lt;/li&gt;&lt;li&gt;Describe the levels of each competency in language that is helpful to students. It's better to be positive than negative--that is, define what you want, not what you don't want when possible. There are many resources on constructing rubrics you can consult. The &lt;a href="http://www.aacu.org/value/rubrics/index_p.cfm?CFID=36339890&amp;amp;CFTOKEN=74306965"&gt;AAC&amp;amp;U's VALUE rubrics&lt;/a&gt; are examples to refer to.&lt;/li&gt;&lt;li&gt;The rubric should be used in creating assignments, and distributed with the assignment, so the student is clear about expectations. Use of rubrics in grading varies--it's not necessary to tie an assessment to a grade, but there are some obvious advantages if you do.&lt;/li&gt;&lt;li&gt;Rating the assignment that was designed with the rubric in mind should not be a challenge. If it's essential to have reliable results, then multiple raters can be used, and training sessions can reduce some of the variability in rating. Nevertheless, it's not an exact process.&lt;/li&gt;&lt;li&gt;Over time you create a library of samples to show students (and raters) what constitutes each achievement level.&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;Note that the way to do this is NOT to take some rubric someone else has created and apply it to assignments that were not created with the rubric in mind. That's how I did it the first time, and wasted everyone's time.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Rubrics as Constructed Language&lt;/b&gt;&lt;/div&gt;&lt;div&gt;The learning objectives that rubrics are employed to assess are often complex, so even though an attempt is made to define the levels of accomplishment, these descriptions are in ordinary language. That is, there's no formal deductive structure or accompanying model that deterministically generates output ratings from inputs. Instead, the ratings rely on the judgment of professionals, who are free to disagree with one another. If your attitude is that there is one true rating that all raters must eventually agree on, you're likely to be frustrated. One problem is that although the competencies, like content and correctness in the example, are not independent. If there are too many spelling and grammar mistakes on a paper to gain any sort of comprehension, content, style, voice, and so on are also going to be degraded. One rater of writing samples I remember was adamant that a single spelling mistake implied that all other ratings would be lowered as well.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So using rubrics is dialogical, but by the way of a nice compromise. The power in rubrics comes from restraining the language we use to describe student work, according to a public set of definitions. Even though these are not rigorous, they are still extremely useful in focusing attention on the issues that are deemed important. In addition, rubrics create a common language in the learning domain. It's important for students not to just know content, but how professionals critique content, and rubrics are a way to do that. They can be used for self-reflection or peer review to reinforce the use of that language.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The advantage of generating useful language is one reason I only use a PAGE scale as a last resort. Terms like poor, average, and so on are too generic, and too easily made relative. An excellent freshman paper and an excellent senior paper should not be the same thing, right? Bad choices in these terms early on can have long-term consequences when you want to do a longitudinal analysis of ratings.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There is a tendency among some to view rubric rating as a more monological process, but I can't see how this can be supported for most learning outcomes. In my opinion, they are most useful in creating a common language to employ in teaching, to rein in the vast lexicon that might naturally be used and focus on the elements that we agree are the most important. This has positive benefits for everyone concerned.&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-7807193689166820757?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/7807193689166820757/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/12/rubrics-as-dialogue.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/7807193689166820757'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/7807193689166820757'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/12/rubrics-as-dialogue.html' title='Rubrics as Dialogue'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-4270514913955772499</id><published>2011-12-01T07:34:00.001-05:00</published><updated>2011-12-01T08:00:14.217-05:00</updated><title type='text'>Chef's Salad</title><content type='html'>Here's another serving of link salad. The articles referenced connect to recent topics of discussion.&lt;br /&gt;&lt;a href="http://www.blogger.com/goog_1070317857"&gt;&lt;br /&gt;&lt;/a&gt;&lt;br /&gt;&lt;b&gt;&lt;a href="http://www.heri.ucla.edu/"&gt;HERI&lt;/a&gt; just released a &lt;a href="http://heri.ucla.edu/DARCU/CompletingCollege2011.pdf"&gt;report on college graduation rates&lt;/a&gt;.&lt;/b&gt; They give details on regressions to predict completion, and provide the rates of correct classification rates for same. Here's an example:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-XnAJaj4fVAs/Ttd1Z1j2gDI/AAAAAAAAAeY/bfUUNkblIKU/s1600/grad1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="237" src="http://1.bp.blogspot.com/-XnAJaj4fVAs/Ttd1Z1j2gDI/AAAAAAAAAeY/bfUUNkblIKU/s400/grad1.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;Note that SAT scores don't add any information once high school GPA is accounted for. The correct classification rate can be compared to the rate of graduates. For example, if a test correctly predicts a coin flip 50% of the time, on the face of it this isn't very impressive. But it's actually more complicated than that. I have a kind of complexity theory approach to this sketched out on scrap paper, and will write about that later. In this case, the rate of four-year graduation from page 7 of the paper is 38.9%, so a correct classification rate of 68.3 could be compared to the strategy of predicting that &lt;i&gt;no student&lt;/i&gt; will graduate, which is correct 61% of the time. Even by this crude comparison, the predictor looks useful.&lt;br /&gt;&lt;br /&gt;HERI provides an associated &lt;a href="http://www.heri.ucla.edu/GradRateCalculator.php"&gt;calculator&lt;/a&gt; that lets you try out different scenarios related to graduation. Very cool.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The &lt;a href="http://www.virginiaassessment.org/"&gt;Virginia Assessment Group&lt;/a&gt; just published &lt;/b&gt;their &lt;a href="http://www.virginiaassessment.org/Final_RPA_winter2011.pdf"&gt;new edition of Research &amp;amp; Practice in Assessment&lt;/a&gt;. I hope to be able to read it on the plane tomorrow, on the way to the annual SACS-COC meeting in Orlando. Last time I was there I got to see a shuttle launch, which was amazing. Not likely this time.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;a href="http://coursekit.com/"&gt;Coursekit&lt;/a&gt; is new learning management system&lt;/b&gt; that wants to be more like social media. This relates to the topic of connecting professional portfolios to a social network. I learned about Coursekit in this &lt;a href="http://chronicle.com/blogs/wiredcampus/new-course-management-software-promises-facebook-like-experience/34488?utm_source=feedburner&amp;amp;utm_medium=feed&amp;amp;utm_campaign=Feed%3A+chronicle%2Fwiredcampus+%28The+Chronicle%3A+Wired+Campus%29"&gt;Wired Campus article&lt;/a&gt; in &lt;i&gt;The Chronicle&lt;/i&gt;. Even more intriguing to me is &lt;a href="http://chronicle.com/blogs/wiredcampus/creating-new-academic-networks-with-commons-in-a-box/34453?utm_source=feedburner&amp;amp;utm_medium=feed&amp;amp;utm_campaign=Feed%3A+chronicle%2Fwiredcampus+%28The+Chronicle%3A+Wired+Campus%29"&gt;Commons in a Box&lt;/a&gt;, a separate open source project to create professional networks. Quoting from the article in &lt;i&gt;The Chronicle&lt;/i&gt;:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;Educational groups, scholarly associations, and other nonprofit   organizations will be able to leverage the Commons in a Box to give   their members a space in which to present themselves as scholars to the   public, to share their work, to locate and communicate with peers, and   to engage in collaborative scholarship.&lt;/blockquote&gt;The original source is the &lt;a href="http://news.commons.gc.cuny.edu/2011/11/22/the-cuny-academic-commons-announces-the-commons-in-a-box-project/"&gt;CUNY Academic Commons&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-4270514913955772499?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/4270514913955772499/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/12/chefs-salad.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/4270514913955772499'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/4270514913955772499'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/12/chefs-salad.html' title='Chef&apos;s Salad'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-XnAJaj4fVAs/Ttd1Z1j2gDI/AAAAAAAAAeY/bfUUNkblIKU/s72-c/grad1.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-1995535033817241324</id><published>2011-11-28T06:40:00.001-05:00</published><updated>2011-11-28T07:07:22.634-05:00</updated><title type='text'>Link Salad</title><content type='html'>A Monday's worth of interesting education-related links:&lt;br /&gt;&lt;br /&gt;On non-cognitives, we have two articles from the &lt;i&gt;Boston Globe&lt;/i&gt;. The first is "&lt;a href="http://www.bostonglobe.com/ideas/2011/11/20/how-college-prep-killing-high-school/94mGUe6o9InIEuO9oMhnzJ/story.html"&gt;How College Prep is Killing High School&lt;/a&gt;":&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;A number of economists, including Nobel economist James Heckman, have documented the need for noncognitive or so-called soft skills in the labor market, such as motivation, perseverance, risk aversion, self-esteem, and self-control. &lt;/blockquote&gt;The second is "&lt;a href="http://www.bostonglobe.com/lifestyle/health-wellness/2011/11/07/how-willpower-works/XlOvEG4FipvZ8vM8VUNBpK/story.html"&gt;How Willpower Works&lt;/a&gt;":&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;In dozens of studies conducted over the past 25 years, Baumeister has found that taking on specific habits - like brushing your teeth with the opposite hand you’d normally use - can increase levels of self-control. In a phone interview, he likened willpower to a muscle: “If you exercise it, you can make it stronger. There’s nothing magical about it.’’&lt;/blockquote&gt;Then there is the less optimistic offering from the &lt;i&gt;New York Times&lt;/i&gt; "&lt;a href="http://www.nytimes.com/2011/11/27/magazine/changing-rules-for-success.html?_r=4&amp;amp;pagewanted=all"&gt;The Dwindling Power of a College Degree&lt;/a&gt;," which contains a warning for all of us:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;A general guideline these days is that people are rewarded when they can do things that take trained judgment and skill — things, in other words, that can’t be done by computers or lower-wage workers in other countries.&lt;/blockquote&gt;&lt;i&gt;The Wall Street Journal&lt;/i&gt; has a &lt;a href="http://graphicsweb.wsj.com/documents/NILF1111/#term="&gt;scorecard of career salaries&lt;/a&gt; by degree, in case you're keeping score. The highest 75th percentile salary goes to math and computer science combined. Compare it to math education:&lt;br /&gt;&lt;br /&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-G7UTJfD0H5Y/TtN1wNYIunI/AAAAAAAAAeQ/bIeGgUdfOmo/s1600/career.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" height="51" src="http://2.bp.blogspot.com/-G7UTJfD0H5Y/TtN1wNYIunI/AAAAAAAAAeQ/bIeGgUdfOmo/s640/career.png" width="640" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;A partial listing of the WSJ salary/major list found &lt;a href="http://graphicsweb.wsj.com/documents/NILF1111/#term="&gt;here&lt;/a&gt;.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;The quote in the &lt;i&gt;New York Times&lt;/i&gt; article about computers replacing us is especially interesting when juxtaposed to the ambitious research plan described in "&lt;a href="http://www.physorg.com/news/2011-11-language-science.html"&gt;Mining the Language of Science&lt;/a&gt;," from &lt;i&gt;Phyorg.com&lt;/i&gt;:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;Scientists are developing a computer that can read vast amounts of scientific literature, make connections between facts and develop hypotheses.&lt;/blockquote&gt;Stanford University is offering a &lt;a href="http://www.ml-class.org/"&gt;free online course&lt;/a&gt; on machine learning if you want to learn how to make a computer smarter than yourself (&lt;a href="http://en.wikipedia.org/wiki/Arthur_Samuel"&gt;true story&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;&amp;nbsp;To round out that topic, here are two articles on the limits of human understanding. First from &lt;i&gt;Physorg.com&lt;/i&gt; again is "&lt;a href="http://www.physorg.com/news/2011-08-people-biased-creative-ideas.html"&gt;People are Biased against Creative Ideas, Studies Find&lt;/a&gt;," including these findings:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Creative ideas are by definition novel, and novelty can trigger feelings of uncertainty that make most people uncomfortable.&amp;nbsp;&lt;/li&gt;&lt;li&gt;&amp;nbsp;People dismiss creative ideas in favor of ideas that are purely practical -- tried and true.&amp;nbsp;&lt;/li&gt;&lt;li&gt;&amp;nbsp;Objective evidence shoring up the validity of a creative proposal does not motivate people to accept it.&amp;nbsp;&lt;/li&gt;&lt;li&gt;Anti-creativity bias is so subtle that people are unaware of it, which can interfere with their ability to recognize a creative idea.&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;The second article, from &lt;i&gt;SciGuru&lt;/i&gt;, is "&lt;a href="http://www.sciguru.com/newsitem/11361/Ignorance-bliss-when-it-comes-challenging-social-issues"&gt;Ignorance is bliss when it comes to challenging social issues&lt;/a&gt;."&lt;/div&gt;&lt;blockquote class="tr_bq"&gt;The less people know about important complex issues such as the economy, energy consumption and the environment, the more they want to avoid becoming well-informed, according to new research published by the American Psychological Association.And the more urgent the issue, the more people want to remain unaware [...]&lt;/blockquote&gt;&lt;div&gt;This illustrates the mechanism I described in "&lt;a href="http://highered.blogspot.com/2011/11/self-limiting-intelligence.html"&gt;Self-limiting Intelligence&lt;/a&gt;." &amp;nbsp;You can test yourself on these last two points. Here's a &lt;a href="http://www.businessinsider.com/this-28-year-old-is-making-sure-credit-cards-wont-exist-in-the-next-few-years-2011-11?page=2"&gt;creative idea&lt;/a&gt; from &lt;i&gt;Business Insider&lt;/i&gt;, and a &lt;a href="http://www.economist.com/node/21540259"&gt;challenging social issue&lt;/a&gt; from &lt;i&gt;The Economist&lt;/i&gt;. Good luck!&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-1995535033817241324?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/1995535033817241324/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/11/link-salad.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/1995535033817241324'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/1995535033817241324'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/11/link-salad.html' title='Link Salad'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-G7UTJfD0H5Y/TtN1wNYIunI/AAAAAAAAAeQ/bIeGgUdfOmo/s72-c/career.png' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-7490647412243231287</id><published>2011-11-23T08:18:00.001-05:00</published><updated>2011-11-26T14:24:03.771-05:00</updated><title type='text'>Assessments, Signals, and Relevance</title><content type='html'>In "&lt;a href="http://highered.blogspot.com/2011/11/tests-and-dialogues.html"&gt;Tests and Dialogues&lt;/a&gt;" I promised to address the use of rubrics, which I get to. But before I do, I want to extend the ideas presented in the last few articles. By coincidence, my daughter &amp;nbsp;provided an example the same day I wrote the article.&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;My daughter Epsilon had a math test and a French test yesterday, so naturally I asked how they went. She had spent quite some time reviewing (with my imperfect help in the French class), and said that the tests were easy except that she forgot what the degree of a polynomial is (ugh!). She said she was able to guess at some things she didn't know, which made my eyebrows rise. Guess? Sure, she says, it's almost all multiple choice. &amp;nbsp;Here I began to sputter. &lt;i&gt;What??&lt;/i&gt; Algebra and &lt;i&gt;French&lt;/i&gt;...multiple choice? Yes, says she, it's because of the EOCs. That would be the local name for the monological "End of Course" state tests. Since the EOCs are multiple choice, and there is so much weight put on them, it makes economic sense to optimize all testing to resemble "the ones that matter." She's 14, and this is old hat to her by now.&lt;/blockquote&gt;The wrong assessments plus a factory mentality optimizes local relevance at the cost of global irrelevance. David Kammler wrote a&amp;nbsp;marvelous&amp;nbsp;parable in this vein, "&lt;a href="http://highered.blogspot.com/2009/01/well-intentioned-commissar.html"&gt;The Well Intentioned Commissar&lt;/a&gt;." Achieving goals can be inherently very complex. When we try to grasp their workings by simplifying cause and effect (e.g. in order to manage like a factory), we can lose important information. This is detrimental when optimizing the simplified problem is not the same as optimizing the original problem. The impact is not merely academic. I read a story in&amp;nbsp;&lt;i&gt;The Economist&lt;/i&gt; years ago that went like this:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;A state in the US was spending more on road repair that it thought reasonable, and sought to make the situation more equitable by passing cost onto the owners of the heavy trucks that were doing the most damage. So they instituted an axle fee--the more axles the truck had, the higher the cost to the truck owner to use the roads. This was a simple approximation: the heavier the truck, the more axles. What could go wrong? The outcome was that truckers, not being stupid, started using trucks that carried just as much weight, but on fewer axles. This increased the ground pressure of the trucks (same weight over less area) and damaged the roads even more than before. In this case they didn't merely optimize irrelevance, but actually exacerbated the problem they were trying to fix.&lt;/blockquote&gt;Even being irrelevant has an associated opportunity cost. The time spent learning how to game multiple choice tests could be better spent. We can only imagine what the long term cost is, when students finally figure out that real problems don't come with a built-in 20% chance of guessing the right answer.&lt;br /&gt;&lt;br /&gt;It is not a coincidence that all of this ties together with the idea in "&lt;a href="http://highered.blogspot.com/2011/11/self-limiting-intelligence.html"&gt;Self-Limiting Intelligence&lt;/a&gt;," where the problem I tried to illuminate about intelligent systems is that self-change easily turns into self-deception. My last couple of articles have mostly ignored the fact that there are powerful motivations lurking behind official definitions. Here's an &lt;a href="http://thenewspaper.com/news/36/3636.asp"&gt;example of how motivation subverts definitions from &lt;i&gt;theNewspaper.com&lt;/i&gt;&lt;/a&gt;, which calls itself "a journal of the politics of driving."&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;Automated ticketing vendor American Traffic Solutions (ATS) filed suit Tuesday against Knoxville, Tennessee for its failure to issue tickets for turning right on a red light -- and that is costing the company a lot of money. A state law took effect in July banning the controversial turning tickets, but the Arizona-based firm contends the law should not apply to their legal agreement with the city, which anticipated the bulk of the money to come from this type of tickets. &lt;/blockquote&gt;If this seems silly, here's a more &lt;a href="http://www.infowars.com/judges-took-bribes-to-send-children-to-privately-owned-juvenile-detention-centers/"&gt;disturbing example&lt;/a&gt;:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;Judge Mark A. Ciavarella and former Senior Judge Michael T. Conahan are accused of taking $2.6 million for sending children to two [correctional] facilities owned by Pittsburgh businessman Greg Zappala.&lt;/blockquote&gt;Privatizing a corrections facility created an economic value for criminal offenders, which increased the supply of such through more application of the monological standard by judges. This is what juries are there to prevent--provide a dialogical check.&lt;br /&gt;&lt;br /&gt;In higher education, the definition of educational success adopted by the policymakers is "enrollment," and "graduation," for which the state pays plenty. See my previous article "&lt;a href="http://highered.blogspot.com/2010/06/flipping-colleges-for-profit.html"&gt;Flipping Colleges for Profit&lt;/a&gt;," for how that turns out in the hands of private investors who seek to maximize dollars/student. We maximize enrollment and (perhaps) graduation at very large monetary expense to the taxpayer in grants and loans to students who &lt;a href="http://www2.ed.gov/offices/OSFAP/defaultmanagement/instrates.html"&gt;default at high rates&lt;/a&gt;. This counterproductive effect is evidence of an over-simplified index of success.&lt;br /&gt;&lt;br /&gt;It would be understandable if you took away from the discussion so far that monological = bad and dialogical = good, but that's not the case. Systems &lt;i&gt;have&lt;/i&gt;&amp;nbsp;to function monologically most of the time. I base this on a simple argument:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;Systems of all kinds exist. Some work, some don't. The ones that survive do so in part because they are motivated to survive. The actions a system takes to survive can be ultimately reduced to binary "do this/don't do that" decisions. Motivations drive those decisions based on information from the internal and external environment. This reduction of complex data into a simple binary decision we might call an assessment, and the result is a 'signal.' &lt;i&gt;Pain when you stub your toe is a "don't do that" signal corresponding to the implicit motivation to avoid bodily harm.&amp;nbsp;&lt;/i&gt;&lt;/blockquote&gt;If we put together all these signals, they comprise a language. If it works, it models the environment and allows the system to survive (eat this, don't eat that). The diagram that goes with a motivation-driven decision loop is the one I discussed in "&lt;a href="http://highered.blogspot.com/2011/11/self-limiting-intelligence.html"&gt;Self-Limiting Intelligence.&lt;/a&gt;"&lt;br /&gt;&lt;br /&gt;The point is not that monological motivation-driven signals are bad for us, it is that we have to use the &lt;i&gt;right ones&lt;/i&gt; if we want to succeed. Sometimes dialogues get turned into signals, as in a plebiscite or anywhere else where public opinion matters. Marketing is another example, but in reverse--working from a motivation to try to affect dialogue so someone can sell more soap. In those cases, lots of energy is spent in trying to affect conversations. The BBC's recent article "&lt;a href="http://www.bbc.co.uk/news/technology-15869683"&gt;Fake forum comments are 'eroding' trust in the web&lt;/a&gt;" is an example.&lt;br /&gt;&lt;br /&gt;Part of the decision about what assessment to use to create signals should be driven by the consideration that weighing it down with economic value will probably degrade the quality. This problem is ubiquitous. It includes counterfeiting, cheating, and corruption of all sorts. It even shows up in natural selection, as Darwin figured out, in &lt;a href="http://en.wikipedia.org/wiki/Sexual_selection"&gt;sexual selection&lt;/a&gt;--explaining peacock feathers, for example. It is perhaps embodied in the advice "you have to fake it to make it."&lt;br /&gt;&lt;br /&gt;In order to make a decision, a system has to process a potentially infinite amount of data for a few clues as to what will accomplish its goal (fulfilling motivations). This assessment&amp;nbsp;is a massive data compression that, if it's done well, describes in signal-language the important elements of the environment relative to motivation.&lt;br /&gt;&lt;br /&gt;An example will illustrate the point. First, let's look at the role of complexity and assessment. I took the photos below at the &lt;a href="http://www.capefearserpentarium.com/"&gt;Cape Fear Serpentarium&lt;/a&gt; in Wilmington, North Carolina. (It's a fantastic place to visit, along with Fort Fisher and the nearby aquarium if you're in the area.) This is the &lt;a href="http://en.wikipedia.org/wiki/Bitis_gabonica"&gt;Gaboon Viper&lt;/a&gt;, and I first read about it in a zoo--I think in Columbia, South Carolina. It's a big, slow snake that prefers to sit and wait for lunch to walk or hop by. If you're a small animal, the picture below shows your perspective:&lt;br /&gt;&lt;table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: left; margin-right: 1em; text-align: left;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-3OvCz0N1CMQ/Ts-y0U4EIHI/AAAAAAAAAd4/90DvLsS9ngg/s1600/snake2.JPG" imageanchor="1" style="clear: left; margin-bottom: 1em; margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" height="240" src="http://4.bp.blogspot.com/-3OvCz0N1CMQ/Ts-y0U4EIHI/AAAAAAAAAd4/90DvLsS9ngg/s320/snake2.JPG" width="320" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;The Gaboon Viper presents a high-complexity look to prey.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&amp;nbsp;In its natural environment, the snake's colors perfectly fit into the surrounding forest floor. The complex patters of light and dark break up the shape of the form, so that a rodent is unlikely to correctly assess this information and form the signal SNAKE! The snake presents a high complexity visual presentation to the world, and is rewarded for this concealment by a reduced probability that a rodent will correctly assess the situation.&lt;br /&gt;&lt;br /&gt;&lt;table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; margin-left: 1em; text-align: right;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-ExtWDFVHxFU/Ts-yuNKl4BI/AAAAAAAAAdw/7bRJz5uqw_0/s1600/snake1.JPG" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" height="240" src="http://1.bp.blogspot.com/-ExtWDFVHxFU/Ts-yuNKl4BI/AAAAAAAAAdw/7bRJz5uqw_0/s320/snake1.JPG" width="320" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;And a low-complexity "here I am!" to large beasts.&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;However, the snake has another problem. It's got a great lifestyle, sitting around waiting for dinner to walk by, but there are also large hooved beasts that are far too large to eat and impossible to get out of the way of when they come ambling by. It's a good thing to be hidden from small critters that one might consume, but quite another to be hidden from a huge monster that might step on you and break your spine! So the presentation for a viewer looking &lt;i&gt;down &lt;/i&gt;on the serpent needs adjustment. Instead of concealment, it wants to create an instant assessment in the bovine brain of SNAKE! As you can see in the photo on the right, this is accomplished (via natural selection, of course) by white stripes that look like the center of a highway. &amp;nbsp;I looked for research that actually demonstrates that cows can see these snakes better than ones without such coloration, but didn't come up with anything. So treat this as informed speculation rather than fact, unless someone can point me to an authoritative source. But the effect is real. In the family of large cats, some females have white dots on the backs of their ears so they can present a low-complexity "follow me" sign to their kittens in low light. Military vehicles do something similar so they don't run into one another in the dark, but also don't make good targets.&lt;br /&gt;&lt;br /&gt;To continue the example:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;Suppose you are going out for an afternoon hike in a tropical jungle. You check into the matter and see that there are deadly poisonous snakes in the area. Fortunately, there is a guy whose job it is to monitor the jungle nearby for such threats, and post a sign at the trail head with a warning when appropriate. &amp;nbsp;You may rightfully be dubious. There is a very large tract of undeveloped forest out there, and how plausible is it that this guy--who may be the governor's favorite nephew, for all you know--could have checked for every possible snake? So you are not&amp;nbsp;reassured&amp;nbsp;when you see a big green NO SNAKES TODAY THANK YOU sign nailed to a tree. In effect, you've rejected the data reducing assessment and have decided to create your own signals. That is, the final assessment for "is there a snake here?" remains pending. &amp;nbsp;You carefully watch where you step, tying up a large part of your mind to continually test the environment against your snake-matching perception. This is a lot of work and quite stressful, so you give it up and go to lunch instead.&lt;/blockquote&gt;The example shows the trade-off in an early or late assessment into a signal. Early signals make subsequent decisions easier. That's why most businesses like stability.&lt;br /&gt;&lt;br /&gt;So the big question for a complex system is when to do the assessment for a given motivation. There are symmetrical arguments for and against early decisions based on limited data:&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Early assessment from data to signal:&lt;/b&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;b&gt;Pro:&lt;/b&gt; If we assess early, we reap the economic benefit similar to mass production. All decisions that depended on the first one can now go about their business. &lt;i&gt;By mandating that 21 is the legal age to drink&amp;nbsp;alcohol, it creates a simple environment for liquor stores, as opposed to say administering an on-the-spot test for "responsible drinking" for each customer.&lt;/i&gt;&amp;nbsp;&lt;/blockquote&gt;&lt;blockquote class="tr_bq"&gt;&lt;b&gt;Con:&lt;/b&gt; Creating the signal greatly simplifies the actual state of the world. If subsequent decisions need detailed information, it won't be available. Worse, the signal may be wrong entirely.&lt;br /&gt;&lt;i&gt;&amp;nbsp;"Housing prices never fall" was an early assessment that led to a lot of unfortunate consequences.&lt;/i&gt;&lt;/blockquote&gt;&lt;blockquote class="tr_bq"&gt;&lt;b&gt;Con:&lt;/b&gt; Signal manipulation for economic benefit (what we would call corruption in a government) can cause a wide-spread disconnect from reality. &lt;i&gt;Adopting unproven test scores as the measure of educational success creates a false economy and doesn't reflect the actual goal.&lt;/i&gt;&lt;/blockquote&gt;&lt;b&gt;Late assessment from data to signal:&lt;/b&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;b&gt;Pro:&lt;/b&gt; Information isn't lost before it's needed. &lt;i&gt;The example of walking in the jungle illustrates this.&lt;/i&gt;&amp;nbsp;&lt;/blockquote&gt;&lt;blockquote class="tr_bq"&gt;&lt;b&gt;Pro:&lt;/b&gt; Local corruptions of signals have only local effects. &lt;i&gt;From a virus's point of view, it's ever-changing protein coat means that it can't be intercepted easily. On the other side, those who have to decide on a vaccine have a difficult decision to make about which variants to target.&lt;/i&gt;&lt;/blockquote&gt;&lt;blockquote class="tr_bq"&gt;&lt;b&gt;Con:&lt;/b&gt; It's more costly because decisions are made individually instead of in mass production style. &lt;i&gt;Every state has a different set of paperwork for allowing truckers to use the roads, which impedes commerce. Conversely, internet sales are aided by not having to worry about every locality's rules.&lt;/i&gt;&lt;/blockquote&gt;There's no one right answer. The cost for deferring assessment can be very high. The railroad owners finally created time zones to solve the problem of every town running on a different clock. A common currency is obviously good for everyone. Standardized traffic laws are a boon. Formalized ownership of property is a prerequisite for a modern society. All of these involve monological definitions that are somewhat based on early assessment of evidence and are somewhat arbitrary. In some cases, it can be &lt;i&gt;completely&lt;/i&gt; arbitrary and still immensely helpful (e.g. which side of the road we drive on). Imagine what a mess it would be if we had to survey the dialogical landscape every morning to see whether most people were driving on the right or left, stopping at green lights or red, and make our adjustments accordingly.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Applying this to Higher Education&lt;/b&gt;&lt;br /&gt;Education at all levels has to process so many students that good organization is essential. So it has to be a system, and there are going to be a lot of semi-arbitrary decisions made just to provide a workable system language. There are many, many of these, and those of us who labor within the system take it for granted that these things exist:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Courses&lt;/li&gt;&lt;li&gt;Grades&lt;/li&gt;&lt;li&gt;Credit-hours&lt;/li&gt;&lt;li&gt;Grade levels or course level designations&lt;/li&gt;&lt;li&gt;Set times for instruction (e.g. one hour lecture, which is really fifty minutes)&lt;/li&gt;&lt;li&gt;Set curricula&lt;/li&gt;&lt;li&gt;Degrees and diplomas&lt;/li&gt;&lt;/ul&gt;None of these have much directly to do with learning. The pressure to define what a credit hour means for online courses shows the rigidity of the system. It's worth a moment to look at one of these in detail, so let's follow that train of thought into the dialogical tunnel. Compare the Carnegie fifty-minute block class with how we naturally learn.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-1gK2iM1pTkE/TtDwwfr452I/AAAAAAAAAeA/yWZJPdHrY3M/s1600/Khan1.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="312" src="http://3.bp.blogspot.com/-1gK2iM1pTkE/TtDwwfr452I/AAAAAAAAAeA/yWZJPdHrY3M/s400/Khan1.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;The &lt;a href="http://www.khanacademy.org/"&gt;Kahn Academy&lt;/a&gt;&amp;nbsp;comprises&amp;nbsp;a large group of tutorials on YouTube started by Sal Khan (who quit managing a hedge fund to do this), and now operated by an impressive &lt;a href="http://www.khanacademy.org/about/the-team"&gt;team&lt;/a&gt;. The home page claims that over 87 million lessons have been delivered, with more than 2700 videos on offer covering a range of academic subjects and levels. The &lt;a href="http://www.youtube.com/watch?v=gHTH6PKfpMc"&gt;one shown&lt;/a&gt; in the picture is about long division. You can see that it's a little less than 10 minutes long. Why ten minutes? Why not fifty minutes? Some are longer and some shorter, depending on the needs of the subject.&lt;br /&gt;&lt;br /&gt;The curriculum in the Khan Academy is not a set of courses with prerequisites. It's much more natural than that, using a "knowledge map" to show connections between the ideas taught in the videos. Here's a sample:&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-ChwlFfoI4TE/TtEl5UA1AkI/AAAAAAAAAeI/MCR3dwjo8xc/s1600/Khan2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="356" src="http://2.bp.blogspot.com/-ChwlFfoI4TE/TtEl5UA1AkI/AAAAAAAAAeI/MCR3dwjo8xc/s640/Khan2.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;There are suggestions for prerequisites, but they are &lt;i&gt;per topic&lt;/i&gt;, not per course. Each of these has associated problems to be solved, challenges, and badges to earn. There are individualized feedback reports, and the ability for coaches to be involved.&lt;br /&gt;&lt;br /&gt;I recently signed up for a &lt;a href="http://www.ml-class.org/"&gt;free course on Machine Learning&lt;/a&gt; taught out of Stanford University. The lectures were online with a built-in quiz in each one. The videos are of varying length, but none close to fifty minutes. I skipped stuff I was already familiar with and browsed topics I didn't know as much about. Once I had to go back to the introductory material to clarify a point, had my "aha!" moment, and then forged ahead. To me this is a very natural way to learn. It's not at all systematic. I would never have had the time to sit through a traditional lecture course on the subject, but with the browsing ability I have with the online presentation, I can choose what I want off the menu and maximize the use of my time.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;What's the point?&amp;nbsp;&lt;/b&gt;&lt;br /&gt;If we create systems that make early decisions for learners so that we can make the logistics work, we save time in administration but suffer opportunity cost for each student. This "early assessment" model leads to preemptive decisions solidified into policy and practice: courses, semesters, grades, and so on.&lt;br /&gt;&lt;br /&gt;A "late assessment" version would be to provide just-in-time instruction for each student so that it could be used in some authentic learning situation. By authentic I mean anything that doesn't cause the student to think "when will I ever use this?" &amp;nbsp;For example, individual or group projects that require learning the content, but perhaps only a piece at a time on demand.&lt;br /&gt;&lt;br /&gt;The early collapse of complexity into a simple bureaucratic language includes the factory-like quality checks that occur along the way in the form of (increasingly standardized) tests. This early assessment presents problems for consumers of the product: employers and the graduates themselves, and society as a whole. The educational system doesn't give much information about what the students have learned other than these formalized assessments. It's like the NO SNAKES TODAY THANK YOU sign.&lt;br /&gt;&lt;br /&gt;Before the Internet, late assessment methods would have been too expensive to use for the whole system. Not anymore. We have the opportunity to adopt a new model whereby we coach students on how to learn for themselves. So much the better if this learning is in the context of some interesting project that the student can show off to peers and (if desired) the world. The creation of a rich portfolio along the way allows employers or anyone else with access to make their own assessments.&lt;br /&gt;&lt;br /&gt;I do not think that the massive higher education system will change this radically anytime soon. Nor are we ready to make such a change. Something like the Kahn Academy for every academic discipline would be a massive undertaking. Textbook publishers, if they are forward thinking, may lead the way. Imagine an online "textbook" that was actually a web of interrelated ideas mapped out transparently, with video lectures and automated problem solvers associated. This frees up what I once called the low bandwidth part of the course and allows for more creative official class time. Effectively, it offloads all the monologue to out-of-class time and lets you get to the dialogue directly.&lt;br /&gt;&lt;br /&gt;There is an opportunity for programs here and there to begin to experiment with this transition. As the examples in my previous articles show, this is already happening. In the most optimistic case, this prevents further solidification of the factory mentality in higher education by showing valid alternatives.&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-7490647412243231287?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/7490647412243231287/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/11/assessments-signals-and-relevance.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/7490647412243231287'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/7490647412243231287'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/11/assessments-signals-and-relevance.html' title='Assessments, Signals, and Relevance'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-3OvCz0N1CMQ/Ts-y0U4EIHI/AAAAAAAAAd4/90DvLsS9ngg/s72-c/snake2.JPG' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-7237854963038597695</id><published>2011-11-22T07:13:00.001-05:00</published><updated>2011-11-26T14:25:17.345-05:00</updated><title type='text'>Tests and Dialogues</title><content type='html'>In "&lt;a href="http://highered.blogspot.com/2011/11/end-of-preparation.html"&gt;The End of Preparation&lt;/a&gt;" I argued that standardized tests, as they exist now, are not very suited to the task of correctly classifying quality of the partial products we call students. Certainly the tests give us information beyond mere guessing, but the accuracy (judging from the SAT) is not high enough to support a factory-like production model. I pointed out that test makers do not usually even attempt to ascertain what the accuracy rate is. Instead we get validity reports that use a variety of associations. If we brought that idea back to the factory line, it would look something like this.&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;Announcing the new Auto Wiper Assessment (AWA). It is designed to test the ability of an auto to wipe water off the windshield. Its validity has been determined by high negative correlations with crash rates of autos on rainy days and low correlation on sunny days.&amp;nbsp;&lt;/blockquote&gt;On a real assembly line, the question would be as simple as D&lt;i&gt;oes it work now?&lt;/i&gt;&amp;nbsp;and A&lt;i&gt;re the parts reliable enough to keep it working?&lt;/i&gt;&amp;nbsp; Both of these can be tested with high precision. And of course, we can throw water on the windshield to directly observe whether the apparatus functions as intended. Direct observation of the functional structure of learning is not possible without brain scanners. Even then, we wouldn't really know what we are looking at--the science isn't there yet. What we do know is fascinating, like the &lt;a href="http://video.nationalgeographic.com/video/player/science/health-human-body-sci/human-body/london-taxi-sci.html"&gt;London Taxi Cab study&lt;/a&gt;, but we're a long way from understanding brains the way we understand windshield wipers.&lt;br /&gt;&lt;br /&gt;Validity becomes a chicken-and-egg problem. Suppose our actual outcome is "critical thinking and complex reasoning," to pick one from &lt;i&gt;Academically Adrift&lt;/i&gt;. There are tests that supposedly tell us how capable students are at this, but how do we know how good the tests are? If there were already a really good way to check, we wouldn't need the test! In practice, the test-makers get away with waving their hands and pointing to correlations and factor analyses, like the Auto Wiper Assessment example above. This is obviously not a substitute for actually knowing, and it's impossible to calculate the accuracy rate from the kind of current validity studies that are done. The SAT, as I mentioned is an exception. This is because it &lt;i&gt;does&lt;/i&gt;&amp;nbsp;try to predict something measurable: college grades.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;This is not a great situation.&lt;/b&gt; How do we know if the test makers are selling flim-flam? In practice, I think tests have to "look good enough" to pass casual inspection, and they can amount to neo-phrenology without anyone every knowing. How else can the vast amount of money being spent on standardized tests be explained? I'd be happy to be wrong if someone can point me to validity studies that show test classification error rates similar to the SAT's. A &lt;a href="http://en.wikipedia.org/wiki/Receiver_operating_characteristic"&gt;ROC graph&lt;/a&gt; would be nice.&lt;br /&gt;&lt;br /&gt;The argument might be that since reductionist definitions are not practical, and there really is no way to know whether a test works except through indirect indications like correlations, this is the best we can do. But it isn't. In order to support that claim, let me develop the idea by contrasting two sorts of epistemology. It's essential to the argument and also worth the exposition for its own sake. When I first encountered these ideas, they changed the way I see the world.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Monological Knowing&lt;/b&gt;&lt;br /&gt;Sometimes we know something via a simple causal mechanism: an inarguable definition. For example, when the home plate umpire calls a strike in a baseball game, that's what it is. It doesn't matter if the replay on television shows that the pitch was actually out of the strike zone. &amp;nbsp;Any argument about that will be in a different space--perhaps a meta-discussion about the nature of how such calls should be made. But within the game, as veteran umpire Bill Klein is quoted as saying "It ain't nothin till I call it!"&lt;br /&gt;&lt;br /&gt;Monological definitions are generally associated with some obvious sign. An umpire jerking his clenched fist after a pitch means it was a strike. Sometimes the definitions come down to chance, as with a jury trial. In the legal system, you are guilty if the jury finds you guilty, which has only indirectly to do with whether or not you committed a crime. The unequivocal sign of your guilt is a verdict from the jury. Other examples include:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Course grades, defining 'A' Student, 'B' Student etc.&lt;/li&gt;&lt;li&gt;Time on the clock at a basketball or football game, which corresponds only roughly to shared perception of time passing (perceived time doesn't stop during a time-out, but monological time can).&lt;/li&gt;&lt;li&gt;Pernicious examples of classifying a person's race, e.g. leading up to the Rwandan genocide. You are what it says you are on your documents.&lt;/li&gt;&lt;/ul&gt;Sometimes the assignments are random or arbitrary. Sometimes a single person gets to decide the classification, as with course grades. There is sometimes pressure from administrators to create easily understood algorithms for computing grades in order to handle grade appeals, but instructors usually have wide latitude in assigning what amounts to the monological achievement level of the student.&lt;br /&gt;&lt;br /&gt;I got bumped from a flight one time, and came away from the gate with the knowledge that I was "confirmed" on the next flight. That didn't mean what I thought it did, however. According to the airline's (monological) definition, "confirmed" means that the airline knows you are in the airport waiting, so you're a sure seat if they have an extra. It does &lt;i&gt;not&lt;/i&gt;&amp;nbsp;mean that such a seat is guaranteed for you.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Dialogical Knowing&lt;/b&gt;&lt;br /&gt;This might be more properly called polyphonic, but for the sake of parallelism, allow me the indulgence. In contrast to a monological handing down of definitions from some source, dialogical knowledge has these characteristics:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;It comes from multiple sources&lt;/li&gt;&lt;li&gt;There isn't universal agreement about it (definitions are not binding if they exist)&lt;/li&gt;&lt;li&gt;It's subjective&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;Whereas there is a master copy of what a Kilogram is in a&amp;nbsp;controlled&amp;nbsp;chamber in France, there is no such thing for the concept of "heavy." A load you are carrying will feel heavier after an hour than at the beginning of the hour. Furthermore, we can disagree about the heaviness. This is messy and imperfect, but very flexible because no definitions are needed. Anyone can create a dialogical concept, and it gets to compete with all the others in an ecology where the most fit survive. This fact is what prevents loose shared understanding from devolving too far into nonsense as a whole. There's plenty of nonsense (like fortune-telling), but we can communicate in a shared language very effectively even in the absence of formal definitions.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If I tell you that I liked the movie &lt;i&gt;Kung Fu Panda&lt;/i&gt;, you know what I mean. There are movies you like too, and you probably assume I feel about this movie the way you feel about those is some vague sense. You may disagree, but that's not a barrier to understanding. We could have a complex conversation about what constitutes a "good" movie, which doesn't have a final, monological answer. In &lt;i&gt;&lt;a href="http://zzascape.com/elephant.pdf"&gt;Assessing the Elephant&lt;/a&gt;&lt;/i&gt;&amp;nbsp;I compared this to the parable of the blind men inspecting an elephant, each sharing their own perspective. I used this as a metaphor for assessing general education outcomes, which are generally broad and hard to define monologically.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Tension between Monologue and Dialogue&lt;/b&gt;&lt;/div&gt;&lt;div&gt;Parallel to the tension between accountability and improvement in outcomes assessment, there is a tension between monological and dialogical knowledge in any system. The demand for locked-down monological approaches is the natural consequence of being part of a system, which as I described last time, needs to manage fuzziness and uncertainty in order to function. That's why we have monological definitions for what it means to be an adult, or "legally drunk." It makes systematization possible. Much of the time, this entails replacing a hard dialogical question ("what is an adult?") by a simple monological definition ("anyone 21 years or older"). In ordinary conversation we may switch these meanings without noticing, but sometimes the tension is obvious.&lt;br /&gt;&lt;br /&gt;The question "which candidate will do the best job in office?" gets answered by "which candidate got the most votes?" It replaces an intractable question with one that can be answered systematically in a reasonable amount of time Of course it's an approximation of unknown validity. Monologically, the system decides on the "best" candidate, but the dialogical split on the issue can be 49% vs 51%.&lt;br /&gt;&lt;br /&gt;Someone put together an page describing the relationship between monological Starbucks definitions of drink sizes and the shared understanding of small, medium, large. The site, which you can find &lt;a href="http://itre.cis.upenn.edu/~myl/languagelog/archives/001677.html"&gt;here&lt;/a&gt;, is a perfect foil for this discussion. I find it hysterically funny. Here's a bit of it:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;The first problem is that Starbucks is right, in a sense. I've established that asking for a "small coffee" gets you the 12-ounce size; "medium" or "medium-sized" gets you 16 ounces; and "large" gets you a 20 ounce cup. However, in absolute rather than relative terms, this is nuts. A "cup" is technically 8 ounces, and in the case of coffee, a nominal "cup" seems to be 6 ounces, as indicated by the calibrations on the water reservoirs of coffee makers, [...]&lt;/blockquote&gt;&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;When a referee makes a bad call in a sports event, the crowd reacts negatively. The dialogical "fact" doesn't agree with the monological one, which is seen as artificial and not reflecting the reality of shared experience.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It may be appalling, but it makes sense that the Oxford English Dictionary now includes the work "nucular" as a synonym for "nuclear." This is the emodiment of a philosophy that the dictionary should reflect the dialogical use of language, not some monological official version.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-29cw_jIgxfo/TsvEx9TUxQI/AAAAAAAAAdo/McdkdojV6co/s1600/nucular.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="271" src="http://4.bp.blogspot.com/-29cw_jIgxfo/TsvEx9TUxQI/AAAAAAAAAdo/McdkdojV6co/s320/nucular.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;In assessment, it's quite natural to fall victim to the tension between these two kinds of knowledge. As noted, tests of learning almost never come with warning labels that say &lt;i&gt;This test gives the wrong answer 35% of the time.&lt;/i&gt; The test doesn't have any other monological ways of knowing to compete with, other than possibly other similar tests, so by default &lt;b&gt;the test becomes the monological definition of the learning outcome&lt;/b&gt;. Because it replaces a hard question ("how well can our students think?") with an easily systematized one ("what was the test score?") it's attractive to anyone who has to watch dial and turn knobs in the system. In the classroom, however, the test may or may not have anything to do with the shared dialogical knowledge--that messy, subjective, imperfect consensus about how well students are really performing.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;A Proposal to Bridge the Gap&lt;/b&gt;&lt;br /&gt;Until we better understand how brains work, it's not realistic to hope for a physiology-based monological definition of learning to emerge to compete with testing. However, it would be very interesting to see how well tests align with the shared conception of expert observers. This doesn't seem to be a standard part of validity testing in education, and I'm not sure why. It's in everyone's best interests to align the two. &lt;br /&gt;&lt;br /&gt;There is a brilliant history of this kind of research in psychology, culminating in the definition of the Big Five personality traits, which you can read about &lt;a href="http://en.wikipedia.org/wiki/Big_Five_personality_traits"&gt;here&lt;/a&gt;. From Wikipedia, here is the kernel of the idea:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;Sir Francis Galton was the first scientist to recognize what is now known as the Lexical Hypothesis This is the idea that the most salient and socially relevant personality differences in people’s lives will eventually become encoded into  language. The hypothesis further suggests that by sampling language, it  is possible to derive a comprehensive taxonomy of human personality  traits.&lt;/blockquote&gt;Subjective assessments have a bad reputation in education, but lexical hypothesis was shown to be workable in practice. It's not astounding that dialogical language has meaning, but it doesn't seem fashionable to admit it. &lt;br /&gt;&lt;br /&gt;Given all this, it's obvious that we should at least try to understand the resemblance between monological tests of "critical thinking and complex reasoning" or "effective writing" and the dialogical equivalent. It's simple and inexpensive to do this if one already has test results. All that's required is to ask people who have had opportunity to observe students what they think. Any way it turns out, the results will be interesting.&lt;br /&gt;&lt;br /&gt;Suppose the test results align very well with dialogical perceptions. That's great--we can use either tests or subjective surveys as we prefer.&lt;br /&gt;&lt;br /&gt;If the two &lt;i&gt;don't&lt;/i&gt; align, then we have to ask who's more likely to be correct. In this case the tests lose out because of a simple fact: test scores don't matter in the real world. What does matter are the subjective impressions of those who employ our graduates or otherwise interact with them professionally. In the world beyond the academy, it's common shared perceptions that are the metric of success, and it won't do any good to point to your test scores. In fact, there is a certain &lt;i&gt;shadenfreude&lt;/i&gt; in disproving credentials, as in watching videos of graduates from Ivy U who don't know why the seasons change. It isn't just Missouri: we're a &lt;i&gt;show me&lt;/i&gt; society.&lt;br /&gt;&lt;br /&gt;You'll notice that either way, the test results are largely unneeded. This illuminates why they are being used. Self-reported dialogical assessments depend on trust. In theory, tests can be administered in an adversarial environment. This restates Peter Ewell's quote in my previous article. This is a recipe for optimizing irrelevance. In &lt;i&gt;Assessing the Elephant&lt;/i&gt;, I called this a degenerate assessment loop and gave this example:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;A software developer found that there were too many bugs in its products, so it began a new system of rewards. Programmers would be paid a bonus for every software bug they identified and fixed. The number of bugs found skyrocketed. The champagne was quickly put back on ice, however, when the company realized that the new policy had motivated programmers to create more bugs so that they could “find” them.&lt;/blockquote&gt;Similar "degenerate" strategies find their way into educational practices because of the economic value placed on monological simplifications used in low-trust settings. We read about them in the paper sometimes.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Surveying the Dialogical Landscape&lt;/b&gt;&lt;br /&gt;I have implemented surveys at two institutions to gather faculty ratings of student learning outcomes. I have many thousands of data points, but no standardized test scores to compare them to, so I can't check the alignment as I described above. The reliability of these ratings is: about a 50% probability of exact match on a four-point scale for the same student, same semester, same learning outcome, with different instructors. I've already written extensively about that, for example &lt;a href="http://zzascape.com/elephant.pdf"&gt;here&lt;/a&gt; and &lt;a href="http://highered.blogspot.com/search?q=FACS"&gt;on this blog&lt;/a&gt;, as well as some chapters in assessment books, which you can find on my &lt;a href="http://www.zzascape.com/Resume.rtf"&gt;vita&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Conclusion&lt;/b&gt;&lt;br /&gt;In a tight system, monological approaches are useful. The human body is a good example of this, but we should note that at least two important systems are more dialogical than monological: the immune system and the conscious mind. The world beyond graduation resembles a competitive ecology more like what the immune system faces than a systematic by-the-numbers existence like a toenail.&lt;br /&gt;&lt;br /&gt;The only reason to use monological tests is if we don't trust faculty. This can't even be done with any intellectual honesty because we can't say that the tests are any good. What I proposed in the "The End of Preparation" is that we move to dialogical methods of assessment throughout and beyond the academy. These can still be summarized for administrators to look at, but only if there is trust at all levels. And really, if there is no trust between faculty and administration, the whole enterprise is doomed.&lt;br /&gt;&lt;br /&gt;The mechanism of using public porfolios showing student records of performance can be purely dialogical--a student's work can have different value to different observers inside and outside the academy. &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Next time I'll address what all this has to do with rubric-based assessment.&lt;br /&gt;&lt;br /&gt;[Next article in this series: "&lt;a href="http://highered.blogspot.com/2011/11/assessments-signals-and-relevance.html"&gt;Assessments, Signals, and Relevance&lt;/a&gt;"]&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Some Frivolous Thoughts&lt;/b&gt;&lt;br /&gt;As I said, this dichotomy changed the way I think about the world, and I find interesting tid-bits everywhere. One interesting idea is the hypothesis that as a domain of interest becomes more reliably theoretical (like alchemy becoming chemistry), the nomenclature transitions from descriptive and dialogical to arbitrary and monological. I went poking through several dictionaries looking for evidence of the names of the elements, to find examples. Copper may be an instance, perhaps having been named for Crete, as in "Cretian metal." If the name is too old, the etymology is foggy. Steal is more recent, and it seems to derive from a descriptive Germanic word for stiff. Compare that to Plutonium, which is modern and non-descriptive. Of course, with arbitrary naming, the namer can &lt;i&gt;choose &lt;/i&gt;to be descriptive, as Radium arguably is. This thesis needs some work.&lt;br /&gt;&lt;br /&gt;In biology, Red-Winged Blackbird is a descriptive name for the monological &lt;i&gt;&lt;span class="binomial"&gt;Agelaius phoeniceus&lt;/span&gt;&lt;/i&gt;&lt;span class="binomial"&gt;. In a good theory, it doesn't matter what you call something. What matters is the relationships between elements, like the evolutionary links between bird species as laid out in a cladistic family tree.Modern scientists are more or less free to name new species or sub-atomic particles whatever they want. Organic chemistry is an interesting exception, because the names themselves are associated with composition. They are simultaneously descriptive and monological. &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Drug names are particularly interesting. Viagra, for example, has a chemical name that describes it, but that obviously wouldn't do for advertising purposes. Here's what &lt;a href="http://www.medscape.com/viewarticle/414871_5"&gt;one source&lt;/a&gt; says about the naming process:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;Drug companies use several criteria in selecting a brand name. First and foremost, the name must be easy to remember. Ideally, it should be one physicians will like -- short and with a subliminal connotation of the drug. Some companies associate their drugs with certain letters (e.g., Upjohn with &lt;em&gt;X&lt;/em&gt; and Glaxo with &lt;em&gt;Z&lt;/em&gt;). If the drug is expected to be used eventually on a nonprescription basis, the name should not sound medicinal. There must be no trademark incompatibilities, and the company must take account of the drug's expected competition.&lt;/blockquote&gt;It sounds like the name is chosen to fit neatly into a dialogical ecology.&lt;br /&gt;&lt;br /&gt;The history of the SAT's name is interesting from this perspective, but I will bring this overlong article to a close.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Acknowledgments:&lt;/b&gt; The idea for the monological/dialogical dichotomy came out of conversations with Dr. Adelheid Eubanks about her research on &lt;a href="http://en.wikipedia.org/wiki/Mikhail_Bakhtin"&gt;Mikhail Bakhtin&lt;/a&gt;. I undoubtedly have mangled Bakhtin's original ideas, and neither he nor Adelheid should be held responsible for that.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-7237854963038597695?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/7237854963038597695/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/11/tests-and-dialogues.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/7237854963038597695'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/7237854963038597695'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/11/tests-and-dialogues.html' title='Tests and Dialogues'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-29cw_jIgxfo/TsvEx9TUxQI/AAAAAAAAAdo/McdkdojV6co/s72-c/nucular.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-6622007522200024152</id><published>2011-11-20T05:20:00.001-05:00</published><updated>2011-11-22T14:17:15.429-05:00</updated><title type='text'>The End of Preparation</title><content type='html'>A few days ago, I wrote "&lt;a href="http://highered.blogspot.com/2011/11/perilous-tail.html"&gt;A Perilous Tail&lt;/a&gt;" about problems with the underlying distributions of measurements (in the sense of observations turned into numbers) we employ in education. I've &lt;a href="http://highered.blogspot.com/2011/09/economics-of-imperfect-tests.html"&gt;shown previously&lt;/a&gt; that at least for the SAT, where there is data to check, predictive validity is not very good: we can only classify students correctly 65% of the time. When this much random chance is involved in decision-making, the effect is that we can easily be fooled. Here I cited the example that might lead us to believe that yelling at dice improves their "performance." It's also unfair to hold individuals accountable if their performance is tied to significant factors they can't control, and it invites cheating and finger-pointing.&lt;br /&gt;&lt;br /&gt;As I mentioned in the previous article, I have a solution to propose. But I haven't really done the problem justice yet.&amp;nbsp; We encounter randomness every day, so how do we deal with it? Functioning systems have a hard time with too much randomness, so the systematic response is to manage it by reducing or avoiding uncertainties, and when that cannot be done, we might imagine it away (for example, throwing salt over one's shoulder to ward off bad luck). Many of our ancestors undoubtedly faced a great deal of uncertainty about what they would be able to eat, so much of our mental and physical activity and the very way our bodies are constructed, has to do with finding suitable food (which can be of various sorts) and consuming it effectively. Compare that with the homogeneous &lt;i&gt;internal &lt;/i&gt;system of energy delivery that feeds and&amp;nbsp;oxygenates&amp;nbsp;the cells in our body via our circulatory system. Complex systems often take messy stuff from the environment and then organize it for internal use. I will use a mass-production line like the one pictured below for such a system.&lt;br /&gt;&lt;table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: left; margin-right: 1em; text-align: left;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="http://upload.wikimedia.org/wikipedia/commons/thumb/5/53/Consolidated_TB-32_production_line.jpg/220px-Consolidated_TB-32_production_line.jpg" imageanchor="1" style="clear: left; margin-bottom: 1em; margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" src="http://upload.wikimedia.org/wikipedia/commons/thumb/5/53/Consolidated_TB-32_production_line.jpg/220px-Consolidated_TB-32_production_line.jpg" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;Source: &lt;a href="http://en.wikipedia.org/wiki/Mass_production"&gt;Wikipedia&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;The idea is to find materials in the environment that can be used to implement a plan, turning rocks into aluminum and tree sap into rubber, sand into glass, and heating, squashing, and otherwise manipulating raw natural resources until they come to create an airplane. This process is highly systematized so that parts are&amp;nbsp;interchangeable, and the reliability of each step can be very high. Motorola invented the concept of &lt;a href="http://en.wikipedia.org/wiki/Six_Sigma"&gt;Six Sigma&lt;/a&gt; to try to reduce the randomness in a manufacturing process to&amp;nbsp;negligible&amp;nbsp;amounts. This is at least theoretically possible in physical systems that have reliable mechanical properties.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;What do we do when randomness can't be eliminated from the assembly line?&lt;/b&gt; One approach is to proceed anyway, because assembly lines have great economies of scale, and can perhaps be useful even if there are a large number of faulty items produced. Computer chip makers have to deal with a certain percentage of bad chips at the end of the line, for example. When chemists make organic molecules that randomly choose a left-right symmetry (i.e. chirality), sometimes they have to throw away half of the product, and there's no way around it.&lt;br /&gt;&lt;br /&gt;The educational system in the United States has to deal with a great deal of variability in the students it getsI as inputs and the processes individuals experience. It superficially resembles a mass production line. There are stages of completion (i.e. grade levels), and bits of assembly that happen in each one. There are quality checks (grades and promotion), quality assurance checks (often standardized tests), and a final stamp of approval that comes at the end (a diploma).&lt;br /&gt;&lt;br /&gt;All this is accomplished while largely ignoring the undeniable fact that students are not standardized along a common design, and their mental machinery cannot be engineered directly the way an airplane can be assembled. In short, the raw material for the process is mind-bendingly more complex than any human-made physical device that exists. &lt;br /&gt;&lt;br /&gt;Because of the high variability in outcomes, the means we use for quality assurance is crucially important, and this is where we have real opportunities to improve. This is the assessment problem. Current methods of assessment look a lot like factory floor assessments: the result of analyzing student performance is often a list of numbers that can be&amp;nbsp;aggregated&amp;nbsp;to show the executives how the line is working. Rewards and punishments may be meted out accordingly. In a real factory,&amp;nbsp; the current stage of production must&amp;nbsp; adequately prepare the product for the next stage of production. We must be able to correctly classify parts and pieces as "acceptable" or "not acceptable" according to whether or not they will function as required in the whole assembly. The odd thing about educational testing is that this kind of question doesn't seem to be asked and answered in a way that takes test error into account. Randomness is simply wished away as if it didn't exist. In the case of the SAT (see &lt;a href="http://highered.blogspot.com/2011/09/sat-error-rates.html"&gt;here&lt;/a&gt;), the question might be "is this student going to work out in college?" In practical terms this is defined as earning at least a B- average the first year (as defined by the &lt;a href="http://professionals.collegeboard.com/profdownload/RR2011-5.pdf"&gt;College Board's benchmark&lt;/a&gt;). To their credit, the College Board published the answer, but this transparency is exceptional. Analyzing test quality in this way proceeds like this:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;State what the desired observable future effect of the educational component under review.&lt;/li&gt;&lt;li&gt;Compare test scores with actual achievement of the outcome. What percentage succeeded at each score?&lt;/li&gt;&lt;li&gt;Find a suitable compromise between true positives and true negative outcomes to use as your benchmark.&lt;/li&gt;&lt;li&gt;Publish the true positive and true negative predication rate based on that benchmark.&lt;/li&gt;&lt;/ol&gt;To repeat, the College Board has done this, and the answer is that the SAT benchmark gives the right answer 65% of the time. This would make you rich if we were predicting stock prices, but it seems awfully low for a production line quality check.&lt;br /&gt;&lt;br /&gt;&lt;table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: left; margin-right: 1em; text-align: left;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="http://upload.wikimedia.org/wikipedia/commons/thumb/3/31/United_States_Army_Air_Forces_Recruting_Poster_-_1.jpg/220px-United_States_Army_Air_Forces_Recruting_Poster_-_1.jpg" imageanchor="1" style="clear: left; margin-bottom: 1em; margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" height="320" src="http://upload.wikimedia.org/wikipedia/commons/thumb/3/31/United_States_Army_Air_Forces_Recruting_Poster_-_1.jpg/220px-United_States_Army_Air_Forces_Recruting_Poster_-_1.jpg" width="207" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;Source: &lt;a href="http://en.wikipedia.org/wiki/Martin_B-26_Marauder"&gt;Wikipedia&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;Because we can't make the randomness go away, we imagine it away. So the assessments become &lt;i&gt;de facto&lt;/i&gt;&amp;nbsp;the measure of quality, and the quality of the tests themselves remains unexamined. In a real assembly line, an imperfect test would be found out eventually when the planes didn't perform as expected. Someone would notice and eventually track it down to a problem with the quality assurance program. There is so much uncertainty in education that this isn't possible, and the result is truly ironic: the deeper insinuation of tests that are unaccountable for their results. To be clear: any quality test that does not stand in for a clear predictive objective &lt;i&gt;and&lt;/i&gt;&amp;nbsp;provide research for its rate of correct classification in actual practice, is being used simply on faith. To be fair, it's virtually impossible to meet this bar. That excuse doesn't make the problem go away, however--it just makes it worse. One result is that test results distort perceptions of reality. If appearance is taken at face value for reality, then appearance has great economic value. There is incentive to do whatever it takes to get higher test scores, with unfortunate and predictable results.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;To sum up the problem:&lt;/b&gt; variability is too high for good standardized tests to support an educational assembly line, and this fact is generally ignored for convenience.&lt;br /&gt;&lt;br /&gt;I don't mean to imply that the major actors in education are&amp;nbsp;incompetent. We are where we are because of historical factors that make sense as a system in evolution, and we have the means to take the next step.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The real world is not an assembly line. &lt;/b&gt;There is this expression we use when talking to students "when you get out into the real world...", as if the academy is a walled garden that excludes the vulgar world at large. This too, is a factory mentality. The students have heard it all before. From kindergarten on up, they hear stories about how hard it's going to be "when you get to high school," or "when you get to college." My daughter marveled at this during her first few weeks of high school. She was amazed and somewhat appalled that her middle school teachers had misled her about this. Of &lt;i&gt;course&lt;/i&gt; it's not much harder--the system can only work with a high degree of integration, smoothing out the hard bits. If the wing assembly is slowing down the production line, then it needs attention. One could argue that the whole path from Kindergarten through Doctorate is becoming a smooth one for anyone who wants to trod it.&lt;br /&gt;&lt;br /&gt;But the real world &lt;i&gt;really is&lt;/i&gt;&amp;nbsp;different. The assembly line stops at the hanger door, and the planes are supposed to be ready to fly. The tests don't matter anymore. No one is going to check their validity, nor delve too deeply into what a certification means after graduation. And in the real world, the factory mentality has to be unlearned: one cannot cram all night just before promotions are announced in order to game the system.&lt;br /&gt;&lt;br /&gt;One solution is to try to change the real world to be more like the educational system. This is a practical choice for a military career, perhaps, where strict bureaucracy is essential to function. But it certainly is a mismatch for an&amp;nbsp;entrepreneurial&amp;nbsp;career, the engine of the nation's economic growth.&lt;br /&gt;&lt;br /&gt;I believe it is now a reasonable and desirable choice to go the other direction, and change the assembly line to look more like the real world. The most important aspect to change is to admit uncertainty and begin to take advantage of it. This means we have to forget about the idea of standardizing and certifying. I will argue that we can do this to our great advantage, and introduce efficiencies into the economic structure of the nation in the process. Currently we pass our uncertainties on to employers. We hide behind test results and certificates, and leave it to employers to actually figure out what all that means. The result is that they have only very crude screening information at their disposal, and have to almost start from scratch to see what a graduate can actually do. The Internet can change all that. To my surprise, I discovered last week that it already is.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://digication.com/images/logo.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="40" src="http://digication.com/images/logo.jpg" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;I had written a draft of this article last week, but when I ran it through my BS-detector, I couldn't bring myself to publish it. The reason is simple: it's just another untested idea, or so I thought. I hadn't actually employed the solution I will describe below, and so I didn't have anything concrete to show. But by coincidence, I saw exactly what I was looking for at the Virginia Assessment Group conference, at a presentation by Jeffrey Yan, who is the CEO of &lt;a href="http://digication.com/"&gt;Digication&lt;/a&gt;. I didn't know about the company before last week.Vendors in higher education technology solutions may be excused perhaps for exaggerating the effectiveness of their products, and I generally think they are overpriced, too complicated, and too narrowly focused. I didn't have great expectations for Jeffrey's talk, but after about ten minutes I realized that he was showing off a practical application of what I was hypothesizing.&lt;br /&gt;&lt;br /&gt;Digication is an eportfolio product. In what follows, I will not attempt to describe it as a software review would, but as it fits into the flow of ideas in this article.&lt;br /&gt;&lt;br /&gt;The main idea is simple:&lt;b&gt; instead of treating students as if we were preparing them for a future beyond the academy, treat them as if they were already there.&lt;/b&gt; In the real world, as it's called, our careers are not built on formalized assessments. To be sure, we have to deal with them in performance reviews or board certifications, but these are mostly barriers to success, not guarantees of it. Instead, it's the record of accomplishment we create as we go that matters. In many instances, promotions and accolades are inefficiently distributed, based on personal relationships and tenure, rather than merit, but this is not what we should aspire to. In fact, these imperfections are vulnerable to the sort of transparency that is within our grasp.&lt;br /&gt;&lt;br /&gt;&lt;table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: left; margin-right: 1em; text-align: left;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="https://stonybrook.digication.com/files/M6d8274848df953a2078ca31dda173250.gif" imageanchor="1" style="clear: left; margin-bottom: 1em; margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" height="144" src="https://stonybrook.digication.com/files/M6d8274848df953a2078ca31dda173250.gif" width="200" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;Senior Design Project at Stonybrook&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;In his presentation, Jeffrey showed examples of the sort of thing that's possible. Take a look at this &lt;a href="https://stonybrook.digication.com/biodesign_group_5/need//"&gt;senior design project&lt;/a&gt;&amp;nbsp;at &lt;a href="http://www.stonybrook.edu/"&gt;Stonybrook University&lt;/a&gt;. It's a real world project to design a new sort of&amp;nbsp;&lt;a href="http://en.wikipedia.org/wiki/Sphygmomanometer"&gt;sphygmomanometer&lt;/a&gt;&amp;nbsp;(blood pressure meter). Quoting from the project page:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;i&gt;We aim to satisfy all the customer needs by designing a [blood pressure measuring] device that translates the vibrations into a visual indication of blood pulses, more specifically the first pulse to force its way through the occluded artery (systolic) and the last pulse detectable before laminar flow is regained (diastolic).  &lt;/i&gt;&lt;/blockquote&gt;&lt;a href="http://4.bp.blogspot.com/-4PgKAicRrX4/TskPOsQrsBI/AAAAAAAAAdg/r_XitRDP4Hw/s1600/math.PNG" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="148" src="http://4.bp.blogspot.com/-4PgKAicRrX4/TskPOsQrsBI/AAAAAAAAAdg/r_XitRDP4Hw/s320/math.PNG" width="320" /&gt;&lt;/a&gt;Another showcased student portfolio was from a second year student at the same institution, who created a public portfolio to tell the world about his interests and abilities. He shows how to solve what we call a difference equation (similar to a differential equation) using combinatoric methods &lt;a href="https://stonybrook.digication.com/mo_lam/recursive_1"&gt;here&lt;/a&gt;. This shows an interest in&amp;nbsp;versatility&amp;nbsp;in the subject that cannot be communicated with a few numbers in an assembly-line type report.&lt;br /&gt;&lt;br /&gt;By concentrating on &lt;i&gt;authentic evidence of accomplishment&lt;/i&gt;, rather than artificially standardized means of observation, we create an important opportunity: a public portfolio can be judged on its own merits, rather than via an uncertain intermediary. It's the difference between seeing a movie yourself and knowing only that it got three and a half stars from some critic.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The solution to the factory mentality presents itself.&lt;/b&gt;&amp;nbsp;If students see that they are working for themselves and not as part of some unfathomable assembly process, accumulating what will become a public portfolio of their accomplishments, their learning becomes transparent. They can directly compare themselves to peers in class, peers at other institutions, graduates from all over, and professionals in the field. I imagine this leading to a day when it's simply unthinkable for &lt;i&gt;any professional&lt;/i&gt;&amp;nbsp;not to have an up-to date professional eportfolio linked to his or her professional social networking presence (see &lt;a href="http://mathoverflow.net/"&gt;mathoverflow.net&lt;/a&gt;, &lt;a href="http://academia.edu/"&gt;Academia.edu&lt;/a&gt;, and &lt;a href="http://linkedin.com/"&gt;LinkedIn.com&lt;/a&gt;&amp;nbsp;as examples of such networks). Once started, the competitive edge by those with portfolios will become obvious--you can learn much more from a transparent work history than you can from a resume.&lt;br /&gt;&lt;br /&gt;While in school, of course, some work, maybe much of it, needs to be private, to gestate ideas before presenting them to the world. But the goal should be for a forward-looking institution of higher education to begin to create public sites like the &lt;a href="https://stonybrook.digication.com/spring_colloquium/About_the_Showcase"&gt;Stonybrook showcase&lt;/a&gt; and the one at &lt;a href="https://lagcc-cuny.digication.com/portfolio/directory.digi"&gt;LaGuardia Community College&lt;/a&gt;. Ultimately, universities need to hand the portfolios off to the students to develop as their respective careers unfold. I understand that graduates get to keep their portfolios and continue to develop them with Digication's license, as long as it is maintained.&lt;br /&gt;&lt;br /&gt;Here's the manifesto version: &lt;br /&gt;&lt;blockquote class="tr_bq"&gt;We don't need grades. We don't need tests or diplomas or certificates or credit hours. None of that matters except insofar as it is useful to internal processes that may help students produce authentic evidence of achievement. That, and that alone is how they should be judged by third parties. &lt;/blockquote&gt;Some advantages of switching from "assemble and test" to authentic work that is self-evidently valuable:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;We change student mentality from "cram and forget" to actual accomplishment. We can make the question "when will I ever really use this stuff?" go away.&lt;/li&gt;&lt;li&gt;The method of assessing a portfolio is deferred to the final observer. You may be interested in someone else's opinion or you may not be. It's simply there to inspect. Once this is established, third parties will&amp;nbsp;undoubtedly&amp;nbsp;create a business out of rating portfolios for suitability for your business if you're too busy to do it yourself.&lt;/li&gt;&lt;li&gt;Instead of just a certificate to carry off at graduation, students could have four years' worth of documentation on their authentic efforts. This idea is second nature to a generation who grew up blogging and posting YouTube videos.&lt;/li&gt;&lt;li&gt;It doesn't matter where you learned what. A student who masters quantum mechanics by watching MIT or Kahn Academy videos might produce as good work as someone sitting in class. It makes a real meritocracy possible. &lt;/li&gt;&lt;li&gt;Intermediate work matters. Even if someone never finishes a degree, they have evidence beyond a list of grades that they learned something. And it's in rich detail.&lt;/li&gt;&lt;/ol&gt;There's more than this, actually. The very nature of publishing novel work is changing. At present, the&amp;nbsp;remnants&amp;nbsp;of paper bound publication, with its interminable delays,&amp;nbsp;exorbitant&amp;nbsp;costs, virtual inability to correct errors, and tightly bound intellectual property issues, is still around. But it's dying. &amp;nbsp;A journal is nothing more than a news aggregator, and those are now ubiquitous and free. It's hard to say what the final shape of publishing will be, but something like a standardized portfolio will probably be front and center. When I say 'standardized', I mean containing certain key features like metadata and historical archive, so that you can find things, cross-reference, and track changes. As the professional eportfolio develops, it will need help from librarians to keep it all straight, but this can be done at a much lower cost than the publishing business now incurs in lost productivity, restricted access, and cost to libraries.&lt;br /&gt;&lt;br /&gt;The focus will, I believe, shift from journals and other information aggregators, to the individuals producing the work. And institutions will share in some of the glory if part of the portfolio was created under their care.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;All of this has been around for a while, of course.&lt;/b&gt;&amp;nbsp;Eportfolios are almost old news in higher education, and I've blogged about them before. My previous opinion was that there was no need for a university to invest in its own portfolio software because everything you already need is on the web. If you want to put a musical composition on the web, just use &lt;a href="http://noteflight.com/"&gt;Noteflight&lt;/a&gt;, and of course there's YouTube for videos, and so on. All that's needed is a way to keep track of hyperlinks to these in a way that can allow instructors to annotate as needed with rubrics and such. The demos convinced me, however, that having a standard platform that can be easily accessible for private, class-wide, collaborative, or public use is worth paying for. I don't know how much it costs in practice, but there is value beyond what one can get for free on the Internet.&lt;br /&gt;&lt;br /&gt;Portfolios have been incorporated here and there as just another part of the machinery, amounting to a private repository of student work that can be used for rubric ratings to produce more or less normalized ratings of performance--an advanced sort of grading. This is useful as a formative means of finding all sorts of pedagogical and program strengths and weaknesses. The point of this article is not that portfolios are a better way to produce test-like scores, but that the test scores themselves will become obsolete as external measures of performance. For professors to get feedback on student performance, and for the students themselves to hear directly what the professors and their peers think is invaluable. It's essential for teaching and learning. But it's downright destructive to use this is as a summative measure of performance, for example for holding teachers accountable. The instant you say "accountability," no one trusts anyone else, and there really is no way to run the enterprise but as a factory, with inspectors enforcing every policy. It cannot work in the face of the uncertainties inherent to the inputs and outputs of education.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;There is a history of tension in higher education&lt;/b&gt; between the desire for authenticity and the simultaneous wish for factory-like operational statistics that show success or failure. The Spellings Commission Report has a nice sidebar about &lt;a href="http://www.neumont.edu/results/industry_demand.html"&gt;Neumont University&lt;/a&gt; and mentions their portfolio approach (their showcase is &lt;a href="http://www.neumont.edu/projectshowcasegallery/index.html"&gt;here&lt;/a&gt;), but can't tear itself away from standardized approaches to learning assessment. Three years before, the Council for Higher Education&amp;nbsp;Accreditation&amp;nbsp;beautifully illustrated the tension:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;[I]t is imperative for accrediting organizations–as well as the institutions and programs&lt;br /&gt;they accredit–to avoid narrow definitions of student learning or excessively standardized&lt;br /&gt;measures of student achievement. Collegiate learning is complex, and the evidence used&lt;br /&gt;to investigate it must be similarly authentic and contextual. But to pass the test of public&lt;br /&gt;credibility–and thus remain faithful to accreditation’s historic task of quality assurance –&lt;br /&gt;the evidence of student learning outcomes used in the accreditation process must be&lt;br /&gt;rigorous, reliable, and understandable.&lt;/blockquote&gt;This is from CHEA's 2003 paper "&lt;a href="http://www.chea.org/pdf/StmntStudentLearningOutcomes9-03.pdf"&gt;Statement Of Mutual Responsibilities&amp;nbsp;for Student Learning Outcomes:&amp;nbsp;Accreditation, Institutions,&amp;nbsp;and Programs&lt;/a&gt;." &amp;nbsp;More recently, Peter Ewell wrote "&lt;a href="http://www.learningoutcomeassessment.org/documents/PeterEwell_005.pdf"&gt;Assessment, Accountability, and Improvement: Revisiting the Tension&lt;/a&gt;" as the first Occasional Paper for the &lt;a href="http://www.learningoutcomeassessment.org/"&gt;National Institute for Learning Outcomes Assessment&lt;/a&gt;, in which he illuminates the game-theoretic problem I alluded to above:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;Accountability requires the entity held accountable to demonstrate, with&amp;nbsp;evidence, conformity with an established standard of process or outcome. The&amp;nbsp;associated incentive for that entity is to look as good as possible, regardless of&amp;nbsp;the underlying performance. Improvement, in turn, entails an opposite set&amp;nbsp;of incentives. Deficiencies in performance must be faithfully detected and&lt;br /&gt;reported so they can be acted upon. Indeed, discovering deficiencies is one of&amp;nbsp;the major objectives of assessment for improvement.&lt;/blockquote&gt;In a real factory setting, tests of mechanical process can be very precise, eliminating the difference between what the assessment folks call formative (used to ferret out useful improvements) and summative (an overall rating of quality). If a machine is supposed to produce 100 widgets per hour, and it's only producing 80, it's clear what the deficit is, and the mechanic or engineer can be called in to fix it. But when one is held accountable for numbers like standardized test results that have a considerable amount of uncertainty (which itself is probably unknown, as I pointed out before), the game is very different. It is less like a factory and more like going to market with a bag of some good and some counterfeit coins, which I described in "&lt;a href="http://highered.blogspot.com/2011/09/economics-of-imperfect-tests.html"&gt;The Economics of Imperfect Tests&lt;/a&gt;." One's optimal strategy has less to do with good teaching than with manipulating the test results anyway one can. Unfortunate &lt;a href="http://www.washingtonpost.com/blogs/blogpost/post/aps-atlanta-public-schools-embroiled-in-cheating-scandal/2011/07/11/gIQAJl9m8H_blog.html"&gt;examples&lt;/a&gt; of that have made national news in K-12 education.&lt;br /&gt;&lt;br /&gt;My proposal is that we in higher education take a look at what Stonybrook and others are doing, and see if there is not merit to an emphasis on authentic student learning outcomes, showcased when appropriate for their and our benefit. That we don't consider a grade card and a diploma an adequate take-away from four years and a hundred thousand dollars of investment. That instead, we help them begin to use social networking in a professional way. Set them up with a LinkedIn account during the orientation class--why not? Any sea change from teach/test/rinse/repeat to more individual and meaningful experiences will be difficult for most, but I believe there will be a payoff for those who get there first. Showing student portfolios to prospective students as well as prospective employers creates a powerful transparency that will inevitably have valuable side effects. Jeffrey told said that some of the portfolios get &lt;i&gt;millions &lt;/i&gt;of Internet views. How many views does a typical traditional assignment get? A handful at most, and maybe only one.&lt;br /&gt;&lt;br /&gt;The odd thing is that this idea is already quietly in place and old hat in the fine arts, performing arts, and architecture departments, and there are probably some I'm not aware of. Who would hire a graphic designer without seeing her portfolio, even if she had a wonderful-looking diploma? This means that we probably have experts already on campus. Computer Science is a natural fit for this too, and there's already a professional social network set up at &lt;a href="http://stackoverflow.com/"&gt;Stackoverflow.com&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;A good first step &lt;/b&gt;would be to allow portfolio galleries to count for outcomes assessment results in the &lt;a href="http://www.voluntarysystem.org/index.cfm"&gt;Voluntary System of Accountability&lt;/a&gt;&amp;nbsp;(VSA). Currently, the only way to participate is to agree to use standardized tests. From the &lt;a href="http://www.voluntarysystem.org/participants/signup.cfm"&gt;agreement&lt;/a&gt;'s provision 17:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;Participate in the VSA pilot project to measure student learning outcomes by selecting one of three tests to measure student learning gains.&amp;nbsp;&lt;/blockquote&gt;&lt;blockquote class="tr_bq"&gt;a)     Collegiate Assessment of Academic Proficiency (CAAP) – two modules: critical thinking and writing essay - http://www.act.org/caap/.&amp;nbsp;&lt;/blockquote&gt;&lt;blockquote class="tr_bq"&gt;b)     Collegiate Learning Assessment (CLA) – including performance task, analytic writing task - http://www.cae.org/content/pro_collegiate.htm.&amp;nbsp;&lt;/blockquote&gt;&lt;blockquote class="tr_bq"&gt;c)     ETS Proficiency Profile (formerly known as MAPP) – two sub scores of the test: critical thinking and written communication - http://www.ets.org/.  Either the Standard or the Abbreviated form can be used.&lt;/blockquote&gt;The VSA is a wonderful program, but it is handicapped by this requirement. If you already use one of these tests, that's fine, but it's expensive and a distraction if you don't find them useful. More to the point of this article, there is no option on the list to report authentic outcomes. Adopting another pilot project to see how far the public portfolio idea will sail would be a great addition.&lt;br /&gt;&lt;br /&gt;[The next article in this series is "&lt;a href="http://highered.blogspot.com/2011/11/tests-and-dialogues.html"&gt;Tests and Dialogues&lt;/a&gt;"] &lt;br /&gt;&lt;br /&gt;&lt;b&gt;Acknowledgements&lt;/b&gt;: Thanks to Jeffrey Yan for letting me chew his ear off after his presentation. And thanks to the coordinators of the Virginia Assessment Group for putting that wonderful event together.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Disclaimer:&lt;/b&gt; I have no financial interest in any of the companies mentioned in this article.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-6622007522200024152?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/6622007522200024152/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/11/end-of-preparation.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/6622007522200024152'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/6622007522200024152'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/11/end-of-preparation.html' title='The End of Preparation'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-4PgKAicRrX4/TskPOsQrsBI/AAAAAAAAAdg/r_XitRDP4Hw/s72-c/math.PNG' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-1019897282842421256</id><published>2011-11-16T16:43:00.001-05:00</published><updated>2011-11-16T17:51:45.403-05:00</updated><title type='text'>A Perilous Tail</title><content type='html'>&amp;nbsp;A certain kind of intellectual honesty seems to be critical to systems that want to survive. Even without the subtleties I &lt;a href="http://highered.blogspot.com/2011/11/self-limiting-intelligence.html"&gt;discussed earlier&lt;/a&gt;, it's obvious that a system that ignores reality can only survive as long as the environment is completely benign. By coincidence, I came across Daniel Kahneman's &lt;em&gt;&lt;a href="http://www.amazon.com/gp/product/B00555X8OA/ref=pd_lpo_k2_dp_sr_3/179-6326428-7379959?pf_rd_m=ATVPDKIKX0DER&amp;amp;pf_rd_s=lpo-top-stripe-1&amp;amp;pf_rd_r=1T062X10XMYQED3KJWZ3&amp;amp;pf_rd_t=201&amp;amp;pf_rd_p=486539851&amp;amp;pf_rd_i=0374275637"&gt;Thinking, Fast and Slow&lt;/a&gt;&lt;/em&gt;, which catalogs a number of ways in which we humans can fool ourselves. One instance of this is particularly relevant to training, management, and education. It occurs in any rating of performance that involves some element of luck.&lt;br /&gt;Dr. Kahneman describes an episode with military training instructors, where he was talking about studies that show positive reinforcement is the key to better learning. This point of view was flatly contradicted by his audience, who claimed the following (my description):&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;When a cadet does a bad job at something, I yell at him. Usually he does better the next time. If I praise him for doing a good job, his performance almost always declines the next time!&lt;/blockquote&gt;The author describes this as an ah-ha! moment for him. The solution to this paradox is cleverly described in the book. Here's my version.&lt;br /&gt;&lt;br /&gt;As we have seen with &lt;a href="http://highered.blogspot.com/2011/09/sat-error-rates.html"&gt;SAT scores&lt;/a&gt;, the predictive validity of even well-researched tests can be poor (65% correct classification in the case of the SAT benchmark). The remaining variance may as well be chalked up to chance unless we have more information to bear. In addition to measurement error, there can be chance involved in the performance itself. That is, many unpredictable environmental variables may come to bear on the outcome. Baseball games, for example, have a large amount of luck injected into the outcome, so that it's only over a large number of games does relative performance actually reveal itself (see &lt;em&gt;&lt;a href="http://www.amazon.com/Moneyball-Winning-Unfair-Game-ebook/dp/B000RH0C8G/ref=sr_1_1?s=digital-text&amp;amp;ie=UTF8&amp;amp;qid=1321480413&amp;amp;sr=1-1"&gt;Moneyball&lt;/a&gt; &lt;/em&gt;by Michael Lewis).&lt;br /&gt;&lt;br /&gt;When luck is involved, something called &lt;a href="http://en.wikipedia.org/wiki/Regression_toward_the_mean"&gt;regression to the mean&lt;/a&gt; happens--exceptional events are usually followed by unexceptional ones. To make this clear, you can do the following experiment, to mimic the drill instructors' experience. You need some dice.&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;Roll two dice and add. Consider higher sums as a good performance, and lower sums as poor. Feel free to strongly admonish the dice when they roll low numbers like 2 or 3, and lavish praise on them when they roll 11s and 12s.&amp;nbsp;You'll find that the stern words work wonders--the rolls almost always improve afterwards! On the other hand, the praise is counter-productive since 11s and 12s are usually followed by lower rolls.&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;We can imagine most events happening in the 'fat part' of a bell curve, and we are generally ill-equipped to encounter events far out on the tails, which by definition are very rare. It's not just the paradox described above. Nassim Taleb wrote a whole book about this called &lt;em&gt;&lt;a href="http://www.amazon.com/Black-Swan-Improbable-Robustness-ebook/dp/B00139XTG4/ref=sr_1_1?s=digital-text&amp;amp;ie=UTF8&amp;amp;qid=1321481296&amp;amp;sr=1-1"&gt;The Black Swan&lt;/a&gt;&lt;/em&gt;. Other, more speculative thinkers, have imagined thus: &lt;br /&gt;&lt;blockquote class="tr_bq"&gt;If you assembled all the humans who ever have or ever will live into a distribution according to when they were born, the curve would likely look like some kind of hump with tails on both sides. If you chose one human&amp;nbsp;at random,&amp;nbsp;he or she&amp;nbsp;would likely come from the fat part of the curve. Therefore, that's where we likely are at this moment in time--that is, we would expect to be typical rather than exceptional. If this is true, then it puts probabilistic bounds on our expectations for the duration of human civilization. The math varies, depending on your assumptions, but something like a few thousand years would be a reasonable upper bound using this method. &lt;/blockquote&gt;In business, there's an idea called &lt;a href="http://en.wikipedia.org/wiki/Six_sigma"&gt;Six Sigma&lt;/a&gt; that is supposed to reduce process errors to an infinitesimal fraction (six sigma means six standard deviations from the mean, which for a Normal distribution is an exceedingly small percentage: about 4 errors per million attempts). Yesterday someone suggested to me that we might use Six Sigma in higher education. I laughed, not because there aren't probably useful ideas there (similar to institutional effectiveness), but because the inherent fuzziness of our core business--changing brains--is so fraught with unknowns.&amp;nbsp;I think one sigma is about as good as we're likely to do. Although we're not well prepared to deal with it, we live in the tail of the distribution.&lt;br /&gt;&lt;br /&gt;What percentage of "&lt;a href="http://www.cgp.upenn.edu/ope_value.html"&gt;value-added&lt;/a&gt;" indices are due to random chance, do you suppose? This statistical method of computing theoretical filling of the learning vessel has been institutionalized to reward or punish teachers and schools. As a mathematician and assessment professional, it's hair-raising to read the pat descriptions from the link above, like:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;strong&gt;Q: How does value-added assessment sort out the teachers' contributions from the students' contributions?&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;A:Because individual students  rather than cohorts are traced over time, each student serves as his or her own  "baseline" or control, which removes virtually all of the influence of the  unvarying characteristics of the student, such as race or socioeconomic  factors.&amp;nbsp;&lt;/blockquote&gt;&lt;blockquote class="tr_bq"&gt;Test scores are projected for students and then compared to the scores they actually achieve at the end of the school year. Classroom scores that equal or exceed projected values suggest that instruction was highly effective. Conversely, scores that are mostly below projections suggest that the instruction was ineffective.&lt;/blockquote&gt;Taking another page from Dr. Kahneman's book, this is an instance of solving a simple problem that superficially resembles the actual problem, because the original problem is too hard. It's easy to imagine that the distribution is tight and the tails insignificant, that we control and understand all the elements of chance that might contribute to a computed value-added parameter. Unfortunately, the direct link to reality doesn't reveal itself easily, and so there is no immediate feedback that would correct the problem by making it obviously wrong to observers. This is a case where we should be assiduously honest with our reasoning and doubts. Suppose the SAT's accuracy is representative, and the underlying achievement tests classify students correctly no more than 65% of the time. What fraction of the value-added score is simply random?&lt;br /&gt;&lt;br /&gt;The general problem is caused by the unknown unknowns that plague complex observations. There is an elegant way out of this mess that doesn't involve advanced math or huge sample sizes. Moreover, it solves the &lt;a href="http://highered.blogspot.com/2011/09/most-important-problem-in-higher.html"&gt;most important problem in higher education&lt;/a&gt;. How's that for a cliff-hanger?&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-1019897282842421256?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/1019897282842421256/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/11/perilous-tail.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/1019897282842421256'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/1019897282842421256'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/11/perilous-tail.html' title='A Perilous Tail'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-2802352796655513783</id><published>2011-11-15T06:14:00.001-05:00</published><updated>2011-11-15T06:19:37.561-05:00</updated><title type='text'>Searching IPEDS Data</title><content type='html'>I found a nice site for sifting through some of the important bits of IPEDS data at&amp;nbsp;&lt;a href="http://www.collegeresults.org/"&gt;www.collegeresults.org&lt;/a&gt;, especially for comparing institutions. My metric of choice is instructional dollars / FTE when trying to roughly compare quality. You can find this under the "Finance and Faculty"&amp;nbsp;tab.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-SbtxrkrvAG0/TsJKgiHJo5I/AAAAAAAAAdQ/lT5VbRjQxcI/s1600/collegeresults.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="176" src="http://4.bp.blogspot.com/-SbtxrkrvAG0/TsJKgiHJo5I/AAAAAAAAAdQ/lT5VbRjQxcI/s640/collegeresults.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-2802352796655513783?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/2802352796655513783/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/11/searching-ipeds-data.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/2802352796655513783'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/2802352796655513783'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/11/searching-ipeds-data.html' title='Searching IPEDS Data'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-SbtxrkrvAG0/TsJKgiHJo5I/AAAAAAAAAdQ/lT5VbRjQxcI/s72-c/collegeresults.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-4321336273679531063</id><published>2011-11-06T08:47:00.000-05:00</published><updated>2011-11-06T10:15:41.517-05:00</updated><title type='text'>Spark Intelligence</title><content type='html'>Yesterday's post about self-limiting intelligence may come off as pessimistic, and indeed, I think very few leaders of organizations (nations, companies, colleges,...) have a standing agenda item labeled "survival." I think they should, and in some ways think that our forebears were more attuned to that, but it's just a notion. The cathedral in Cologne took about 640 years to complete. What projects do we have ongoing now with that sort of horizon? I think if you asked a large corporation's CEO what he or she thought of the company's prospects three or four centuries out, you'd get a strange look. Next quarter is what matters.&lt;br /&gt;&lt;br /&gt;Suppose for a moment that it's true that a singular intelligent system (SIS) can only get so smart before it starts working against its own interests. It's doomed, and it would be smart enough to realize its's doomed. What then?&lt;br /&gt;&lt;br /&gt;Although we have no evidence of other intelligent life in the universe, since we are here ourselves, it's possible that such life has or will exist somewhere else (the &lt;a href="http://en.wikipedia.org/wiki/Drake_equation"&gt;Drake Equation&lt;/a&gt; tries to pin that down, but that's not what I'm concerned with). So our hypothetical doomed SIS knows this too. That is, it knows that although all civilizations will eventually collapse, the universe is a fertile ground for new ones to spring up. This is what I call spark intelligence--new SISs cropping up now and then across the galaxy. Therefore, there is a possibility for the universe to maintain a disjointed "stream of consciousness" if these independent SISs could communicate with each other. Time and distance scales make any sort of synchronous communication unlikely, so it has to be asynchronous, like one civilization reading an ancient book left by another long-gone culture. This would enable a sort of meta-intelligence comprising knowledge and culture from a long sequence of dead civilizations: a universal Domesday book. Eventually one of them would have to figure out how to get this package to a new universe before this one suffers heat death, but there are billions of years left to do that.&lt;br /&gt;&lt;br /&gt;Imagine these sparks of intelligence going off all around the hundreds of billions of galaxies in our observable patch of space. Many of them reach the same conclusion I just have. Some of them might have the motivation to participate (motivation is essential to survival, recall, and we're talking about survival of knowledge and culture). There are two ways to participate. One is to create a library that can be seen and decoded from a very long way off, and the other is to search for and assimilate the libraries of others.&lt;br /&gt;&lt;br /&gt;If this giant inter-library loan program exists, it would depend on the "spark" rate, the probability that a civilization will be motivated to participate, and the window of opportunity it has to do so with existing resources before it collapses. An obvious first step would be to see if there are libraries already out there. Of course, we're already doing that with SETI, but it's not a high priority.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;You may be having trouble seeing over the pile of hypotheticals&lt;/b&gt; I assembled in the&amp;nbsp;preceding&amp;nbsp;paragraphs, so let me bring all this back to Earth. Much of the analysis above also applies right here. A regulated market economic system, for example, provides fertile ground for "sparks" of a different sort--businesses of all sorts spring into existence and then eventually get eaten or die. A few last hundreds of years. But they too share a common "culture bank," hold conferences and host professional organizations in order to share ideas (while hiding trade secrets, of course) similar to the galactic library I proposed. [Edit: We can also see that there are policy implications for the government that regulates the system: keeping the ground fertile for new 'sparks' and making sure that the eventual end of any enterprise is planned for. That way "too big to fail" wouldn't be the critical issue it is now. If the philosophy is that every enterprise &lt;i&gt;will&lt;/i&gt;&amp;nbsp;eventually fail, and that this has to be planned for, it's not a catastrophic surprise when it happens.]&lt;br /&gt;&lt;br /&gt;There may be a case made for education being like this too: providing the right environment for novelty to emerge in the form of new research results, new art, and so on. If so, it's probably not intentional from an organizational leadership viewpoint. The general tone of administration in my experience is more about how to keep a bureaucracy running efficiently. The effect of this machinery on learning are a factor, but the delineation between creating an environment for success and simply expecting it is a fuzzy one. As an example, assuming that learning is mostly related to how well students are taught is a bureaucratic simplicity (the inputs have a lot to do with it). Or the idea that if a student passes a writing class she can then write as well as she needs to. This "inoculation" philosophy is purely process-driven, and is almost antithetical to the idea that minds are cultivated, not stamped out in a factory. More on this theme next time.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-4321336273679531063?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/4321336273679531063/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/11/spark-intelligence.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/4321336273679531063'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/4321336273679531063'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/11/spark-intelligence.html' title='Spark Intelligence'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-664252793631927050</id><published>2011-11-05T19:16:00.000-05:00</published><updated>2011-11-05T19:16:05.614-05:00</updated><title type='text'>Self-Limiting Intelligence</title><content type='html'>I think intelligence grows only to the point where it begins to interfere with itself. In short, when we get smart enough, we begin to &lt;i&gt;outsmart&lt;/i&gt;&amp;nbsp;ourselves and actually undermine our own survival. Here, the inclusive 'we' could apply to individuals, but is more aimed at organizations: corporations, institutions, or governments.&lt;br /&gt;&lt;br /&gt;I have been researching survival of these entities in the abstract for several years, and I seem to be all alone in this. This is a real mystery, because survival is the &lt;i&gt;sine qua non&lt;/i&gt; for everything else we care about. If you are interested in some of the background, see &lt;i&gt;&lt;a href="http://arxiv.org/abs/0812.0644"&gt;Survival Strategies&lt;/a&gt; &lt;/i&gt;[1]&amp;nbsp;or &lt;a href="http://highered.blogspot.com/2010/04/surviving-entropy.html"&gt;"Surviving Entropy"&lt;/a&gt;&amp;nbsp;[2] in this blog.&lt;br /&gt;&lt;br /&gt;My interest is in the seemingly pessimistic question "is it likely that an intelligent being or organization can survive for an indefinite period?" This is contrasted with the messy sort of survival exhibited by ecologies that evolve over time, which I refer to in short-hand as a MIC (multiple independent copies). The intelligent systems are shortened to SIS for singular intelligent system. The primary difference is that it's impossible to reproduce and mutate an organization the same way it is a bacterium. All your eggs are in one basket, so to speak. Whereas a bacteria culture can loose 99% of its population and pull through, a singular system can't afford a single lethal mistake.&lt;br /&gt;&lt;br /&gt;In [1] I showed a couple of interesting facts about a SIS. First, it has to learn how to predict (or engineer) its environment at a very fast rate, unlike a MIC, which gets this for free via even the most&amp;nbsp;desultory&amp;nbsp;rate of reproduction. In actual fact, we have evidence that the ecology of life on Earth (a MIC) has survived for some billions of years, whereas we have no evidence of any government or other organization (a SIS) surviving for more than a few thousand years (I'm being generous). Put another way, when we look at the vast and enduring features of the universe around us, they are uniformly non-intelligent. This is the source of the so-called &lt;a href="http://en.wikipedia.org/wiki/Fermi_paradox"&gt;Fermi Paradox&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The second interesting fact about a SIS is that although it may be smart enough to change itself, it is impossible&lt;i&gt;&amp;nbsp;&lt;/i&gt;for it to predict the ultimate result of those changes. For an organism that is the product of an ecology, this is not an issue. Animals often come prepared for their earthly homes with protective coloration and other adaptations for the environment they will live in. They don't need to change this, or if they do, the provision is built-in but limited (like a chameleon). A frog can't re-engineer itself into a bird if it finds the need to fly. A SIS, on the other hand, may have to adapt to completely foreign environments over time.&lt;br /&gt;&lt;br /&gt;The problem a SIS faces is that it generally cannot predict what will happen to it after a self-change, so it doesn't know if this change is good or bad in the long run. It can try to guess by simulating itself, but there's an essential limitation here. There are two types of simulation, detailed below.&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;Suppose a SIS considers changing its 'constitution' in some way, which will affect the way future decisions are made. It builds a sophisticated computer model of itself making this change to see what will happen. There are two possibilities:&lt;/blockquote&gt;&lt;blockquote class="tr_bq"&gt;1) The simulation is perfectly good: so good that the SIS cannot change the outcome even if it's a bad one.&lt;/blockquote&gt;&lt;blockquote class="tr_bq"&gt;2) The simulation is only approximate: the SIS can take a look at the future and change its mind about making the change.&lt;/blockquote&gt;&lt;blockquote class="tr_bq"&gt;In the first case, a perfect simulation tells us not only what the future holds, but also &lt;i&gt;whether or not the organization will make the change&lt;/i&gt;. This is because it incorporates all information about the SIS, including the complete present state. So it will present a result like "you make the change and then X happens," or "you don't make the change." A perfectly true self-simulation has to have this property. So it's like Cassandra's warning--even if it predicts an undesirable future, it still has to live it!&amp;nbsp;&lt;/blockquote&gt;Such perfect simulations are really only possible with completely deterministic machines, like a computer with known inputs. In practice, all sorts of variables might knock it off course. So what about approximations?The essential element of an approximation is to be able to make a decision about the future. The most fundamental one might be "if I make this change, will I eventually self-destruct?" This is the most fundamental question for a SIS. The most dangerous challenge from the environment for a SIS comes from within itself.&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;The US constitution makes it harder to change the constitution than to pass ordinary laws. This is a prudent approach to self-modification.&lt;/blockquote&gt;Unfortunately, decision problems like this are not reachable by general-purpose processes. This is covered in [1], but you might peek at &lt;a href="http://en.wikipedia.org/wiki/Rice%27s_theorem"&gt;Rice's Theorem&lt;/a&gt; to see the breathtaking limitations of our knowledge of what deterministic systems will do. So we can simulate in the short term, but the long-term effect will be a mystery.&lt;br /&gt;&lt;br /&gt;So a SIS can only learn about self-change empirically, by trying things out, or short-term simulations. It can't ask about the general future. Although the external environment may be quite challenging, and survival may be a risk because of factors beyond its control, the internal question of how to manage self-change are just as bad or worse. Hence my hypothesis that the odds will catch up with any SIS eventually, and it will crash. This also jibes with with all the empirical evidence we have.&lt;br /&gt;&lt;br /&gt;This is where I left the question in [1], but in the last couple of years I think I've identified a fundamental mechanism for self-destruction that any SIS has to overcome.It has practical implications for institutions of higher learning and other sorts of systems like businesses and governments.&lt;br /&gt;&lt;br /&gt;In my &lt;a href="http://highered.blogspot.com/2011/11/language-of-assessment-session-summary.html"&gt;last post&lt;/a&gt;, I showed a diagram for an institutional effectiveness loop that looks more technical than the usual version. Here it is again, with some decorations from the talk I gave at the Assessment Institute.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-YSro4Dqyn8o/TrExrr-RFFI/AAAAAAAAAdA/4BzsV2S79_8/s1600/2011-11-02_0803.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="305" src="http://3.bp.blogspot.com/-YSro4Dqyn8o/TrExrr-RFFI/AAAAAAAAAdA/4BzsV2S79_8/s640/2011-11-02_0803.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;The diagram actually comes from my research on systems survival, and it is a schematic for how a SIS operates in its environment. The (R) and (L) notations refer to 'Reality' and 'Language' respectively. Recall that the I in SIS stands for Intelligent, and this is what I mean: the intelligent system has ways of observing the environment, encoding those observations into a language that compresses the data by looking for interesting features, and &amp;nbsp;models the interactions between these. This allows a virtual simulation of reality to be played out in the SIS, enabling it to plan what to do next in order to optimize its goals. This is the same thing as an institutional effectiveness loop in higher education, in theory at least.&lt;br /&gt;&lt;br /&gt;Language is much more&amp;nbsp;malleable than reality: we can imagine all sorts of interactions that aren't likely to actually occur. For example, astrology is a language that purports to model reality, but doesn't. It's essential for the SIS to be able to model the real environment increasingly well. The mathematical particulars are given in [1] in terms of increasing survival probabilities.&lt;br /&gt;&lt;br /&gt;There's something essential missing from the diagram above. That is the motivation for doing all this. When the SIS plans, it's trying to optimize something. This motivation is not to be taken for granted, because there's no reason to assume that a SIS even wants to survive unless it's specifically designed that way. For example, a modern air-to-air missile has good on-board ways to observe a target aircraft (e.g. radar or heat signature), a model for predicting the physics of its own flight and the target's, and the means to implement a plan to intercept. So by my definition, it's reasonably intelligent. But it doesn't care that it will be blown up along with its target.&lt;br /&gt;&lt;br /&gt;Motivation to survive is a decoration on a SIS. Of course it won't likely survive long without it, but it's not to be taken for granted, which makes the question of what happen under self-change very important. It's quite possible to make a change that eliminates the motivation for self-survival. What exactly constitutes survival is a messy topic, so let's just consider this general feature of an SIS, which has applications to personal life as well as governments, corporations, military organizations, and universities:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;b&gt;Motivations can change or be subverted when self-modifications are made&lt;/b&gt;.&lt;/blockquote&gt;This doesn't sound very profound; it's the particular mechanism shown below that is the interesting part. Here's how it works. When we observe our environment, we encode this into some kind of language, specialized to help us understand where we are in relation to our goals. For example, if I stub my toe on external reality, I get a finely-tuned message that informs me immediately that my most recent action was inimical to my goals for self-preservation: it hurts! This pain signal is just like any other bit of information encoded into a custom language: it can be intercepted or subverted. There are medicines and anesthetics that can reduce or completely eliminate the pain signal. Because signals are purely informational, they are always vulnerable to such manipulation by any system that can self-change.&lt;br /&gt;&lt;br /&gt;Motivations are closely tied to these signals. It may be a simple correspondence, as with pain, or something abstract that comes from modeling the environment, like fear of illness. Sometimes these come into conflict, as the example below illustrates.&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;Sometimes I get sleepy driving on the interstate. If I find myself beginning to micro-sleep, I pull off the road and nap for 15 minutes. How is it that my brain can be so dumb as to fall asleep while I'm driving? Something very old in there must be saying "it's comfortable here, there's not much going on, so it's a good time to sleep," in opposition to the more abstract model of the car careening off the road at speed. We can try to interfere with the first signal with caffeine or loud music or opening the windows, or we can just admit that it's better to give in to that motivation for a few minutes in a safer place.&lt;/blockquote&gt;The mechanism for limiting intelligence works like this:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;b&gt;A SIS tries to attain goals by acting so as to optimize encoded signals that correspond to motivations. If it can self-modify, the simplest way to do this is to interfere with the signal itself.&lt;/b&gt;&lt;/blockquote&gt;I think it is very natural for a SIS to begin to fail because it fools itself to artificially achieve goals by presenting itself with signals that validate that. Even if external reality would disagree.&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;I just finished reading Michael Lewis' &lt;i&gt;&lt;a href="http://www.amazon.com/Big-Short-Inside-Doomsday-Machine/dp/0393072231"&gt;The Big Short&lt;/a&gt;&lt;/i&gt;, which is rife with examples of signal manipulation. Here are a couple. 1) The ratings agencies (S&amp;amp;P, Moody's) had two motivations in conflict: generating revenue by getting business rating financial instruments (such as CDOs), and generating accurate ratings. These are in conflict because if they rate something poorly (and perhaps unfairly), they may lose business. The information stream got subverted that should have signaled that &amp;nbsp;it was a bad idea repackaging high-risk loans into triple-A rated instruments.&amp;nbsp;&lt;/blockquote&gt;&lt;blockquote class="tr_bq"&gt; 2) According to Lewis, counterparts at Goldman Sachs learned exactly how to tweak the signals in order to get the result they wanted from the bond raters (by manipulating the way risky loans were structured to optimize an average credit rating, for example).&amp;nbsp;&lt;/blockquote&gt;&lt;blockquote class="tr_bq"&gt;3) In an example of self-deception, risk management offices of the investment banks managed were fooled by the ratings agencies "credit-laundering" and their own trading desks, which allowed vast liabilities to go unnoticed.&amp;nbsp;&lt;/blockquote&gt;&lt;blockquote class="tr_bq"&gt;4) The whole economic apparatus of the world largely ignored signs that the system was on the verge of collapse.&amp;nbsp;&lt;/blockquote&gt;If we imagine that a SIS is continually trying to increase its survival chances, an observation that probabilities are &lt;i&gt;decreasing&lt;/i&gt;&amp;nbsp;instead is obviously bad news. If it can self-modify it has the choice to accept this unwelcome fact about probabilities, or it could interfere with the signal (ignore it, for example).&lt;br /&gt;&lt;br /&gt;Alternatively, the internal model of an SIS may associate a potential benefit with a planned act, which is a good thing. Any evidence that this may not work out as intended would decrease the value of the act, and this (also bad) news might be subverted, so that only supporting evidence is accepted. This is usually called confirmation bias in humans.&lt;br /&gt;&lt;br /&gt;It's natural to ask, if this is such a problem, why hasn't civilization already collapsed from ignoring bad news and amplifying good news? The answer, I think, is that humans comprise the civilization and all its organized systems, and humans can't completely self-modify. Yet. Imagine if you could.&lt;br /&gt;&lt;br /&gt;What if every emotional reaction could be consciously tuned through some mental control panel? What to be happier? Just turn up the dial. Don't like pain? Turn that dial down.&lt;br /&gt;&lt;br /&gt;Because humans are actually members of a MIC (that is, an ecology), we are subject to selection pressure from the environment. Viewed as discrete systems, our organizations inherit some of this evolutionary common sense, but it's diluted. Individual humans often have a lot to say about how an organization operates, and can imbue them with denial and confirmation bias. Organizations are easily self-changed, and can't predict how those changes will turn out. I think, however, that certain strategies can ameliorate some of the most self-destructive behaviors. Here they are:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;1) Create cultures of intellectual honesty, and actively audit signals, languages, and models to make sure they correspond to what's empirically known, whether it's good news or not. Intellectual honesty should be audited the same way financials are: by an outside agency doing an in-depth review. In the long run this doesn't solve the problem because any such agency will have the same problems (self-deception, inability to predict effects of changes, etc.), but it might increase the quality of decision-making in the short to near term.&lt;/blockquote&gt;&lt;blockquote class="tr_bq"&gt;2) Be conservative and deliberate about changes to signals, languages, predictive models, and fundamental structure. Audit those continually and transparently. Everyone should know what the motivations are and what signals apply to each. Moreover, 'best practices for survival' should be used. Since much of our learning is from other systems that failed, this wisdom should be carefully archived and used.&lt;/blockquote&gt;These are particularly advisable for organizations that have motivational signals that are difficult or slow to interpret. For enterprises that are very close to objective reality, these measures are less necessary because of the obviousness of the situation. For example, it's hard to argue with the scoreboard in a sporting event. We can close our eyes if we don't like the score, but there's really not much room for misinterpretation. Therefore, one would expect a successful team to be either very lucky or else have good models of reality reflected in their language and signals. The same could be said of military units in active service, traders on a stock exchange, or any other occupation with signals that are hard to interfere with.&lt;br /&gt;&lt;br /&gt;Examples in the other direction, where signals are or have been ignored are the financial crisis already mentioned, the looming disaster of global warming, the eventual end of cheap oil, and human overpopulation. &amp;nbsp;On an individual level, unnecessarily bad diets, lack of exercise, smoking, and so on are examples of abstract survival signals ("doctor say so") versus&amp;nbsp;visceral&amp;nbsp;motivations (e.g. tastes good) that show flaws in our motivational calculus.&lt;br /&gt;&lt;br /&gt;I intend in the next post or two to show how this is related to the business of higher education.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-664252793631927050?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/664252793631927050/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/11/self-limiting-intelligence.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/664252793631927050'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/664252793631927050'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/11/self-limiting-intelligence.html' title='Self-Limiting Intelligence'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-YSro4Dqyn8o/TrExrr-RFFI/AAAAAAAAAdA/4BzsV2S79_8/s72-c/2011-11-02_0803.png' height='72' width='72'/><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-5811092188982230820</id><published>2011-11-02T08:42:00.000-05:00</published><updated>2011-11-02T08:43:29.439-05:00</updated><title type='text'>Language of Assessment: Session Summary</title><content type='html'>This post summarizes the conclusions from my presentation yesterday at the Assessment Institute in Indianapolis. Many thanks to Trudy and everyone else who helps make this conference happen!&lt;br /&gt;&lt;br /&gt;The topics below are taken from my Conclusions slide. They are related to the relationship between reality (actually affecting/effecting events) versus language (observing, understanding, planning). This is pictured in the diagram below, where R=reality, and L=language.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-YSro4Dqyn8o/TrExrr-RFFI/AAAAAAAAAdA/4BzsV2S79_8/s1600/2011-11-02_0803.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="305" src="http://3.bp.blogspot.com/-YSro4Dqyn8o/TrExrr-RFFI/AAAAAAAAAdA/4BzsV2S79_8/s640/2011-11-02_0803.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Language is easy. Reality is hard.&lt;/b&gt;&lt;br /&gt;Language is mostly combinations of arbitrary signs (pointing with your finger and 'oink' being two exceptions), and the flexibility is due to the infinite number of ways that we can arrange these signs (words). I used the example of QR codes like the one pictured below to contrast reality (a small square) with language (about $10^{300}$ different codes expressible--probably more atoms than there are in the observable universe).&lt;br /&gt;&lt;table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: left; margin-right: 1em; text-align: left;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="http://upload.wikimedia.org/wikipedia/commons/thumb/9/9b/Wikipedia_mobile_en.svg/220px-Wikipedia_mobile_en.svg.png" imageanchor="1" style="clear: left; margin-bottom: 1em; margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" src="http://upload.wikimedia.org/wikipedia/commons/thumb/9/9b/Wikipedia_mobile_en.svg/220px-Wikipedia_mobile_en.svg.png" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;Source: Wikipedia&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;The larger point here is to be careful that we don't become disconnected from reality. This can happen in a number of ways. We can, for example, do everything through planning, but then fail to execute. Or we be fixated on some model, some sort of understanding that we preconceive and don't deviate from, and make observations simply conform to this worldview. The language we choose, including the types of assessments and how they convert raw observations into data, necessarily limit us, and it's good to be cognizant of those limits.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Seek meaning: use contextual language&lt;/b&gt;&lt;br /&gt;When talking to faculty members and other non-assessment experts, I recommend avoiding jargon like 'measurement', 'validity', and such. It's more productive to use language that they already use. I don't even like to use 'assessment', since the point of the process is really improvement, not just assessment, and the word is now a lightning rod for instant opposition in some quarters. Of course, if you have to get a report out of them for accreditation, it isn't going to matter much what you call it. The Happy Happy Fun-Time Learning Report isn't going to fool them. The deeper philosophy behind the advice under this heading is that the faculty members really are the experts--they know their students, the material, the history of the curriculum, the changes that have been made, know what is on tests, and have all this rich contextual information. They are already familiar with the idea of academic success or failure, and don't really need a new lexicon to discuss it.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Administrators like summative/monological language&lt;/b&gt;&lt;br /&gt;From the Department of Education down, 'performance measures', 'accountability', and 'value-added' are words that reflect the top-down mindset. They often want to see dashboards that lead to easy conclusions about the success or failure of some endeavor. &amp;nbsp;For more on monological/dialogical, which I didn't cover, see &lt;i&gt;&lt;a href="http://zzascape.com/elephant.pdf"&gt;Assessing the Elephant&lt;/a&gt;&lt;/i&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Faculty like formative/dialogical language&lt;/b&gt;&lt;br /&gt;Faculty may also want to see summary graphs of indices (e.g. pass rates), but unless there's some model of cause and effect that we can build from the information, it's hard to figure out what to do with the data. In other words, we need to be able to construct a story that makes sense. For this, providing details like percentages (e.g. distributions instead of averages), or other connections like correlates, shown below (see my&lt;a href="http://highered.blogspot.com/2011/10/mapping-covariates-part-iii.html"&gt; previous posts&lt;/a&gt; on that).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-jnyp_jcjXeE/TrE2NGDhkTI/AAAAAAAAAdI/s4CKgMVIeGs/s1600/2011-11-02_0823.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="220" src="http://1.bp.blogspot.com/-jnyp_jcjXeE/TrE2NGDhkTI/AAAAAAAAAdI/s4CKgMVIeGs/s320/2011-11-02_0823.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Rubrics are particularly useful, especially if the faculty members construct them using language that makes sense to them (e.g. fitting them into their cause/effect model). The scale used is particularly important. I didn't mention this in the talk, but I will here. Often we use a PAGE rubric (poor, average, good, excellent) for accomplishment levels, but this should only be done for skills or knowledge that are not going to be under long development. For example, you wouldn't want to use this to rate writing because there's really no upper limit on how good you can get. Moreover, a freshman who gets Excellent ratings in the first year is likely to get Excellent in the fourth year as well--demonstrating no improvement! I use a PAGE rubric for rating effort, since the concept of no effort to maximum effort seems a good fit. David Dirlam has a comprehensive way to create very far-reaching rubric language that I won't say more about here. Google him and find out more. I referenced him &lt;a href="http://highered.blogspot.com/2010/10/assessing-creativity-creatively.html"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;I mentioned the FACS a few times in the talk. It's a rating scale based on a student's career. You can read all about it in this &lt;a href="http://www.coker.edu/assessment/elephant.pdf"&gt;manuscript&lt;/a&gt; or through &lt;a href="http://highered.blogspot.com/search?q=facs"&gt;these links&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Predictive validity of tests is probably not great&lt;/b&gt;&lt;br /&gt;I showed examples from the College Board's benchmark on the SAT, which gives a maximum 65% correct classification rate. More on that topic in &lt;a href="http://highered.blogspot.com/2011/09/sat-error-rates.html"&gt;this article&lt;/a&gt;. &amp;nbsp;The SAT is exceptional in that we &lt;i&gt;actually can&lt;/i&gt;&amp;nbsp;test its predictive validity. Validity studies for most standardized tests are underwhelming, and don't address this crucial point--the main point, really. If we are content to assume that being able to answer a test question correctly at this moment translates into some future ability without actually testing it, then okay. But I think that's unreasonable given the nature of education. At the national scale, there is also a strange lack of interest in finding out what all this investment in education actually produces in outcomes that can be accurately assessed, like employment and earnings (financial aid data + loan data + IRS data).&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Report summarizes, not abstractions&lt;/b&gt;&lt;br /&gt;I left this off the conclusions page, but added it here because it relates to the previous topic. Let me illustrate with an example. Suppose we have an English proficiency exam to place incoming students into a first writing course. This is a test we can (and should) check the predictive validity of by looking at success rates. We can make summary statements about test results like this:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;The 2011 English proficiency test results showed that 79% met the faculty-established criterion to be placed into ENG 102.&lt;/blockquote&gt;The validity of this statement is unquestionable as long as the calculation was done correctly. Compare that to:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;According to the 2011 English proficiency test, 79% of our students can write well.&lt;/blockquote&gt;This substitution of a known fact "met the criterion" with a notion "can write well" takes us from a perfectly valid statement to one that is completely subjective, depending on what the reader's idea of "can write well" is. If the reader is trusting, he/she may just pass over this, but the validity of the statement is essentially unknowable.&lt;br /&gt;&lt;br /&gt;This isn't hair-splitting--it's essential. If we allow ourselves to jump from what's know to undefined and untestable abstractions, we may end up creating educational systems that follow this illusion into irrelevance (e.g. a completely artificial test-driven culture like No Child Left Behind). &amp;nbsp;If we stick to what we actually know to be true, we don't have this problem. Of course, this comes into conflict with the summative view demanded by the DoE. We need to make the conversation more nuanced, I think.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Don't aggregate different kinds of stuff&lt;/b&gt;&lt;br /&gt;This falls in the nuts and bolts category. You wouldn't normally add different sorts of data together, like age plus shoe size, and expect to get anything useful. Why do we add up rubric dimensions for an 'overall score?' &amp;nbsp;The only purpose it serves is to save space, by reducing everything to one number. But it's really hard to make sense of the number, and comparing one score to another is problematic. For example, suppose student papers are rated on a 1-10 scale on Correctness (e.g. grammar and spelling), Style, and Audience. Is a three point deficit in correctness exactly made up for a by a surfeit of style? It's hard to see how that could be the case.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Average data as a last resort: use richer displays&lt;/b&gt;&lt;br /&gt;Learning outcomes data is complex. Reducing a rich dataset into one number is like burning it and poking through the ashes. Use averages (or better: medians) when you really only have room for one number. For Likert scale data, reporting out the percentage of Agree and Strongly Agree is often more meaningful to the reader than an average.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Validity is not a property of a test, but of a statement&lt;/b&gt;&lt;br /&gt;The quote below pretty much says it all.&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;It is a common misconception that validity is a particular phenomenon whose presence in a test may be evaluated concretely and statistically. One often hears exclamations that a given test is “valid” or “not valid.” Such pronouncements are not credible, for they reflect neither the focus nor the complexity of validity.&lt;br /&gt;– College BASE Technical Manual&lt;/blockquote&gt;&lt;br /&gt;It is statements that are valid or not, not tests or results from tests. Just like anything we say can be true or not (true = valid). You would think that this would require test makers and test users to only say things they know are true, but this is not the case. The temptation to make the leap to abstraction is&amp;nbsp;irresistible. I wrote a paper on this topic &lt;a href="http://zzascape.com/LanguageofAssessment.pdf"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Complexity and reliability don't go together&lt;/b&gt;&lt;br /&gt;If something takes more language to describe, it's more complex. When we look at something complex, there are by definition many ways of looking at it. This means that we often have a collision between reality and language when we assess learning. If we insist on a simple bad-to-good scale (one dimensional, in other words), we have to throw out much of the original information in order to data-compress it. It's similar to the complaint about averaging. Complex outcomes will have lower reliability in ratings because there are valid differing opinions. I used the example of rating the taste of green beans. You may like them al dente. I like them mushy. What's the correct response on the answer key?&lt;br /&gt;&lt;br /&gt;You can see that the summative approach favored by policy-makers is not a very natural way to look at learning, and the gap widens the more complex the outcome becomes. A simple test for multiplication is probably sufficient because its complexity is so low. But a multiple-choice test for understanding of literature (like the one my daughter took last year) is simply a result of abstraction (test result = ability) and convenience (relatively cheap, easy to score).&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Non-cognitives are important&lt;/b&gt;&lt;br /&gt;Outcomes other than knowledge and skill include attitude and personal behaviors. These are increasingly recognized as important in higher ed. I've written more about that subject in my blog.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Opportunity to focus on accomplishment&lt;/b&gt;&lt;br /&gt;This is the punch line, that provides a hopeful way out. If the critique above seems negative in parts, it's only intended to be an honest look at what we're doing so we can make the next evolutionary improvement. My suggestion is that we stick to facts we know when using assessment results (that is be more modest in our claims), and take two new approaches. One is based on the FACS--gather subjective opinions about our students so we can see if our tests predict those real-world opinions. (I didn't mention this in the talk). The second, and more important suggestion, is that we stop thinking of ourselves as preparing students for something after college. Instead we have the opportunity to lead them to authentic accomplishments that they can use to build their professional portfolio beginning in their first year.&lt;br /&gt;&lt;br /&gt;Compare:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;We test our students constantly. According to the results, they graduate knowing how to think and communicate. That's what the diploma means.&amp;nbsp;&lt;/blockquote&gt;&lt;blockquote class="tr_bq"&gt;Our students demonstrate achievement through substantive works that stand on their own. You can look at them in the portfolios that they build during their time with us. You will see their actual accomplishments in their fields of study and be able to judge for yourself their quality.&lt;/blockquote&gt;The first approach is based on unprovable claims, and has students waiting on the 'factory floor' to receive their certificate. This is the summative industrial standarized-in, standardized-out approach. The second approach is more motivating (I suspect) to both students and faculty--it allows students to see real-world outcomes as they go along. Ones that are incremental and achievable, that they can compare and compete with, and retain &amp;nbsp;meaning after graduation as part of their professional history. And it's free--just used &lt;a href="http://www.linkedin.com/"&gt;LinkedIn&lt;/a&gt; for the 'portfolio' system.&lt;br /&gt;&lt;br /&gt;I have a lot more to say about this, but it will have to wait.&lt;br /&gt;&lt;br /&gt;Thanks to everyone who came to the session--please stay in touch and share your successes and failures with us so we can imitate the former and avoid the latter.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-5811092188982230820?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/5811092188982230820/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/11/language-of-assessment-session-summary.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/5811092188982230820'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/5811092188982230820'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/11/language-of-assessment-session-summary.html' title='Language of Assessment: Session Summary'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-YSro4Dqyn8o/TrExrr-RFFI/AAAAAAAAAdA/4BzsV2S79_8/s72-c/2011-11-02_0803.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-3121470057383401728</id><published>2011-10-23T07:25:00.001-05:00</published><updated>2011-10-23T07:25:07.719-05:00</updated><title type='text'>Assessment Workshop Survey Results</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: left;"&gt;Survey responses for the assessment workshop next week are shown below. Nothing is hyperlinked, since these are just images. I had to do it this way to protect the data, so the individual responses can't be used to identify individuals.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-a-PUASK5ksY/TqQGdPwKvDI/AAAAAAAAAb8/TrQhH0SXo6U/s1600/Survey1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/-a-PUASK5ksY/TqQGdPwKvDI/AAAAAAAAAb8/TrQhH0SXo6U/s1600/Survey1.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;a href="http://2.bp.blogspot.com/-4yHvp8WUdsI/TqQGfAM87-I/AAAAAAAAAck/0IT9eVm8MWw/s1600/survey2.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em; text-align: center;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/-4yHvp8WUdsI/TqQGfAM87-I/AAAAAAAAAck/0IT9eVm8MWw/s1600/survey2.png" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-ITrsfVNjqzk/TqQGelokngI/AAAAAAAAAcc/265CuIXl1Jg/s1600/survey3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/-ITrsfVNjqzk/TqQGelokngI/AAAAAAAAAcc/265CuIXl1Jg/s1600/survey3.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-NL4L_uIn1gk/TqQGeWJA03I/AAAAAAAAAcU/a1rWbj2xk74/s1600/survey4.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/-NL4L_uIn1gk/TqQGeWJA03I/AAAAAAAAAcU/a1rWbj2xk74/s1600/survey4.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-FZs3NCrFxvI/TqQGd8sWXSI/AAAAAAAAAcM/t0lXYevVnP4/s1600/survey5.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/-FZs3NCrFxvI/TqQGd8sWXSI/AAAAAAAAAcM/t0lXYevVnP4/s1600/survey5.png" /&gt;&lt;/a&gt;&lt;a href="http://1.bp.blogspot.com/-Nklu8bffqCM/TqQGdQJzwVI/AAAAAAAAAcE/e5d1splGeR4/s1600/survey6.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/-Nklu8bffqCM/TqQGdQJzwVI/AAAAAAAAAcE/e5d1splGeR4/s1600/survey6.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-3121470057383401728?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/3121470057383401728/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/10/assessment-workshop-survey-results.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/3121470057383401728'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/3121470057383401728'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/10/assessment-workshop-survey-results.html' title='Assessment Workshop Survey Results'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-a-PUASK5ksY/TqQGdPwKvDI/AAAAAAAAAb8/TrQhH0SXo6U/s72-c/Survey1.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-2846463943800551041</id><published>2011-10-18T18:38:00.001-05:00</published><updated>2011-12-09T14:39:44.975-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='statistics'/><category scheme='http://www.blogger.com/atom/ns#' term='surveys'/><category scheme='http://www.blogger.com/atom/ns#' term='correlation'/><title type='text'>Mapping covariates, Part III</title><content type='html'>In my spare time (ha-ha) I refined the software I blogged about in the last two posts in order to automate almost everything about sorting out what's connected to what in a data set. &amp;nbsp;Now I can create a folder with a data file, an index of variables, and an options list, drop that folder on a script on the desktop, and a few seconds later have a graph like the one below. This makes it easy to tweak parameters to find a nice picture. One that tells $2^{10}$ words, give or take.&lt;br /&gt;&lt;br /&gt;For the graph below, I dug out the results of a semester of course evaluation using the new form I got implemented a year ago. I wrote previously about the odd fact that the summative evaluation of the course in Q12 and Q13 didn't seem to relate much to the learning outcomes. The closest other item in this topology is how enjoyable the students reported the course being.&lt;br /&gt;&lt;br /&gt;This graph shows covariances instead of correlations. The latter have the nice property of being normalized to [-1,1], but suffer from the fact that nearly constant items all correlate highly with each other. The program can run either way. The graph below shows the top 30 correlates. The means are shown too.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-AbMMv4umKGw/Tp4Nv73L_XI/AAAAAAAAAbw/uj6adP2EBMM/s1600/COV30.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/-AbMMv4umKGw/Tp4Nv73L_XI/AAAAAAAAAbw/uj6adP2EBMM/s1600/COV30.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-2846463943800551041?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/2846463943800551041/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/10/mapping-covariates-part-iii.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/2846463943800551041'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/2846463943800551041'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/10/mapping-covariates-part-iii.html' title='Mapping covariates, Part III'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-AbMMv4umKGw/Tp4Nv73L_XI/AAAAAAAAAbw/uj6adP2EBMM/s72-c/COV30.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-462529497611398081</id><published>2011-10-01T09:20:00.001-05:00</published><updated>2011-12-09T14:39:44.991-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='statistics'/><category scheme='http://www.blogger.com/atom/ns#' term='surveys'/><category scheme='http://www.blogger.com/atom/ns#' term='correlation'/><title type='text'>Creating Graphs with Perl and GraphViz</title><content type='html'>&lt;a href="http://highered.blogspot.com/2011/09/recipe-for-finding-correlates-in-large.html"&gt;Yesterday&lt;/a&gt; I solved one of my data problems, but that just led to another one. I can now filter large correlation tables for (absolute) values above a threshold, but it's still laborious to connect those in a diagram that shows the relationships. So the next step was to look for a program to display graphs. Here, I don't mean bar graphs and pie charts and whatnot, but the mathematical object that was unfortunately also called a graph, which consists of vertices and edges. Or dots and lines connecting them, if you prefer. A map of a subway system is a kind of graph. Here's a very simple one:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-ajeA3ciL2sg/Tocd-CpjhlI/AAAAAAAAAbo/OWWi4K8LzjQ/s1600/CIRP1-2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/-ajeA3ciL2sg/Tocd-CpjhlI/AAAAAAAAAbo/OWWi4K8LzjQ/s1600/CIRP1-2.png" /&gt;&lt;/a&gt;&lt;/div&gt;With the correlations, I want to see what belongs with what on a graph that is generated automatically from a threshold I assign. In the example above, survey results show a perception that creativity is associated with artistic ability.&lt;br /&gt;&lt;br /&gt;Fortunately, other people have already solved this problem, and it's just a matter of putting the machinery in place. I used two pieces of software, &lt;a href="http://graphviz.org/"&gt;GraphViz&lt;/a&gt;&amp;nbsp;(thanks to AT&amp;amp;T and the &lt;a href="http://www.graphviz.org/Credits.php"&gt;development team&lt;/a&gt;)&amp;nbsp;and the &lt;a href="http://search.cpan.org/dist/GraphViz/lib/GraphViz.pm"&gt;perl interface for it&lt;/a&gt;&amp;nbsp;(thanks to developer Leon Brocard). Both of these are open source and free to use.&lt;br /&gt;&lt;br /&gt;The only other thing&lt;i&gt;&amp;nbsp;&lt;/i&gt;I had to do was extract the prompts for each item to correspond to the item codes. Otherwise, you get a nice graph showing that RATE1 connects to RATE2, which doesn't help understand what's going on.&lt;br /&gt;&lt;br /&gt;I used the CIRP data with a .5 threshold to get these interesting networks of association. It takes just a few seconds to select the threshold value, run the scripts, and look at the resulting file. The code for generating the graphs from the output of the correlation summaries is &lt;a href="http://zzascape.com/graph.txt"&gt;here&lt;/a&gt;. There are many, many options for displaying the graphs. The &lt;a href="http://www.graphviz.org/Gallery.php"&gt;gallery of GraphViz images&lt;/a&gt; shows off the&amp;nbsp;versatility&amp;nbsp;of the software.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-KuyuOGbhW84/Tocd5X9_WNI/AAAAAAAAAbk/_KD2K49gO6k/s1600/CIRP1-1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="199" src="http://2.bp.blogspot.com/-KuyuOGbhW84/Tocd5X9_WNI/AAAAAAAAAbk/_KD2K49gO6k/s320/CIRP1-1.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-kSkh-nU9m2o/Tocd_0OLNMI/AAAAAAAAAbs/TI4WqiGQNuM/s1600/CIRP2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="442" src="http://2.bp.blogspot.com/-kSkh-nU9m2o/Tocd_0OLNMI/AAAAAAAAAbs/TI4WqiGQNuM/s640/CIRP2.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;This software does a great job of placing nodes logically so that the edges (the lines) are organized and neat.&lt;br /&gt;Update: &lt;a href="http://zzascape.com/graph.png"&gt;Here&lt;/a&gt; is a complete output with some cool new modifications.&lt;br /&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-462529497611398081?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/462529497611398081/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/10/creating-graphs-with-perl-and-graphviz.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/462529497611398081'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/462529497611398081'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/10/creating-graphs-with-perl-and-graphviz.html' title='Creating Graphs with Perl and GraphViz'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-ajeA3ciL2sg/Tocd-CpjhlI/AAAAAAAAAbo/OWWi4K8LzjQ/s72-c/CIRP1-2.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-8241535446484080101</id><published>2011-09-30T09:02:00.002-05:00</published><updated>2011-12-09T14:39:45.000-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='statistics'/><category scheme='http://www.blogger.com/atom/ns#' term='surveys'/><category scheme='http://www.blogger.com/atom/ns#' term='correlation'/><title type='text'>A Recipe for Finding Correlates in Large Data Sets</title><content type='html'>The internet has revolutionized intelligence. I've seen articles about how it's making us dumber, and I don't know if that's true, but it's certainly made me spoiled. In the old days if I had a computer problem I would just use brute trial and error, often giving up before finding a solution. Now, I just assume that someone else has already had the same problem and kindly posted the solution on a message board somewhere. So a few Google searches almost always solves the problem. Not this time.&lt;br /&gt;&lt;br /&gt;This problem is a bothersome thing that comes up occasionally, but not often enough that I've taken action on it. It happens when I have a large data set to analyze and I want to see what's related to what. It's easy enough in SPSS to generate a correlation table with everything I want to know, but it's &lt;i&gt;too much&lt;/i&gt; information. If there are 100 items on a survey, the correlation matrix is 100x100 = 10,000 cells. Half of them are repeats, but that's still a lot to look at. So I wanted a way to filter out all the results except the ones with a certain significance level.&lt;br /&gt;&lt;br /&gt;I poked around at scripting sites for SPSS, but couldn't find what I was looking for. The idea of writing code in a Basic-like language gives me hives too (don't get me wrong--I grew up on AppleSoft Basic, but somehow using it for this sort of thing just seems wrong). &lt;br /&gt;&lt;br /&gt;So without further ado, here's the solution I found. I'm sure someone has a more elegant one, but this has the virtue of being simple. &lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: large;"&gt;&lt;b&gt;How-to: Finding Significant Correlates&lt;/b&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The task: take a set of numerical data (possibly with missing values) with column labels in a comma-separated file and produce a list of what is correlated with what other variables at some given cut-off for the correlation coefficients. Usually we would want to look for ones larger than a certain value.&lt;br /&gt;&lt;br /&gt;Note that some names are definable. I was using CIRP data, so I called my data set that. I'll put the names you can define in bold. Everything else is verbatim. The hash # lines denote a comments, which you don't need to enter--it's just to explain what's going on.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Step One&lt;/b&gt;&lt;br /&gt;&lt;a href="http://www.r-project.org/"&gt;Download R&lt;/a&gt;, the free stats package, if you don't have it already. Launch it to get the command prompt and run these commands (cribbed mostly from &lt;a href="http://www.gardenersown.co.uk/Education/Lectures/R/correl.htm"&gt;this site&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;# choose a file for input data and name it something&lt;br /&gt;&lt;b&gt;cirp&lt;/b&gt;.data=read.csv(file.choose())&lt;br /&gt;&lt;br /&gt;# import the columns into R for analysis &lt;br /&gt;attach(&lt;b&gt;cirp&lt;/b&gt;.data)&lt;br /&gt;&lt;br /&gt;# create a correlation matrix, using pairwise complete observation. other options can be found &lt;a href="http://stat.ethz.ch/R-manual/R-patched/library/stats/html/cor.html"&gt;here&lt;/a&gt;&lt;br /&gt;&lt;b&gt;cirp&lt;/b&gt;.mat = cor(&lt;b&gt;cirp&lt;/b&gt;.data, use ="pairwise.complete.obs")&lt;br /&gt;&lt;br /&gt;# output this potentially huge table to a text file. Note that here you use forward slashes even in Windows&lt;br /&gt;&amp;nbsp;write.table(&lt;b&gt;cirp&lt;/b&gt;.mat,"&lt;b&gt;c:/cirpcor.txt&lt;/b&gt;")&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Step Two&lt;/b&gt;&lt;br /&gt;Download ActiveState Perl if you don't have it (that's for Windows).&lt;b&gt; &lt;/b&gt;Run the following script to filter the table. You can change the file names and the threshold value as you like. [Edit: I had to replace the code below with an image because it wasn't rendering right. You can download the script &lt;a href="http://zzascape.com/process.txt"&gt;here&lt;/a&gt;.]&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-oO8KgBjIwE8/TocZZmA_CTI/AAAAAAAAAbg/J9JhCTgTWbE/s1600/process.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="346" src="http://1.bp.blogspot.com/-oO8KgBjIwE8/TocZZmA_CTI/AAAAAAAAAbg/J9JhCTgTWbE/s400/process.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;b&gt;Step Three&lt;/b&gt;&lt;br /&gt;Go find the output file you just created. It will look like this:&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: xx-small;"&gt;YRSTUDY2&amp;nbsp;&amp;lt;-&amp;gt;&amp;nbsp;YRSTUDY1 (0.579148687526634)&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: xx-small;"&gt;YRSTUDY3&amp;nbsp;&amp;lt;-&amp;gt;&amp;nbsp;SATV (0.434618737520563)&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: xx-small;"&gt;YRSTUDY3&amp;nbsp;&amp;lt;-&amp;gt;&amp;nbsp;SATW (0.491389963307668)&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: xx-small;"&gt;DISAB2&amp;nbsp;&amp;lt;-&amp;gt;&amp;nbsp;ACTCOMP (-0.513776993632538)&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: xx-small;"&gt;DISAB4&amp;nbsp;&amp;lt;-&amp;gt;&amp;nbsp;SATV (0.540769639192817)&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: xx-small;"&gt;DISAB4&amp;nbsp;&amp;lt;-&amp;gt;&amp;nbsp;SATM (0.468981872216475)&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: xx-small;"&gt;DISAB4&amp;nbsp;&amp;lt;-&amp;gt;&amp;nbsp;DISAB1 (0.493333333333333)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The variable names are linked by the &amp;lt;-&amp;gt; symbol to show a correlation, and the significance level (that is, the coefficient) is show in parenthesis. If you want the p-value, you'll have to do that separately.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Step Four (optional)&lt;/b&gt;&lt;br /&gt;Find a nice way to display the results. I am preparing for a board report, and used &lt;a href="http://prezi.com/"&gt;Prezi&lt;/a&gt; to create graphs of connections showing self-reported behaviors, attitudes, and beliefs of a Freshman class. Here's a bit of it. A way to improve this display would be to incorporate the frequency of responses as well as the connections between items, perhaps using font size or color. [Update: see my &lt;a href="http://highered.blogspot.com/2011/10/creating-graphs-with-perl-and-graphviz.html"&gt;following post&lt;/a&gt; on this topic.]&lt;br /&gt;&lt;b&gt; &lt;/b&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-v-a-__ed9ng/ToXKGFvFGpI/AAAAAAAAAbc/_UsYB1azdpg/s1600/CIRP.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/-v-a-__ed9ng/ToXKGFvFGpI/AAAAAAAAAbc/_UsYB1azdpg/s1600/CIRP.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;b&gt; &lt;/b&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-8241535446484080101?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/8241535446484080101/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/09/recipe-for-finding-correlates-in-large.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/8241535446484080101'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/8241535446484080101'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/09/recipe-for-finding-correlates-in-large.html' title='A Recipe for Finding Correlates in Large Data Sets'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-oO8KgBjIwE8/TocZZmA_CTI/AAAAAAAAAbg/J9JhCTgTWbE/s72-c/process.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-4802122978734307161</id><published>2011-09-28T19:55:00.001-05:00</published><updated>2011-09-28T19:55:55.956-05:00</updated><title type='text'>SAT Error Rates</title><content type='html'>In "&lt;a href="http://highered.blogspot.com/2011/09/economics-of-imperfect-tests.html"&gt;The Economics of Imperfect Tests&lt;/a&gt;" I explored the consequences of errors when making decisions with a test. By coincidence, the College Board came out with something very similar a few days later: their revised&amp;nbsp;&lt;a href="http://professionals.collegeboard.com/testing/sat/sat-benchmark"&gt;College and Career Readiness Benchmark&lt;/a&gt; &lt;a href="http://professionals.collegeboard.com/profdownload/RR2011-5.pdf"&gt;[1]&lt;/a&gt;. This article and a related &lt;a href="http://professionals.collegeboard.com/data-reports-research/cb/sat-benchmarks"&gt;study&lt;/a&gt;&amp;nbsp;&lt;a href="http://professionals.collegeboard.com/profdownload/pdf/RN-30.pdf"&gt;[2]&lt;/a&gt; from 2007 give statistics that can illuminate the ideas in my prior post.&lt;br /&gt;&lt;br /&gt;When testing for a given criterion, it's essential to be able to check how well the test is working. This is a nice thing about the SAT: since it predicts success in college, we can actually see how well it works. This new benchmark isn't intended for predicting individual student performance, but groups of them. It looks like a bid for the SAT to become a standardized assessment of how well states, school districts, and the like, are preparing students for college. One caveat is mentioned in [2] on page 24:&lt;br /&gt;&lt;blockquote&gt;One limitation of the proposed SAT benchmark is that students intending to attend collegeare more likely to take the SAT and generally have stronger academic credentials than thosenot taking the exam. This effect is likely to be magnified in states where a low percentage ofthe student population take the exam, since SAT takers in those states are likely to be highachievers and are less representative of the total student population.&lt;/blockquote&gt;The solution there would be to mandate that everyone has to take the test.&lt;br /&gt;&lt;br /&gt;As with the test for good/counterfeit coins in my prior post, the benchmark is based on a binary decision:&lt;br /&gt;&lt;blockquote&gt;Logistic regression was used to set the SATbenchmarks, using as a criterion a 65 percent probability of obtaining an FYGPA of a B- orhigher [...]&lt;/blockquote&gt;The idea is to find an SAT score that gives us statistical assurance that students above this threshold have a 65% probability of having a college GPA of 2.67 or better their first year of college. There are some complexities in the analysis, including the odd fact that this 65% figure includes students who do not enroll. Of the students who &lt;i&gt;do&lt;/i&gt; enroll, table 4 on page 15 of [1] shows that of those who met the benchmark, 79% of them were 'good' students having FYGPA of B- or better (i.e. 2.67 or more). For the purposes of rating the quality of large groups of students, I suppose including non-enrolled students makes sense, but I will look at the benchmark from the perspective of trying to understand incoming student abilities or engineer the admissions stream, which means only being concerned with enrolled students.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Using the numbers in the two reports, I tried to find all the conditional probabilities&lt;/b&gt; needed to populate the tree diagram I used in the prior post to illustrate test quality. For example, I needed to calculate the proportion of "B- or better" students. I did this three ways, using data from tables in the article, and got 62% to within a percentage point all three times. The article [1] notes that this would be less than 50% if we include those who don't enroll, but that must be an estimate because a student obviously doesn't have a college GPA if they don't enroll.&lt;br /&gt;&lt;br /&gt;Here are the results of my interpretation of the data given. It's easiest to derive the conditional probabilities this direction:&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-mlFRZvGSRpA/ToNxPJ-k37I/AAAAAAAAAbM/PGztKvn6Wkk/s1600/SAT1.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="216" src="http://4.bp.blogspot.com/-mlFRZvGSRpA/ToNxPJ-k37I/AAAAAAAAAbM/PGztKvn6Wkk/s400/SAT1.PNG" width="400" /&gt;&amp;nbsp;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;In the tree diagram above, 44% of students pass the benchmark, which I calculated from table 5 on page 16 of [1]. The conditional probabilities on the branches of the tree come from table 4 on the previous page. Note that there's a bit of rounding in both these displays.Using Bayes Rule, it's easy to transform the tree to the form I used in the first post.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&amp;nbsp;&lt;a href="http://1.bp.blogspot.com/-EzgqkmSeIRI/ToNxRd54ytI/AAAAAAAAAbQ/VdboLIhRDo8/s1600/SAT2.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="220" src="http://1.bp.blogspot.com/-EzgqkmSeIRI/ToNxRd54ytI/AAAAAAAAAbQ/VdboLIhRDo8/s400/SAT2.PNG" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The fraction of 'good' students comes out to 62%, which agrees closely with a calculation from the mean and standard deviation of sampled FYGPA on page 10 of [1], assuming a normal distribution (the tail of "C or worse" grades ends .315 standard deviations left of the mean). It also agrees with the data on page 16 of [1], recalling that high school GPAs are about half a point higher than college GPAs in the aggregate.&lt;br /&gt;&lt;br /&gt;Assuming my reading of the data is right, the benchmark is classifying 35% + 28% = 63% of students correctly, doing a much better job with "C or worse" students than with "B- or better" students. Notice that if we don't use any test at all, and assume that everyone is a "B- or better" student, we'll be right 62% of the time, having a perfect record with the good students and zero accuracy with the others. Accepting only students who exceed the benchmark nets us 79% good students, a 17% increase in performance due to the test, but it means rejecting a lot of good students unnecessarily (44% of them).&lt;br /&gt;&lt;br /&gt;&lt;b&gt;In the prior post I used the analogy of finding counterfeit coins&lt;/b&gt; using an imperfect test. If we use the numbers above, it would be a case where there is a lot of bad currency floating around (38% of it), and on average our transactions, using the test, would leave use with 79 cents on the dollar instead of 62. We have to subtract the cost of the test from this bonus, however. It's probably still worth using, but no one would call it an excellent test of counterfeit coins. Nearly half of all good coins are kicked back because they fail the test, which is pretty inefficient, and half the coins that are rejected are good.&lt;br /&gt;&lt;br /&gt;We can create a performance curve&amp;nbsp;using Table 1a from page 3 of [2]. The percentage of B- students is lower here, at about 50% near the benchmark, so I'm not sure how this relates to the numbers in [1] that were used to derive the tree diagrams. But the curves below should be self-consistent at least, coming all from the same data set. They show the ability of the SAT to distinguish between the two types of students.&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-nETOdq0Shnk/ToO5UUQuwII/AAAAAAAAAbU/7XjtowJgd4E/s1600/SAT3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/-nETOdq0Shnk/ToO5UUQuwII/AAAAAAAAAbU/7XjtowJgd4E/s1600/SAT3.png" /&gt;&lt;/a&gt;&lt;/div&gt;If we set the bar very high, we can be relatively sure that those who meet the threshold are good (B- or better) students, but this comes at a cost in false negatives as we saw before. The "sweet spot" seems to be at 1100, with a &amp;nbsp;65% rate of classifying both groups correctly. Using this criterion, it's 15% better than a random coin toss for predicting both good and poor academic performers.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;It's clear that although the SAT has some statistically detectable merit&lt;/b&gt; as a screening test via this benchmark, it's not really very good at predicting college grades.As others have pointed out, this test has decades of development behind it, and may represent the best there is in standardized testing. Another fact makes the SAT (and ACT) unusual in the catalog of learning outcomes tests: we can check its predictive validity in order to ascertain error rates like the ones above.&lt;br /&gt;&lt;br /&gt;Unlike the SAT, most assessments don't have a way to find error rates because there is no measurable outcome beyond the test itself. For example, tests of "complex thinking skills" or "effective writing," or the like. These are not designed to predict outcomes that have their own intrinsic scalar outputs like college GPA. They often use GPAs as correlates to make the case for validity (ironically sometimes simultaneously declaring that grades are not good assessments of anything), but what exactly is being assessed by the test is left to the imagination. This is a great situation for test-makers because there is ultimately no accountability for the test itself. Recall from my previous post that test makers can help their customers show increased performance in two ways: either by helping them improve the product so that the true positives increase (which is impossible if you can't test for a true positive), or by introducing changes that increase the number of positives without regard to whether they are true or not.&lt;br /&gt;&lt;br /&gt;It's ironic that standardized tests of learning are somehow seen as leading to accountability when the tests themselves generally have no accountability for their accuracy.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-4802122978734307161?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/4802122978734307161/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/09/sat-error-rates.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/4802122978734307161'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/4802122978734307161'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/09/sat-error-rates.html' title='SAT Error Rates'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-mlFRZvGSRpA/ToNxPJ-k37I/AAAAAAAAAbM/PGztKvn6Wkk/s72-c/SAT1.PNG' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-1004666466455142804</id><published>2011-09-09T12:42:00.000-05:00</published><updated>2011-09-10T06:06:34.508-05:00</updated><title type='text'>What to Expect When You're Assessing</title><content type='html'>Along with Kaye Crook and Terri Flateby, I will be leading a one-day pre-institute workshop at the &lt;a href="http://planning.iupui.edu/conferences/national/nationalconf.html"&gt;2011 Assessment Institute&lt;/a&gt; in Indianapolis. This is a large national conference led by Trudy Banta and her team at IUPUI. It runs from October 31- November 1, with the pre-institute workshops on October 30.&lt;br /&gt;&lt;br /&gt;The description of our workshop "What to Expect When You're Assessing" is:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;div class="MsoNormal"&gt;This workshop is intended for faculty and administrators whohave responsibility for administering assessment activities at the program,department, or higher level. Through hands-on activities, participants willlearn essential skills for supervision of the whole assessment cycle, includinggood reporting, tips for data analysis, avoiding assessment pitfalls, goodpractices with tools like rubrics and curriculum maps, as well as management approachesto get the best out of your team using calendars, policies, and institutionalreadiness assessment. The workshop is appropriate for those with little assessmentexperience as well as those who would like to further develop their existing practicesto create sustainable and meaningful assessment programs.&lt;/div&gt;&lt;/blockquote&gt;&lt;div class="MsoNormal"&gt;&amp;nbsp;The reason for offering the workshop is to help institutions grow their own expertise in leading assessment processes. Because gaining trust of faculty and administrators within the organization is so important, it's a good strategy to find someone who already has that trust and teach them about assessment rather than hiring an assessment expert from outside who then has to win everyone's trust.&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="MsoNormal"&gt;I asked the ASSESS-L email list for their "Must-knows for a new assessment coordinator" (thanks to Katy Hill, Sean A McKitrick, and Rhonda A. Waskeiwicz for their responses). The &lt;a href="http://lsv.uky.edu/scripts/wa.exe?A2=ind1103&amp;amp;L=ASSESS&amp;amp;P=R31622&amp;amp;I=-3"&gt;results&lt;/a&gt; were interesting for a noticeable dearth of technical items, and an emphasis on political and personal skills, some of which actually &lt;i&gt;de-emphasize &lt;/i&gt;technical knowledge, including:&lt;/div&gt;&lt;div class="MsoNormal"&gt;&lt;/div&gt;&lt;ul&gt;&lt;li&gt;It's okay for things to not be perfect.&lt;/li&gt;&lt;li&gt;One has to 'suspend disbelieve' at times with regard to rigor&lt;/li&gt;&lt;/ul&gt;&lt;div class="MsoNormal"&gt;The lists are insightful, and have helped me think about the one-day program we're putting together. Roughly, it's about one half technical stuff:&lt;/div&gt;&lt;ul&gt;&lt;li&gt;The basic idea of assessment loops&lt;/li&gt;&lt;li&gt;Common terms and what they mean in practice &lt;/li&gt;&lt;li&gt;How to write good reports&lt;/li&gt;&lt;li&gt;Use of rubrics and curriculum maps&lt;/li&gt;&lt;li&gt;Data analysis and presentation&lt;/li&gt;&lt;/ul&gt;One quarter is strategies for working with people toward a common goal:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Appreciative inquiry&lt;/li&gt;&lt;li&gt;Responding to specific challenges (this was a topic on ASSESS-L too)&lt;/li&gt;&lt;li&gt;Setting expectations&lt;/li&gt;&lt;/ul&gt;The rest, which is heavily represented in the results of my email are about management of the process, including:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Assessing institutional readiness&lt;/li&gt;&lt;li&gt;Calendars&lt;/li&gt;&lt;li&gt;Working with other groups on campus (e.g. faculty senate, center for teaching and learning)&lt;/li&gt;&lt;li&gt;Administrative buy-in&lt;/li&gt;&lt;li&gt;What software tools can do and what they can't do&lt;/li&gt;&lt;/ul&gt;The overall objective is for participants to walk out of the workshop with a concrete plan in hand, as well as more resources and contacts that will help them find success.&lt;br /&gt;&lt;br /&gt;I welcome comments or suggestions. More materials will be forthcoming.&lt;br /&gt;&lt;br /&gt;Edit: In addition to the &lt;a href="http://lsv.uky.edu/scripts/wa.exe?S1=assess"&gt;ASSESS-L archive&lt;/a&gt;, there is a wonderful site &lt;a href="http://www2.acs.ncsu.edu/UPA/assmt/resource.htm"&gt;Internet Resources for Higher Education Outcomes Assessment&lt;/a&gt;  hosted by University of North Carolina and maintained by &lt;a href="http://higheredassessment.com/"&gt;Ephraim Schechter&lt;/a&gt;, a familiar name in assessment circles. That page is a familiar open window on my browser, and an essential bookmark for anyone interested in assessment.&lt;br /&gt;&lt;div class="MsoNormal"&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-1004666466455142804?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/1004666466455142804/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/09/what-to-expect-when-youre-assessing.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/1004666466455142804'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/1004666466455142804'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/09/what-to-expect-when-youre-assessing.html' title='What to Expect When You&apos;re Assessing'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-8580318635291625937</id><published>2011-09-05T14:17:00.002-05:00</published><updated>2011-09-05T14:17:40.235-05:00</updated><title type='text'>The Economics of Imperfect Tests</title><content type='html'>It's fascinating to me how attracted people are to rankings: colleges, sports teams, best cities to live in, most beautiful people, and so on can seemingly be put in an order. Of course, it's ridiculous if you stop to ask if the quality in question could really be as simply as a one-dimensional scalar that can be measured with such precision. But it doesn't stop new lists from being generated. Rank order statistics (e.g. saying that Denver is number one and Charlotte is number two on the list) come with their own sort of confidence intervals, so that we really should be saying "City C is rank R plus or minus E, with probability P." Computing these confidence bounds is &lt;a href="http://ideas.repec.org/a/tsj/stataj/v6y2006i3p309-334.html"&gt;not easy&lt;/a&gt;, and I've never seen it done on one of these lists.&lt;br /&gt;&lt;br /&gt;Leaving aside the issue of bounding error, the generation of the numbers themselves is highly questionable. Often, as with US News rankings of colleges, a bunch of statistics are microwaved together in Frankenstein's covered dish to create the master ranking number. You can read the FAQ on the US News rankings &lt;a href="http://www.usnews.com/education/articles/2010/08/17/frequently-asked-questions-college-rankings#16"&gt;here&lt;/a&gt;. It seems that consumers of these reports are in such a hurry, or have such limited attention spans, that we can only consider one comparative index. That is, we can't simultaneously consider graduation rate in one column and net cost in another to make compromise decisions. Rather, all the variables have assigned weights (by experts, of course), and everything is cooked down into a mush.&lt;br /&gt;&lt;br /&gt;A more substantive example is the use of SAT scores for making decisions about admissions to college (in combination with other factors). In conversations in higher ed circles, SATs are sometimes used as a proxy for the academic potential of a student. It's inarguable that although there is some slight predictive validity for, say, first year college grades, tests like these aren't very good as absolute indicators of ability. And so it would seem on the surface of it that the tests are over-valued in the market place. I've argued that this is an opportunity for colleges that want to investigate non-cognitive indicators, or other alternative ways of better valuing student potential.&lt;br /&gt;&lt;br /&gt;But the question I've entertained for the last week is what is the economic effect of an imperfect test. I imagine some economist has dealt with this somewhere, but here's a simple approach.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;How do we make decisions with imperfect information? &lt;/b&gt;We can apply the answer to any number of situations, including college admissions or learning outcomes assessment, but let me take a simpler application as an analogue. Suppose we operate in a market where there is a proportion $g$ of good money and counterfeit, or bad, money $1-g := b$. [You need javascript on to see the formulas properly.] We also have a test that can imperfectly distinguish between the two. I've sketched a diagram to show how well the test works below.&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-8NemiYQGVHU/TmTcnX56KGI/AAAAAAAAAbA/sbnyt52TKa0/s1600/coins.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="211" src="http://4.bp.blogspot.com/-8NemiYQGVHU/TmTcnX56KGI/AAAAAAAAAbA/sbnyt52TKa0/s320/coins.JPG" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;A perfect test would avoid two types of errors--false negatives and false positives. These may happen with different rates. Suppose an agent we call the Source goes to the market to exchange coins for goods or services with another agent, the Receiver. Assume that the Receiver only accepts coins that test "good." It's interesting to see what happens.&lt;br /&gt;&lt;br /&gt;The fraction of good coins that the Source apparently gives the receiver is: $gt+b(1-f)$&lt;br /&gt;The fraction of good coins actually received by the Receiver is $gt$&lt;br /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;br /&gt;So the Source will obtain more goods and services in return than are warranted, an excess of $b(1-f)$. The inefficiency can be expressed as the ratio $\frac{b(1-f)}{gt+b(1-f)}$, which is also the conditional probability Pr[false positive|test = "true"]. (Pr means probability, and the vertical bar is read "given.")&lt;br /&gt;&lt;br /&gt;There are two factors in $b(1-f)$: the fraction of bad coins and the false positive rate. So the Sender has an incentive to increase both. Increasing the fraction of bad coins is easy to understand. Increasing $1-f$ means trying to fool the test. Students who take SAT prep tests are manipulating this fraction, for example.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;So we have mathematical proof that it's better to give than to receive!&amp;nbsp;&lt;/b&gt;&lt;br /&gt;In a real market, it might work the other direction too. The Receiver might try to fool the Sender with a fraction of worthless goods or services in return. In that case, the best test wins, which should lead to an evolutionary advantage to good tests. In fact, in real markets we see that at least for easily-measured quantities like weight.&lt;br /&gt;&lt;br /&gt;In many cases, the test only goes one direction. When you buy a house, for example, there's no issue about counterfeiting money since it all comes from the bank anyway. The only issue is whether your assessment of the value of the house is good. The seller (the Source) has economic incentive to fool your tests by hiding defects and exaggerating the value of upgrades, for example. It's interesting to me how poor the tests for real estate value seem to be.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;In terms of college admissions or learning outcomes assessment,&lt;/b&gt; we don't hear much about testing inefficiency. The effects are readily apparent, however. For example, the idea of "teaching to the test" that crops up in conversations about standardized testing. If teachers receive economic or other benefit from delivering students that score above certain thresholds on a standardized test, then they are the Source, and the school system (or public) is the&amp;nbsp;Receiver. It's somewhat nebulous what the actual quality the tests are testing for because there isn't any discussion that I can find about the inherent $b(1-f)$ inefficiency that must accompany any less-than-perfect test. For teachers, the supply of "currency" (their students) is fixed, and they don't have any incentive to keep back the "good" currency for themselves. It's a little different from the market scenario described above, but we can easily make the switch. The teachers are motivated to increase the number of tested positives, whether these are true or not. They would also find false negatives galling, which they would see as cheating them out of goods they have delivered. As opposed to the exchange market, they want to increase both $gt$ and $b(1-f)$, not just the latter. They are presumed to have the ability to transmute badly-performing students into academic achievers (shifting the ratio to a higher $g$), and they can also try to fool the test by decreasing $f$. It is generally assumed that the ethical solution is to do the former.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;A good case study in this instance is the New York City 2009 Regents exam results&lt;/b&gt;, as described in this &lt;a href="http://online.wsj.com/article/SB10001424052748703445904576117793343465096.html?mod=wsj_share_twitter"&gt;Wall Street Journal article&lt;/a&gt;. The charge is made that the teachers manipulated test results to get higher "true" rates, and the evidence given clearly indicates this possibility. The graphs show that students are somehow getting push over the gap from not acceptable to acceptable, which is analogous to receiving a "good" result on the test. One of these is reproduced below.&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-8CLVk9QQun8/TmTtIEEK0MI/AAAAAAAAAbE/olE1dCBpIVo/s1600/history.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/-8CLVk9QQun8/TmTtIEEK0MI/AAAAAAAAAbE/olE1dCBpIVo/s1600/history.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Quoting from the article:&lt;br /&gt;&lt;blockquote&gt;Mr. Rockoff, who reviewed the Regents data, said, "It looks like teachers are pushing kids over the edge. They are very reluctant to fail a kid who needs just one or two points to pass."&lt;/blockquote&gt;This could be construed as either teachers trying to fix false negatives or trying to create false positives. The judgments come down as if it were the latter. The argument that this effect is due to fiddling with $1-f$ instead of increasing $g$ is bolstered by this:&lt;br /&gt;&lt;blockquote&gt;Mr. Rockoff points to the eighth-grade math scores in New York City for 2009, which aren't graded by the students' own teachers. There is no similar clustering at the break point for passing the test. &lt;/blockquote&gt;I find this very interesting because it is assumed that this is the normal situation--that if there were no $1-f$ "test spoofing" going on, then we should see a smooth curve with no bump. The implication is that teachers don't have a good understanding of how students will test--that is, despite the incentive to increase the skill levels of students so that they will convert from $b$ to $g$ (and have a better chance of testing "good"), they just don't know how to do it.&lt;br /&gt;&lt;br /&gt;Consider an analogous situation on an assembly line for bags of grain. Your job is to make sure that each bag has no less than a kilo of grain, and you have a scale at your disposal to test. Your strategy would probably be to just top up the bags so they have a kilo of grain, and then move on to the next one. Mr. Rockoff (and probably most of us) assumes that this is not possible for educators. It's an admission that teaching and testing are not very well connected. Otherwise, we would expect to see teachers "topping off" the educational level of students to pass the test and then moving on to other students, to maximize the $gt$ part of their payoff. This quote shows that the school system administrators don't even believe this is possible:&lt;br /&gt;&lt;blockquote&gt;After the audit, the state said it took a series of actions and plans to conduct annual "spike/cluster analysis of scores to identify schools with suspicious results."&lt;/blockquote&gt;It's ironic that on the one hand, this latent assumption questions the value of the tests themselves, and at the same time the system is built around their use. Other language in the article includes such expressions of certitude as this:&lt;br /&gt;&lt;blockquote&gt;Michelle Costa, a high-school math teacher in New York City, said she often hears from friends who teach at other schools who [bump up scores on] tests, though she doesn't do it. "They are really doing the student a disservice since the student has so obviously not mastered the material," she said. &lt;/blockquote&gt;Missing the mark by a couple of points is equated to "obviously not mastering the material." There is no discussion about the inherent inefficiencies in the test, although there seems to be a review process that allows for some modification of scores (called scrubbing in the article).&lt;br /&gt;&lt;br /&gt;This situation is designed for teachers to try to affect $f$ as the most sensible approach. Teaching students how to take standardized tests is one way of doing that. &lt;a href="http://performanceassessment.org/consequences/EnglishLanguageArts.pdf"&gt;This critique&lt;/a&gt; of one of the tests makes fascinating reading. &amp;nbsp;Here's a short quote:&lt;br /&gt;&lt;blockquote&gt;Do not attempt to write an original essay.  You don't have time.  Points are awarded      and subtracted on the basis of a formula.  Write the five-paragraph essay, even though      you will never again have a personal or professional occasion to use this format.  It      requires no comprehension of the text you are citing, and you can feel smart for having      wasted no time reflecting on the literature selections.&lt;/blockquote&gt;&lt;b&gt;We don't usually know if the tests are meaningful. &lt;/b&gt;If we did, we would know the ratio $g:b$ both before and after the educational process, and we would be able to tell what the efficiency of the test was. This is essentially a question of test validity, which seems to get short shrift. Maybe it seems like a technicality, and maybe the consumers of the tests don't really understand the problem, but it's essential. Imagine buying a test for&amp;nbsp;counterfeit&amp;nbsp;currency without some known standard against which to judge it!&lt;br /&gt;&lt;br /&gt;In education, the gold standard is predictive validity: we don't really care whether or not Tatiana can multiply single digit numbers on a test because that's not going to happen in real life. We care about whether she can use multiplication in some real world situation like calculating taxes, and that she be able to actually do that when it's needed, not in some artificial testing environment. If we identified these outcomes, we could ascertain the efficiency of the test. The College Board publishes reports of this nature, relating test scores to first year college grades, for example. From these reports we can see that the efficiency of the test is quite low, and it's a good guess that most academic standardized tests are equally poor.&lt;br /&gt;&lt;br /&gt;Yet the default assumption is often that the test is 100% efficient, that is $f=t=1$: we always perfectly distinguish $g$ from $b$.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The perspective from the commercial test makers is enlightening. &lt;/b&gt;If the teachers, faced with little way of relating teaching to testing, and no hope of relating that to (probably hypothetical) predicted real outcomes, choose to modify $1-f$ as a strategy, what is the likely motivation of the test makers?&lt;br /&gt;&lt;br /&gt;The efficiency of the test is certainly a selling point. Given the usual vagueness about real predictable outcomes, test makers can perhaps sell the idea that their products really are 100% efficient (no false positives or negatives). An informed consumer would have a clear goal as to the real outcomes that are to be predicted, and demand a &lt;a href="http://en.wikipedia.org/wiki/Receiver_operating_characteristic"&gt;ROC curve&lt;/a&gt; to show how effective the test is. In some situations, like K-12 testing, we have the confused situation of teachers not having a direct relationship to the tests, which have no accountability to predict real outcomes. It's similar to trying to be the "best city to live in," by optimizing the formula that produces the rankings.&lt;br /&gt;&lt;br /&gt;Since there is an assumed (but ironically unacknowledged) gap between teaching and testing, even the testing companies have no real incentive to try to improve teaching, improving the $g:b$ ratio. It's far easier for them &lt;i&gt;to sell the means to fool their own tests.&lt;/i&gt;&amp;nbsp;By teaching students how to optimize their time, and deal with complexities of the test itself, which very likely has nothing to do with either the material or any real predictable outcomes, test makers can sell to schools the means to increase the number of reported positives. They are selling ways to raise $t$ and $f$ without affecting $g$ at all. Of course, this is an economic advantage and doesn't come for free. Quoting again from the &lt;a href="http://performanceassessment.org/consequences/EnglishLanguageArts.pdf"&gt;critique&lt;/a&gt;&amp;nbsp;of one test:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;[S]everal private, for-profit companies have already developed&amp;nbsp;Regents-specific test-preparation courses for students who can afford their fees [...]&lt;/blockquote&gt;&lt;br /&gt;It's as if instead of trying to distinguish counterfeit coins from good ones, we all engage in trying to fool the test that imperfectly distinguishes between the two. That way we can pretend that there's more good money in circulation than there actually is.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-8580318635291625937?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/8580318635291625937/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/09/economics-of-imperfect-tests.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/8580318635291625937'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/8580318635291625937'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/09/economics-of-imperfect-tests.html' title='The Economics of Imperfect Tests'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-8NemiYQGVHU/TmTcnX56KGI/AAAAAAAAAbA/sbnyt52TKa0/s72-c/coins.JPG' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-1781306244535873012</id><published>2011-09-04T07:45:00.004-05:00</published><updated>2011-09-04T07:45:56.493-05:00</updated><title type='text'>Understanding Assessment through Language</title><content type='html'>Over the summer I wrote a paper that explores the role of language in outcomes assessment: how it can help and how it can get in the way of understanding what's going on. This exercise clarified for me the long-term importance of eportfolios, and the risks inherent to abstract forms of assessment. I will look for a publisher eventually, but in the meantime I welcome your comments.&lt;br /&gt;&lt;br /&gt;"&lt;a href="http://zzascape.com/LanguageofAssessment.pdf"&gt;Understanding Assessment through Language&lt;/a&gt;" (.5MB pdf)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-1781306244535873012?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/1781306244535873012/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/09/understanding-assessment-through.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/1781306244535873012'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/1781306244535873012'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/09/understanding-assessment-through.html' title='Understanding Assessment through Language'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-1031839270411498867</id><published>2011-09-03T11:10:00.002-05:00</published><updated>2011-09-03T11:10:30.454-05:00</updated><title type='text'>The Most Important Problem in Higher Education</title><content type='html'>&lt;a href="http://en.wikipedia.org/wiki/Arrow%27s_impossibility_theorem"&gt;Arrow's Impossibility Theorem&lt;/a&gt; is a fascinating application of mathematics to social science. Arrow took up the problem of how to find a voting system that meets certain reasonable criteria, such as taking into account the preferences of more than one voter. He showed that not all of the criteria can be met simultaneously in a single system because it creates a logical contradiction.&lt;br /&gt;&lt;br /&gt;I have thought that it would be interesting to try the same trick with higher education. Informally, we might think of a list of best-case qualities we would wish for an educational system, such as universal access, a mechanism for cost&amp;nbsp;deferral&amp;nbsp;(like loans), public subsidization at a given level, and so on. In these lists, which I scribble on napkins at diners, the hardest quality to come to grips with is the idea of certification: a way of knowing that a particular student showed a certain level of accomplishment.&lt;br /&gt;&lt;br /&gt;The idea of accomplishment is subtle. Here are at least three ways of looking at it:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;We might be interested in what a person can do in the future, as a prospective employer would. Can Tatiana synthesize organic compounds? In this sense, accomplishment has to do with the predictive validity of inducing future performance from past performance. Sports statistics embody this idea: a batting average in baseball, for example.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Another effect of accomplishment is the usefulness of experience in being a consultant. If Tatiana swam the English Channel back in 1981, she might not be able to do it again today, but she could probably tell you some important things to know about it.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;A third effect is social standing. Accomplishment brings its own rewards in terms of access to more connected people, the media, and so on.&amp;nbsp;&lt;/li&gt;&lt;/ol&gt;All of these effects usually derive from indicators of direct accomplishment. This isn't always the case, of course. Some people are famous just for being famous. This latter phenomenon should be seen as&amp;nbsp;deleterious because it dilutes and obfuscates the real value of accomplishment (perhaps related to&amp;nbsp;&lt;a href="http://en.wikipedia.org/wiki/Gresham%27s_law"&gt;Gresham's Law&lt;/a&gt;?).&lt;br /&gt;&lt;br /&gt;This last issue is confounded by the fact that accomplishments are often judged subjectively. Someone might think the screenplay I wrote is wonderful, but that would be a minority view. We can disagree about subjective assessments, so 'accomplishment' itself can be very fuzzy. Even in the case where some putative objective fact is presented (e.g. swimming the English Channel), there is room for debate about the significance of that accomplishment, and what it means in terms of the effects listed above. The common denominator is to keep track of evidence of the accomplishment itself (we would call it authentic assessment data in higher ed land), and let the endorsements and comments ebb and flow with the tides. So if Lincoln's Gettysburg Address receives poor reviews the day after, we still have it around 150 years later to review for ourselves. The historiography of accomplishments is almost as interesting as the things themselves.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Existing systems for tracking accomplishment are inadequate to the task.&lt;/b&gt;&amp;nbsp;Let's look at some of them.&lt;br /&gt;&lt;br /&gt;In higher education, we issue diplomas and certificates, sometimes with decorations like &lt;i&gt;sum laude&lt;/i&gt;, and identifying an area of study. The reputation of the issuing institution lends itself to the holder of the degree, and some subject areas are worth more than others. This is rather like a guild system, a gate-keeper approach. Some of the problems are:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;It doesn't allow for autodidacts who learn outside the system, and so demands a significant investment in time and money.&lt;/li&gt;&lt;li&gt;There isn't much 'partial credit': completing 99% of requirements does not translate to 99% accomplishment since you don't get a diploma.&lt;/li&gt;&lt;li&gt;It's too coarse-grained: we need more information than "John got an engineering degree from State U," or even transcripts.&lt;/li&gt;&lt;li&gt;The lending of reputation from a school's name to the individual distorts the actual accomplishment of the individual.&amp;nbsp;&lt;/li&gt;&lt;li&gt;It's very expensive and time-consuming to become certified.&lt;/li&gt;&lt;/ul&gt;At higher levels, the choice of a dissertation advisor can matter greatly, which leads to a second sort of certification: the letter of recommendation and other ways of personally endorsing someone. Problems with this approach include:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;There is no limit to how many letters one can write.&lt;/li&gt;&lt;li&gt;There is incentive to be overly generous in letters you do write, or at least there is a disincentive to be negative.&lt;/li&gt;&lt;li&gt;It's very difficult to compare letters from different endorsers since they are not standardized.&lt;/li&gt;&lt;/ul&gt;Self-certification using a resume to document or summarize one's accomplishments are unsatisfactory because they have to be checked for accuracy, leading to one of the other forms of tracking. There are often significant penalties for getting caught in outright fabrication, but the problem of verifying the truth of claims remains difficult and time-consuming. &lt;br /&gt;&lt;br /&gt;If we were interested only in predictive validity of statements like "Tatiana can speak French fluently," then the ideal case would be to have a perfect testing system to ascertain proficiency only in areas of interest. This is the &lt;a href="http://www.wgu.edu/"&gt;Western Governors&amp;nbsp;University&lt;/a&gt;&amp;nbsp;approach, and it also exists in the form of professional board exams. However, these are coarse-grained hurdles that have to be overcome to get credit for a course or enter a profession, and the skills actually needed for a job may have only a passing resemblance to the test. For example, there are many, many types of engineering jobs, and the board exams cannot possibly cover the skill sets in that level of detail. This results in a rather arbitrary, and certainly inefficient, standard of accomplishment.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;What are some requirements for an accomplishment tracking system?&lt;/b&gt;&lt;br /&gt;Here's a starter list.&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Identity is valid. We have to be confident that the accomplishments weren't outsourced or bought on eBay and then the credit transferred to someone else.&lt;/li&gt;&lt;li&gt;The system has to be open and transparent, so that we don't create a bottleneck that is just another gatekeeper.&lt;/li&gt;&lt;li&gt;The system has to be self-correcting with respect to inflation of accomplishments and outright errors.&lt;/li&gt;&lt;li&gt;Individuals have the final say over their own accomplishment profiles.&lt;/li&gt;&lt;/ol&gt;We can probably think of more requirements, but this is enough to begin with. The identity problem can really only be solved with personal contact. That is, a person has to be convinced that some other person did something. No amount of biometrics is ever going to solve that problem, I predict.&lt;br /&gt;&lt;br /&gt;The system or systems should be web-based and easily accessed. There shouldn't be fees or artificial walls to viewing profiles. Of course, per number four above, profiles should be able to be restricted by their owners, like current social media sites sort of allow. It also implies that there is no negative information about an individual, only positive. This seems like a drawback, and I might be wrong about it, but I think allowing negative feedback (which necessarily allows others to say things about you publicly that you don't like) is fraught with problems.&lt;br /&gt;&lt;br /&gt;So one possibility is a lightweight endorsement system that works something like the science pre-print site&amp;nbsp;&lt;a href="http://arxiv.org/"&gt;arxiv.org&lt;/a&gt;&amp;nbsp;mashed up with Facebook. Well, like &lt;a href="http://academia.edu/"&gt;academia.edu&lt;/a&gt;, come to think of it.&lt;br /&gt;&lt;ol&gt;&lt;li&gt;I do some work that I think shows off my skills and upload a record of it to the system under my profile. This would necessarily be electronic, but could consist of video or any other kind of media--not just papers.&lt;/li&gt;&lt;li&gt;A portfolio-like central index keeps track of all my stuff, with pointers and meta-data about the works I have listed in my resume.&amp;nbsp;&lt;/li&gt;&lt;li&gt;The index system also allows for endorsements by other registered users. Everyone's ID is real, verified by credit cards and phone numbers or something (optionally a unique SSL certificate).&amp;nbsp;&lt;/li&gt;&lt;li&gt;Endorsements from experts would claim to authenticate that this is indeed my work, and say something appropriate about it that demonstrates a connection with the work itself. &lt;/li&gt;&lt;li&gt;All of this is standardized to the extent possible, and if the user allows it, fed out to a data export.&lt;/li&gt;&lt;/ol&gt;The result could look something like LinkedIn's recommendation system mixed up with Academia.edu's portfolio/vita.&lt;br /&gt;&lt;br /&gt;On the surface, it seems like reciprocal recommendations would be a problem, as it tends to be in journals (citing each other's papers or including as co-authors). But with transparency, I suspect that clever people would create metrics that would attempt to summarize the evidence and connections that exist in the body of work. By tracing links of recommendations to create a network of association, it ought to be possible to see how worthwhile they are. It would be far from perfect, of course. But the possibility exists for a real "market" of reputation, where one's own standing is linked to those whom one has endorsed. It would be like picking stocks. In this way, there is an incentive for high-standing individuals to endorse newcomers who show promise. It probably doesn't prevent fads from distorting the 'reputation market', but if there is a relationship to the real world, eventually it becomes self-correcting. (This doesn't apply to all fields, and that problem may be unsolvable there simply because the discipline is more or less defined by the fads that exist at that time.)&lt;br /&gt;&lt;br /&gt;The beauty of a transparent system is that a company or university wanting to evaluate a person's record could either use an off-the-shelf metric provided by one of these&amp;nbsp;hypothetical&amp;nbsp;companies that would spring up, &lt;i&gt;or&lt;/i&gt;&amp;nbsp;they could just look at the evidence themselves. Or they could hire their own experts to look at the portfolios. There is a sliding scale between how quick the evaluation is and how customized it is, and the best solution for one circumstance may be different from another. I assume this would be followed by an interview, which could be very substantive and deal directly with the evidence of performance that pertains to the position.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;A good test case would be teaching evaluation.&lt;/b&gt;&amp;nbsp; I haven't seen a public teaching portfolio. Maybe I just haven't come across one, but it's never occurred to me to post mine either, until now. I have big binders with student projects I've supervised. Why not post the best of those? &amp;nbsp;The creation of a teaching-record portal would be a real service to higher education to standardize and validate that important part of academia.&lt;br /&gt;&lt;br /&gt;Although I talked about 'a system', it would almost have to be many systems, each associated with some professional endeavor. This still allows for a resume to link all of the pieces together if necessary. In education, learning outcomes assessments would be a necessary component.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The rewards for figuring out how to accurately and efficiently certify accomplishment &lt;/b&gt;are worth the trouble. Education systems could be revolutionized to be more flexible and transparent, less bureaucratic and guild-like. Someone who teaches herself would not be valued less than someone who learned at a high-priced university--the proof would be in the sweet dairy desert dish, as the saying goes.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-1031839270411498867?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/1031839270411498867/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/09/most-important-problem-in-higher.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/1031839270411498867'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/1031839270411498867'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/09/most-important-problem-in-higher.html' title='The Most Important Problem in Higher Education'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-7372607862678294989</id><published>2011-08-19T09:14:00.000-05:00</published><updated>2011-08-19T09:14:51.476-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='directory'/><category scheme='http://www.blogger.com/atom/ns#' term='web'/><title type='text'>Building a Web Directory</title><content type='html'>The project that consumed a large chunk of my time this summer was being project manager for our new &lt;a href="http://www.jcsu.edu/"&gt;university web site&lt;/a&gt;. It launched August 1 (on time and on budget, I'm happy to report), after a very intense summer of teamwork. &lt;a href="http://www.knowmad.com/"&gt;Knowmad Technologies&lt;/a&gt; is the web development company that we partnered with to identify strategies, do design and architecture work, and deploy the site in a &lt;a href="http://www.webgui.org/"&gt;WebGUI&lt;/a&gt; content&amp;nbsp;management&amp;nbsp;system. On the university side I am fortunate to have a talented and dedicated team: Josh Nypaver and Erin Phipps. I took on small pieces of the project myself; one of those was the site directory.Together with William McKee at Knowmad, we created an end-to-end process for creating and maintaining a web-based directory of staff, faculty, and offices. Here's how it works:&lt;br /&gt;&lt;br /&gt;1. I built a light-weight forms solution in openIGOR. The usual way of handling form data is to create a database table to save the entries. In order to maintain flexibility at the cost of query ability, I used plain text files instead, with a name=value syntax, like FirstName=David. Nothing is ever deleted--when entries are edited, it just appends the new version. This allows for complete roll-back and journaling. It also allows me to change the form without destroying a link to database tables.&amp;nbsp; Most of the form is shown below.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-50VU_WtlghU/Tk5sQoicUxI/AAAAAAAAAa0/b4iWJxXMHmk/s1600/directory.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/-50VU_WtlghU/Tk5sQoicUxI/AAAAAAAAAa0/b4iWJxXMHmk/s1600/directory.png" /&gt;&lt;/a&gt;&lt;/div&gt;2. Users log in to IGOR using their normal authentication through LDAP. This gives me lots of information about them since IGOR keeps track of who belongs to what group. So we know who's a biology professor and who works in the financial aid office, and so on. This is important because the directory is searchable by this criterion. Even better, these memberships are controlled by group administrators, so I don't have to personally modify them (although I spent several days cleaning it up, and we still don't have a great way to purge ex-employees other than manual changes).&lt;br /&gt;&lt;br /&gt;3. When the user saves their directory information, the server sends an email to the web editor, who has administrative access to fix spelling, grammar, and standardize building names. Photos have to be hand edited right now to specification.&lt;br /&gt;&lt;br /&gt;4. A Perl script is run to download all of this, as well as the names of groups for the directory (there's a flag to say which ones those are--we don't want committees showing up on the public web), and the index that says who is in what group. The three files (directory, groups, and index) are comma delimited text files.&lt;br /&gt;&lt;br /&gt;5. The directory data along with edited photos is uploaded to the web server. Two scripts that William wrote put everything in place.Once that's done, it's live.&lt;br /&gt;&lt;br /&gt;6. There are two sorts of queries that can be run on the web directory, which you can find here: (click the image)&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://www.jcsu.edu/directory"&gt;&lt;img border="0" height="198" src="http://3.bp.blogspot.com/-hY-mpzfkp0U/Tk5th8FSSBI/AAAAAAAAAa4/NeYjjVnCvlE/s400/directory2.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The first searches on all fields, and the second just on the group name relationships. This allows us to display office directories as well as more general searches. This was William's wonderful idea, and it gives us a simple implementation that is very powerful.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Conclusions&lt;/b&gt;&lt;br /&gt;It's hard to keep directories up to date, but by giving staff and faculty members direct control, we can shift responsibility from IT to the users. It seems to make them happy too. So far, the system has worked flawlessly, and the web directory is all I could have hoped it would be.Not all of the information from the IGOR form is yet on the web, and some of the fields really need validation or restricted options (like building names). But as just one element in a huge web project, we're quite happy with how it turned out.&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-7372607862678294989?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/7372607862678294989/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/08/building-web-directory.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/7372607862678294989'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/7372607862678294989'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/08/building-web-directory.html' title='Building a Web Directory'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-50VU_WtlghU/Tk5sQoicUxI/AAAAAAAAAa0/b4iWJxXMHmk/s72-c/directory.png' height='72' width='72'/><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-2576733522019935584</id><published>2011-07-01T17:19:00.000-05:00</published><updated>2011-07-01T17:19:10.053-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='e-learning'/><category scheme='http://www.blogger.com/atom/ns#' term='technology'/><category scheme='http://www.blogger.com/atom/ns#' term='teaching'/><title type='text'>Building a Teaching and Learning Technology lab</title><content type='html'>Thanks to our always-helpful government grants office, we were able to find money to develop a technology lab for teaching and learning. We brainstormed for a way to augment our existing installations, which already provide a lot of technology for course instructors. Given our new foray into online learning, lecture capture seemed the best bet--it's focused,&amp;nbsp;manageable&amp;nbsp;under our budget, and can give us an immediate payoff. It's also scalable in application from simple video recording to much more sophisticated productions. It will shine a light on our distribution system for electronic media too, and prompt infrastructure improvements that will have multiplier effects.&lt;br /&gt;&lt;br /&gt;"Lecture capture" is of course a terrible name for it. Lecturing means literally reading, which is what our academic forbears did before there were enough books to go around. Like wearing those heavy black robes in the May sun, some ideas are ripe for overhaul.&lt;br /&gt;&lt;br /&gt;There's a nice introduction by Educause &lt;a href="http://net.educause.edu/ir/library/pdf/ELI7044.pdf"&gt;here&lt;/a&gt;&amp;nbsp;(pdf), if you want to see what they have to say.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;What if we started from scratch?&lt;/b&gt;&lt;br /&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;It's not just the robes and readings that have seeped into the weave of the academic tapestry. Exposing young minds to the conceptual treasures of our disciplines is constrained by the combinatorics of time and space so that that certain optimizations are so standardized we don't think about them. Like having students sit for fifty minutes at a stretch in classrooms. What if instead we could engage them for ten minutes at a time on focused topics all day long? Another optimization is that all students in the class get the same linear flow of narration, and the discipline itself gets strung out into a one-dimensional string, codified into textbooks and syllabi that have the same linear style.&lt;br /&gt;&lt;br /&gt;Imagine if the world wide web were like that: a single linked list of information, so that you had to start at the beginning and work your way forward link by link until you got where you wanted to go. It's a lousy way to organize rich information.&lt;br /&gt;&lt;br /&gt;Sometimes concepts do have prerequisites. Math is full of it. But even in math there are optional topics, and opportunities to go into more depth even the simplest subject. If you want, you can think of the long multiplication process you learned in third grade as a convolution product, which can be optimized with Fourier transforms. One of my professors in graduate school (David Kammler) took an approach he compared to "Island hopping" as practiced in the Pacific theater in WWII, which consisted of learning certain important topics in depth and assuming that students could then make logical leaps to nearby topics, which could be skipped or treated as exercises.&lt;br /&gt;&lt;br /&gt;Imagine instead of a linear trip through a discipline, we captured key ideas and used those as anchors for a network of related facts, concepts, processes, etc. One immediate advantage would be the ability to easily update the network as the discipline changes. Group evolution is back in vogue? Add a node and stick it in there. No need to fell a forest for the new edition of a book. Another advantage is that focused topics can be crowdsourced, as in Wikipedia, with experts in sub-sub-sub-fields filling in blanks ad infinitum. Other thoughts:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;interdisciplinary connections become easier to integrate naturally&lt;/li&gt;&lt;li&gt;because learning happens in detail, it's easier to assess a small topic than a big one, allowing for faster improvement of the captures and related&amp;nbsp;pedagogy&lt;/li&gt;&lt;li&gt;custom methods can be applied more easily to particular topics. Think apps. When you get to the bit about singular value decompositions, you can download the stuff you need to see it in action, and then it can go away. It's modular.&lt;/li&gt;&lt;li&gt;It creates a way for professionals to agree on something. A whole textbook? Never happen. But a five minute tour of Darwinian evolution? Better odds. This standardizes at the right level. You can still customize your curriculum--just match assessments to content.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;All of this is probably being done in bits and pieces now, this divide and&amp;nbsp;conquer&amp;nbsp;approach. But it hasn't gelled yet into a common framework supported by real infrastructure and common culture (like say Wikipedia is).&lt;br /&gt;&lt;br /&gt;&lt;b&gt;I don't want to go all&amp;nbsp;Utopian&amp;nbsp;here, but let me push this one step further.&lt;/b&gt; With a rich network of high quality instruction/assessment/activities modules, the whole idea of a course becomes outdated. I know, I know, that's scary stuff. But if we don't have to have students sitting in desks for fifty minutes at a time, why do we need them for 15 weeks? All that really matters is that they learn and demonstrate mastery of concepts that we think are important. So then our whole bureaucracy can go hard a-port and do something really interesting. All that bricks and mortarboard infrastructure can be focused on &lt;i&gt;motivating students.&lt;/i&gt;&amp;nbsp;I've written enough about that already, so I'll stop here.&lt;br /&gt;&lt;br /&gt;I should hasten to add that our modest project at my institution has none of these grand ambitions. &amp;nbsp;I will be selling the idea that a "lecture capture" doesn't need to be 50 minutes long, however, and keeping my eye on the horizon...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-2576733522019935584?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/2576733522019935584/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/07/building-teaching-and-learning.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/2576733522019935584'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/2576733522019935584'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/07/building-teaching-and-learning.html' title='Building a Teaching and Learning Technology lab'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-48444866262979097</id><published>2011-07-01T06:39:00.003-05:00</published><updated>2011-07-01T16:17:47.810-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='games'/><category scheme='http://www.blogger.com/atom/ns#' term='creativity'/><category scheme='http://www.blogger.com/atom/ns#' term='complexity'/><title type='text'>Teaching 9th graders operations research</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-qd-XB03VFW0/Tg2uXwfxo8I/AAAAAAAAAY4/Rvc1uJJ6GOQ/s1600/DSCF4624.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="240" src="http://1.bp.blogspot.com/-qd-XB03VFW0/Tg2uXwfxo8I/AAAAAAAAAY4/Rvc1uJJ6GOQ/s320/DSCF4624.jpg" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;Our two-week project with Upward Bound came to a close yesterday. I've been working with Soumia Ichoua on her NSF-supported project to try to interest high schoolers in the fun side of math.&lt;br /&gt;&lt;br /&gt;More to come after we've analyzed the assessment results and put the paper together.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="MsoNormal" style="mso-layout-grid-align: none; text-align: justify;"&gt;&lt;b style="mso-bidi-font-weight: normal;"&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;b style="mso-bidi-font-weight: normal;"&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;b style="mso-bidi-font-weight: normal;"&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;b style="mso-bidi-font-weight: normal;"&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;&lt;b style="mso-bidi-font-weight: normal;"&gt;Acknowledgements&lt;o:p&gt;&lt;/o:p&gt;&lt;/b&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="mso-layout-grid-align: none; text-align: justify;"&gt;Financial support for this work was provided by the National Science Foundation (NSF) through grant number 0927129 (Ichoua). This support is gratefully acknowledged.&amp;nbsp;&lt;b style="mso-bidi-font-weight: normal;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/b&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-48444866262979097?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/48444866262979097/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/07/teaching-9th-graders-operations.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/48444866262979097'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/48444866262979097'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/07/teaching-9th-graders-operations.html' title='Teaching 9th graders operations research'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-qd-XB03VFW0/Tg2uXwfxo8I/AAAAAAAAAY4/Rvc1uJJ6GOQ/s72-c/DSCF4624.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-1984205442692214073</id><published>2011-06-10T10:06:00.000-05:00</published><updated>2011-06-10T10:06:20.197-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='aalhe'/><category scheme='http://www.blogger.com/atom/ns#' term='assessment'/><category scheme='http://www.blogger.com/atom/ns#' term='public policy'/><category scheme='http://www.blogger.com/atom/ns#' term='higher education'/><title type='text'>An Addendum and  Apology</title><content type='html'>On Wednesday I &lt;a href="http://highered.blogspot.com/2011/06/policy-questions-raised-at-aalhe.html"&gt;wrote&lt;/a&gt; about my take-aways from the AALHE meeting in Lexington, and drew on some remarks from Trudy Banta. Some of the responses I got were justly critical of the way I mangled Trudy's message about the wider importance of the SAT example. It's true--I botched it, and I'd like to set the record straight so that Trudy doesn't begin to doubt her communications skills.&lt;br /&gt;&lt;br /&gt;In order not to further fold, spindle, or mutilate someone else's message, let me say that what follows is my interpretation, and any faults should be ascribed to the author alone.&lt;br /&gt;&lt;br /&gt;In a comment on the still-infantile state of the art of measurement, Trudy observed that even after more than 80 years of development and efforts to improve the SAT for its intended purpose, there is as much disagreement as ever about the validity of the SAT as a predictor of success in college.  I intended to emphasize this point and what it portends for other projects that are even more ambitious. In the original article I left this argument dangling rather ineptly. So here it is, starting with a closer look at SAT, generalizing to the characteristics of industrialized tests (meaning massive, usually commercial, standardized instruments), and how we can do better with more authentic assessments.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Even with all the research that has gone into the SAT&lt;/b&gt;, in absolute terms it still isn't very good. The &lt;a href="http://professionals.collegeboard.com/profdownload/Validity_of_the_SAT_for_Predicting_First_Year_College_Grade_Point_Average.pdf"&gt;2008 validity study&lt;/a&gt; from the College Board gives us (pg. 2):&lt;br /&gt;&lt;blockquote&gt;[T]he weighted&amp;nbsp;average correlation between SAT writing scores and English&amp;nbsp;composition course grades was 0.32, after correcting for&amp;nbsp;range restriction.&lt;/blockquote&gt;This is the most direct link I could find in the study between SAT and learning outcomes. Here we have actual writing in a standardized environment, correlated with course grades in coursework where writing is taught. If the course grade is related to how well students can write, then their potential as demonstrated by the SAT writing component ought to line up with it. And it does, but only to the tune of explaining 10% of the variance in grades (.32 squared). More generally, the variance in first year college grades explained by all the components of the SAT combined is 25% (page 5, squaring the adjusted R).&lt;br /&gt;&lt;br /&gt;Any predictor with greater than zero R may be useful in the right context. But if we consider SAT's performance as an upper limit to what industrialized tests can do, it's a warning for current and contemplated high-stakes applications. This is a good place to&amp;nbsp;segue&amp;nbsp;to the mis-step&amp;nbsp;in my prior article.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;I gave three advantages of standardized tests.&lt;/b&gt; Here they are:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;They can have good reliability, so they seem to measure something.&lt;/li&gt;&lt;li&gt;Because of this, they can create self-sustaining corporations that provide a standardized service to higher education, so there are fixed points of comparison.&lt;/li&gt;&lt;li&gt;Even if the validity isn't stellar, some information is better than none, right?&lt;/li&gt;&lt;/ul&gt;I was still thinking mostly of the SAT when I wrote this, so you might consider this the best case scenario. However, I intended it to be clear (it wasn't) that this is damning with faint praise: although these bullets describe how standardized tests have colonized an ecological niche successfully, these reasons aren't sufficient to warrant their use for high-stakes purposes. Like value-added comparisons of institutions, for example.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The missing "anti-bullets" that describe the corresponding &lt;i&gt;disadvantages &lt;/i&gt;are respectively:&lt;/b&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Depending solely on reliability is like searching for your lost keys under the lamp because that's where the light is. The drive for reliability creates artificial conditions like timed multiple-choice tests that have very little mechanical relationship to real-world application of knowledge. So reliability comes at a cost in validity.&amp;nbsp;&lt;/li&gt;&lt;li&gt;The size of the testing industry means that it can use its resources to circumvent professional review of its products and sell them directly to politicians.&amp;nbsp;&lt;/li&gt;&lt;li&gt;The validity comment is specious: "some" information isn't good enough in many circumstances. Imagine driving on a cliff's edge and only knowing where 10% of the road is. (An unfortunate real case is the publication of ranks of teacher "value-added" scores by the LA Times last year.) Exacerbating the general dearth of validity is is the fact that validity has to be determined &lt;i&gt;locally&lt;/i&gt; after application of the test results (that is, to decide if some proposition about the actual results can be supported or not). It's very hard to do. Most of the time we don't know if our applications are valid or not.&amp;nbsp;&lt;/li&gt;&lt;/ul&gt;I would like to apologize to Trudy for bungling the transition between my summary of her remarks and the bullets in the original. I detracted from the point and may have misled or confused some readers. &lt;i&gt;Mea culpa.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;We can now step beyond the problems and look for solutions&lt;/b&gt;, and Trudy gave a list of some of these alternative approaches. Reliability isn't produced in a factory in Iowa City: we can get good reliability using rubrics on authentic student work. Additionally, the convenience of industrialized testing can nowadays be matched with local technological solutions. If we want to use portfolios, we don't have to have a thousand file cabinets, or mail copies to prospective employers; it's all on the web.&lt;br /&gt;&lt;br /&gt;The big win for local solutions is validity: with the right application of technique, we can link assessments solidly with pedagogy at many levels of resolution, from individual assignments up to the institutional level.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;As noted above, the testing industry has a lot of ability to protect its own interests, including great access to policy makers. With that in mind, I should wrap back around to the &amp;nbsp;theme of the policy conversations at the conference: the calls for accountability. It's not good enough to just resist the K-12-like solution; we have to find an alternative we find acceptable, and there are some good candidates available.&amp;nbsp;That, I hope, is a clear and unmangled statement of the main message.&lt;br /&gt;&lt;br /&gt;I don't mean to demonize "big testing," by the way. Maybe the best solution would be if we could convince them that authentic assessment and electronic portfolios are profitable new markets. I don't know what the chances of that are.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-1984205442692214073?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/1984205442692214073/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/06/addendum-and-apology.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/1984205442692214073'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/1984205442692214073'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/06/addendum-and-apology.html' title='An Addendum and  Apology'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-5272225278834913387</id><published>2011-06-09T05:15:00.001-05:00</published><updated>2011-06-09T05:17:27.842-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='higher education'/><title type='text'>Inputs and Outputs</title><content type='html'>There's an interesting &lt;a href="http://www.newyorker.com/arts/critics/atlarge/2011/06/06/110606crat_atlarge_menand#ixzz1OlqMOugr"&gt;article in The Atlantic&lt;/a&gt;. There's much more to it, but this quote gets to the heart of the experience for many classroom instructors at all levels, I imagine:&lt;br /&gt;&lt;blockquote&gt;Professor X is shrewd about the reasons it’s hard to teach underprepared students how to write. “I have come to think,” he says, “that the two most crucial ingredients in the mysterious mix that makes a good writer may be (1) having read enough throughout a lifetime to have internalized the rhythms of the written word, and (2) refining the ability to mimic those rhythms.” This makes sense. If you read a lot of sentences, then you start to think in sentences, and if you think in sentences, then you can write sentences, because you know what a sentence sounds like. Someone who has reached the age of eighteen or twenty and has never been a reader is not going to become a writer in fifteen weeks.&lt;/blockquote&gt;The source here is the book by "Professor X" called &lt;i&gt;&lt;a href="http://www.amazon.com/Basement-Ivory-Tower-Confessions-Accidental/dp/067002256X/ref=sr_1_1?ie=UTF8&amp;amp;qid=1307614599&amp;amp;sr=8-1"&gt;In the Basement of the Ivory Tower&lt;/a&gt;&lt;/i&gt;.&lt;br /&gt;&lt;br /&gt;In Seattle Education there is a &lt;a href="http://seattleducation2010.wordpress.com/2011/05/26/the-big-picture-privating-education-part-one-of-three/"&gt;three-parter on privatizing education&lt;/a&gt;. There's a great game-theory example I didn't know about:&lt;br /&gt;&lt;blockquote&gt;Proponents claim that by encouraging competition, privatization can improve the efficiency of public services. But there can be serious drawbacks. For instance, before fire departments were publicly run, groups of firefighters sometimes set fires just to earn money by putting them out!&lt;/blockquote&gt;&amp;nbsp;The evidence provided is more suggestive than compelling, but it's a nice business model (from the point of view of investors anyway) to control both supply and demand. In education this would translate as institutions not really caring much about education, but really caring about credentialism, and then seeking to maximize the number of credentials available and perceived as necessary to employment. In other words, if education is a gate to the promised land, put us as many toll booths as possible on the road between here and there.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-5272225278834913387?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/5272225278834913387/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/06/professor-x.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/5272225278834913387'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/5272225278834913387'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/06/professor-x.html' title='Inputs and Outputs'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-6504108301491104996</id><published>2011-06-08T08:38:00.001-05:00</published><updated>2011-06-10T11:25:10.300-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='aalhe'/><category scheme='http://www.blogger.com/atom/ns#' term='assessment'/><category scheme='http://www.blogger.com/atom/ns#' term='public policy'/><category scheme='http://www.blogger.com/atom/ns#' term='higher education'/><title type='text'>Policy Questions Raised at AALHE</title><content type='html'>I flew back yesterday from Lexington and the first AALHE conference. I found it very stimulating. I put faces to names from the ASSESS list server, which was delightful.&lt;br /&gt;&lt;br /&gt;In the opening plenary, Trudy Banta gave us a broad perspective on the evolution of the measurement and accountability, pointing out the weaknesses of value-added derivations and standardized tests in particular, and suggesting authentic assessment (e.g. portfolios and their analyses) as useful alternatives.&lt;br /&gt;&lt;br /&gt;One point is particularly compelling to me. Trudy mentioned the pedigree of the SAT, and it's not hard to imagine the many hours and dollars that have gone into fine-tuning this test. These are smart people working with a will for a long time toward a well-defined purpose: predicting how well high school students will do in college. In my own experience as an IR person, the SAT does add some predictive strength to linear models, but not much once high school GPA is considered. A handful of percent of R^2 is it. At my present institution, it's virtually worthless as a predictor of first-year grades, which points also to the known biases of the test.&lt;br /&gt;&lt;br /&gt;In short, there is some usefulness to the SAT, but it may not warrant all the trouble and expense. And of course some schools are now SAT-optional. I've written before about how, as a market signal, SAT overprices some students and overlooks others, creating the opportunity to use other (e.g. non-cognitive) indicators to find good students.&lt;br /&gt;&lt;br /&gt;Trudy's comments did not go this far, but it's not hard to connect the dots. It's an important point: if all this effort yields so little result, maybe we're doing it wrong. The alternative is to admit that maybe this is the best we can do, and that our ways of knowing will just never be much good.&lt;br /&gt;&lt;br /&gt;It should be noted that the plenary was planned as a kind of debate with two viewpoints, but that because of a cancellation by Mark Chun, only one side was presented. So in the defense of big standardized tests, here are some advantages:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;They can have good reliability, so they seem to measure &lt;i&gt;something.&lt;/i&gt;&lt;/li&gt;&lt;li&gt;Because of this, they can create self-sustaining corporations that provide a standardized service to higher education, so there are fixed points of comparison.&amp;nbsp;&lt;/li&gt;&lt;li&gt;Even if the validity isn't stellar, some information is better than none, right?&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;The conversation is much more subtle than "down with standardized tests!" But you may have noticed that fine distinctions aren't part of the public debate on the quality of higher education. It would be great if Mark and Trudy and other experts introduced a public discussion like this on a message board or something--somewhere that the complexities of the issues could be fully explored and commented on over time.&lt;br /&gt;&lt;br /&gt;[Update, see my "&lt;a href="http://highered.blogspot.com/2011/06/addendum-and-apology.html"&gt;Apology and Addendum&lt;/a&gt;" that goes with this section.]&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Bookending the conference, the closing plenary was given by David Paris,&lt;/b&gt; and also included a historical and political overview of where we are now. He is the executive director of the New Leadership Alliance for Student Learning and Accountability. You can sign up for email updates on their &lt;a href="http://www.newleadershipalliance.org/"&gt;website&lt;/a&gt;. Paraphrasing David, the Obama administration talks nicer, but wants the same things as the Bush administration. And what that seems to be is "accountability." One important take-away for me is that we (higher education) have a window of opportunity that will soon close. That is, we can "do it to ourselves" or "have it done to us." The 'it' is amorphous, but centers on the accountability for cost and learning trade-off. In other words, all this public noise about how bad college is will create changes, and we should try to shape those.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It's hard to disagree with that. One approach is the Alliance's Institutional Certification Program, which you can learn about &lt;a href="http://www.newleadershipalliance.org/what_we_do/excellent_practice_in_student_learning_assessment/"&gt;here&lt;/a&gt;. I'd like to see some examples of how it works in practice, but on the face of it, it seems like a great idea, similar in some aspects to the Bologna 'centering' process in Europe. I would call it a kind of horizontal professionalization, complementary to the 'vertical' type provided by discipline-based accreditations and professional organizations. Overall, it seems like a serious and well-reasoned approach, and I look forward to finding out more about it.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;There was a good Q&amp;amp;A afterwards&lt;/b&gt;, and I suggested that higher education needs access to data pertaining to what happens to students after graduation, and that the federal bureaucracy sits on a gold mine of such information. This didn't have a chance to become a real discussion because of the format, but my interpretation of David's response was that the student-level record project that was floated and shot down (with help from higher education lobbyists) would have solved that problem. So we are complicit in causing the problem.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I don't buy this. First of all, what we need most right now is a deep historical understanding of how higher education has affected lives after school (graduation or not), for individual institutions if not individual majors. I understand that the data are imperfect and that there are privacy issues to be solved. But I think 9/11 shows that any privacy concerns can be solved rather quickly if the motivation is there. Specifically, it seems like it should be possible to combine some combination of student loan records, FAFSAs and &amp;nbsp;PELL grant records, tax records, federal employee records (including military) and so on to link where students took instruction, what their demographics were, and what happened to them afterwards. There are plenty of other places to find data, I'm sure, like state system databases.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The payoff for this is potentially enormous. Suppose we (higher education) can agree with the politicos that employment is a good outcome of a college education, and even talk about the details like what kind, how much pay, where geographically, and so on. Then we could look at what kinds of schools have what kinds of effects on what kinds of students. Is a biology program at my (private) university worth the cost premium over &amp;nbsp;the public one down the road? What about access? Which schools are economic engines by taking low-income inputs and (eventually)&amp;nbsp;producing high-income outputs?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The payoff is so great that I think that if the government really can't find a way to do this, we should figure out how to do it ourselves.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What we have now is survey data that shows that college generally pays for itself, but the price is going up. This is in opposition to the cries of critics that say students aren't learning much, if anything.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Here's my analogy.&lt;/b&gt; It's imperfect, but has the advantage of being vivid. Students are like movie scripts coming to our production companies. We have our specialties: niche art films or mass market gloss, and so on. Each script has an individual experience, no matter what our organization is. They get cast differently (different instructors), and we do our best to make a good fit usually. Or maybe we just get the cheapest actors we can find and hope for the best. We always have to rewrite the script to some extent, and our mission--the hope that gets buried in the Sisyphean rolling of semesters up and down the calendar--is that the screenplay comes to life and realizes its potential.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It takes a lot of money to make this happen, and it comes from investors who are increasingly grumpy. They say the movies are no good. They're too expensive and nobody likes them. Film critics like Arum and Roksa used data from a large number of scripts in production to claim that a large portion weren't being significantly improved in the rewriting stage.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;But we don't have any real information about what the audience thinks. We can analyze the heck out of our own productions, and do six-sigma and professionalize all we want, but until we understand what the audience thinks, we don't really know if we're doing any good. Maybe all those films get shown once, and then end up in a storage shed.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Maybe some types of films shouldn't be made anymore. Maybe Forensic Dolphin Psychoanalysis shouldn't even &lt;i&gt;be &lt;/i&gt;a major because there's no audience for it.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The recent explosion of for-profits makes the problem more urgent to solve. Most industries have to depend on their products working. We don't have much direct evidence one way or the other in the kind of detail we need to make decisions. It's starting to happen, for example with student load default rates getting attention. But we can go a lot further than that.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Imagine if we could sit down with the investors and agree on the long-term goals, have metrics that more or less tell us how well we're doing, and plan together how to get there. Are the goals&amp;nbsp;strictly economic? Or do we want people to pursue happiness and bear arms? Even if it &lt;i&gt;is&lt;/i&gt; just economics, it presents an opportunity to really understand how, say, liberal arts education matters in job mobility, lifetime income, and so on. Institutions could say "yes, you'll get a fine job, but in addition you'll have a&amp;nbsp;fulfilling&amp;nbsp;career." Or whatever their mission is. Instead of talking past each other about standardized assessments and authentic assessments, we could figure out how to work backwards from the real goals to real assessments that matter at the strategic level and then add institutional flavors that matter to us. That would be an exciting and productive conversation.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;The alternative is grim.&lt;/b&gt;&amp;nbsp;If we only focus on what goes on inside our production studios, the future of the nation is at the mercy of every critic with a theory or an agenda. I'm not sure which is worse: theories or agendas. Some will want to break down every step of movie making into a reductionist flow-chart, and create spreadsheets to show the rate of script-to-casting time or use biometrics to calculate charisma factors of the actors. There's no bottom to this because although movie scripts are only 120 pages, individual brains of our students have perhaps 100 trillion synapses each. If each one is, say eight bits of information, that's about&lt;i&gt; four billion&lt;/i&gt; times the complexity of a movie script. Each one. Others will work backwards from agendas to create policies that make no sense in context.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Even if we think we've solved the problem and get universal agreement that our learning outcomes are being achieved at the university, how do we really know what the effect is after graduation unless we &lt;i&gt;measure that&lt;/i&gt;? Maybe they're learning the wrong things. Maybe some small college has methods that work twice as good as ours. Maybe we can reduce costs and increase quality. The ingredients are wonderful: a diverse ecology of isolated experiments. We just can't see where the lab results are recorded in order to make conclusions.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Yes, we need individual student records. But we can't wait for that. The problem is too important. Maybe tax records and FASFAs can't be mashed up because of political or technical reasons (but do you really think Mark Zuckerberg couldn't figure this out in an afternoon?). If that's the case, then we have to find another way to measure historical and ongoing long-term outputs in such detail that it can inform institutional decisions.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What we're doing at present is creating our own dramatic screenplay: an epic version of "&lt;a href="http://en.wikipedia.org/wiki/No_Exit"&gt;No Exit&lt;/a&gt;."&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-6504108301491104996?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/6504108301491104996/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/06/policy-questions-raised-at-aalhe.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/6504108301491104996'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/6504108301491104996'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/06/policy-questions-raised-at-aalhe.html' title='Policy Questions Raised at AALHE'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-250085408784603781</id><published>2011-05-23T08:52:00.000-05:00</published><updated>2011-05-23T08:52:57.564-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='course evaluation'/><title type='text'>Evaluation Oddity</title><content type='html'>This year we changed our course evaluation form from a very long list of management questions ("Did the instructor meet office hours regularly?") to a short one focused on learning outcomes. The report below is from my fall 2010 Calculus II class.&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-8oEfb0qCR_c/Tdpj9Nuv3ZI/AAAAAAAAAXs/noE3A9cC4og/s1600/eval4.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="529" src="http://3.bp.blogspot.com/-8oEfb0qCR_c/Tdpj9Nuv3ZI/AAAAAAAAAXs/noE3A9cC4og/s640/eval4.png" width="640" /&gt;&amp;nbsp;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;That's the whole form, except for free-form written responses. As a first look at the meaningfulness of results, I looked at the correlation between responses for each item, and got a surprise. To wit: while items 9,10, and 11 are tightly correlated, as shown on the matrix below,&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-t0NqyHPGkQU/TdpkoKzTJpI/AAAAAAAAAX0/ngh-MUKJMZs/s1600/eval2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="99" src="http://2.bp.blogspot.com/-t0NqyHPGkQU/TdpkoKzTJpI/AAAAAAAAAX0/ngh-MUKJMZs/s200/eval2.png" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;and furthermore, items 12 and 13 are correlated at .88,&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-E7dBZit3pqY/TdplXKw-9dI/AAAAAAAAAX8/sRjmZ0BF_D8/s1600/eval3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="102" src="http://1.bp.blogspot.com/-E7dBZit3pqY/TdplXKw-9dI/AAAAAAAAAX8/sRjmZ0BF_D8/s640/eval3.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&amp;nbsp;these two summative questions (12 and 13) do not correlate well at all with any of the other items, including 9, 10, and 11, which ask if the course was interesting, enjoyable, and did you learn. Do students not associate those things with their summative evaluations, or is there some other register that is engaged (i.e. they perceive that the last two questions are more personal, about the instructor instead of about themselves)? On the other hand, the very low correlation between item 2 (How much effort did you put into the course?) and the overall evaluation seems to be a good sign, implying that students differentiate between their own performance and that of the instructor.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-250085408784603781?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/250085408784603781/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/05/evaluation-oddity.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/250085408784603781'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/250085408784603781'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/05/evaluation-oddity.html' title='Evaluation Oddity'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-8oEfb0qCR_c/Tdpj9Nuv3ZI/AAAAAAAAAXs/noE3A9cC4og/s72-c/eval4.png' height='72' width='72'/><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-8136745044693032088</id><published>2011-05-03T12:46:00.001-05:00</published><updated>2011-05-03T12:47:15.637-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='accreditation'/><category scheme='http://www.blogger.com/atom/ns#' term='SACS'/><title type='text'>SACS Changes to the Principles Proposed</title><content type='html'>A few weeks ago I posted about recommended changes to our SACS-COC accreditation standards. Today the review committee's recommendations for change were announced. You can find the document &lt;a href="http://www.sacscoc.org/principles%20worksheet.pdf"&gt;here&lt;/a&gt;. There is a comment period on this document open until May 18.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-8136745044693032088?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/8136745044693032088/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/05/sacs-changes-to-principles.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/8136745044693032088'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/8136745044693032088'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/05/sacs-changes-to-principles.html' title='SACS Changes to the Principles Proposed'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-2909383419055909562</id><published>2011-04-29T12:26:00.000-05:00</published><updated>2011-04-29T12:26:53.335-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='word cloud'/><title type='text'>Wordle Again</title><content type='html'>I mentioned &lt;a href="http://wordle.net/"&gt;Wordle.net&lt;/a&gt; a while back. It's a way to create 'word clouds' from text. It occurred to me this afternoon that it would be neat to put course descriptions in from the catalog for each major. I tried a couple of them, shown below.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-N4k8Agwx2Qg/Tbr0fz8FT2I/AAAAAAAAAXk/YIRNBnJsw_k/s1600/history.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="259" src="http://2.bp.blogspot.com/-N4k8Agwx2Qg/Tbr0fz8FT2I/AAAAAAAAAXk/YIRNBnJsw_k/s400/history.PNG" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-8CeFO8TC-kA/Tbr0oMb1swI/AAAAAAAAAXo/a7xA-r3nglE/s1600/art.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="266" src="http://3.bp.blogspot.com/-8CeFO8TC-kA/Tbr0oMb1swI/AAAAAAAAAXo/a7xA-r3nglE/s400/art.PNG" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-2909383419055909562?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/2909383419055909562/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/04/wordle-again.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/2909383419055909562'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/2909383419055909562'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/04/wordle-again.html' title='Wordle Again'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-N4k8Agwx2Qg/Tbr0fz8FT2I/AAAAAAAAAXk/YIRNBnJsw_k/s72-c/history.PNG' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-5432114043705693979</id><published>2011-04-29T08:38:00.001-05:00</published><updated>2011-05-02T13:14:16.144-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='effectiveness'/><category scheme='http://www.blogger.com/atom/ns#' term='assessment'/><category scheme='http://www.blogger.com/atom/ns#' term='QEP'/><category scheme='http://www.blogger.com/atom/ns#' term='SACS'/><title type='text'>Small College Initiative</title><content type='html'>I attended the SACS Commission on Colleges &lt;i&gt;Small College Initiative&lt;/i&gt; this week. You can find the slides on the SACS web site &lt;a href="http://www.sacscoc.org/SCI.asp"&gt;here&lt;/a&gt;. Below are some of my notes and observations.&lt;br /&gt;&lt;br /&gt;Mike Johnson talked about CR 2.5 (Institutional Effectiveness) and pointed out the difference between assessment and evaluation. In my interpretation of his remarks, the former is gathering data and the latter is using it to draw conclusions for action. In my experience, this gap is where many IE cycles break down. Signs of this are "Actions for Improvement" that:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Are missing altogether&lt;/li&gt;&lt;li&gt;Are too general or vague to be put into practice&lt;/li&gt;&lt;li&gt;Report that everything is fine, and no improvements are necessary&lt;/li&gt;&lt;li&gt;Suggest only improvements to the assessments&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;I have a lot more to say about this (there's a surprise), and am preparing a talk for the Assessment Institute and a paper for NILOA on related topics.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Another important point is that IE cannot be outsourced or even in-sourced to a director. The whole point is that is is a &lt;i&gt;collaborative&lt;/i&gt; exercise in striving to achieve goals. I think results are proportional to participation. In a similar vein, Mike noted that computer software can help organize reporting, but doesn't magically solve the problem of generating quality IE loops. Garbage in = garbage out.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A wonderful suggestion was to use the creation of "board books"as a way to encapsulate IE reports in a natural way &lt;i&gt;that's already being done&lt;/i&gt;. Mike's larger point here is that we already have many &lt;i&gt;real&lt;/i&gt;&amp;nbsp;IE processes--all institutions that manage to survive use data one way or another--and there's no need to create an artificial one for reporting. I saw this during a review, where the institution had wonderful processes in place, but didn't include that documentation in the compliance certification, and instead reduced all that rich information into a four-column grid that "looks like it's supposed to." Of course, one problem here (in my opinion) is that there doesn't seem to be a standardized way to look at IE processes. If we were serious about it, we'd do inter-rater reliability studies and create tight rubrics with lots of examples in a library, showing what's acceptable and what's not. I think this would go a long way toward reducing the number of out-of-compliance findings. Way back when--over a decade ago--I heard a SACS VP complaining that even back then, IE had been around a long time and &lt;i&gt;college should know what to do by now.&lt;/i&gt;&amp;nbsp;That's true as far as it goes, but it should be acknowledged that: 1) it's very hard to satisfy committees, and 2) it's not entirely clear what is acceptable and what's not. Part of the problem is that while the theory of IE loops is easy to understand, practice is far more difficult. Sort of like Socialism. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There was a clarification that it &lt;i&gt;is&lt;/i&gt;&amp;nbsp;acceptable for institutions to sample programs for reporting 3.3.1.1 in lieu of reporting outcomes for every single one. There is supposed to be a policy statement about this on the web site, but I couldn't find it after several minutes of searching the list for 3.3.1, effectiveness, outcomes, sampling, etc. If someone finds it, please let me know. The main thing is that it should be representative and not look like it was cheery-picked (e.g. reporting only programs that have discipline-based accreditation).&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It was noted that CS 2.10 implicitly has learning outcomes reporting requirements, making it a pseudo-IE standard. I included this in my recommendations for 'fixes' to the Principles in my letter to SACS, posted &lt;a href="http://highered.blogspot.com/2011/03/improving-principles-of-accreditation.html"&gt;here&lt;/a&gt; for comment. Not many institutions seem to be flunking it, though, unlike 3.3.1.1 (see below).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;The fifth year report&lt;/b&gt; was highlighted in a break-out session. You can find additional slides on this topic on the website &lt;a href="http://www.sacscoc.org/staff/cbaird/Whats%20This%20Fifth-Year%20Interim%20Review.pdf"&gt;here&lt;/a&gt;.&amp;nbsp; Out of 39 institutions, 28 were cited on 3.3.1.1, and alarmingly, the number of citations on the QEP Impact Report is 33%. Although this says that the review process is no &amp;nbsp;piece of cake (which is good--it should be meaningful), it points to a problem. In fact, the rationale for the Small College Initiative is to help address this problem, which is particularly acute for small schools. As a side note, over lunch I talked to an IR director who speculated that there is a bias against citing large schools, particularly ones with high rankings. It would be really interesting, in conjunction with the inter-rater reliability study I fantasized about above, to have &lt;i&gt;blind&lt;/i&gt;&amp;nbsp;reviews of 3.3.1.1. Given the growing emphasis on student learning outcomes (including the new credit-hour rules), a whole separate system for learning outcomes may need to be developed. One of the challenges on the horizon, in my view, is the contradiction of grades. On the one hand they are the basic unit of merit for courses, with a vast bureaucracy behind them. By contrast, grades are not seen as 'real' assessments. This needs to be fixed. I don't know if Western Governors University's model is the answer, but what we have now makes no sense, and it is impossible to explain to the public.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;b&gt;Reasons given for flunking a QEP&lt;/b&gt; included:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Bad planning, which leads to a bad report. One kind of bad plan is one that's too broad.&amp;nbsp;&lt;/li&gt;&lt;li&gt;Failure to execute it, e.g. if a new administration comes in and lacks enthusiasm for the old project&lt;/li&gt;&lt;li&gt;Not talking about goals and outcomes in the report. Hard to believe.&lt;/li&gt;&lt;li&gt;Not describing the implementation (just narrating the creation, perhaps)&lt;/li&gt;&lt;li&gt;Not collecting or using data&lt;/li&gt;&lt;li&gt;Bad writing. Ironic, since so many QEPs are about writing.&lt;/li&gt;&lt;/ul&gt;&amp;nbsp;Tips for writing QEP impact reports:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Follow the directions given in the SACS policy&lt;/li&gt;&lt;li&gt;Address all the elements&lt;/li&gt;&lt;li&gt;Keep narrative to 10 pages. (&lt;strike&gt;You can apparently link out to other documents, which I hadn't heard before. I thought everything had to be in 10 pages&lt;/strike&gt;.) [Edit: see the update below]&lt;/li&gt;&lt;li&gt;Use data, but include analysis--don't just put in graphs with no explanation.&lt;/li&gt;&lt;/ul&gt;&lt;b&gt;Networking over lunch,&lt;/b&gt; I gleaned a couple of nifty ideas. At one institution, faculty contracts include a 'gotcha' clause, which stipulates that if assessment reports are not done by date X, then the prof has to stick around until date Y to finish them. This provides an incentive to get them done. Also, the reports are broken down into phases across the academic year, so that not everything is done at once. Smart.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Update: &lt;/b&gt;Mike Johnson posted a note to the list server saying that the links in the 10 page (max) Impact Report can only be internal to the report itself, which does not allow 'extra room'. In his words:&lt;br /&gt;&lt;blockquote&gt; Links within a disk or flash drive are okay as long as the documents  that are part of the link are included in the ten page maximum length.  So please do not use hyperlinks to documents as a means to lengthen the  report.&lt;/blockquote&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-5432114043705693979?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/5432114043705693979/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/04/small-college-initiative.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/5432114043705693979'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/5432114043705693979'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/04/small-college-initiative.html' title='Small College Initiative'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-4463431770089235132</id><published>2011-04-28T06:54:00.001-05:00</published><updated>2011-04-28T07:10:42.558-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='IQ'/><category scheme='http://www.blogger.com/atom/ns#' term='FACS'/><category scheme='http://www.blogger.com/atom/ns#' term='noncognitive'/><category scheme='http://www.blogger.com/atom/ns#' term='motivation'/><category scheme='http://www.blogger.com/atom/ns#' term='intelligence'/><title type='text'>Motivation and Intelligence</title><content type='html'>Angela Duckworth et al have a new article in the Proceedings of the National Academy of Sciences (PNAS) entitled "&lt;a href="http://www.pnas.org/content/early/2011/04/19/1018601108"&gt;Role of test motivation in intelligence testing&lt;/a&gt;." The abstract reads, in part:&lt;br /&gt;&lt;blockquote&gt;Intelligence tests are widely assumed to measure maximal intellectual performance, and predictive associations between intelligence quotient (IQ) scores and later-life outcomes are typically interpreted as unbiased estimates of the effect of intellectual ability on academic, professional, and social life outcomes. The current investigation critically examines these assumptions and finds evidence against both.&amp;nbsp;[...]&amp;nbsp;After adjusting for the influence of test motivation, however, the predictive validity of intelligence for life outcomes was significantly diminished, particularly for nonacademic outcomes.&lt;/blockquote&gt;The press on the paper includes Science Daily's "&lt;a href="http://www.sciencedaily.com/releases/2011/04/110427171638.htm"&gt;Motivation Plays a Critical Role in Determining IQ Test Scores&lt;/a&gt;," and Discover's blog post "&lt;a href="http://blogs.discovermagazine.com/notrocketscience/2011/04/26/iq-scores-reflect-motivation-as-well-as-intelligence/"&gt;IQ scores reflect motivation as well as 'intelligence&lt;/a&gt;'."&lt;br /&gt;&lt;br /&gt;&lt;b&gt;We include 'Effort' in our faculty-assessed end-of-semester &lt;a href="http://highered.blogspot.com/search/label/FACS"&gt;FACS&lt;/a&gt; survey&lt;/b&gt;, and have found a link between grades and this effort rating. Of course, it could just be that professors who thing students work hard also tend to give them higher grades, so over the summer we will look at multi-year correlations to eliminate that confounding factor.&lt;br /&gt;&lt;br /&gt;The graphs below (courtesy of Google charts) shows GPA in red, and hours completed in blue above the distribution bars for rated student effort across all classes. The heights of the bars give the percent of the distribution that received that rating. The left one is our first survey, Spring 2009. The one on the right is from Fall 2010. The sample size has increased as we've gotten better participation.&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;img src="http://chart.apis.google.com/chart?cht=bvg&amp;amp;chs=150x250&amp;amp;chxt=x,y&amp;amp;chxl=0:|Min|Low|Good|Great&amp;amp;chco=4D89F9&amp;amp;chd=t:17,36,35,10&amp;amp;chm=t76,0000FF,0,0,10|t85,0000FF,0,1,10|t87,0000FF,0,2,10|t94,0000FF,0,3,10|t2.1,FF0000,0,0,10,,b::15|t2.58,FF0000,0,1,10,,b::15|t2.89,FF0000,0,2,10,,b::15|t3.32,FF0000,0,3,10,,b::15&amp;amp;chtt=Effort|N=720" /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;img src="http://chart.apis.google.com/chart?cht=bvg&amp;amp;chs=150x250&amp;amp;chxt=x,y&amp;amp;chxl=0:|Min|Low|Good|Great&amp;amp;chco=4D89F9&amp;amp;chd=t:14,20,46,17&amp;amp;chm=t50,0000FF,0,0,10|t59,0000FF,0,1,10|t56,0000FF,0,2,10|t47,0000FF,0,3,10|t2.01,FF0000,0,0,10,,b::15|t2.46,FF0000,0,1,10,,b::15|t2.94,FF0000,0,2,10,,b::15|t3.46,FF0000,0,3,10,,b::15&amp;amp;chtt=Effort|N=1457" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: left;"&gt;The drop in credits earned is due to more first year students being included in the sample. The year-by-year story is similar, except that the overall averages have an interesting shape as ecological samples from first year to fourth:&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;img src="http://chart.apis.google.com/chart?cht=bvg&amp;amp;chxr=1,0,3,.5&amp;amp;chs=150x250&amp;amp;chxt=x,y&amp;amp;chxl=0:|1st|2nd|3rd|4th&amp;amp;chco=4D89F9&amp;amp;chd=t:60,53.5,50.7,57.3&amp;amp;chm=t1.8,0000FF,0,0,10|t1.6,0000FF,0,1,10|t1.5,0000FF,0,2,10|t1.7,0000FF,0,3,10&amp;amp;chtt=Effort|Average+by+Year" /&gt; &lt;/div&gt;&lt;br /&gt;The first year students in the graph are the first class to fall under the new (much higher) admissions standards. The number is the average effort rating on a scale of zero (minimum effort) to three (great effort). This is for N=1403, Fall 2010. Note that there is a survivorship bias, so that we'd expect the averages to grow as the time-in-school increases. I don't yet have true longitudinal data.&lt;br /&gt;&lt;br /&gt;Inter-rater reliability was measured by finding the frequency of exact matches for two instructors rating the same student. There were 385 instances of this, with a match rate of 50.7%. It's not hard to find the rate of pure-chance matches (dot-product the distribution with itself), but I haven't done that. In the past, the chance of matching randomly has been around 35%. See &lt;a href="http://zzascape.com/elephant.pdf"&gt;this source&lt;/a&gt; for more on that.&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-4463431770089235132?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/4463431770089235132/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/04/motivation-and-intelligence.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/4463431770089235132'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/4463431770089235132'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/04/motivation-and-intelligence.html' title='Motivation and Intelligence'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-7554740389720605243</id><published>2011-04-13T07:23:00.000-05:00</published><updated>2011-04-13T07:23:42.008-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='creativity'/><category scheme='http://www.blogger.com/atom/ns#' term='assessment'/><category scheme='http://www.blogger.com/atom/ns#' term='learning outcomes'/><title type='text'>Getting to Expression</title><content type='html'>&amp;nbsp;Barbara Fister's &lt;a href="http://www.insidehighered.com/blogs/library_babel_fish/why_the_research_paper_isn_t_working"&gt;"Why the 'Research Paper' Isn't Working&lt;/a&gt;" has some interesting observations about the teaching and assessment of composition, and I made the connection to the deductive/inductive divide I've been going on about lately. Let me reframe the latter as the "language/expression" divide as a preface:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;A &lt;i&gt;language&lt;/i&gt; is a set of knowledge that usually comprises vocabulary, methods, reference points of common knowledge, and a web of connections between concepts. Understanding a language is always a prerequisite to being able to produce it intelligently.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;i&gt;Expression&lt;/i&gt; is the illumination of new ideas, new connections, creation of new parts of the language to contribute to the existing corpus. It is realized with different styles in varying degrees of fluency, and allows the display of insights or brilliance.&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;This is the analytical/deductive vs. creative/inductive divide that I've blogged about before, for example in the &lt;a href="http://highered.blogspot.com/2011/04/creativity-as-pinnacle-of-learning.html"&gt;previous post&lt;/a&gt;. We see this division everywhere. Learning how to use paint versus expressing yourself in the medium.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A "life-long" learner must either become used to learning new languages all the time or else not plan to live long. My 'stack' of languages to learn currently includes R programming, German, and photography. I think of it in over-generalized terms as a progression from confusion to understanding to expression. I'll come back to that idea.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The "photography language" is one I dabbled in when it meant smelly chemicals and a long time between when you shot a photo and when you got to look at it. Nowadays digital photography obviates many of the skills that one needed, and it does something else very important. &amp;nbsp;It's not any &lt;i&gt;easier&lt;/i&gt;&amp;nbsp;to do digital photography--you have to master software instead of stop bath--but it's so much quicker to get from the snap to the view that &lt;i&gt;you can learn from trial and error in real time. &lt;/i&gt;This is a huge advantage. Conservatively, the gap between taking a shot on film and holding a print in your hands is at minimum several hours (leaving aside Poloroids or other quickie formats). With digital it's a matter of seconds. So learning the language through sheer trial and error has been accelerated by a factor of, say 2 hrs/2 seconds = 3600.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-LIj2PfxM5TQ/TaWGaB2OnYI/AAAAAAAAAXg/65da3vMr3vY/s1600/laughpaint.JPG" imageanchor="1" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" height="146" src="http://2.bp.blogspot.com/-LIj2PfxM5TQ/TaWGaB2OnYI/AAAAAAAAAXg/65da3vMr3vY/s400/laughpaint.JPG" width="400" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;Photo: David Eubanks, &lt;a href="http://creativecommons.org/licenses/by/2.0/"&gt;some rights reserved&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;div&gt;&lt;div&gt;Different learners approach learning language in different ways. Some people like to read all the manuals first, and others start pushing buttons. My wife (laughing at the end of a long work day, above) learned Italian by working all the exercises in two textbooks and then spending a month in Italy. I struggle along with German because I don't have the patience to memorize vocabulary. I try to bridge the confusion/understanding divide by reading novels translated into German (it makes the language &lt;i&gt;much &lt;/i&gt;simpler), and look up words that come up frequently enough. Her way is much more efficient than mine.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So, in this epistemological vivisection of learning, the challenge for faculty is to teach and assess the crossing of two metaphorical bridges:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;b&gt;Land o' confusion -&amp;gt; Understanding -&amp;gt; Expression&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In &lt;a href="http://highered.blogspot.com/2011/03/complexity-as-pedagogy.html"&gt;"Complexity as Pedagogy"&lt;/a&gt; I showed how it's possible to take a very narrow road straight to Expression. That is, one can encapsulate a small part of the language and use it to get right to the fun part. Because, let's face it, creating is fun! And if anything distinguishes humans from the rest of the biological kingdom, it's our&amp;nbsp;blabbing--we like to talk.&lt;/div&gt;&lt;blockquote&gt;An art professor once told me how to learn to draw. He said, just draw your hand over and over again in different positions. After about 500 times, you should be pretty good at it. I don't know if he was joking or not, but this is an example of simplifying the language to the point where you can quickly become expressive.&lt;/blockquote&gt;&lt;div&gt;&lt;b&gt;The practice of assessment should be very different across this divide.&lt;/b&gt; Testing language fluency can take many forms, but it's always about correctness, speed, conformity to convention, and so on. One is not supposed to be creative on a spelling test. Otherwise I would have gotten better grades in grade school. Similarly, we're not suppose to invent better names for state capitals for that test, or help the Germans organize the genders of their nouns better.&amp;nbsp;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Assessing language seems easy because of this necessary emphasis on mastering form. Vocabulary tests, concept inventories, and the like are easily administered, and even testing understanding of subtle connections through the use of the language itself is straightforward.&amp;nbsp;&lt;/div&gt;&lt;blockquote&gt;Example: In teaching logic, it's simple to write down a logical argument and ask students to justify each step with an axiom or theorem, or even let them find errors with the proof. The only way students can be successful is if they have a good understanding of the language.&lt;/blockquote&gt;&lt;div&gt;&lt;b&gt;This ease of assessment is a bane, and a great peril to learning. &lt;/b&gt;Let me finally get to the points I liked about the article I cited way back at the beginning of this piece. Starting with the idea of forcing students to master arcane rules of correct citations, the author notes more broadly that&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;blockquote&gt;I have long agreed with Richard Larson who wrote way back in 1982 that the research paper as taught in college is an artificial genre, one that works at cross-purposes to actually developing respect for evidence-based reasoning, a measured appreciation for negotiating ideas that are in conflict, or original thought.&lt;/blockquote&gt;&lt;/div&gt;&lt;div&gt;&lt;i&gt;An artificial genre that is at cross-purposes with original thought.&lt;/i&gt; That's pretty damning. But it's these very mechanics of &lt;i&gt;any &lt;/i&gt;language that are easily defined, easy to get agreement on, and easy to assess. It's a quick slide down the slope to standardization of a form that becomes&amp;nbsp;inimical&amp;nbsp;to the actual intent of the enterprise! This happens all over the place. Whole subjects taught in school exist only because of such inertia, like Geometry in high school--there's no reason kids should be learning plane geometry with rulers and protractors in this day and age, but it's been so deeply standardized that it's become part of the culture. But I digress.&lt;br /&gt;&lt;br /&gt;Barbara goes on to illustrate the point with a fascinating example of how students react to the low-complexity standard we've set institutionally:&lt;br /&gt;&lt;blockquote&gt;I hate it when students who have hit on a novel and interesting way of looking at an issue tell me they have to change their topic because they can’t find sources that say exactly what they plan to say. I try to persuade them otherwise, but they believe that original ideas are not allowed in “research.” How messed up is that? The other and, sadly, more frequent reference desk winch-making moment involves a student needing help finding sources for a paper he’s already written. Most commonly, students pull together a bunch of sources, many of which they barely understand on a topic they know little about, and do their best to mash the contents up into the required number of pages.&lt;/blockquote&gt;Does this sound like a road map from Confusion to Expression? It doesn't to me--it sounds like a Skinner Box: mash the button to get the food pellet.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;It shouldn't be hard to fix this problem. &lt;/b&gt;That's the good news. The recipe is simple:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Focus the language to a small useful subset. In terms of composition, it would mean picking a topic that's narrow enough to actually learn something about quickly.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Demonstrate and assess--with feedback!--fluency in this new language. Have conversations about what confusion levels, what you know you know, and what you know you don't know. There are lots of creative ways to organize this with mind maps and such, and it also can be fodder for oral presentations, or other engagement activities. &lt;i&gt;Develop fluency in real time.&lt;br /&gt;&lt;/i&gt;&lt;/li&gt;&lt;li&gt;Emphasize expression and creativity over form as far as it can be pushed. This isn't always possible, e.g. in logic--you have to be 100% correct--which is why the focus is so important. If you have to absolutely master some topic in order to be creative, make it a small one.&amp;nbsp;&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;Note that I am not advocating "free form creativity" devoid of any content or ultimate value. This might be fun for the students, but I don't see how it accomplishes any useful learning objectives. But there's a lot of road between many of our current practices and goofing off in the name of creativity.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Barbara's ending paragraph is&amp;nbsp;apposite:&lt;/div&gt;&lt;div&gt;&lt;blockquote&gt;But if you want first year college students to understand what sources are for and why they matter, if you want them to develop curiosity and respect for evidence, your best bet is to start by tossing that generic research paper. As for those who will complain that students should have learned how to paraphrase and cite sources in their first semester – we’ve tried to do that for decades, and it hasn’t worked yet. Isn’t it time to try something else?&lt;/blockquote&gt;Yes.&amp;nbsp;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-7554740389720605243?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/7554740389720605243/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/04/getting-to-expression.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/7554740389720605243'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/7554740389720605243'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/04/getting-to-expression.html' title='Getting to Expression'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-LIj2PfxM5TQ/TaWGaB2OnYI/AAAAAAAAAXg/65da3vMr3vY/s72-c/laughpaint.JPG' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-7784183579507283847</id><published>2011-04-07T05:48:00.000-05:00</published><updated>2011-04-07T05:48:52.220-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='creativity'/><title type='text'>Creativity as the Pinnacle of Learning</title><content type='html'>In a &lt;a href="http://highered.blogspot.com/2011/03/memory-as-slo.html"&gt;recent post on the topic of memory&lt;/a&gt;, I noted that this skill was at the base of the revised Bloom's&amp;nbsp;Taxonomy&amp;nbsp;(or &lt;a href="http://highered.blogspot.com/2009/06/its-all-in-name-comic.html"&gt;taxidermy&lt;/a&gt;?). This morning I woke up thinking about the other end: creativity. I've mused about the role of creativity before, and how to teach and it (&lt;a href="http://highered.blogspot.com/search/label/creativity"&gt;here&lt;/a&gt; and &lt;a href="http://highered.blogspot.com/search?q=creativity+creatively"&gt;here&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;It's easy to make the unwarranted leap from creativity to aesthetics, because we associate art justifiably with a creative process. But I prefer to think of creativity as the production of new knowledge in any context. Let me give a pedantic example:&lt;br /&gt;&lt;blockquote&gt;All men are mortal.&lt;br /&gt;Socrates is a man.&lt;br /&gt;------------------&lt;br /&gt;Socrates is mortal.&lt;/blockquote&gt;The conclusion follows deductively from the two statements above it, so it is &lt;i&gt;not&lt;/i&gt; the production of new knowledge. This is the hammer that fell on Bertrand Russell in his quest for ultimate true by means of logic. Logical, rigorous deductive thinking is an essential skill, but it's not creative. In contrast, Aristotle's encoding of logic into language &lt;i&gt;was &lt;/i&gt;creative, but I have a more interesting example.&lt;br /&gt;&lt;br /&gt;The other day I saw an interesting problem posted on the &lt;a href="http://www.reddit.com/r/math/"&gt;math subreddit&lt;/a&gt;. The diagram below shows a laser beam coming from the right and striking a perfect mirror (the thick black line at the bottom) at angle b, bouncing off, and then striking another mirror placed at angle a with respect to the first one:&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-cMIntaqSZQk/TZ2HskmPI2I/AAAAAAAAAW4/wYtFaG738ME/s1600/mirror.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="148" src="http://3.bp.blogspot.com/-cMIntaqSZQk/TZ2HskmPI2I/AAAAAAAAAW4/wYtFaG738ME/s400/mirror.JPG" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;The gray line is imaginary here--it extends the line of the bottom mirror to illustrate angle b, but the light hits the very end of the mirror itself. The light will continue to bounce off the mirrors in some way. Where does the beam end up going? I will put the solution at the end, in case you want to think about it first.&lt;br /&gt;&lt;br /&gt;The point is that I suspected there should be an elegant way to think about the problem, where the solution--all solutions--would be obvious. So I cast about, looking for it. This is rather like trying to find the light switch in a dark room, as Andrew Wiles put it (see my other posts on creativity for the link). I gave up before I found it. I found an inelegant solution, which was correct, but wasn't creative enough to be called elegant. It was sort of a plodding, "add up the&amp;nbsp;cumulative&amp;nbsp;effect" solution, where you sort of crush a problem with the weight of logical facts until it leaks out its secrets like a garlic clove exudes oil.&lt;br /&gt;&lt;br /&gt;My lack of blazing imagination does, however, illustrate that the creative process itself deserves its own "taxonomy." In other words, there are qualitative differences to creative enterprise. Let's take a look.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Creativity as producing new information&lt;/b&gt; can start with sheer randomness. Flipping a coin and writing down the results is creative. This sounds too trivial to be counted, but it's not. In fact, it's the singular most important spark of novelty in history. I have two examples. First,&amp;nbsp;physicists&amp;nbsp;have wondered how galaxies formed. If the big bang started from a single point, for example, why wasn't everything thereafter perfectly uniform? Where did all the novelty come from? One proposal is that tiny differences in the&amp;nbsp;primordial&amp;nbsp;universe were seeded by quantum events, which we know to be &lt;a href="http://en.wikipedia.org/wiki/Bell's_theorem"&gt;deeply random&lt;/a&gt;. So the largest structures in the universe may have started with infinitesimal randomness. Cool, right?&lt;br /&gt;&lt;br /&gt;The second example is the evolution of life, which explores via an ecology a vast space of possible designs for living things. This exploration proceeds by random mutation of genes, and other ways in which genetic material may get mixed around (like parasitism), or a bacteria's&amp;nbsp;lascivious&amp;nbsp;lifestyle with regard to DNA. This is not the deeply puzzling randomness of quantum mechanics, but the sort that emerges from complex systems that is sometimes called chaos.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Randomness is a great entry point into creative thinking.&lt;/b&gt;&amp;nbsp;The casting about for novelty is a skill in itself. It requires courage to be wrong, a good idea of how to recognize your intellectual quarry when you've found it, and&amp;nbsp;determination--because it takes a long time for randomness to hit the right target. Louis Pasteur's "Chance favors the prepared mind." has two parts: chance, and preparation. The latter is a formed in the laborious mastering of some discipline or subject.&lt;br /&gt;&lt;br /&gt;The whole idea of serendiptipy is based on these two elements, and our culture has benefited handsomely from it: rubber, penicillin, radioactivity, and many more are on the list. Wikipedia has many examples&amp;nbsp;&lt;a href="http://en.wikipedia.org/wiki/Serendipity"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;In the &lt;a href="http://highered.blogspot.com/2011/03/complexity-as-pedagogy.html"&gt;last post&lt;/a&gt;, I showed an example of a game designed for high schoolers that is aimed at creative thinking in a mathematical context. The essential skills are being able to understand the problem and do basic math (easy), and cast about for creative solutions (fun, I hope). These solutions will start with guessing.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Guessing is a step up from randomness.&lt;/b&gt; Humans aren't very good at true randomness--we have to depend on the world around us for that, like moldy bread crumbs falling accidentally into a Petri dish. I suspect that good guessing is an art unto itself, and that it can be taught and practiced. There's an MIT course on the art of making educated guesses with regard to estimation (how many gas stations are in the US, do you think?). Here's the course description from the &lt;a href="http://ocw.mit.edu/courses/mathematics/18-098-street-fighting-mathematics-january-iap-2008/"&gt;Open Courseware site&lt;/a&gt; (it's free!).&lt;br /&gt;&lt;blockquote&gt;This course teaches the art of guessing results and solving problems without doing a proof or an exact calculation. Techniques include extreme-cases reasoning, dimensional analysis, successive approximation, discretization, generalization, and pictorial analysis. Applications include mental calculation, solid geometry, musical intervals, logarithms, integration, infinite series, solitaire, and differential equations.&lt;/blockquote&gt;This is targeted at students with a good math foundation (everybody at MIT, I guess), but I find it exciting because it shows how to teach a whole course on &lt;i&gt;guessing&lt;/i&gt;&amp;nbsp;in the context of a discipline. There's no reason that this couldn't be done in other subjects just as well. Guess-and-check is a&amp;nbsp;fundamental&amp;nbsp;human skill that reinforced our knowledge of the world. Think about kids and the funny way they conjugate verbs at first because they are guessing based on simple rules (e.g "I eated my peas, daddy"). The guess is close enough to communicate, and as an additional reward, they glean information about new complexities of language, if someone is kind enough to point out the right way of saying it.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Problem-Solving&lt;/b&gt; might be the next step in the creative chain of being. This is a natural continuation of randomness and guessing, which results in the production of new knowledge in some applied context. This works in art as well as math, I think. It's the evolution from random doodles to purposeful artistic creation. Problem solving weds the analytical/deductive process, discipline-specific skills and knowledge with the trial-and error process that I've described in prior posts on creativity. This is the nuts and bolts of creative production.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Inspiration may or may not be teachable. &lt;/b&gt;If we help students to be good seekers of randomness, good guessers, and good problem-solvers, can we help them elevate themselves to inspired thought? I don't know, but I &lt;i&gt;guess&lt;/i&gt; that we can provide a fertile environment for this, and foster it in individuals who might otherwise have not reached their potential. I don't really believe that we can take every math student and produce another Gauss or Euler, but we can ameliorate one of the great hidden human tragedies--the many, many inspired thinkers who never got the intellectual cultivation they needed to allow their talents to flower.&lt;br /&gt;&lt;br /&gt;This is all first-draft thinking. An interested group of discipline experts could turn these rough ideas into something applicable to a curriculum or institution. To include ways to assess creativity at each step along the way. Disciplines can learn from each other and share approaches, opening up the possibility of interdisciplinary learning. I often though that the math students could benefit from watching art students critically review each others' work.&lt;br /&gt;&lt;br /&gt;&lt;hr /&gt;&lt;br /&gt;&lt;b&gt;Here's the solution to the problem.&lt;/b&gt; I have redrawn it, but I saw it first &lt;a href="http://www.reddit.com/r/math/comments/ggnyr/how_many_bounces/c1nf8av"&gt;here&lt;/a&gt;. The original problem was in terms of a tiny billiard ball, but I changed it to a laser beam. The key insight is that reflections preserve angles, so that instead of imagining the beam bouncing off at the same angle (incidence = reflection), imagine it passing through the mirror as if it were a pane of glass. Then add another pane of glass where the return bounce would have occurred, so that copies of the mirrors look like spokes on a wheel separated by angle a. This illustrates clearly that the beam will swiftly exit the mirrors and go on its way in most arrangements. The whole process is laid bare. I've illustrated it with a=45 degrees below. This is an inspired and elegant solution, unlike my workable but problem-solving brute force approach (not shown).&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-3sYGmzpzTj0/TZ2Vh4AxvoI/AAAAAAAAAW8/KkAZ81UkMiY/s1600/mirror2.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="146" src="http://4.bp.blogspot.com/-3sYGmzpzTj0/TZ2Vh4AxvoI/AAAAAAAAAW8/KkAZ81UkMiY/s320/mirror2.JPG" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-7784183579507283847?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/7784183579507283847/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/04/creativity-as-pinnacle-of-learning.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/7784183579507283847'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/7784183579507283847'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/04/creativity-as-pinnacle-of-learning.html' title='Creativity as the Pinnacle of Learning'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-cMIntaqSZQk/TZ2HskmPI2I/AAAAAAAAAW4/wYtFaG738ME/s72-c/mirror.JPG' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-6162374948182637016</id><published>2011-03-27T10:03:00.002-05:00</published><updated>2011-03-27T10:16:51.969-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='complexity'/><category scheme='http://www.blogger.com/atom/ns#' term='learning'/><category scheme='http://www.blogger.com/atom/ns#' term='math'/><title type='text'>Complexity as Pedagogy</title><content type='html'>Learning an academic subject usually goes like this. First you have to get used to a new language and ideas expressed in that language. At first you're quizzed on meaning, but you're increasingly required to actively use the new concepts (using a &lt;a href="http://en.wikipedia.org/wiki/Broca's_area"&gt;different part of your brain&lt;/a&gt; to do so). When you get far enough along, you can begin to critique work using the new ideas. Ideally you critique your own work as you produce it. I think of this as the analysis and creativity cycle. The former demands use of the language (terms, ideas, grammar and syntax), whereas the latter depends on insight, trial and error, and imagination.&lt;br /&gt;&lt;br /&gt;The problem is that it usually takes a long time to get any good a new language before you can effectively be creative in it. Take speaking in a foreign language, for example. You can't just make up words and sentences that 'sound German' and hope for it to be meaningful. A German friend gave a good example of this in reverse, when his sister tried to order a sandwich in New York:&lt;br /&gt;&lt;blockquote&gt;Using "Ich will einen Hamburger bekommen"&lt;ich bekommen.="" einen="" hamburger="" will=""&gt;&amp;nbsp;&amp;nbsp;(I want to get a hamburger.) to create the English approximation: "I will become a hamburger."&amp;nbsp;&lt;/ich&gt;&lt;/blockquote&gt;&lt;blockquote&gt;But 'will' and 'become' mean very different things in the two languages, and there's no way to approximate the knowledge.&lt;/blockquote&gt;You have to understand how verbs are&amp;nbsp;consummated&amp;nbsp;and whatnot...&lt;i&gt;then&lt;/i&gt; you can express yourself.&lt;br /&gt;&lt;br /&gt;All this preparation is a&amp;nbsp;hindrance&amp;nbsp;if you just want to introduce a student to a subject. One of my professors, David Kammler, exposed me to the interesting idea in math that it should be taught like MacArthur's&lt;a href="http://en.wikipedia.org/wiki/Leapfrogging_(strategy)"&gt; island-hopping&lt;/a&gt; campaign in the Pacific in WWII. That is, don't try to do everything, just teach strategic material that an able student can use as basis to explore other parts as needed. This has the effect of lessening the language burden.&lt;br /&gt;&lt;br /&gt;There is an advantage to that. Being creative is like play, or it can be. It reinforces the language and gives students the thrill of discovery. This is how games work. If you make the learning curve too steep for beginners, they may not even make it through the rule book. Moreover, it's not necessary to have a lot of language/rules to have an interesting game. Chess, for example, or Backgammon, or Poker. The trick is to create a tight language that is unambiguous, and thereby allow a platform for exploration that can be easily checked by the rules.&lt;br /&gt;&lt;br /&gt;Math and computer science are particularly amenable to this approach. My colleague Soumia Ichoua has a grant to do operations research activities, and part of it is an outreach project for middle or high school students. We are working with the university's Upward Bound program to coordinate logistics, and have created several types of logistical games for the kids.&lt;br /&gt;&lt;br /&gt;The idea is to create a simple system that has these three properties:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;It's easy to understand the rules and check them. (low complexity rules = simple descriptive language)&lt;/li&gt;&lt;li&gt;It's difficult to solve the problem. (high complexity solution = perfect solution &lt;i&gt;method &lt;/i&gt;has a long description, and is unlikely to be found.)&lt;/li&gt;&lt;li&gt;There is a wide range of possible solutions. (expressive language, allowing heuristics to be effective)&lt;/li&gt;&lt;/ol&gt;As it turns out, there is a major class of algorithmic challenges in computer science that has this property. They are in a &lt;a href="http://en.wikipedia.org/wiki/NP_(complexity)"&gt;class called NP&lt;/a&gt;. One example is the &lt;a href="http://en.wikipedia.org/wiki/Travelling_salesman_problem"&gt;traveling salesman problem&lt;/a&gt;, where you try to find the shortest circuit to visit a list of destinations. The rules are very simple: add up the length of each leg (transit from one location to the next) in the circuit. The total distance is the sum. Finding the shortest path is very hard, however. So it's a perfect environment to employ creativity in a real discipline-based problem without having to learn a bunch of language and rules first.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-fvd_XaSoYdg/TY9NOL-2TBI/AAAAAAAAAW0/7aa9msKc3jA/s1600/TravelingSalesmanProblem_1000.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="211" src="http://1.bp.blogspot.com/-fvd_XaSoYdg/TY9NOL-2TBI/AAAAAAAAAW0/7aa9msKc3jA/s400/TravelingSalesmanProblem_1000.gif" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The image above is taken from &lt;a href="http://mathworld.wolfram.com/TravelingSalesmanProblem.html"&gt;Wolfram Alpha&lt;/a&gt;, and shows a minimal distance circuit for the given points. The space of possible solutions is vast, but humans can use heuristics to find reasonably good solutions.&lt;br /&gt;&lt;br /&gt;One of the game boards was created by my daughter, and is shown below. There's a story behind it about King Lolly, who needs to keep his people fed in winter, and involves moving food markers around from the grainery and the castle to the villages. There are several scenarios to keep it interesting.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-raGYD0LH2J8/TY9FHTdfnKI/AAAAAAAAAWw/nIC9ldoQjGM/s1600/lollyland.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="http://2.bp.blogspot.com/-raGYD0LH2J8/TY9FHTdfnKI/AAAAAAAAAWw/nIC9ldoQjGM/s320/lollyland.jpg" width="242" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Other logistics games include air-dropping supplies into a remote region, delivering hot dogs to hot dog vendors in a city, and planning for natural disasters. The last one is the focus of Dr. Ichoua's research, which has her visiting FEMA to get real data.&lt;br /&gt;&lt;br /&gt;A draft of the rules for Lollyland is found below. The plan is to use undergraduate helpers to assist rising 9th&amp;nbsp;and 10th graders in pairs. We are developing assessments for cognitive and non-cognitives.&lt;br /&gt;&lt;br /&gt;&lt;hr /&gt;&lt;br /&gt;&lt;b&gt;Rules&lt;/b&gt;&lt;br /&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;Food travels from the castle and keep to the villages over the roads. If you have to move the food through the castle, the hungry people will eat half of whatever you send through (rounded up). Each food marker is enough to satisfy one hungry villager.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Winter in Lollyland&lt;/b&gt;&lt;br /&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;In the small kingdom of Lollyland, King Lollygaster has to plan well to feed his people during the winter. Being wise, he has constructed a grain tower in his castle and another one at a distant keep. Between these two sources, he must distribute emergency food if the winter is harder than expected. Besides the town surrounding the castle, which has its own store and can take care of itself, there are four large villages that have no such protection. It is these that the kind must provide for in time of famine.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Scenario 1.&lt;/b&gt;&lt;br /&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;The ground stayed frozen so long that the crops were put out late. The king must distribute food to the four villages to keep them from starving.  He isn’t good with numbers, so he puts you in charge and entrusts you to get his people fed.&lt;br /&gt;&lt;br /&gt;Set up: 25 food in the castle, and 11 in the keep. Each village gets 9 hungry people. &lt;br /&gt;&lt;i&gt;&lt;br /&gt;&lt;/i&gt;&lt;br /&gt;&lt;i&gt;Questions:&lt;/i&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Can all the villagers be fed?&lt;/li&gt;&lt;li&gt;How many ways are there to distribute the food to feed them?&lt;/li&gt;&lt;/ol&gt;&lt;b&gt;Scenario 2.&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;After many happy years King Lollygaster became too frail to rule effectively, and turned the kingdom over to his son Lollyfright, who promptly dumped the old man down a well. Lollyfright neglected to lay up much in the way of food storage, assuming the winter would be mild. You are now the Minister of Happiness, and tasked with the following situation:&lt;br /&gt;&lt;br /&gt;Set up: 18 food in the castle and 8 in the keep. Each village gets 9 hungry people. All of the roads are closed due to lack of interest in clearing snow from them. Choose four roads to open and deliver food; the others will remain closed.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Questions:&lt;/i&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;What is the most number of villagers that can be fed?&lt;/li&gt;&lt;li&gt;What are the best roads to open?&lt;/li&gt;&lt;/ol&gt;&lt;b&gt;Scenario 3.&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;After the disaster, the merchant class of Lollyland dethrone Lollyfright and send him into exile. They form a governing council and put you in charge of planning for the next famine. &lt;br /&gt;Set up: Assume there will be 7 hungry villagers in each village, the populating having declined of late.  With this decline there is less ability to clear roads of snow. Assume that you can open only three roads in the worst case.&lt;br /&gt;&lt;br /&gt;Question: How much food do you need to put at the castle and the keep?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-6162374948182637016?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/6162374948182637016/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/03/complexity-as-pedagogy.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/6162374948182637016'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/6162374948182637016'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/03/complexity-as-pedagogy.html' title='Complexity as Pedagogy'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-fvd_XaSoYdg/TY9NOL-2TBI/AAAAAAAAAW0/7aa9msKc3jA/s72-c/TravelingSalesmanProblem_1000.gif' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-6965108255190143859</id><published>2011-03-25T19:10:00.000-05:00</published><updated>2011-03-25T19:10:24.176-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='assessment'/><title type='text'>Why Assessment?</title><content type='html'>When the word came filtering down through the academic rumor enhancement facility (an endowed building, thankfully) that we'd have to do assessment--this happening in a past demarcated by one (1) millennium post, one (1) century post, and two (2) decades--we academics ran for cover. Some are still down there in their fox holes, waiting for the war to be over, not having access to the headlines proclaiming victory for the other side.&lt;br /&gt;&lt;br /&gt;I have to wonder why the emphasis was ever put on assessment in the beginning? It's certainly a bad word to build a PR campaign around, like trying to advertise lamprey on the menu. Anyway, assessment isn't the point at all. This has only recently occurred to me, which may be symptomatic of all those sedimentation layers from said rumor enhancement facility that piled 'assessment' on top of 'measurement' and so on and on. You could make chalk out of the stuff if you had the patience.&lt;br /&gt;&lt;br /&gt;When I finally made the synaptic leap, it was a shock. But perhaps I can be excused that because of the continual sustained emphasis on assessment in conferences, publications, and public discourse.&lt;br /&gt;&lt;br /&gt;Assessment is one component of a theoretical institutional effectiveness process, but it's not the most important one. The spiral of excellence is to be climbed by:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Setting goals&lt;/li&gt;&lt;li&gt;Finding assessments for same&lt;/li&gt;&lt;li&gt;Gathering data and analyzing it&lt;/li&gt;&lt;li&gt;Making changes that may improve things&lt;/li&gt;&lt;/ol&gt;The first step and the last are the most important, and if you could only pick one, number four would be it. If the mandate were to set goals and try to achieve them, anyone taking that seriously would pretty quickly figure out some sort of assessment was in order. So why do we start with assessment first? Historical accident?&lt;br /&gt;&lt;br /&gt;Compounding the premier position of assessment is the way we sometimes talk about the activity, as if we were doing science in order to do engineering. What I mean here is that I characterize science as pinning down cause and effect, whereas engineering is the use of those principles to accomplish some aim. This is all mixed up in learning outcomes assessment, because it's often expected to do both at the same time. Here again, the heavy emphasis on assessment gets in the way. To find a link between cause and effect, we have to vary starting conditions, and compare ending conditions. Then we have to hope that the universe is reasonable and lets us get away with inductively assuming that because it worked last time it will work again the next time.&lt;br /&gt;&lt;br /&gt;Suppose we want to test some variations of fertilizers on corn to see which one works best. We'd try to keep everything constant except our treatment (type of fertilizer), which varies in some suitable way. We'd be careful to randomize over plots of ground to average out soil or topographical variance, and so on. We would carefully document all the conditions (water, acidity, parasites, etc.) during the experiment, and finally assess the yield at the end. Now all that stuff I just reeled off is part of the experiment, but with learning outcomes we typically wish everything away except for the last bit--assessing the yield. I have yet to see an effectiveness model that has a box asking for the experimental conditions during the teaching process. Maybe that's because few would do it.&lt;br /&gt;&lt;br /&gt;The effect of this casual approach to experimentation is to make the assessment less valid. This may be remedied to some extent by bringing back into light the context, which is exactly why having teaching faculty involved with broad discussions that include but are not limited to assessment results is so important. Given the typical (and understandable) lack of rigor for most learning assessments, the results are more like a psychic reading than science. And that's fine. It should be told that way, so that faculty members don't get so stressed out about a "scientific" use of the results. Focusing more heavily on goals and improvements makes it easier to engage faculty, and puts the emphasis where it ought to be: taking action.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-6965108255190143859?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/6965108255190143859/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/03/why-assessment.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/6965108255190143859'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/6965108255190143859'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/03/why-assessment.html' title='Why Assessment?'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-6926513051143044694</id><published>2011-03-24T07:02:00.000-05:00</published><updated>2011-03-24T07:02:52.477-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='learning'/><category scheme='http://www.blogger.com/atom/ns#' term='memory'/><title type='text'>Memory as a SLO</title><content type='html'>Why don't we teach memory skills as learning outcomes? Given the amount of memorizing required for many college courses, wouldn't it make sense to have a first-year course on the general subject: theory and application? Or better yet, integrate techniques within classes. I can remember my uncle, who is a doctor, reciting a long list of bones from some mnemonic device he'd used back in school, decades before. The subject has a long rich history. The Romans practiced memory techniques, and probably considered it essential (see Wiki article "&lt;a href="http://en.wikipedia.org/wiki/Art_of_memory"&gt;Art of Memory&lt;/a&gt;" for sources and details). The book &lt;i&gt;&lt;a href="http://www.amazon.com/Moonwalking-Einstein-Science-Remembering-Everything/dp/159420229X/ref=zg_bs_books_4"&gt;Moonwalking with Einstein: The Art and Science of Remembering Everything&lt;/a&gt;&lt;/i&gt; is currently the number four &lt;a href="http://www.amazon.com/Moonwalking-Einstein-Science-Remembering-Everything/dp/159420229X/ref=zg_bs_books_4"&gt;best-seller on Amazon.com&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Memory shows up as the base of the pyramid on the revised Bloom's Taxonomy (taken from &lt;a href="http://www.nwlink.com/~donclark/hrd/bloom.html"&gt;here&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://lh6.googleusercontent.com/-97lzFdEYsKI/TYstQYl1cnI/AAAAAAAAAWs/yqh5VUuSYpw/s1600/bloom_taxonomy.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="293" src="https://lh6.googleusercontent.com/-97lzFdEYsKI/TYstQYl1cnI/AAAAAAAAAWs/yqh5VUuSYpw/s320/bloom_taxonomy.jpg" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;I'm guessing that there are institutions out there who do this, but I haven't heard about (or ironically forgot about). Does anyone have a "memory across the curriculum" program? It seems like this skill does belong at the base of cognitive skills, and moreover is transferable from one discipline to another. Simple things could easily be done, like teaching students how big flash-card decks should be and how to time the reinforcements, or providing online resources like &lt;a href="http://www.wired.com/medtech/health/magazine/16-05/ff_wozniak"&gt;SuperMemo&lt;/a&gt;:&lt;br /&gt;&lt;blockquote&gt;SuperMemo is based on the insight that there is an ideal moment to practice what you've learned. Practice too soon and you waste your time. Practice too late and you've forgotten the material and have to relearn it. The right time to practice is just at the moment you're about to forget.&lt;/blockquote&gt;&lt;span class="Apple-style-span" style="font-family: georgia, 'times new roman', serif; font-size: 14px; line-height: 17px;"&gt;Here's another tip: &lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: georgia, 'times new roman', serif; font-size: 14px; line-height: 17px;"&gt;&lt;a href="http://www.livescience.com/1473-moving-eyes-improves-memory-study-suggests.html"&gt;move your eyes&lt;/a&gt;&lt;/span&gt;&lt;br /&gt;&lt;blockquote&gt;If you’re looking for a quick memory fix, move your eyes from side-to-side for 30 seconds, researchers say.&amp;nbsp;Horizontal eye movements are thought to cause the two hemispheres of the brain to interact more with one another, and communication between brain hemispheres is important for retrieving certain types of memories.&lt;/blockquote&gt;Finally, here's a very cool way to remember numbers called &lt;a href="http://www.buildyourmemory.com/pegging.php"&gt;pegging&lt;/a&gt;:&lt;br /&gt;&lt;blockquote&gt;Basically what pegging does is to turn a number (any number), into a set of phonetic sounds or letters. These sounds are then joined together to form words, and these words may then be linked together to form a series of images. Finally these images may then be committed to memory. This enables an individual to recall numbers of up to (and above) 100 digits, with relative ease.&lt;/blockquote&gt;It seems like helping students improve memory would have broad positive consequences for the rest of their learning. They could spend less time in the process, stop cramming vocabulary just before the test (and forgotten just after), and take away a useful life skill. We could ask students to memorize presentations instead of reading them.&lt;br /&gt;&lt;br /&gt;Memory as an outcome is attractive because it's easy to assess, there are straightforward techniques that are known to work, and it can be incorporated into the curriculum. The biggest barrier might be faculty who don't know and don't want to learn the techniques themselves.&lt;br /&gt;&lt;br /&gt;Undoubtedly&amp;nbsp;the drama department already does some of this--you can't take the flash cards on stage with you for the performance. Music too, maybe. Can we broaden the scope and take similar advantage institutionally? I'd love to know if someone out there is doing this already.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-6926513051143044694?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/6926513051143044694/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/03/memory-as-slo.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/6926513051143044694'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/6926513051143044694'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/03/memory-as-slo.html' title='Memory as a SLO'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='https://lh6.googleusercontent.com/-97lzFdEYsKI/TYstQYl1cnI/AAAAAAAAAWs/yqh5VUuSYpw/s72-c/bloom_taxonomy.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-6375663299165919448</id><published>2011-03-23T07:56:00.000-05:00</published><updated>2011-03-23T07:56:47.946-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='assessment'/><category scheme='http://www.blogger.com/atom/ns#' term='epistemology'/><category scheme='http://www.blogger.com/atom/ns#' term='assessment higher education'/><title type='text'>Compare and Contrast</title><content type='html'>These pairs of quotes come from different viewpoints about how we know what we know, and how we should conduct higher education. Write a 3000 word essay in which you reflect how these statements fit into the practice of teaching and learning as you currently understand it. Due Friday.&lt;br /&gt;&lt;blockquote style="color: red;"&gt;If we take take in our hand any volume, of divinity&amp;nbsp; or school metaphysics for instance, let us ask, Does it contain any abstract&amp;nbsp; reasoning concerning quantity or number? No. Does it contain any experimental&amp;nbsp; reasoning concerning matter of fact and existence? No. Commit it then to the flames, for it can contain nothing but sophistry and illusion. --&amp;nbsp; &lt;i&gt;David Hume, An Enquiry Concerning Human Understanding&lt;/i&gt;, ed. L. Selby-Bigge, p. 163, taken from &lt;a href="http://www.ucl.ac.uk/%7Euctytho/AyerbyTH.html"&gt;here&lt;/a&gt;.&lt;/blockquote&gt;Versus:&lt;br /&gt;&lt;blockquote style="color: blue;"&gt;Epistemological anarchism is an epistemological theory advanced by Austrian philosopher of science Paul Feyerabend which holds that there are no useful and exception-free methodological rules governing the progress of science or the growth of knowledge. It holds that the idea that science can or should operate according to universal and fixed rules is unrealistic, pernicious and detrimental to science itself. --&lt;a href="http://www.facebook.com/pages/Epistemological-anarchism/110448748975828"&gt;Facebook Group on Epistemological anarchism(!)&lt;/a&gt;&lt;/blockquote&gt;And&lt;br /&gt;&lt;blockquote style="color: red;"&gt;We say that a sentence is factually significant to any given person, if and only if, [she or] he knows how to verify the proposition which it purports to express—that is, if [she or] he knows what observations would lead [her or him], under certain conditions, to accept the proposition as being true, or reject it as being false. – A. J. Ayer, &lt;i&gt;Language, Truth, and Logic&lt;/i&gt;&lt;/blockquote&gt;&amp;nbsp;Versus:&lt;br /&gt;&lt;br /&gt;&lt;blockquote style="color: blue;"&gt;[T]he meaning of a word is its usage in the language. – L. Wittgenstein&lt;/blockquote&gt;And&lt;br /&gt;&lt;blockquote style="color: red;"&gt;The CLA measures were designed by nationally recognized experts in  psychometrics and assessment, and field tested in order to ensure the  highest levels of validity and reliability.-- Advertising flyer for the Collegiate Learning Assessment&lt;/blockquote&gt;Versus:&lt;br /&gt;&lt;blockquote style="color: blue;"&gt;It is a common misconception that validity is a particular phenomenon whose presence in a test may be evaluated concretely and statistically. One often hears exclamations that a given test is “valid” or “not valid.” Such pronouncements are not credible, for they reflect neither the focus nor the complexity of validity. – College BASE Technical Manual&lt;/blockquote&gt;And finally this pair of quotes from a &lt;a href="http://www.nytimes.com/roomfordebate/2011/03/20/career-counselor-bill-gates-or-steve-jobs"&gt;New York Times article&lt;/a&gt;: &lt;br /&gt;&lt;blockquote style="color: red;"&gt;In a talk to the nation's governors earlier this month, [Microsoft founder] Mr. Gates  emphasized work-related learning, arguing that education investment  should be aimed at academic disciplines and departments that are  "well-correlated to areas that actually produce jobs."&amp;nbsp;&lt;/blockquote&gt;Versus:&lt;br /&gt;&lt;blockquote style="color: blue;"&gt;At an event unveiling new Apple products, [Apple CEO] Mr. Jobs said: "It's in Apple's DNA that technology alone is not enough  -- it's technology married with liberal arts, married with the  humanities, that yields us the result that makes our heart sing and  nowhere is that more true than in these post-PC devices."&lt;/blockquote&gt;&lt;br /&gt;What positions on the red/blue spectrum are more consistent with your own practices and beliefs? There are not right answers. Well, unless you believe there are I guess.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-6375663299165919448?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/6375663299165919448/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/03/compare-and-contrast.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/6375663299165919448'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/6375663299165919448'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/03/compare-and-contrast.html' title='Compare and Contrast'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-3322020901298697035</id><published>2011-03-15T09:25:00.001-05:00</published><updated>2011-03-15T15:42:09.581-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='IE'/><category scheme='http://www.blogger.com/atom/ns#' term='assessment'/><category scheme='http://www.blogger.com/atom/ns#' term='SACS'/><category scheme='http://www.blogger.com/atom/ns#' term='Principles of Accreditation'/><title type='text'>Improving the Principles of Accreditation</title><content type='html'>If you are in the &lt;a href="http://sacscoc.org/"&gt;SACS/CoC&lt;/a&gt; region, you know all about the &lt;i&gt;&lt;a href="http://sacscoc.org/pdf/2010principlesofacreditation.pdf"&gt;Principles of Accreditation&lt;/a&gt;, &lt;/i&gt;the document that outlines accreditation standards, and which every institution must use to report compliance every ten years and (with fewer sections) every five years in between. &amp;nbsp;Until March 31, the Commission is accepting comments that will inform a review of said document. This is an excellent opportunity to make your voice heard in what is, after all, a peer-review process.&lt;br /&gt;&lt;br /&gt;I have some observations I will post here for comment before sending them off to the Commission, in order to see if others agree or can suggest better approaches. I will restrict my comments to the institutional effectiveness (IE) sections. So here goes.&lt;br /&gt;&lt;br /&gt;Note: CR = Core Requirement, CS = Comprehensive Standard, and FR = Federal Requirement&lt;br /&gt;&lt;hr /&gt;&lt;blockquote&gt;&lt;b&gt;CR 2.10&lt;/b&gt;&amp;nbsp;&amp;nbsp;The institution provides student support programs, services, and activities consistent with its mission that promote student learning and&amp;nbsp;enhance the development of its students. (Student Support Services)&lt;/blockquote&gt;Although this isn't in the IE sections formally, it has a requirement that student support services "promote student learning and enhance the development of its students." This is a clear IE requirement, and is exceptional in that no other functional units, including academic programs, are required to pass this level of detailed IE review as a &lt;i&gt;core requirement&lt;/i&gt;. Taken literally, an institution can be sanctioned severely for not assessing learning for student support services, which seems out of line with the more strategic level requirements that comprise the CR sections. I would suggest moving the IE language to 3.3.1.1, quoted below in the current version:&lt;br /&gt;&lt;blockquote&gt;&lt;b&gt;3.3.1 &lt;/b&gt;The institution identifies expected outcomes, assesses the extent to&amp;nbsp;which it achieves these outcomes, and provides evidence of&amp;nbsp;improvement based on analysis of the results in each of the following areas: (Institutional Effectiveness)&lt;br /&gt;&lt;b&gt;3.3.1.1&lt;/b&gt; educational programs, to include student learning outcomes&lt;br /&gt;&lt;b&gt;3.3.1.2&lt;/b&gt; administrative support services&lt;br /&gt;&lt;b&gt;3.3.1.3&lt;/b&gt; educational support services&lt;br /&gt;&lt;b&gt;3.3.1.4&lt;/b&gt; research within its educational mission, if appropriate&lt;br /&gt;&lt;b&gt;3.3.1.5&lt;/b&gt; community/public service within its educational mission, if appropriate&lt;/blockquote&gt;Note that "student support services" doesn't appear. There are administrative and educational support services, but not student support services. The nomenclature needs to be cleaned up so we know exactly what we're talking about. One approach is to assume that any service that has learning outcomes is an educational support service, and therefore the modification could be as follows:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;End the statement of CR 2.10 after "mission," omitting the IE component.&lt;/li&gt;&lt;li&gt;Change 3.3.1.3 to read "educational support programs, to include student learning outcomes."&lt;/li&gt;&lt;/ol&gt;&lt;hr /&gt;&lt;div&gt;&lt;blockquote&gt;&lt;b&gt;CS 3.5.1&lt;/b&gt;&amp;nbsp;&amp;nbsp;The institution identifies college-level general education competencies and the extent to which graduates have attained them.&amp;nbsp;(College-level competencies)&lt;/blockquote&gt;This is the general education assessment requirement. Note that it doesn't appear in 3.3.1 &lt;i&gt;unless general education is defined as a program by the institution&lt;/i&gt;. On the other hand, 3.5.1 is NOT an IE requirement--there is no statement about use of results to improve, just that you assess the extent to which students meet competencies.This is a knotty puzzle, so let me take it one part at a time.&lt;br /&gt;&lt;br /&gt;First, the phrase "the extent to which" is ambiguous. Does it mean relative to an absolute standard or a relative one? This is by no means splitting hairs. If it means the former, then the institution MUST define what an acceptable competency is for each outcome, and&amp;nbsp;presumably&amp;nbsp;report out percentages that meet the standard. If it's a &lt;i&gt;relative&lt;/i&gt; "extent to which" then simply reporting raw scores of a standardized test against national norms would work.&lt;br /&gt;&lt;blockquote&gt;Example (absolute): Graduates will score 85% or more on the Comprehensive Brain Test. In 2010, 51% of graduates met this standard.&lt;br /&gt;&lt;br /&gt;Example (relative): In 2010, graduates averaged 3.1 on the Comprehensive Brain Test, versus a national average of 2.9.&lt;/blockquote&gt;The standard is silent about the complexities of sampling graduates too. Are ratings from &lt;i&gt;all&lt;/i&gt;&amp;nbsp;graduates to be included? I would assume not, since this standard generally doesn't apply to IE processes because of the&amp;nbsp;impracticably&amp;nbsp;of it.&lt;br /&gt;&lt;br /&gt;My sense of this is that we should allow institutions to define success in either absolute or relative terms, as best suits them, and include this standard with the other 3.3.1 sections so that it formally becomes part of IE. This will resolve the ambiguity about whether or not general education is program, and require that general education assessments actually be used for improvements. It also would broaden the scope to include students generally, not just graduates, who may have been out of general education courses for two years by then.&lt;br /&gt;&lt;br /&gt;The modification could simply be to:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Add:&amp;nbsp;&lt;b&gt;CS&lt;/b&gt;&amp;nbsp;&lt;b&gt;3.3.1.6&lt;/b&gt; general education, to include learning outcomes&lt;/li&gt;&lt;li&gt;Delete:&amp;nbsp;&lt;b&gt;CS&amp;nbsp;3.5.1&lt;/b&gt;&lt;/li&gt;&lt;/ol&gt;&lt;hr /&gt;Finally, let's look at 3.3.1.1 itself. The first issue is subtle. It concerns the meaning of the language "provides evidence of&amp;nbsp;improvement based on analysis of the results". This can mean two different things, and I've seen it interpreted both ways, causing confusion.&lt;br /&gt;&lt;br /&gt;The first interpretation is that it means that you have to &lt;i&gt;demonstrate that improvement happened. &lt;/i&gt;This is a very high standard. It means things like benchmarking before changes are occurred, and then assessing the same way later on to see what impact occurred. When I hear speakers talk about QEP assessment, this is generally assumed, but it leaks over into the other IE areas too. Anytime you take a difference between two measurements it amplifies the relative error--this is a basic fact from numerical analysis. So you have to have very good assessments and they have to be objective (for reliability) and numerous (small standard error) and scalar (so you can subtract and still have meaning). Also, pre-post tests are the only method that can be used with any&amp;nbsp;resemblance&amp;nbsp;to scientific method. That is, you can't survey two different populations to compare unless you think you can explain all the variance between the populations (read &lt;i&gt;Academically Adrift&lt;/i&gt; to see how problematic that is even for educational researchers). My objections to this are:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;No small program could ever meet this standard; the N will never be big enough.&lt;/li&gt;&lt;li&gt;Many subjective assessments are very valuable, but are useless in this interpretation&lt;/li&gt;&lt;li&gt;We don't yet have the technology to create scientific scalar indices of things like "complex reasoning" or other fuzzy goals, despite the testing companies' sales literature.&lt;span style="font-size: x-small;"&gt;[1]&lt;/span&gt;&lt;/li&gt;&lt;li&gt;Random sampling is often impossible, which introduces biases that are probably not well understood &lt;/li&gt;&lt;li&gt;Pre-post testing is quite limited, and severely restricts the kinds of useful assessments we might employ&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;The other interpretation is simply that we use analysis of assessment data to take actions that would reasonably be expected to improve things, &lt;i&gt;but we don't have to prove it did.&lt;/i&gt;&amp;nbsp;This is the standard I've seen most widely applied, except perhaps for QEP impact. Arguably the higher standard &lt;i&gt;should&lt;/i&gt; apply to QEP, but &amp;nbsp;that's beyond the scope of my comments here.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In practical terms, the second interpretation is the most useful. It has to be borne in mind that assessment programs are mostly implemented by teaching faculty, who are probably not educational researchers by training, and in my experience tend to become frozen with a sort of helplessness if they think they are expected to track learning outcomes like a stock ticker tracks equity prices. I &lt;a href="http://highered.blogspot.com/2010/05/assessing-your-program-level-assessment.html"&gt;blogged about this&lt;/a&gt; a while back. Too much emphasis on proofs of improvement is paralyzing and counterproductive. On the other hand, free-ranging discussions that include the meaning of results, subjective impressions from course instructors, and other information that speaks to the learning outcome under consideration is a gold mine of opportunities to make changes for the better.&lt;br /&gt;&lt;br /&gt;The most powerful argument against the strict (first) interpretation, however, is that there is simply no way to guarantee improvement on some index unless (1) one cheats somehow, manipulating the index, or (2) the index reflects some goal that is so obviously easy to improve that it's trivial. Either way the meaningfulness of the program vanishes, and we are left with many programs that are out of compliance (not showing improvement) or in compliance in name only (showing fake improvement or showing trivial improvement).&amp;nbsp; &lt;br /&gt;&lt;br /&gt;My recommendation is to clarify the meaning of the language to read (bold emphasizes the change):&lt;br /&gt;&lt;blockquote&gt;&lt;b&gt;CS 3.3.1 &lt;/b&gt;The institution identifies expected outcomes, assesses the  extent to&amp;nbsp;which it achieves these outcomes, &lt;b&gt;and takes actions&lt;/b&gt; based on analysis of the results in each of the following  areas: (Institutional Effectiveness)&lt;/blockquote&gt;&lt;/div&gt;It should be obvious that the actions are intended to effect positive change, since this is rather the whole point of IE.&lt;br /&gt;&lt;hr /&gt;There is another issue with 3.3.1 that deserves attention. This concerns goals or outcomes that are not about student learning. It seems to me from the interpretations I've seen of 3.3.1 by reviewers and practitioners that the methods that apply to learning outcomes leak over to administrative areas, and that the expectations for compliance are tilted out of plumb. For example, there seems to be an expectation that every unit do surveys and have goals related to those, and that moreover this is &lt;i&gt;necessary and sufficient.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Part of the problem may be the language "the extent to which," which almost insists on a scalar quantity.&amp;nbsp; But in fact, the most important goals for a given unit's effectiveness may have little to do with surveys and not be naturally a scalar.&lt;br /&gt;&lt;br /&gt;One example I saw takes issue with a compliance report that presented "action steps" as goals. An action step might be "Approval of architectural drawings of the new library by 5/1/11." This sort of thing shows up all over Gantt charts:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://lh3.googleusercontent.com/-FpOWZdBhz3s/TX9tA5bk4yI/AAAAAAAAAWo/Fwy2RivqSHE/s1600/Pert_example_gantt_chart.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="146" src="https://lh3.googleusercontent.com/-FpOWZdBhz3s/TX9tA5bk4yI/AAAAAAAAAWo/Fwy2RivqSHE/s640/Pert_example_gantt_chart.gif" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;span style="font-size: xx-small;"&gt;(image courtesy of &lt;a href="http://en.wikipedia.org/wiki/File:Pert_example_gantt_chart.gif"&gt;Wikipedia&lt;/a&gt;)&lt;/span&gt;&lt;/div&gt;&lt;br /&gt;&amp;nbsp;This method of tracking complicated goals to completion is very common and very effective. The bars may or may not represent progress toward completion--they are simply timelines. So, for example, an OK stamp from the city engineer on your electrical plans is not a "percent to completion" item--it's either done or not. In other words it's Boolean.&lt;br /&gt;&lt;br /&gt;Reviewers can be allergic to Boolean outcomes like:&lt;br /&gt;&lt;ul&gt;&lt;li&gt; Complete the library building on time and on budget.&lt;/li&gt;&lt;li&gt;Implement the new MS-Social Work program by Fall 2012&lt;/li&gt;&lt;li&gt;Gain approval via the substantive change process for a full online program in Dance by 2013 (good luck!)&lt;/li&gt;&lt;li&gt;Maintain a balanced budget every fiscal year.&lt;/li&gt;&lt;/ul&gt;For some reason, there seems to be a bias against this kind of goal, and I can't figure out why. These are obviously important operational items, key to the effectiveness of respective units. But a unit that includes these and doesn't have a satisfaction survey may be cited, whereas the reverse may be true too. &lt;br /&gt;&lt;br /&gt;It may be that I'm making a mountain out of a termite hill here, but I think the language of "extent to which" could be changed to make it clearer that Boolean objectives are sometimes the natural way to express effectiveness goals.&lt;br /&gt;&lt;br /&gt;For example, 3.3.1 could read (change bolded):&lt;br /&gt;&lt;blockquote&gt;&lt;b&gt;CS 3.3.1 &lt;/b&gt;The institution identifies expected outcomes, &lt;b&gt;assesses success in a manner appropriate to each outcome&lt;/b&gt;, and takes actions based on analysis of the results in each of the following  areas: (Institutional Effectiveness)&lt;/blockquote&gt;I know the non-parallel structure will offend the grammarians, but someone more expert can consider that conundrum.&lt;br /&gt;&lt;br /&gt;One final nit-pick is that outcomes may not really be expected, but simply striven for--aspirational outcomes, in other words. If they are expected, then by nature they are less ambitious than they might otherwise be. I also put the odd dangling " in each of the following  areas" at the front where it belongs, so my final version is this:&lt;/div&gt;&lt;blockquote&gt;&lt;div&gt;&lt;b&gt;CS 3.3.1 In each of the following areas, the institution&lt;/b&gt;&lt;b&gt; identifies aspirational outcomes&lt;/b&gt;, assesses success in a manner appropriate to each outcome, and takes actions based on analysis of the results: (Institutional Effectiveness)&lt;/div&gt;&lt;/blockquote&gt;&lt;hr /&gt;&lt;div&gt;[1] I've written much about assessing complex outcomes before, and this is not the page to rehash that issue. The short version is that learning happens in brains, and unless we understand how brains change when we learn, we are not able to speak about causes and effects as they relate to the physical world. See &lt;a href="http://www.holah.co.uk/study/maguire/"&gt;this article&lt;/a&gt; about London taxicab drivers to see a study that links learning to physiological changes to the brain. I don't mean to imply that assessing complex outcomes is useless, but just that we should be modest about our conclusions. It's called complex for a reason.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-3322020901298697035?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/3322020901298697035/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/03/improving-principles-of-accreditation.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/3322020901298697035'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/3322020901298697035'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/03/improving-principles-of-accreditation.html' title='Improving the Principles of Accreditation'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='https://lh3.googleusercontent.com/-FpOWZdBhz3s/TX9tA5bk4yI/AAAAAAAAAWo/Fwy2RivqSHE/s72-c/Pert_example_gantt_chart.gif' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-2719935779042802887</id><published>2011-03-14T08:32:00.002-05:00</published><updated>2011-03-14T20:52:22.271-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='IQ'/><category scheme='http://www.blogger.com/atom/ns#' term='critical thinking'/><category scheme='http://www.blogger.com/atom/ns#' term='academically adrift'/><category scheme='http://www.blogger.com/atom/ns#' term='learning outcomes'/><category scheme='http://www.blogger.com/atom/ns#' term='CLA'/><title type='text'>Academically Adrift</title><content type='html'>I've come up for air after posting part three of &lt;i&gt;&lt;a href="http://lifeartificial.com/"&gt;Life Artificial&lt;/a&gt;&lt;/i&gt; yesterday evening. Over the last few weeks I've had dozens of browser tabs open to write a post about, but have been too focused on getting particular projects done to write them.&lt;br /&gt;&lt;br /&gt;I finished reading most of &lt;i&gt;Academically Adrift&lt;/i&gt; last week, meaning I've started but not finished the Appendices. I downloaded the book from Amazon and read it on the iPad's Kindle application. This is very convenient, and the reading experience is fine, but it does have a significant drawback. Despite the ability to bookmark pages and leave yourself notes and highlights, it's not very easy for me to mark up the book in an easily accessible fashion for reference purposes. Normally I'd have sticky notes coming out of the leaves of the volume, and have text circled by scribbled notes in the margin. This by way of apology that I don't have many quotes in this post.&lt;br /&gt;&lt;br /&gt;I've never been impressed with the CLA, and the research in &lt;i&gt;Academically Adrift&lt;/i&gt; depends heavily on it. I hasten to add that I don't have any problems with the laudable aim of assessing "critical thinking, complex reasoning, and writing." I think the instrument itself can even be useful. It's the exaggerated claims of importance and over-statement of validity that seem unwarranted to me. Because so much of the book relies on CLA scores, let me elaborate.&lt;br /&gt;&lt;br /&gt;First, the test is advertised as posing "real world" problems to test subjects. I'm not quite sure what this means, but it sounds good. An example is given on page 22-23:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://lh4.googleusercontent.com/-qTtXOfjL5C8/TX32QI2Kf0I/AAAAAAAAAWk/ClLijLrEzys/s1600/blog.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="https://lh4.googleusercontent.com/-qTtXOfjL5C8/TX32QI2Kf0I/AAAAAAAAAWk/ClLijLrEzys/s1600/blog.jpg" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;This seems to be a particularly poor exemplar of a real world problem. On the face of it: would you ask a twenty-something with two years of college to make a decision like this? Undoubtedly the aircraft is an expensive purchase that, more to the point, you'll be trusting your life to. So the way you make the decision is to give a non-expert a limited amount of documentation and ask for a report?&amp;nbsp; I don't think so.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;b&gt;Real life problems are not neatly defined&lt;/b&gt; by a few newspaper articles, FAA reports, and such. There's a vast body of knowledge to sort through an analyze if one cares to look for it. It is in the form of web searches, academic and professional articles, interviews with experts, and on and on. I understand that it would make the test much harder to administer to allow open-ended searches, and it would take a long time. Days or weeks. This just highlights how far this exercise is from a real "real life problem."&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;Real life problems are generally solved by people who know what they are doing. You don't ask the guy at the car wash to take out your gall bladder for a good reason. If you want to know about safety issues with a particular model of airplane, I think it would be really good to talk to a senior mechanic that works on such planes and to pilots that fly them. In fact, putting an expert in the field in charge of your investigation would be the best thing to do, no? This is one of the problems with assessments of "critical thinking" that try to ignore content knowledge. It doesn't work. Content is really important when you want to solve a problem, unless all you care about is sophistry.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;Here's a real life problem:&lt;/div&gt;&lt;blockquote&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;You're the head mechanic who supervises a small fleet of helicopters for a rent-a-bird operation. The chief pilot is planning to fly in tomorrow to test out a new Bell Jet Ranger that you're getting ready for operations. But you and your team haven't been able to get the main rotor balanced, and you suspect that the part that came from Bell to do this is defective. But the pilot has a volatile temper, and is going to be very upset with you if the helicopter isn't ready for him to fly when he gets here. What do you do?&lt;/div&gt;&lt;/blockquote&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;This requires technical knowledge, creativity, and judgment about social dynamics to get through, and there's probably no perfect solution. This &lt;i&gt;was&lt;/i&gt; a real world problem I witnessed as a teenager. I saw the mechanics try to balance that rotor blade for hours. They finally gave up, &lt;i&gt;put it on the helicopter anyway&lt;/i&gt;, and let the pilot try to take off with it that way! It was so out of balance, that he could only wobble the aircraft around on the ground before giving up. He emerged looking like a ghost, too relieved to be alive to be mad at anyone. Probably not an ideal solution.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;b&gt;A problem with the analysis in the book&lt;/b&gt; is the emphasis on differences in scores and the way the information is represented. Others have mentioned (e.g. &lt;a href="http://chronicle.com/article/Academically-Adrift-a/126371/?sid=cr&amp;amp;utm_source=cr&amp;amp;utm_medium=en"&gt;this article in The Chronicle&lt;/a&gt;) that the authors of &lt;i&gt;Academically Adrift&lt;/i&gt; put a lot of importance on how many students haven't demonstrated significant learning. The literature I've read on the CLA says that it's not even designed for student-level analysis, but for comparing institutions. But never mind that. The statistical problem that has been pointed out by others is that a hypothesis test that does not find a difference in pre-post tests does not mean that there is no difference. This is from Stats 101. As an analogy, let's say my kid is getting ready for school and can't find her gym clothes. I don't have my glasses on yet and am bumbling around the house before the first cup of coffee has kicked in. But I look here and there for the gym clothes halfheartedly as I walk around. There are two possibilities:&lt;/div&gt;&lt;ol&gt;&lt;li&gt;I see the distinctive purple bag. Even without my glasses on, there's a reasonably high probability that I've found the gym clothes.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;I look in the room but don't see the bag. I cannot say that the bag isn't in the room, only that I didn't see it. It may, in fact, be just behind the door.&lt;/li&gt;&lt;/ol&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;This is the difference between rejecting the null hypothesis (#1) and not rejecting it (#2). Yet the authors lead us to believe through their narrative that not finding a difference means that there is no difference, and go on to claim that not much learning is happening during the first two years of college.&amp;nbsp;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;b&gt;Although the authors wave at the notion of IQ&lt;/b&gt; late in the book, they never say anything about the relationship between what they are trying to measure and intelligence (as assessed by standard instruments). Psychologists assure us that IQ is largely static. It's also closely related to verbal skills, which is what the CLA assesses too. If IQ is largely immutable, don't we need to factor it out before making conclusions about improvement? I understand that this is very sensitive issue because you ultimately would have to say that some people are smarter than other people (with all the fine print that goes with that), but to completely ignore the issue seems disingenuous.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;b&gt;We shouldn't necessarily reject all the findings &lt;/b&gt;of the book just because the analysis is flawed. The attention on how hard students work and the corresponding rigor of academics is an important discussion. Unfortunately, the way the headline findings of the text overstate the findings on complex learning outcomes only feeds into the current of public discourse advertising that teachers are villains. Despite carefully worded disclaimers that general readers won't decode, the book practically screams that the first two years of college don't produce any meaningful learning. This is a disservice to all the course instructors out there who work hard to teach students calculus, computer programming, world history, composition, and so on. &lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;The fact is that the content of the CLA doesn't correspond well to the course content of a general education curriculum. This could be remedied easily. I am quite sure that a one-semester prep course on items like the ones on the CLA would significantly increase the scores. Would this result in real, meaningful knowledge? Maybe, depending on the depth of the course. A pure test-prep course that teaches how to game the test is probably worthless. But perhaps the current curricula are out of date. Even if that's true, the importance of the CLA and its standardized measurement are not very important because they are too simplified, and rely too much (I assume) on IQ, and not enough on discipline-based knowledge. There are already recommendations (e.g. from the AAC&amp;amp;U) to increase complex problem solving skills, and many institutions are already struggling with how to teach and assess this. But almost by definition, complex problems aren't trivial to solve. If the CLA becomes the measure of these learning outcomes--if we really take analyses like &lt;i&gt;Academically Adrift&lt;/i&gt; seriously--they we will shortly have another instance of &lt;a href="http://en.wikipedia.org/wiki/Campbell%27s_law"&gt;Campbell's Law&lt;/a&gt; at work. See this &lt;a href="http://online.wsj.com/article/SB10001424052748703445904576117793343465096.html?mod=wsj_share_twitter"&gt;Wall Street Journal article&lt;/a&gt; for a good example. Or &lt;a href="http://ifors.org/web/the-number-thats-devouring-science/"&gt;this opinion&lt;/a&gt; about "impact factors."&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;b&gt;Learning outcomes don't matter anyway&lt;/b&gt;, in the big picture. That is, test scores and other performance indicators taken during college aren't ultimately the measure of success of a program. It's what happens after graduation that ultimately matters. Are your graduates curious about the world? Do they read newspapers and vote intelligently? Are they engaged in service to their fellow humans? Do they contribute to the overall economy? Depending on the mission of the institution, these goals may vary (but always include "do they donate money?"). At the level of the Education Department, where the strategy should inform the national interests, perhaps the easiest metric of success would be income histories. The federal government has vast data stores on financial aid given. It also has all the IRS data. I asked an official from DoE about mashing these up to answer important questions like "what's the lifetime earnings benefit for attending a private vs public college?". I wasn't encouraged by the answer. No one cares enough or the problem seems too hard. And yet, I'm sure if we gave all the data to the guys at Facebook, they'd have it figured out over the weekend.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-2719935779042802887?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/2719935779042802887/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/03/academically-adrift.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/2719935779042802887'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/2719935779042802887'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/03/academically-adrift.html' title='Academically Adrift'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='https://lh4.googleusercontent.com/-qTtXOfjL5C8/TX32QI2Kf0I/AAAAAAAAAWk/ClLijLrEzys/s72-c/blog.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-101306024263329935</id><published>2011-02-25T06:28:00.000-05:00</published><updated>2011-02-25T06:28:53.542-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='higher education'/><title type='text'>UniLeaks</title><content type='html'>Perhaps this was inevitable...a &lt;a href="http://www.unileaks.org/index.php"&gt;WikiLeaks clone&lt;/a&gt; for higher education. Seems to be UK-centric so far, but don't wait for the rush--dust off those fiery curriculum committee minutes and start some controversy!&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;img border="0" height="195" src="http://4.bp.blogspot.com/--Gvw6MGmH74/TWeSGRcfNGI/AAAAAAAAAWg/TtshvWBnrBw/s320/unileaks.png" width="320" /&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;a href="http://www.blogger.com/"&gt;&lt;/a&gt;&lt;span id="goog_851518225"&gt;&lt;/span&gt;&lt;span id="goog_851518226"&gt;&lt;/span&gt;Image from &lt;a href="http://unileaks.org/"&gt;UniLeaks.org&lt;/a&gt;.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-101306024263329935?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/101306024263329935/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/02/unileaks.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/101306024263329935'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/101306024263329935'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/02/unileaks.html' title='UniLeaks'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/--Gvw6MGmH74/TWeSGRcfNGI/AAAAAAAAAWg/TtshvWBnrBw/s72-c/unileaks.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-4492186613040467403</id><published>2011-01-20T07:06:00.000-05:00</published><updated>2011-01-20T07:06:19.703-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='FACS'/><category scheme='http://www.blogger.com/atom/ns#' term='assessment'/><title type='text'>Individual FACS reports</title><content type='html'>I have about two dozen web pages marked to write articles about, but haven't found the time. I'm trying to wrap up part three of my novel (see &lt;a href="http://lifeartificialblog.blogspot.com/"&gt;that blog&lt;/a&gt;), and still working on writing up my research. A couple of new things for me:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Our non-cognitive research is going well. Preliminary results based on one semester's grades indicate that a non-cog survey has the potential to add information to the usual enrollment inputs. We'll have much better data in the fall, with a whole year of grades and year-to-year retention data.&lt;/li&gt;&lt;li&gt;We built and launched an early alert system for tracking student who have academic difficulty early in the semester. It's in the testing phase right now.&lt;/li&gt;&lt;li&gt;We're developing a new web site, and one of the most important, and (to all&amp;nbsp;appearances) ignored sections is the description of academic programs. We're spending a large effort there to develop superb pages that will sell programs to students. Stay tuned. Our Google Analytics shows that these are the most frequented pages by outside visitors. Not surprising, since academics is the product a university sells.&lt;/li&gt;&lt;li&gt;I've done a lot of development on reporting FACS scores. There is now a self-serve site to generate reports by term, program, or class. This week I added the ability to drill down to the individual student level, for use by advisors. Here's an example:&lt;/li&gt;&lt;/ul&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_KC3QZzlY64A/TTgkJcjQ6HI/AAAAAAAAAWQ/vOuwsb7W1eI/s1600/ifacs.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="488" src="http://1.bp.blogspot.com/_KC3QZzlY64A/TTgkJcjQ6HI/AAAAAAAAAWQ/vOuwsb7W1eI/s640/ifacs.JPG" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;br /&gt;The red lines and bars show where this second year student is performing relative to other students, based on faculty assessments from the prior semester. At at glance you can see that this student is performing well below par, and according to professors is also not putting forth effort at the same level as peers. The sample sizes are necessarily small (although they will increase as the assessment becomes institutionalized). Note, however, that all three raters agreed that creative thinking was demonstrated at the pre-college "developmental" level.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Off to the Southern Education Foundation&lt;/b&gt;&amp;nbsp;today to attend an assessment meeting in San Antonio.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-4492186613040467403?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/4492186613040467403/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2011/01/individual-facs-reports.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/4492186613040467403'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/4492186613040467403'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2011/01/individual-facs-reports.html' title='Individual FACS reports'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_KC3QZzlY64A/TTgkJcjQ6HI/AAAAAAAAAWQ/vOuwsb7W1eI/s72-c/ifacs.JPG' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-2012378339495829646</id><published>2010-12-07T14:51:00.001-05:00</published><updated>2010-12-07T14:53:51.792-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='SACS'/><category scheme='http://www.blogger.com/atom/ns#' term='twitter'/><title type='text'>SACS gets a Back Channel</title><content type='html'>Last year's tweets from the SACS/COC December meeting were almost non-existent. That has been remedied this year thanks to contributions from several indefatigable tweeters. You can see them at&amp;nbsp;&lt;a href="http://twitter.com/search?q=%23sacs"&gt;http://twitter.com/search?q=%23sacs&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_KC3QZzlY64A/TP6P8GwpgyI/AAAAAAAAAWE/jXUSopIz72k/s1600/twitter.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="217" src="http://3.bp.blogspot.com/_KC3QZzlY64A/TP6P8GwpgyI/AAAAAAAAAWE/jXUSopIz72k/s320/twitter.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;I have notes to write up and post here, but will have to wait until the weekend. The big news for me was that Coker's Fifth Year report sailed through with no&amp;nbsp;recommendations. Congrats to Kaye and Pat and Daniel and everyone I don't know about who made that happen.&lt;br /&gt;&lt;br /&gt;Here's a gem I found in the resource room, while picking over the 3.3.1 sections (paraphrased):&lt;br /&gt;&lt;blockquote&gt;&lt;i&gt;Objective:&lt;/i&gt; 60% of the students taking the test will be in the 100th percentile&lt;/blockquote&gt;I know what they meant, but I got a good laugh out of it. It puts Lake Wobegon to shame!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-2012378339495829646?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/2012378339495829646/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2010/12/sacs-gets-back-channel.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/2012378339495829646'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/2012378339495829646'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2010/12/sacs-gets-back-channel.html' title='SACS gets a Back Channel'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_KC3QZzlY64A/TP6P8GwpgyI/AAAAAAAAAWE/jXUSopIz72k/s72-c/twitter.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-3512697782153272533</id><published>2010-12-03T08:42:00.000-05:00</published><updated>2010-12-03T08:42:50.924-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='SACS'/><category scheme='http://www.blogger.com/atom/ns#' term='twitter'/><title type='text'>SACS 2010</title><content type='html'>I'm off to Louisville tomorrow morning for the December SACS meeting. Last year's backchannel was about zilch, and it looks like #SACS means something in another context, judging from the &lt;a href="http://twitter.com/search#search?q=%23SACS"&gt;Twitter search for #SACS&lt;/a&gt;. I'll tweet to #SACS anyway, from my phone or iPad. Send me an email if you want to have a coffee and compare notes.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-3512697782153272533?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/3512697782153272533/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2010/12/sacs-2010.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/3512697782153272533'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/3512697782153272533'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2010/12/sacs-2010.html' title='SACS 2010'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-1847268076622330935</id><published>2010-11-21T09:03:00.009-05:00</published><updated>2010-11-21T15:23:24.071-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='collatz'/><category scheme='http://www.blogger.com/atom/ns#' term='artificial life'/><category scheme='http://www.blogger.com/atom/ns#' term='evolution'/><category scheme='http://www.blogger.com/atom/ns#' term='math'/><title type='text'>Collatz Ecologies</title><content type='html'>I mentioned the &lt;a href="http://en.wikipedia.org/wiki/Collatz_conjecture"&gt;Collatz Conjecture&lt;/a&gt; in "&lt;a href="http://highered.blogspot.com/2010/11/on-design.html"&gt;On Design&lt;/a&gt;" as an example of the qualitative difference between simulation and inverse problem solving. In this article I want to use it for another purpose: to show how structure emerges out of iteration. Specifically, I want to create a very simple model of Darwinian evolution and demonstrate with simulations and mathematical proof that patterns emerge naturally. In a later post I will talk more about what is significant about this, but here's the preview: when stable patterns emerge in some iterated system, it's possible to build new systems on top of the old ones. Moreover, these new systems can be seen as independent of the old ones. The full discussion on that can wait. This is the fun part.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Iteration &lt;/b&gt;at the heart of the conjecture is a single branching formula that works on (usually positive) integers:&lt;br /&gt;&lt;blockquote&gt;$N_{new}:=\left\{ {N_{old}/2\mbox{ if even}\atop (3N_{old}+1)/2\mbox{ if odd}}\right.$&lt;/blockquote&gt;I have used the more compact form of the formula that goes ahead and divides by two in the odd case, since 3N+1 will always give an even number when N is odd. As an example, starting with 8, we get the sequence 8 -&amp;gt;; 4 -&amp;gt;; 2 -&amp;gt;;1, since all but the last is even (8 is a power of 2). A more interesting example is 3 -&amp;gt; 5 -&amp;gt; 8 -&amp;gt; 4 -&amp;gt; 2 -&amp;gt; 1. The unproven conjecture is that any starting number eventually ends up at one. If the conjecture is not true, then for some starting N, either the sequence grows without bound, or it forms a repeating loop. The Wikipedia page has a nice summary of what is known.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Artificial Life&lt;/b&gt; is the use of computer simulations to understand biological-like behavior. &lt;a href="http://en.wikipedia.org/wiki/Conway's_Game_of_Life"&gt;Conway's Game of Life&lt;/a&gt; is one of the best known. For more on the general topic see the Wiki on &lt;a href="http://en.wikipedia.org/wiki/Artificial_life"&gt;ALife&lt;/a&gt;. For our purposes here, I want to use the Collatz iterator to construct a population that is subject to Darwinian evolution. For that we need these components:&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;An initial state from which to begin the simulation. In practice this will be an individual species, identified with an odd integer 3,5,7...&lt;/li&gt;&lt;li&gt;A simulation of change over time. This will be the Collatz iterator acting on the numerical species.&lt;/li&gt;&lt;li&gt;A fitness function. If a species "evolves to" 1 via the iteration, it is eliminated from the population. The Collatz Conjecture in this context is that all species eventually go extinct.&lt;/li&gt;&lt;li&gt;Reproduction and variation. For any odd numbered species, when it is transformed by the iterator, an imperfect copy is also created. The copy is the species N-2, where N is the new number of the original after iteration. For example, when 11 -&amp;gt; (3*11+1)/2 = 17, the species 17 - 2 = 15 is also added to the population. &lt;/li&gt;&lt;li&gt;A carrying capacity C. When the population comprises C species already, reproduction is paused until a vacancy opens up through some species going extinct. &lt;/li&gt;&lt;/ol&gt;&lt;div&gt;For my purposes I don't care how many copies of an individual species exist, since they will all share the same deterministic fate. It also helps keep the computation cost down. So if 17 already exists in the population, and another 17 comes along, I don't create two copies of 17.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Initial Results&lt;/b&gt;. Alife sims are just plain fun to play with. I include my code at the bottom of this post in case you want to try it yourself. For what follows, the carrying capacity is set to 100. Here are the first seven starting species charted over 20 generations.&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_KC3QZzlY64A/TOkS37kPX5I/AAAAAAAAAVw/xW6lCRQ6HL8/s1600/collatz1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_KC3QZzlY64A/TOkS37kPX5I/AAAAAAAAAVw/xW6lCRQ6HL8/s1600/collatz1.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div&gt;We see exponential growth capped at C except for N=3 and N=5 (they overlap), which form the line at the bottom. What's going on there? &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A little investigation shows that the population {2,3,4,5,6,8} forms a closed "ecosystem" that is generated from starting with 3 or 5 (or 6 if we include evens). Moreover, this system follows a stable pattern that will go on forever, showing that there exists Collatz ecologies that never go extinct. In fact, if any other N-ecology  ever generates 3 or 5, it will create this pattern as well, and so will also live forever. This gives us:&lt;br /&gt;&lt;blockquote&gt;&lt;i&gt;Conjecture 1&lt;/i&gt;: All odd N-ecologies with N &amp;gt; 1 survive forever for sufficiently large C&lt;/blockquote&gt;I hedged a bit there, but I don't think C needs to be very large at all. Another question that we can frame as an affirmative in the form of a conjecture is:&lt;/div&gt;&lt;blockquote&gt;&lt;i&gt;Conjecture 2:&lt;/i&gt; There is only one bounded ecosystem for odd N &amp;gt; 1.&lt;/blockquote&gt;&lt;div&gt;Here bounded means that the largest species encountered doesn't grow beyond a ceiling. It looks like the other graphs are bounded too, because they run into C, but this is only true for the &lt;i&gt;number of species at any one time&lt;/i&gt;. The individual species identifications continue to grow. We'll look at that next.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The graph below shows the population of N=9 after 16 generations. It has been capped by C, and so the number of species will stay about 100, but which ones those are will continue to change. The individual species numbers are shown by the heights of the bars. The x-axis is unimportant--it just arranges the species from smallest to largest. Each bar is a distinct species. The four highest ones are {1893,1895,1896,1898}.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_KC3QZzlY64A/TOiZaTn1cEI/AAAAAAAAAVY/sapnN0KIrro/s1600/collatz2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_KC3QZzlY64A/TOiZaTn1cEI/AAAAAAAAAVY/sapnN0KIrro/s1600/collatz2.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;style type="text/css"&gt;p, li { white-space: pre-wrap; }&lt;/style&gt; &lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/_KC3QZzlY64A/TOhTxYwWTPI/AAAAAAAAAU0/HyCY2OVOGwc/s1600/collatz3.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;br /&gt;&lt;/a&gt;&lt;a href="http://3.bp.blogspot.com/_KC3QZzlY64A/TOhTxYwWTPI/AAAAAAAAAU0/HyCY2OVOGwc/s1600/collatz3.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt; &lt;/a&gt;&lt;br /&gt;Obviously there's a lot of structure here. If we plot the populations of all 16 generations together, we see a consistent pattern over time. Each generation has its own color:&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_KC3QZzlY64A/TOiY7bKcy6I/AAAAAAAAAVU/MjdOwyvGh6A/s1600/collatz3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_KC3QZzlY64A/TOiY7bKcy6I/AAAAAAAAAVU/MjdOwyvGh6A/s1600/collatz3.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The quartet we saw as the top plateau for the population seems to be an enduring structure. Moreover, this pattern is visible on the other populations created from N = 9, 13, and 25, for example:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_KC3QZzlY64A/TOiaqYwMItI/AAAAAAAAAVc/8qYYACNgECs/s1600/collatz4.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_KC3QZzlY64A/TOiaqYwMItI/AAAAAAAAAVc/8qYYACNgECs/s1600/collatz4.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The graph shows the 16th generation of each, when N=9 and 25 are just hitting the carrying capacity. The quartet and plateau patterns are obvious in all three.&lt;br /&gt;&lt;br /&gt;A note about the above graphs. I was constantly fiddling with the program, running different numbers of generations and messing around with how reproduction gets handled. The final version of the program kills any number that attempts to grow beyond a billion, in order to prevent integer overflows. So if you run the program at the bottom and compare it to these you'll see minor differences at the upper end. The graphs above were generated without that constraint, but no overflows actually happened in the data represented.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The Quartets are not hard to explain. &lt;/b&gt;Whenever N begets another two odd numbers by the action of the iterator and mutator/reproducer, a quartet will be formed. And because the "two odds in a row" means that the starting number is multiplied by essentially (3/2)(3/2) = 2.25, it and its three new kinfolk will leapfrog over the competition. Suppose for some odd N we get another odd number (plus new variation). This will happen when  $(3N+1)/2 = 2k+1$ for some $k=1,2,...$. Solving for $N$ gives $N=(4k+1)/3$. This fraction is only an integer when  $k$ is of the form $k=3m+2, m=0,1,...$ Plugging all this back in, we find that $N=4m+3, m=0,1,...$, so $N=3,7,11,15,19,...$ all have the property that they are odd numbers that also yield new odd numbers under iteration. When this happens, two pairs of new species are generated. In terms of $m$ they are $9m+8, 9m+6, 9m+5, 9m+3$. This gives exactly the pattern we observe in the simulations.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_KC3QZzlY64A/TOhh428xYEI/AAAAAAAAAU8/y0elHHOPOrg/s1600/collatz5.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_KC3QZzlY64A/TOhh428xYEI/AAAAAAAAAU8/y0elHHOPOrg/s1600/collatz5.png" /&gt;&lt;/a&gt;&lt;/div&gt;Plotting the population of N=7 over 100 generations against a log scale shows how the shape of the ecology changes over time. The lines of dots sloping up represent the constant log(3/2) growth of the odds. The dots seem to pretty much cover all the integers (not all at once, but eventually). That gives us a third question:&lt;br /&gt;&lt;blockquote&gt;&lt;i&gt;Conjecture 3:&lt;/i&gt; Except for N = 1, 3, or 5, the ecology for odd N eventually reaches any given positive integer.&lt;/blockquote&gt;As supporting evidence (certainly not proof!), if we look at N=7,9,...99 and locate the missing integers after 1000 generations, we see that every one has swept up the integers through 250. &lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_KC3QZzlY64A/TOkUSp1UMHI/AAAAAAAAAV0/Xe1wyX-jAGg/s1600/collatz6.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_KC3QZzlY64A/TOkUSp1UMHI/AAAAAAAAAV0/Xe1wyX-jAGg/s1600/collatz6.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;The horizontal axis is N (odd numbers from 7 to 99), and the vertical axis locates the missing value: each dot shows an integer that was missed by the N-ecology after 1000 generations. The distribution of these misses, inclusive of all the populations above shows some structure:&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_KC3QZzlY64A/TOkUqUWekxI/AAAAAAAAAV4/segr_OhExvM/s1600/collatz7.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_KC3QZzlY64A/TOkUqUWekxI/AAAAAAAAAV4/segr_OhExvM/s1600/collatz7.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;Is there some crazy number out there that isn't reachable by a given N-ecology? Does the carrying capacity change this one way or the other, over straight exponential growth?&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Behavior at the point of capacity&lt;/b&gt; shows another emergent pattern. We see this in the number of species in an N-ecology after it faces the limitation of C, and new species are created much more slowly. We would expect that larger Ns have lower extinction rates once population size C is reached. This is because in order for a species to die, it has to land on a power of two, after which it zips to 1 and goes extinct. Powers of two are distributed more sparsely (on a linear scale) as N increases. Of course, there is only one "2" in the population at any given time, so the actual extinction rate due to the 2 -&amp;gt; 1 evolution is either zero or one each generation. The graph below shows 1000 generations, averaging the odd Ns from 7 to 99, giving the demand for new population, the actual number created, and the extinction rate. The first of these would be expected to be half the population, since the odds get reproduced, and indeed we see the line bounces around 50. The extinction events come from 2-&amp;gt;1 and from the overflow limit, when a number tries to grow beyond a billion.  &lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_KC3QZzlY64A/TOkccZIitFI/AAAAAAAAAV8/WWYPObQFRRQ/s1600/collatz8.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_KC3QZzlY64A/TOkccZIitFI/AAAAAAAAAV8/WWYPObQFRRQ/s1600/collatz8.png" /&gt;&lt;/a&gt;&lt;/div&gt;The red line is more interesting. It shows the actual number of slots that opened up for new population. If we subtract out the ones we can account for due to 2-&amp;gt;1 or overflow, we get a constant average of about 5.4. This must be the rate at which two species turn into one species due to the action of the iterator. In other words&lt;br /&gt;&lt;blockquote&gt;$(3N_1 + 1)/2 = N_2 / 2$ or $3N_1 + 1 = N_2$.&amp;nbsp;&lt;/blockquote&gt;How likely is it that $N_2 \mod 3 = 1$? Should be one third, but we have to remember that $N_1$ is odd and $N_2$ is even, so for positive integers $n1$ and $n2$ we have $3(2n_1+1)+1 = 2n_2$ which gives us $3n_1 + 2 = n_2$. That doesn't change the odds. So if all integers were in the population at the same time, we'd expect a third of them to join up after iteration, decreasing the new generation's size by one each. Given that we have to account for an actual average loss of 5.4, this seems to imply that at any given time, the ecology is 16% saturated (5.4%/33%). At 100% saturation, all integers are present, and one third will disappear due to collisions. If this analysis is right, is this saturation level really constant over time? It seems unlikely.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Reference.&lt;/b&gt; I discovered by Googling around that Hiroki Sayama had a similar idea to use the Collatz sequence to study artificial life. His version is completely different from mine, and you can find it &lt;a href="http://mitpress.mit.edu/books/chapters/0262290758chap75.pdf"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The code. &lt;/b&gt;&lt;a href="http://snipplr.com/view/44573/collatz-ecology-generator/"&gt;[download perl script from snipplr]&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_KC3QZzlY64A/TOklJ0Gz96I/AAAAAAAAAWA/qcpsOmzpRNQ/s1600/collatz10.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_KC3QZzlY64A/TOklJ0Gz96I/AAAAAAAAAWA/qcpsOmzpRNQ/s1600/collatz10.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;b&gt;Edit: &lt;/b&gt;I made a couple of minor edits to two sentences after posting.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20035359-1847268076622330935?l=highered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://highered.blogspot.com/feeds/1847268076622330935/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://highered.blogspot.com/2010/11/collatz-ecologies.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/1847268076622330935'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/20035359/posts/default/1847268076622330935'/><link rel='alternate' type='text/html' href='http://highered.blogspot.com/2010/11/collatz-ecologies.html' title='Collatz Ecologies'/><author><name>dave</name><uri>http://www.blogger.com/profile/08633920160358488401</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://3.bp.blogspot.com/-sugmOxPHNqo/Tk5cXuhcycI/AAAAAAAAAaY/dhBdJ17ZzlI/s220/self.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_KC3QZzlY64A/TOkS37kPX5I/AAAAAAAAAVw/xW6lCRQ6HL8/s72-c/collatz1.png' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-20035359.post-5052070486619561732</id><published>2010-11-20T08:06:00.001-05:00</published><updated>2010-11-20T16:31:14.541-05:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='strategy'/><category scheme='http://www.blogger.com/atom/ns#' term='decisions'/><category scheme='http://www.blogger.com/atom/ns#' term='planning'/><category scheme='http://www.blogger.com/atom/ns#' term='design'/><category scheme='http://www.blogger.com/atom/ns#' term='higher education'/><title type='text'>The Long View</title><content type='html'>In "&lt;a href="http://highered.blogspot.com/2010/11/on-design.html"&gt;On Design&lt;/a&gt;," I gave three versions of designing for an outcome: soft, forward, and inverse, in increasing degree of difficulty. The question "how difficult is inverse design?" is of utmost importance when we consider complex systems. As a real example, consider how the government of the United States is "designed." By this I mean, the way laws and policies are created and enforced. Because of conflicting goals of different constituents and the inherent difficulty of the project, there is no complete empirical language to describe a state of affairs, let alone do a forward simulation to see the status of the nation in, say, three years. And even if we &lt;i&gt;did&lt;/i&gt;&amp;nbsp;have such a language, we would be limited to simulation and prediction of only the "easiest" parameters. And even these would subject to the whims of entropy, a subject I'll take up later.&lt;br /&gt;&lt;br /&gt;I submit that individual governments, as well as companies, universities, and militaries use soft design with bits and pieces of forward and inverse design thrown in (for example trying to forecast near term economic conditions to help determine monetary policy).&lt;br /&gt;&lt;br /&gt;Government in the general sense has been "designed" by the process of being tossed into the blender of fate, to be tested by real events. See the following video of the history of Europe to see what I mean.&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;object height="385" width="480"&gt;&lt;param name="movie" value="http://www.youtube.com/v/I3HPIG_rUHQ?fs=1&amp;amp;hl=en_US&amp;amp;rel=0"&gt;&lt;/param&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;/param&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/I3HPIG_rUHQ?fs=1&amp;amp;hl=en_US&amp;amp;rel=0" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="480" height="385"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/div&gt;&lt;br /&gt;Natural selection would seem to be be at work here, weeding out the worst designs. But it's not Darwinian&amp;nbsp; because the countries change rapidly over time with the population and minds of leaders. One truly spectacular bad idea (like invading Russia, apparently) can bring down a whole nation. So what we are left with is a very temporary list of "least bad designs." Of course, many other factors are important, such as geography, natural resources, and so on. Even so, the people who live there still have to make use of such advantages. If Switzerland abandoned its natural mountain fortress and invaded Russia, it likely wouldn't end well.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Darwinian evolution&lt;/b&gt; is different from this national evolution. In the former, good solutions can be remembered and reused through genes or any other information passed from generation to generation. Diversity is created through recombination, mutation, population isolation, and so on. Darwinian evolution comes with an empirical language that we partly understand. To make a metaphor of it, "programs" are written in phenotypes and these "are computed by" the laws of physics and chemistry using the design and environment as "inputs." The fact that scientists can discover this language and use it to make predictions should be appreciated for the miracle that it is: we are witnesses to a dynamic but understandable problem-solving machine of enormous scope that has worked spectacularly well at producing&amp;nbsp;viable&amp;nbsp;designs &lt;i&gt;with only an empirical language&lt;/i&gt;. Evolution does not use predictive techniques (that is anticipating that a critter will need wings and therefore building them--for a dramatic example of this, see &lt;a href="http://www.youtube.com/watch?v=cO1a1Ek-HD0"&gt;this video&lt;/a&gt;). But there do exist creatures who &lt;i&gt;do&lt;/i&gt;&amp;nbsp;use forward and inverse design to plan their day. If you throw a ball at target, you're predicting. If you go to the fridge to get food, you're using the inverse technique: starting with the outcome (get food) and working back to the solution. But this still isn't good enough.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Here's the rub:&lt;/b&gt; forward design isn't enough to guarantee any more than short term outcomes, and our ability to do
