Higher Ed/: Do this, Don't do that

Sign, signEverywhere a signBlockin' out the sceneryBreakin' my mindDo this, don't do thatCan't you read the sign?

-"Signs" by Les Emmerson (Five Man Electrical Band)

A February 2023 Ezra Klein podcast pokes at his audience with the title "How Liberals — Yes, Liberals — Are Hobbling Government." He interviews Nick Bagley, a law professor at the University of Michigan, who has fascinating ideas about government regulation. Really.

The whole piece is great, but I want to extract a simple idea, that regulations can get in the way of regulation. That is, we can have too many rules for them to be effective in performing their advertised purpose, like maintaining a high quality educational system, or ensuring clean air and water. Bagley notes that small-government-leaning lawmakers can restrict the power of government by creating so many regulations that the functions of an office grind to a halt. But even when lawmakers have good intentions, the effect can be the same. The obvious question is how much regulation is enough?

Rules, Thin and Thick

I heard the podcast after having read Lorraine Daston's Rules: A Short History of What We Live By. If you want to get up to speed, you can read reviews in The Wall Street Journal, Law & Liberty, and The New Yorker. A key idea is Daston's conception of a "thick" rule, which can be succinct because it relies on the good judgment in enforcement (e.g. the golden rule). Exceptions and rare conditions can be made according to the characteristics of a given case. A "thin" rule, by contrast is very specific and meant to be followed absolutely (e.g. always stop at a stop sign). She observes that coinciding with the scientific revolution, and the absolute "laws of nature," thin rules have increasingly become the norm.

Thin rules appeal to objectivity and fairness ("justice is blind"), whereas thick rules depend on the wisdom and benevolence of rulers. We can see immediately that a population suspicious of its government would tilt toward thin rules. However, governments cannot be run on thin rules exclusively, as I'll discuss below.

For thin rules to work, we have to have a reliable means of knowing whether it is being followed or not. Games like chess have rules that pertain to a small number of game piece types (pawns, etc.) and a checkered board. The piece types are recognizable with respect to their roles, and the rules are unambiguously defined for every board position. This is quite different from a baseball umpire calling balls and strikes, where the rules are just as clear, but their adjudication is somewhat subjective. Chess rules are thinner than baseball's, not because of the way they are written but because of limitations in the reliability of identifying states (e.g. checkmate versus called strike). We may hear fans yelling "the ump is blind," but there's no equivalent for chess.

The first conclusion, then is that the effectiveness of regulation depends on the specificity of rules and how reliably we can determine the state of compliance. There are two possible ways we might proceed: (1) make rules thinner and thinner until we reach acceptable reliability, or (2) accept that some rules will be relatively thick. Unfortunately, the first option has severe problems.

Reliability

Imagine that some case of potential rule violation is being adjudicated, and we give this task to several independent groups who all use the same evidence for their respective deliberations. If they reach similar conclusions over many such cases, we'd conclude that the process is reliable. This can be formalized in a number of ways, including intuitive measures of "rater agreement." This is a subject I've done work on: see here and here. A more measurement-based approach is to compare the variation in the objects being observed to the variation in instrumentation. A speedometer that wiggles around while the car is moving at constant speed exhibits "noise" or "error," whence we get ideas like signal-to-noise ratios, systematic errors (when they are biased in one direction), and the educational measurement definition of reliability as the ratio of "good" variance to total variance.

In adjudicating rules, we might roughly divide reliability issues into perception and decision-making. The first concerns the quality of the information available, and the second refers to the potential arbitrariness of decisions (inter-rater agreement again).

Reliability may be associated with fairness, but too much reliability can be harmful, as with mandatory sentencing guidelines that leave no room for a judge's discretion. The illusion of reliability can also be harmful. For example, processes and paperwork can be verified as existing, but the reliable existence of reports, say, does not mean that those reports are meaningful signs of following rules. It is part of the human condition, I think to equate process reliability with process validity, but it's a bad assumption. This relates to the educational measurement idea of consequential validity, which we can simplify here to "does it really matter?"

Deterministic Rules

The thinnest possible rules are purely logical, like computer programs. That's fortunate, because a vast amount of research has been done on their limits. One of the foundational results, due to Alan Turing (1937), is called The Halting Problem, which can be extended to show that many of the questions we would like to ask of a computer program can't be answered, like will it crash?

Suppose you move to a new state, and the power company won't take your check without an in-state driver's license. But the DMV won't give you a driver's license without proving you live in state, proof being a power bill sent to your address. This is one way for a deterministic process to freeze up (see race conditions). There's no general way to analyze a system of rules to eliminate such problems. Software companies have systems for spotting and fixing problems as they arise (debugging), which means continual adjustment to the computer code. Libraries of known-good code are used as building blocks. But this is an uncertain human process, and not guaranteed to succeed.

In practice, software engineering is a lot planning followed by finding and fixing problems. There are different philosophies of design, as described in The Problem with Software: Why smart engineers write bad code. The author concludes that "[...] software developers seem to do everything in their power to make even the easy parts harder by wasting an inordinate amount of time on reinvention and inefficient approaches. A lot of mistakes are in fundamental areas that should be understood by now [...]" (see ACM review).

All of this to say that the problems of regulation are likely to multiply the thinner we make them. The more they resemble computer programming, the more the resulting system becomes bewilderingly complex and impossible to maintain. When my university's finance and human resources processes moved to an intricate online system, some issues that could have been resolved informally before suddenly had to find a solution in a system strictly defined by computer programs, so that a simple phone call to clear up confusion instead becomes a software "escalation event" to find a solution at a high enough level of abstraction to solve the problem. Thin rules multiply because the world is complex, constantly demanding new exceptions.

Theories of Regulation

The properties of regulation have not gone unnoticed by the academy. There are organizations devoted to the study of regulations, for example, this one, supported by the University of Florida. Here's a snip from that source:

Normative theories of regulation generally conclude that regulators should encourage competition where feasible, minimize the costs of information asymmetries by obtaining information and providing operators with incentives to improve their performance, provide for price structures that improve economic efficiency, and establish regulatory processes that provide for regulation under the law and independence, transparency, predictability, legitimacy, and credibility for the regulatory system.

The role of competition is to take burdens off of the regulator by delegating adjudication to the "real world," which allows a regulator to focus on creating a level playing field, so that "natural laws" of competition apply equally to all, like an umpire in a baseball game.

The same source describes the role of the human regulator, concluding with the advice that

A regulator should carefully map crucial relationships, know their natures, and build a strong regulatory agency. The regulator should also stir and steer, but always with humility, knowing that by stirring the pot the regulator is surfacing problems that others might think the regulator should leave alone, and that by steering the regulator is providing direction that policymakers and lawmakers properly see as theirs to provide, but which they cannot provide because of their limited information and knowledge.

This does not sound like computer programming at all, and it's clear that this conception of leadership leans on professional judgment and the desire to do good.

The thinnest rules I saw mentioned at this source concerned price controls, where rules and metrics mix. And there's an indirect reference to reliability in "Regulation is predictable if regulatory decisions are consistent over time so that stakeholders are able to anticipate how the regulator will resolve issues." Such consistency depends on (1) common understanding of rules, (2) reliable state identification, and (3) reliable adjudication. The middle part of that sandwich is likely to be a problem, because words seem concrete when we write them, but whatever bits of reality correspond to those words may be ill-defined.

Rules for Humans

Rules constraining human behavior are not like computer programs, because logic gates don't have internal motivations. As soon as motivation enters the picture, all this changes, including with computers. One of the first mechanical computers famously attracted a moth to the warm innards, causing errors in the output. To "debug" a computer was already a term of art, but this made it a literal process. I assume that the glowing vacuum tubes of the subsequent generation of computing machines attracted even more of the furry beasts, but Google has let me down there; it wants to sell me moth balls and vacuum cleaners.

One way that thin rules can be subverted is through motivated nominal adherence, where the regulation is followed to the letter, but not in spirit. Ergo the case of VW's car emissions. As described in Car & Driver,

Volkswagen installed emissions software on more than a half-million diesel cars in the U.S.—and roughly 10.5 million more worldwide—that allows them to sense the unique parameters of an emissions drive cycle set by the Environmental Protection Agency. [...]

In the test mode, the cars are fully compliant with all federal emissions levels. But when driving normally, the computer switches to a separate mode—significantly changing the fuel pressure, injection timing, exhaust-gas recirculation [permitting nitrogen-oxide emissions] up to 40 times higher than the federal limit.

The pattern of nominal compliance with rules, combined with self-serving side effects is a very old one, and the founding principle of classical Cynicism, where Diogenes was instructed by the oracle to "debase the coin of the realm." See my extended thoughts on the role of Cynicism in higher education here, and see Louisa Shea's excellent book The Cynical Enlightenment.

Nominal compliance is related to the idea of cheating, where the rules appear to be followed, but are either being secretly broken or are being followed in a way that subverts their larger purpose. There's an online forum devoted to stories of the latter case (https://www.reddit.com/r/MaliciousCompliance/). One of my favorite stories, however, comes from Richard Goodman's memoirs Remembering America. While serving in the army, he helped prepare for an inspection of the post and discovered that there was one more two-and-a-half ton truck in the motor pool than was listed on the manifest. Having this extra truck created quite a panic until Goodman came up with the solution. At night, a trusted crew dug a large pit and buried a truck, so it would never be seen by the inspectors and the count would be correct.

The point here is two-fold. First, no thin regulation will capture the overall intent of the rule: there's a gap between the "letter of the law" and the "spirit of the law," as we say. This gap can only be perceived and addressed by a thicker adjudication than the formal (thin) one. If this thick perception did not exist, we would never detect cheating or subversion.

A nice example of these layers of perception comes from Arlie Russell Hochschild's book Strangers In Their Own Land. Louisiana (finally) outlawed driving with unsealed containers of alcohol in 2004, but compliance can be nominal:

At a Caribbean Hut in Lake Charles, a satisfied customer reported ordering a 32-ounce Long Island Iced Tea with a few extra shots, a piece of Scotch tape placed over the straw hole--so it was "sealed"--and drove on (p. 67).

Second, rule enforcement in the face of adverse motivation is an evolving challenge. Arnold Kling (substack) wrote about this in a blog post "The Chess Game of Financial Regulation"

It turns out that financial regulation is not like a math problem, which can be solved once and stays solved. Instead, financial regulation is like a chess game, in which moves and counter-moves proceed continually, eventually changing the board in ways that players have not anticipated.

Effective regulation, then, is complex and entails second- and third-order thinking about the implications of the rules' likely changes to behavior that will subvert it. The Prohibition in the United States is an example of a massive regulatory failure because of such side effects.

Thick Rules

Not all rules can be thin, because not all rules can be reliably adjudicated. Facebook's moderation efforts, as documented in Steven Levy's very readable Facebook: The Inside Story provide several examples. Here's one: "lactivists" demonstrated at the company's headquarters to push for photos of breastfeeding moms to be exempt from the no-nudity moderation rules. Daston's book has sometimes-hilarious examples of attempts to use thin rules for thick purposes concerning sumptuary laws in France (e.g. "beaked shoes cannot have points longer than a finger's width"). Another of Arnold Kling's posts relates thin/thick rules (he doesn't use those terms, he talks about formal versus informal rules) to the sociology of small and large groups.

Daston's book describes the adjudication of thick rules as "casuistry," a process of looking at a complex case from all sides, considering rules in the context of exceptions to the rules. Like "Cynicism," the word casuistry has lost its potency; both are victims of modernity's regression to the mean (in German there are two words for the former to mark the change: Kynismus and Zynismus, see here). It's more useful to use them in a more original sense, where they are complementary in my reading:

Casuistry: the art of making judgments for thick rules by considering all angles
Cynicism: the art of revealing a deeper reality by subverting conventions.

Socrates defined humans as "featherless bipeds," whereupon Diogenes tossed a plucked chicken at his feet. Regulators assure us that banks are sound; depositors withdraw their money and demonstrate that "soundness" is just an appeal to have faith. The Ivy League uses SAT scores as proof of ability; rich parents pay smart kids to take the test for their kids.

Casuistry easily handles Cynical acts in adjudicating thick rules. We can see through the Varsity Blues deceptions and know that they are immoral. Thin rules are where Cynical attacks present the most challenges.

Our university accreditor requires that we list all teaching faculty along with the courses they teach and their qualifications for doing so. There are thin rules to handle most of the cases: having a terminal degree in biology means you can teach any undergraduate biology class. For exceptions, the review team must use casuistry (no one calls it that). A yoga teacher for a physical education class probably doesn't have an advanced degree in yoga, for example. I think this combination of thick and thin rules works well in this case. The main vulnerability is to a Cynical attack that omits or misrepresents the evidence. If institutions regularly engaged in such deception, the accreditation process would devolve to paperwork reliability without validity.

Conclusions

My first conclusion is that rules should only be as thin as they need to be, to avoid the measurement costs associated with the required reliability (thin, low-reliability rules are worse than useless). One way to audit an existing set of rules, then, is to examine the reliability of the "how do we know?" challenge. In the case of the emissions requirement, I assume the actual measurements were highly reliable, but they failed to generalize to the performance of cars on the road, creating a gap between perceived reliability (in the lab) and actual meaningfulness.

The reliability calculation is confounded by motivation, which leads me to the second conclusion: regulation will be most effective when the motivation to subvert it is minimized. For example, a law that taxes should be included in all retail prices would make everyone's life easier, and as long as everyone has to do this retailers don't have a motivation to subvert the rule. Similarly with the German rule (I'm told) to rotate which gas stations are allowed to be open on Sunday, so the staff get a break on the weekend, and on average no gas station loses revenue.

My intent in working through these ideas is to apply them to higher education policy at the levels I engage with, including accreditation. I think there are shared goals in the regulatory triad (states, accreditors, federal government), which suggests that a joint discovery of the appropriate thickness of rules will be productive. Shared goals and collaborative rule-making can ideally result in the amount of trust necessary to permit casuistry for thick rules when they are called for. That is, everyone must agree that in some cases it is acceptable to give up some reliability in order to achieve depth and reasonableness. This may or may not be achievable, but at least these ideas provide a vocabulary to describe what actually does happen.

This article was lightly edited on 5/21/2023, and I added the sealed container example.

Higher Ed/

Friday, March 17, 2023

Do this, Don't do that