Higher Ed/: causality

Showing posts with label causality. Show all posts

Tuesday, December 03, 2019

Causal Inference

Introduction

We all develop an intuitive sense of cause and effect in order to navigate the world. It turns out that formalizing exactly what "cause" means is complicated, and I think it's fair to say that there is not complete consensus. A few years ago, I played around with the simplest possible version of cause/effect to produce "Causal Interfaces," a project I'm still working on when I have time. For background I did a fair amount of reading in the philosophy of causality, but that wasn't very helpful. The most interesting work done in the field comes from Judea Pearl, and in particular I recommend his recent The Book of Why, which is a non-technical introduction to the key concepts.

One important question for this area of research is: how much can we say about causality from observational data? That is, without doing an experiment?

Causal Graphs

The primary conceptual aid in both philosophy and in Pearl's work is a diagram with circles and arrows--called a "graph" in math, despite the confusion this can cause. In particular, these are Directed Acyclic Graphs (DAGs), where a-cyclic means the arrows can't be followed to arrive back at your starting point (no cycles). This encompasses a large variety of types of problems we are interested in, including confounding scenarios.

If you want a quick primer on this subject, take a look at Fabian Dablander's "An introduction to causal inference." He does an excellent job at describing a hierarchy of knowledge about systems: look, do, imagine, which we could rephrase as observe, experiment, model.

Application

In the comments to Fabian's article there's a reference to an R package that calculates generalized (i.e. non-linear) correlation coefficients, which looks interesting but complicated. I found another R package called calg, which directly addresses the graph operations that are integral to Pearl's theory.

The image below comes from page 17 of the linked documentation. It's the result of simulating data with known properties, and then asking the algorithm to uncover the properties.

From calg documentation: left is the true model, right is the one discovered by the causal analysis

The diagram shows an example of the algorithms inducing the most likely causal structure when only given observational data. We can see that the structure is correct, but not all the arrow directions are discoverable from the data set (e.g. between 2 and 3).

This is fascinating stuff, and I look forward to trying it out on, say, retention analysis.

Monday, May 26, 2014

2014 AIR Forum: Correlation, Prediction, and Causation

I've put my slides for my AIR presentation on dropbox. You can access them here. The text of my remarks is included as comments under the slides.

Thursday, May 15, 2014

Why "Correlation doesn't imply Causation" isn't very sophisticated

At www.tylervigen.com you can find graphs of two variables that are correlated over time, but aren't plausibly causal. For example, the divorce rate in Maine versus margarine consumption. On his blog, David R. MacIver argues that coincidences like these are inevitable in large data sets. He's right, but there's a more fundamental problem with "correlation doesn't imply causation."

Causality is a widely discussed topic by researchers, and Judea Pearl gives a historical perspective here. Correlation is a statistic computed from paired data samples that assesses how linear the relationship is.

Causation is one-directional. If A causes B, we don't normally assume that B causes A too. The latter implication doesn't make sense because we insist on A preceding B. Correlation, however, is symmetrical--it can't distinguish between these two cases. A causing B or B causing A give the same numerical answer. In fact, we can think of the correlation coefficient as an average causal index over A => B and B => A [1, pg 15-16].

What we should really say is that "implication doesn't imply causation," meaning that if our data supports A => B, this doesn't necessarily mean that A causes B. If we observe people often putting on socks and then shoes (Socks => Shoes), it doesn't mean that it's causal. The causes ?? => socks and ??? => shoes may be related somehow, or it may just be a coincidence. (We can mostly rule out coincidence with experimentation.)

Everyone knows that even if A and B are highly correlated, it doesn't necessarily identify a causal relationship between the two, but it's even worse than that. A and B can have a correlation close zero, and A can still cause B. So correlation doesn't work in either direction.

Example: Suppose that S1 and S2 control a light bulb L, and are wired in parallel, so that closing either switch causes the light to be on. An experimenter who is unaware of S2 is randomly flipping S1 to see what happens. Unfortunately for her, S2 is closed 99% of the time, so that L is almost always on. During the remaining 1%, S1 perfectly controls L as an on/off interface. The correct conclusion is that closing S1 causes L to be on, but the correlation between the two is small. By contrast, the implication [S1 closed => L is on] is always true. Note that this is different from [S1 open => L is off]. The combination of the two is called an interface in [1], and methods are given to generate separate coefficients of causality.

This masking is very common. Your snow tires may work really well on snow, but if you live in Florida, you're not going to see much evidence of it. Because correlation is blind to the difference between [A => B] and [~A => ~B], it is an average indicator over the whole interface. It's heavily weighted by the conclusion that ~A does not imply ~B, and therefore the statistic doesn't accurately signal a causal connection.

One last problem with correlation I'll mention: it's not transitive the way we want causality to be. If A causes B and B causes C, we'd like to be able to reach some conclusion about A indirectly causing C. It's easy to produce examples of A and B having positive correlation and the same with B and C, but A and C have zero correlation.

Tomorrow I'll resume the "A Cynical Argument for the Liberal Arts" series with part seven.

[1] Eubanks, D.A. "Causal Interfaces," arXiv:1404.4884 [cs.AI]

Thursday, May 01, 2014

Cause and Because

My new article "Causal Interfaces" at arxiv.org is about disentangling cause and effect. Here's the abstract:

The interaction of two binary variables, assumed to be empirical observations, has three degrees of freedom when expressed as a matrix of frequencies. Usually, the size of causal influence of one variable on the other is calculated as a single value, as increase in recovery rate for a medical treatment, for example. We examine what is lost in this simplification, and propose using two interface constants to represent positive and negative implications separately. Given certain assumptions about non-causal outcomes, the set of resulting epistemologies is a continuum. We derive a variety of particular measures and contrast them with the one-dimensional index.

I was moved to finish the thing, which had been languishing on my computer, because of a deadline for the AIR forum in Orlando later this month. The title of my presentation there is "Correlation, Prediction, and Causation", with the program blurb below.

Everyone knows the mantra “correlation doesn’t imply causation,” but that doesn’t make the desire to find cause-effect relationships disappear! This session will address the relationship between correlation and prediction, and take up the philosophical question of what “causation” can be thought to mean, and how we can usefully talk to decision-makers about these issues. These ideas are immediately useful in analyzing and reporting information to decision-makers, and are both practical and optimistic. The goal is to answer the question “what’s the next best thing we can try to improve our situation?” There is some math involved, but it is not necessary to understand the main ideas.

In my review of literature, I turned up the tome Causality in the Sciences, which is pictured below, decorated by Greg in ILL. I'm not sure why it's upside down--some mysterious cause, no doubt.

As you can see, there is a lot to say on the subject, but there is one particular idea that seems to lie at the heart of cause-effect analysis. I didn't know about it until I read Judea Pearl's work. Here it is, in my words, not his.

If all we do is observe the world, we can never be sure what causes what because there always might be some ultimate cause hidden from us. If we watch Stanislav flip a light switch up and down and observe that a light goes on and off, this prompts the idea that the former causes the latter. But we can't be sure that Stanislav's circuit is not dead, and that in another room Nadia manipulates the live switch. Assume the two of them are timing their moves by the motion of a clock that we cannot observe. The claim that Stanislav's switch causes the light to illuminate is therefore called into doubt.

However, now imagine that we abandon our lazy recline and ask Stanislav if we can flip the switch ourselves. This we do randomly to eliminate coincidence with other variables. We have gone from observation to experiment. If the light's cycle still corresponds to the switch, then we can conclude that the switch causes the light to shine or not to shine.

In Pearl's publications, he uses a do() notation to show that a system variable is set experimentally rather than merely observed. This is a new element that cannot be accounted for in usual statistical methods.

My paper takes up the question of partial causality. Suppose the light corresponds to the switch some of the time, and not other times. What can we conclude in this case?

References can be found in the article linked above. Additionally, you may be interested in this reddit post on inferring cause in purely observational systems.