Causation and determinism

In this post I will try to disentangle the notions of determinism and causality, and suggest a different way to think of them. I came to think of these issues via the following informal assertion

The decay of a radioactive atom has no cause

I will not be discussing hidden variable theories nor Bell’s inequalities; I will assume outright that the phenomenon of radioactive decay is intrinsically random (as opposed to “apparent randomness” induced by ignorance), its quantum mechanical description is the most complete model possible; said model assigns probabilities to outcomes. With that out of the way, the usual argument that arrives at the above statement is

1) Radioactive decay is intrinsically random, indeterminate and cannot be predicted

2) There is no physical factor which determines whether a radioactive atom decays or not

3) Therefore, that a specific atom decays has no cause

Although the argument makes sense I am hesitant to accept 3) as is, and what it implies about how we think of causality.

Causality has been confusing minds for hundreds of years, it is a very difficult subject as evidenced by the volumes that have been written on it. So there’s not much point in trying to figure out what causality means exhaustively, via conceptual analysis in the tradition of analytic philosophy. Instead we will just quickly define causality using the mathematics of causal models, and see where that takes us for a specific scenario. In the context of these models, we will define causality according to two complementary questions:

A) what is meant by “the effect of a cause”

B) what is meant by “the cause of an effect”

Two types of causal models have been developed over the last thirty years, causal bayesian networks and structural causal models. These two formalisms are largely equivalent[4]; both make use of graph representations. Vertices in these graphs correspond to the variables under study whereas edges represent causal influences between the variables.  Guided by these graphs, one follows precise procedures to obtain mathematical expressions for causal queries over variables. These expressions are cast in the language of probability.

From [Pearl2000] page 15

In this post I will refer to structural causal models as described in Pearl’s Causality: Models, Reasoning, and Inference [Pearl2000]. To begin, we have the definition[2]


In the above, D is a directed graph whose edges represent causal influences; these influences are quantitavely specified by functions (structural equations) on variables. Finally, probabilities are assigned to variables not constrained by functions, these are exogenous variables.

The effect of a cause

Given a structural causal model, question A) can be answered with the following result



The difference E(Y | do(x’)) – E(Y | do(x”)) is sometimes taken as the definition of “causal effect” (Rosenbaum and Rubin 1983)

The causal effect of changing the variable x’ => x” on y is defined as the difference in expectation of the value that y will take. Note how the formalism includes explicit notation for interventions, do(x).

The cause of an effect

Question b) looks at it from a different point of view. Instead of asking what the effects of some cause are, we ask what the cause of some effect is; it’s a question of attribution. These questions naturally assume the form of counterfactuals (wikipedia):


A counterfactual conditional, is a conditional (or “if-then”) statement indicating what would be the case if its antecedent were true.

For example

If it were raining, then he would be inside.

Before you run off screaming “metaphysics!”, “non-falsifiability!” or other variants of hocus pocus, rest assured: counterfactuals have a clear empirical content. In fact, what grants counterfactuals their empirical content is the same assumption that allows confirmation of theories via experiments: that physical laws are invariant. Counterfactuals make predictions just the same way as experiments validate hypothesis. If I say

“if you had dropped the glass it would have accelerated downwards at g”

I am also saying that

“If you now drop the glass, it will accelerate downwards at g”

Given that I make the assumption that all relevant factors remain equal (ie, gravity has not suddenly disappeared).

The following result allows us to answer queries about counterfactuals[3]:


Once we have expressions for counterfactuals, we can answer questions of type B), with the following results. Note that these results are expressed in terms of counterfactuals, which is why one needs theorem 7.1 as a prerequisite.


This completes the brief listing of key results for the purposes of the discussion.


So what was the point of pasting in all these definitions without going into the details? The point is that given the formalisms of these models and their associated assumptions[5], we can think quantitatively about questions A) and B), without going into the nightmare of trying to figure out what causality “really means” from scratch. Our original criteria now have assumed a quantitative form:

A) The difference in expectation on some value Y when changing some variable X

B1)The probability that some variable X is a necessary requirement for the value of some observed variable Y

B2) The probability that some variable X is a sufficient requirement for the value of some observed variable Y

Thankfully our example of a radioactive atom is very simple compared to the applications causal models were designed for; for our purposes we do not need to work hard to identify the structure nor the probabilities  involved, these are given to us by the physics of nuclear decay.

Feynman diagram for Beta- decay (wikipedia)

Having said this, we construct a minimal model for eg. negative beta decay with the following two variables

r: The neutron-proton ratio, with values High, Normal, Low (using some arbitrary numerical threshold)

d: Whether β- decay occurs at some time t, with values True, False

Our questions, then, are

Q1) What is the causal effect of r=High on d?

Q2) What is the probability of necessity P(N) of r = High, relative to the observed effect d = True?

Q3) What is the probability of necessity P(S) of r = High, relative to the observed effect d = True?

In order to interpret the answers to the above questions we must first go into some more details about causality and the models we have used.

General causes, singular causes, and probabilities

Research into causality has distinguished two categories of causal claims:

General (or type-level) causal claims:

Drunk driving causes accidents.

Singular (or token-level) causal claims:

The light turning on was caused by me flipping the switch.

General claims describe overall patterns in events, singular claims describe specific events. This distinction brings us to another consideration. The language in which causal models yield expressions is that of probability. We have seen probabilities assigned to the value of some effect, as well as probabilities assigned to the statement that a cause is sufficient, or is necessary. But how do these probabilities arise?

Functional causal models are deterministic; the structural equations that describe causal mechanisms (graph arrows) yield unique values for variables as a function of their influences. On the other hand, the exogenous variables, those that are not specified within the model, but rather are inputs to it, have an associated uncertainty. Thus probabilities arise from our lack of knowledge about the exact values that these external conditions have. The epistemic uncertainty spreads from exogeneous variables throughout the rest of the model.

[Pearl2000] handles the general/singular dichotomy elegantly: there is no crisp border, rather there is a continuous spectrum as models range from general to specific, corresponding to how much uncertainty exists in the associated variables. A general causal claim is one where exogenous variables have wide probability distributions; as information is added these probabilities are tightened and the claim becomes singular. In the limit, there is no uncertainty, the model is deterministic.

We can go back to question 1) whose answer can be interpreted without much difficulty.

Q1) What is the causal effect of r=High (high nuclear ratio) on d (decay)?

If physics is correct, having a certain values for r will increase the expectaton of d being equal to True, relative to some other value for r. This becomes a general causal claim,

A1) High nuclear ratio causes Beta- decay

So, relative to our model, high nuclear ratio is a cause of Beta- decay. Note that we can say this despite the fact that decay is intrinsically indeterministic. Even though the probabilities are of a fundamentally different nature, the empirical content is indistinguishable from any other general claim with epistemic uncertainty. Hence, in this particular case determinism is not required to speak of causation.

The more controversial matter is attribution of cause for a singular indeterministic phenomenon, which is where we began.

3) A specific atom decay has no cause

This is addressed by questions 2) and 3).

Q2) What is the probability of necessity P(N) of r = High, relative to the observed effect d = True?

Q3) What is the probability of necessity P(S) of r = High, relative to the observed effect d = True?

Recall, functional causal models assign probabilities that arise from uncertainty in exogenous variables; this is what we see in definitions 9.2.1 and 9.2.2. The phrase “probability of sufficiency/necessity” conveys that sufficiency/necessity is a determinate property of the phenomenon, it’s just that we don’t have enough information to identify it. Therefore, in the singular limit these properties can be expressed as logical predicates

Sufficiency(C, E): Cause => Effect

Necessity(C, E): Effect => Cause

In the case of the decay of a specific atom at some time the causal claims become completely singular, definitions 9.2.1 and 9.2.2 reduce to evaluations of whether the above predicates hold. If we assume that atoms with low nuclear ratio do not undergo Beta- decay, our answers are:

A2) High nuclear ratio is a necessary cause of Beta- decay

A3) High nuclear ratio is not a sufficient cause of Beta- decay

Thus the truth of the statement that the decay of a radioactive atom has no cause depends on whether you are interested in sufficiency or necessity. In particular, that the atom would not have decayed were it not for its high nuclear ratio suggests this ratio was a cause of its decay.

But let’s make things more complicated, let’s say there is a small probability that atoms with low nuclear ratios show Beta- decay. We’d have to say that (remember, relative to our model) the decay of a specific atom at some time has no cause, because neither criterion of sufficiency or necessity is met.

The essence of causality, determinism?

We can continue to stretch the concept. Imagine that a specific nuclear ratio for a specific atom implied a 99.99% probability of decay at some time t, and also that said probability of decay for any other nuclear ratio were 0.001%. Would we still be comfortable saying that the decay of such an atom had no cause?

Singular indeterministic events are peculiar things. They behave according to probabilities, like those of general causation, but are fully specified, like instances of singular deterministic causation. Can we not just apply the methods and vocabulary of general causation to singular indeterministic events?

In fact, we can. We can modify functional causal models such that the underlying structural equations are stochastic, as mentioned in [Pearl2000] section 7.2.2. Another method found in [Steel2005] is to add un-physical exogenous variables that account for the outcomes of indeterministic events. Both of these can be swapped into regular functional models. This should yield equivalent definitions of 9.2.1 and 9.2.2, where probability of sufficiency and necessity are replaced with degrees, giving corresponding versions of A2) and A3).

In this approach, singular causation is not an all or nothing property, it is progressive. Just as general causal claims are expressed with epistemic probabilities, singular causal claims are expressed in terms of ontological probabilities. In this picture, saying that a particular radioactive decay had no cause would be wrong. Instead, perhaps we could say that a specific decay was “partially” or “mostly” caused by some property of that atom, rather than that there was no cause.

I believe this conception of causality is more informative. Throwing out causation just because probabilities are not 100% is excessive and misleading, it ignores regularities and discards information that has predictive content. The essence of causation, I believe, is not determinism, but counterfactual prediction, which banks on regularity, not certainty. It seems reasonable to extend the language we use for general causes onto singular ones, as their implications have the same empirical form. Both make probabilistic predictions, both can be tested.

No cause

What would it mean to say that some event has no cause, according to this interpretation? It would mean that an event is entirely unaffected by, and independent of, any of the universe’s state; no changes made anywhere would alter the probabilities we assign to its occurence. Such an event would be effectively “disconnected” or “transparent”.

Pollock – Lavender Mist Number 1

We could even imagine a completely causeless universe, where all events would be of this kind. It is not easy to see how such a strange place would look like. The most obvious possibility would be a chaotic universe, with no regularities. If we described such a universe as an n-dimensional (eg 3 + 1) collection of random variables, a causeless universe would exhibit zero interaction information, and zero intelligibility, as if every variable resulted of an independent coinflip. But it is not clear to me whether this scenario necessarily follows from a causeless  universe assumption.








[5] See eg causal markov condition, minimality, stability


The essential ingredient of causation, as argued in Pearl (2009:361) is responsiveness, namely, the capacity of some variables to respond to variations in other variables, regardless of how those variations came about.

[7] Laurea and her tolerance