Philosophy - John G. Halstead

Football without boundaries

Until recently, the offside rule stated that you are in an offside position if any part of the body you can score with is behind the last defender. You can’t score with your arm or hands, so they can’t make you offside. If any other part of you is behind the last defender, you’re in an offside position.

Harry Kane is offside here because he is behind the last defender when the ball is played forward to him.

Before 2019, we relied on hapless bald linesmen to make offside decisions. These linesmen would have to monitor when the ball was played forward and simultaneously whether the attacking player is in an offside position. This is a tough task because often the ball will be played forward from more than 20 yards away and some footballers (Ronaldo, Agbonlahor etc) are quick. Linesmen’s decisions were analysed to death in slow motion by clueless football pundits who, with the camaraderie of the changing room a distant memory, are typically clinically bored.

In 2019 the Premier League brought in the Video Assistant Referee (VAR). Humans were replaced by infallible robot referees housed in a chrome and granite bunker in west London guarded by armed Premier League drone swarms. The VAR process is utterly interminable: decisions are not referred to the European Court of Human Rights, but it feels like it. The new robot referees really don’t want to get it wrong lest they upset Gary Neville.

All of this means that the precise meaning of the offside rule really starts to matter. A lot of offside decisions are tight. To arrive at an answer in this case for example, the robot draws geometric lines at tangents off the contorted limbs of Tyrone Mings in a gross perversion of Da Vinci.

Is toothy Brazilian Bobby Firmino offside here? He can score with his shoulder, is his shoulder off? Where does his arm start again? I can score with my nose; can my nose be offside?

To Graham Souness, all of this is health and safety gone mad and can’t possibly be right. Plain old bloody common sense means that you can’t be a bloody millimetre offside.

The obvious solution: a thicker line. This will be introduced next season in the Premier League. The line may be up to 10cm thick, so we won’t get any of these nonsense decisions any more.

***

Picture Boris Johnson. Due to the pandemic and continual mortar shots from Dominic Cummings, he starts to develop progressive stress-induced male pattern baldness, shedding a hair every day. How many hairs would Bohnson have to lose to become bald? Is there a precise answer to this question? Intuitively, removing one hair from his head cannot be the difference between him being bald and not bald. But if we apply this principle to each hair on his head, then Bohnson would still not be bald even if he had no hair.

This is the sorites paradox, which exploits the vagueness of the term ‘bald’. Most important words are vague, so vagueness might be important. Notably, it might be vague at which point a foetus becomes a person, so vagueness might matter for abortion law.

The sorites can be stated more formally as follows:

Base step: A person with 100,000 hairs is not bald

Induction step: If a person with n hairs is not bald, then that person is also not bald with n-1 hairs.

Conclusion: Therefore, a person with 0 hairs is not bald.

The Base step is clearly true and the Conclusion is clearly false, so something must have gone wrong. The natural thing to do is to follow the logic where it leads and to say that the induction step is false. This is the approach that epistemicists take: there is a precise number of hairs we could take off Bohnson’s head at which he becomes bald where before he was not. The point is just that we cannot know where that point is. Our concepts are learned from clear cases – this man is fat, this man is bald – not quantitiative definitions – a man of 100kg is fat, a man with less than 10,000 hairs is bald. We are bamboozled by the fact that without us even realising it, these concepts draw sharp boundaries

The epistemicist approach accepts a crucial lesson I learned when studying philosophy: a boundary has no width.

Some theories of vagueness try to get round this by saying that there are clear cases and a zone of borderline cases in the middle for which it is neither true nor false that Bohnson is bald.

The most popular theory that takes this approach is known as supervaluationism (discussed here). There are several problems with this approach. Firstly, what does this say about the sorites paradox? Supervaluationism says that the induction step is false but that for any n you might pick, we will know that the claim isn’t true. So, if I were to say that Bohnson minus 50,000 hairs is not bald but Bohnson minus 50,001 hairs is bald, my claim would be neither true nor false. This would apply to any individual number I might pick. Weirdly, for the same reason, supervaluationism says that it is true that “Bohnson minus 50,001 hairs is either bald or not bald” but that it is not true that “Bohnson minus 50,001 hairs is bad” and not true that “Bohnson minus 50,001 hairs is not bald”. This looks like (and is) a contradiction.

Second, note that on the diagram above, the boundaries to the rectangular zone of borderline cases are sharp. So, on this approach, while there is no sharp transition from baldness to non-baldness, there is a sharp transition from borderline baldness to baldness. If so, where is it? Supervaluationism does posit sharp boundaries, but treats them differently to the sharp boundary between baldness and non-baldness which motivated the theory in the first place. This is known as ‘higher-order vagueness’. Boundary moving is not a good solution to vagueness.

***

Returning to the offside rule, the new thicker line is not really thick, it has just been moved 10cm back from the last defender. In the rules of football, players are offside or they aren’t; there is no purgatory of ‘borderline offside’ where we have a drop ball rather than a free kick. Since a boundary has no width, there just has to be a sharp line – moving it does not solve this problem. Next season, there will still be tight offsides – they will just be measured against a line that has been moved.

If a millimetre can’t be the difference between being offside and not, then, as in the sorites paradox, this implies that someone who is 6 yards offside is not offside. Each season, we will have to move the line. By 2029, the offside rule will have been abolished and goal hanging will be the norm.

According to top referee mandarins, the rationale for the ‘thicker’ line is that it restores “the benefit of the doubt in favour of the attacker”. But that was never the rule. The rule is and always has been crisp and clear – if you’re offside, you’re offside. There’s no mention of doubt. (Indeed, this is why offside decisions are not cases of vagueness).

Moreover, with VAR, doubt has been eradicated by machines. The decisions are not in doubt, but they are close. Pundits and managers object because the offside rule is now being enforced with previously unattainable accuracy. But the rule has always been that there is a sharp line, and it will ever be thus. If there is a sharp line, then players can be offside by a nose.

Creating a thicker line fails to come to terms with the fact that the world is full of sharp boundaries. The sorites paradox and VAR make us pay attention to these sharp boundaries. They offend common sense but they exist.

Clarity

I studied political philosophy at university. This meant I spent a lot of my time doing exegesis of John Rawls, a prominent philosopher. There was a particularly big exegesis industry surrounding Political Liberalism and related work. People just could not agree on what his arguments for political liberalism were.

This is a failing on Rawls’ part. Rule number one when making an argument is clarity. Smart well-informed people should not be left uncertain about what you are saying if they get to the end of your long book.

What difference can my emissions make?

Some people argue against strong action on climate change with the following reasoning. “Whatever we do in this country makes no difference because China produces more than that in a week”. This line of thought is mistaken. There are two ways of thinking about the damage from CO₂ emissions:

Cumulative Damage: The damage from CO₂ emissions is cumulative such that the costs of CO₂ increases with additional CO₂ emissions.
Passing Thresholds: The damage from CO₂ emissions stems from the risk that we pass tipping points, such as the melting of the Greenland ice sheet, or the burning of the Amazon rainforest.

On the Cumulative Damage view, the fact that China emits a lot more than the UK doesn’t matter for the question of whether the UK causes damage by emitting CO₂. Regardless of whether China emits or not, the UK’s emissions still cause damage. When one thinks about small amounts of emissions, such as one person might produce, the cumulative damage view may look counterintuitive. But it is not really. If my personal emissions cause millions of people to die 0.000000001 second earlier, this amount of time spread across millions of people can be substantial and amount to years of life lost.

On the Passing Thresholds view, the UK’s emissions might not in fact cause us to pass a dangerous threshold, and so might not in fact do damage. However, we are uncertain about when we will pass climate thresholds. We don’t know what amount of emissions could cause the permafrost to melt, or the Greenland ice sheet to melt, or the Amazon to burn. If passing a threshold has some cost X, then the expected cost of our emissions is given by the probability that our emissions cause us to pass that threshold p_c*X. So, on the Passing Thresholds view, our emissions do impose expected costs on society. It’s irrelevant that Chinese people are collectively pushing us much closer to the threshold than my personal emissions.

Fatal flaws of nonconsequentialism: rights that trump the common good

Almost all nonconsequentialists hold that people have rights against that may not be infringed simply because the consequences are better. For example, here is Peter Vallentyne:

“[I]ndividuals have certain rights that may not be infringed simply because the consequences are better. Unlike prudential rationality, morality involves many distinct centers of will (choice) or interests, and these cannot simply be lumped together and traded off against each other.

The basic problem with standard versions of core consequentialism is that they fail to recognize adequately the normative separateness of persons. Psychological autonomous beings (as well, perhaps, as other beings with moral standing) are not merely means for the promotion of value. They must be respected and honored, and this means that at least sometimes certain things may not be done to them, even though this promotes value overall. An innocent person may not be killed against her will, for example, in order to make a million happy people significantly happier. This would be sacrificing her for the benefit of others.” (Vallentyne in Norcross)

1. Justifications for rights

Rights are often defended with claims about the separateness of persons:

There is no social entity with a good that undergoes some sacrifice for its own good. There are only individual people, different individual people, with their own individual lives. Using one of these people for the benefit of others, uses him and benefits the others. Nothing more. What happens is that something is done to him for the sake of others. Talk of an overall social good covers this up. (Intentionally?) To use a person in this way does not sufficiently respect and take account of the fact that he is a separate person, that his is the only life he has. (Nozick in Norcross)

One can find similar defences of the separateness of persons by Rawls, Nagel, Gauthier and other nonconsequentialist luminaries.

Vallentyne appeals to the apparently distinct idea that individuals “must be respected and honoured” as an argument for rights. Some also defend it by appealing to the Kantian idea that to sacrifice one for many treats people as a means, and fails to recognise their status as an end in themselves.

As a result, nonconsequentialists, along with most people, think that it is impermissible for a doctor to kill one person and harvest their organs to save five other people. They think that we may never punish the innocent even if doing so is for the greater good. The reason is that the one person has a right not to be killed or punished, even if doing so produces better consequences overall.

2. An absolute prohibition?

One natural initial interpretation of claims about rights is that they imply an absolute prohibition on violation of the right regardless of the consequences. So, we may never kill one person even to save one million people from dying or from being tortured for years.

Problems

There are several problems with rights absolutism.

Counterintuitive

This is extremely counterintuitive. This is why, with the exception of John Taurek and a handful of others, few nonconsequentialists actually endorse the absolutist position.

Risk

Secondly, as Michael Huemer argues here, absolutist theories run into problems when they have to deal with risk. Ok, we may never punish the innocent for the greater good. But can we punish someone with a 0.0001% chance of being innocent for the greater good? If not, then we need to say goodbye to the criminal justice system. We know for a fact that the criminal justice system punishes lots of innocent people every year. I am not pointing to corruption or bureaucratic ineptitude. The point is just that an infallible legal system is practically impossible. So, even a legal system in some advanced social democracy like Sweden is going to punish lots and lots of innocent people every year: we can never be 100% certain that those we imprison are guilty.

Similarly, by driving, you impose a nonzero risk of death on others by causing a car accident. Does this mean that driving is never permissible?

Near certain harms

In fact, as Will MacAskill argues, by driving, you, with almost 100% certainty, cause some people to die by causally affecting traffic flow – you pulling into the road will through some distant causal chain change the identity of who is killed in a car crash. Does this mean that driving is never permissible? To reiterate, this isn’t about imposing a small risk of harm, it is about knowingly and with near-certainty changing the identity of who is killed through a positive action that you take. If you say that this doesn’t matter because the net harms are the same, then welcome to the consequentialist club.

3. Moderate nonconsequentialism

One solution to the first two problems is to give up on absolutism. Huemer proposes that the existence of a right has the effect of raising the standards for justifying a harm. That is, it’s harder to justify a rights-violating harm than an ordinary, non-rights-violating harm. E.g., you might need to have expected benefits many times greater than the harm. Huemer writes:

“This view has a coherent response to risk. The requirements for justification are simply discounted by the probability. So, suppose that, to justify killing an innocent person, it would be necessary to have (expected) benefits equal to saving 1,000 lives. (I don’t know what the correct ratio should be.) Then, to justify imposing a 1% risk of killing an innocent person, it would be necessary to have expected benefits equal to saving 10 lives (= (1%)(1,000)).” [my emphasis]

This avoids problems with risk and also offers a way out of the counter-intuitiveness of saying that we may never sacrifice one person even if we can thereby prevent billions from being tortured.

Problems

Inconsistent with justification for rights

The first problem with this is that it is inconsistent with the justifications for rights offered above. To say that one can be sacrificed for the many is to fail to recognise “the normative separateness of persons”, to act as though “people’s interests can be traded off against each other”. Ok, but why can people’s interests be traded off for 1,001 lives? The separateness of persons sounds like a claim to the effect that we can never make interpersonal trade-offs. If it isn’t this, I don’t know what it means. If it means that the standards for inflicting harm on others is raised to 1,000 lives, then the separateness of persons is merely an elaborate and rhetorical way of redescribing the intuition that people have non-absolute rights. Arguments from the separateness of persons entail absolutism, not moderate deontology.

Similarly, where does this leave the argument that respecting and honouring an individual means that we cannot sacrifice them for the greater good? If the idea of respect does some work in the argument, why does respect stop at 1,000 lives? What if I respond as a typical consequentialist and say that respecting and honouring an individual means giving their interests equal weight to others. One counts for one, so more count for more. So, we can sacrifice one for many. What would count as an argument against this from ‘respect’? Would it be to just restate that respect requires that the standard for inflicting harm is raised to 1,000 lives. If so, again, the appeal to rights just seems to be an elaborate rhetorical way to redescribe the intuition that people have non-absolute rights.

What about the idea that people should be treated as an end and not as a means? On the most natural interpretation of this claim, it means that we must never impose costs on people for the greater good. Why does sacrificing someone for 1,001 people not treat them as a means? If the answer is that their interests were considered but they were outweighed, then why can’t we use that argument when deciding whether to sacrifice 1 person for 2 people? Again, the appeal to the idea that people are an end just seems to be an elaborate rhetorical redescription of the intuition that people have non-absolute rights.

(There is a theme emerging here, which I will return to in a later post).

What is the threshold?

The second problem is related to the first. In the quote above, Huemer says “I don’t know what the correct threshold should be”. Can this question be resolved, then, by further inquiry? What would such an argument look like? Consequentialists have a coherent and compelling account of these cases. We consider each person’s interests equally. Sacrificing 1 for 2 produces more of what we ultimately care about, so we should save 2. Saving 1,000 is even better!

What could be the nonconsequentialist argument for the threshold being 6 lives, 50 lives, 948 lives, 1 million lives or 1 billion lives? This is not a case of vagueness where there are clear cases at some ends of the scale and then a fuzzy boundary. It is not like the question – how many hairs do we have to remove before someone becomes bald? There are clearly answers at either end of the spectrum here: remove 10,000 hairs, someone is clearly bald, remove 5 clearly not. The rights threshold isn’t like this. I genuinely do not know what arguments you could use in favour of a 1,000 person threshold vs a 1 billion people threshold. We’re not uncertain about a fuzzy boundary case, rather there seem to be no criteria telling us how to decide between any possible answer.

As we have seen above, the tools in the nonconsequentialist toolkit don’t seem like they will be much help. The reason for this is that the heart of the nonconsequentialist project is to ignore how good certain actions are. Rights are not grounded in how good they would be for anyone’s life – they’re prohibitions that are independent of their service of welfare. I say “We will be able to produce more welfare if we use our healthcare resources to improve the health of other people rather than keep this 90 year old alive for another week.” The nonconsequentialist retort is “the 90 year old has a right to health”. Where does this leave us? He has a right to treatment that doesn’t seem to be grounded in anything, especially not something that can be compared and prioritised.

Return to the threshold. Maybe one answer is simply intuition. Maybe people have the intuition that the threshold is 1,000 and that is good enough.

Several things may be said here. Firstly, nonconsequentialists themselves implicitly deny that this kind of argument is good enough. That is why they try to build theories that justify where the threshold should be, just as they try to justify why people have rights in the first place. In truth, I would prefer the entirely intuition-led approach because it is more honest and transparent.

Secondly, this is the kind of thing about which there would be massive disagreement among nonconsequentialists. I would wager that some people will answer 100, some 1,000, some civilisation collapse, and some will endorse no threshold (Taurek). Since no arguments can be brought to bear, how do we decide who is right? Do we vote? Moreover, if we are apprehending an independent moral reality, why would there be such disagreement among smart people that cannot be resolved by further argument?

The better explanation is that this is an ad hoc modification erected to save a theory that cannot, in the end, be saved. I would expect that if people really believed that persons are separate, need to be respected, not treated as a means, and so on, there would be much more people who end up in Taurek’s position of denying any trade-offs. I would expect moral philosophy to be divided between utilitarians and Taurekians who refuse to leave the house lest they risk a car accident. The world is not like this, so I don’t think people actually believe these claims.

Not a response to near-certain harms

Moderate non-consequentialism is not a response to the near-certain harms objection.

Summing up

Rights were initially defended with what seemed to be arguments with premises and a conclusion: separateness of persons therefore rights; people as an end therefore rights; respect therefore rights. The implications of these arguments are so unpalatable that almost no nonconsequentialists actually accept them. In the end, they endorse something more moderate which is inconsistent with the arguments that they initially appealed to. Moreover, on closer examination the arguments seemed merely to be elaborate rhetorical redescriptions of the intuition that people have rights. Until better arguments are forthcoming, this looks like a good reason to believe that people do not have rights that trump the common good.

The simple case for Bayesian statistics

There is a debate among some scientists, philosophers of science and statisticians about which of frequentist statistics and Bayesian statistics is correct. Here is a simple case for Bayesian statistics.

1. Everyone agrees that Bayes theorem is true

Bayes theorem is stated mathematically with the following theorem:

As far as I know, everyone accepts that Bayes’ theorem is true. It is a theorem with a mathematical proof.

2. The probability of the hypothesis given the data is what we should care about

When we are developing credences in a hypothesis, H, what we should ultimately care about is the probability of H given the data, D, that we actually have. This is what is on the left hand side of the equation above. Here ends the defence of Bayesian statistics; no further argument is needed. Either you deny a mathematical proof or you deny that we should form beliefs on the basis of the evidence you have. Neither is acceptable, so Bayesian statistics is correct. This argument is straightforward and there should no longer be any debate about it.

Appendix. Contrast to frequentism

(Note again that the argument for Bayesian statistics is over, this is just to make the contrast to frequentism clear.) In contrast to Bayesianism, frequentist statistics using p-values asks us:

Assuming that the hypothesis H is false, what is the probability of obtaining a result equal to or more extreme that the one you in fact observed?

What you actually care about is how likely the hypothesis is, given the data, rather than the above question. So, you should not form beliefs on the basis of p-values. Whether a Bayesian prior is ‘subjective’ or not, it is necessary to form rational beliefs given the evidence we have.

Aggregating health and future people

Argument: There are clear parallels between common sense intuitions about cases involving a large number of people each with small amounts of welfare have the same intuitive cause. If one aims to construct a theory defending these common sense intuitions, it should plausibly be applicable to these different cases. Some theories fail this test.

What ought you to do in the following cases?

Case 1. You can bring into existence a world (A) of 1 million very happy people or a world (B) of 100 quadrillion people with very low, but positive welfare.

Case 2. You can cure (C) James of a terminal illness, or (D) cure one quadrillion people of a moderate headache lasting one day.

Some people argue that you ought to choose options (B) and (D). Call these the ‘repugnant intuitions’. One rationale for these intuitions is that the value of these states of affairs is a function of the aggregate welfare of each individual. Each small amount of welfare adds up across persons and the contribution of each small amount of welfare does not diminish, such that due to the size of the populations involved options (B) and (D) have colossal value, which outweighs that of (A) and (C) respectively. The most notable theory supporting this line of reasoning is total utilitarianism.

Common sense dictates the ‘non-repugnant intuitions’ about cases 1 and 2: that we ought to choose (A) and (C). Philosophical theories have been constructed to defend common sense on this front, but they usually deal with cases 1 and 2 separately, in spite of the obvious parallels between them. In both cases, we face a choice between giving each of a massive number of people a small amount of welfare, and giving large amounts of welfare to each of a much smaller number of people. In both cases, the root of the supposed counterintuitiveness of the aggregationist moral view is that it aggregates small amounts of welfare across very large numbers of people to the extent that this outweighs a smaller number of people having large welfare.

Are there any differences between these two cases that could justify trying to get to the non-repugnant intuitions using different theoretical tools? I do not think so. It might be argued that the crucial difference is that in case 1 we are choosing between possible future people, whereas in case 2 we are choosing how to benefit groups of already existing people. But this is not a good reason to treat them differently, assuming that one’s aim is to get the non-repugnant intuitions for cases 1 and 2. Standard person-affecting views imply that (A) and (B) are incomparable and therefore that we ought to be indifferent between them and are therefore permitted to choose either. But the non-repugnant intuition is that (A) is better than (B) and/or that we ought to choose (A). Person-affecting views don’t get the required non-repugnant conclusions dictated by common sense.

Moreover, there are present generation analogues of the repugnant conclusion, which seem repugnant for the same reason.

Case 3. Suppose that we have to choose between (E) saving the lives of 1 million very happy people, and (F) saving the lives of 100 quadrillion people with very low but positive welfare.

Insofar as I am able to grasp repugnance-intuitions, the conclusion that we ought to choose F is just as repugnant as the conclusion that we ought to choose B, and for the same reason. But in this case, future generations are out of the picture, so cannot explain differential treatment of the problem.

In sum, the intuitive repugnance in all three cases is rooted in the counterintuitiveness of aggregating small amounts of welfare, and is only incidentally and contingently related to population ethics.

If the foregoing argument is correct, then we would expect theories that are designed to produce the non-repugnant verdicts in these cases to be structurally similar, and for any differences to be explained by relevant differences between the cases. One prominent theory of population ethics fails this test: critical level utilitarianism (CLU). CLU is a theory that tries to get a non-repugnant answer for case 1. On CLU, the contribution a person makes to the value of a state of affairs is equal to that person’s welfare level minus some positive constant K. A person increases the value of a world if her welfare is above K and decreases it if it her welfare level is below K. So, people with very low but positive welfare do not add value to the world. Therefore, world B has negative value and world A is better than B. This gets us the non-repugnant answer in case 1.

CLU has implications for case 2. However, it is interesting to explore an analogue critical level theory constructed exclusively to produce non-repugnant intuitions about case 2. How would this theory work? It would imply that the contributory value of providing a benefit to a person is equal to the size of the benefit minus a positive constant K. So, the contributory value of curing Sandra’s moderate headache is the value of that to Sandra – let’s say 5 utils – minus K, where K>5. In this case, curing Sandra’s headache would have negative contributory value; it would make the world worse.

The analogue-CLU theory for case 2 is crazy. Clearly, curing Sandra’s headache does not make the world worse. This casts doubt on CLU in general. Firstly, these theories both try to arrive at non-repugnant answers for cases 1 and 2, and the non-repugnant intuition for each case has the same explanation (discussed above). Thus, it needs to be explained why the theoretical solution to each problem should be different – why does a critical level make sense for case 1 but not for case 2? In the absence of such an explanation, we have good reason to doubt critical level approaches in general.

This brings me to the second point. In my view, the most compelling explanation for why a critical level approach clearly fails in one case but not the other is that the critical level approach to case 1 exploits our tendency to underrate low quality lives, but that an analogous bias is not at play in case 2.

When we imagine a low quality life, we may be unsure what its welfare level is. We may be unsure what constitutes utility, how to weight good experiences of different kinds, how to weight good experiences against bad experiences, and so on. In light of this, assessing the welfare level of a life that lasts for years would be especially difficult. We may therefore easily mistake a life with welfare level -1, for example, for one with welfare level 2. According to advocates of repugnant intuitions, the ability to distinguish such alternatives would be crucial for evaluating an imagined world of low average utility: it would be the difference between world B having extremely large positive value and world B having extremely large negative value.^[1]

Thus, it is very easy to wrongly judge that a low positive welfare life is bad. But one cannot plausibly claim that curing a headache is bad. The value of curing a day-long moderate headache is intuitively easy to grasp: we have all experienced moderate headaches, we know they are bad, and we know what it would be like for one to last a day. This explains why the critical level approach is clearly implausible in one case but not the other: it is mistaken about case 2 because it underrates low quality lives, but this bias is not at play in case 1. Thus, we have good reason to doubt CLU as a theory of population ethics.

The following general principle seems to follow. If our aim is to theoretically justify non-repugnant intuitions for cases 1 and 2, then one theory should do the job. If the exact analogue of one theory is completely implausible for one of the cases, that should lead us to question whether the theory can be true for the other case.

[1] Huemer, ‘In defence of repugnance’, Mind, 2008, p.910.

Where should anti-paternalists donate?

GiveDirectly gives out unconditional cash transfers to some of the poorest people in the world. It’s clearly an outstanding organisation that is exceptionally data driven and transparent. However, according to GiveWell’s cost-effectiveness estimates (which represent a weighted average of the diverse views of GiveWell staffers), it is significantly less cost-effective than other recommended charities. For example, the Against Malaria Foundation (AMF) is ~4 times as cost-effective, and Deworm the World (DtW) is ~10 times as cost-effective. This is a big difference in terms of welfare. (The welfare can derive from averting deaths, preventing illness, increasing consumption, etc).

One prima facie reason to donate to GiveDirectly in spite of this, suggested by e.g. Matt Zwolinski and Dustin Moskovitz, is that it is not paternalistic.[1] Roughly: giving recipients cash respects their autonomy by allowing them to choose what good to buy, whereas giving recipients bednets or deworming drugs makes the choice for them in the name of enhancing their welfare. On the version of the anti-paternalism argument I’m considering, paternalism is non-instrumentally bad, i.e. it is bad regardless of whether it produces bad outcomes.

I’ll attempt to rebut the argument from anti-paternalism with two main arguments.

(i) Reasonable anti-paternalists should value welfare to some extent. Since bednets and deworming are so much more cost-effective than GiveDirectly, only someone who put a very high, arguably implausible, weight on anti-paternalism would support GiveDirectly.

(ii) More importantly, the premise that GiveDirectly is much better from an anti-paternalistic perspective probably does not hold. My main arguments here are that: the vast majority of beneficiaries of deworming and bednets are children; deworming and bednets yield cash benefits for others that probably exceed the direct and indirect benefits of cash transfers; and the health benefits of deworming and bednets produce long-term autonomy benefits.

Some of the arguments made here have been discussed before e.g. by Will MacAskill and GiveWell, but I think it’s useful to have all the arguments brought together in one place.

It is important to bear in mind in what follows that according to GiveWell, their cost-effectiveness estimates are highly uncertain, not meant to be taken literally, and that the outcomes are very sensitive to different assumptions. Nonetheless, for the purposes of this post, I assume that the cost-effectiveness estimates are representative of the actual relative cost-effectiveness of these interventions, noting that some of my conclusions may not hold if this assumption is relaxed.

What is paternalism and why is it bad?

A sketch of the paternalism argument for cash transfers goes as follows:

Anti-malaria and deworming charities offer recipients a specific good, rather than giving them the cash and allowing them to buy whatever they want. This is justified by the fact that anti-malaria and deworming charities enhance recipients’ welfare more than cash. Thus, donating to anti-malaria or deworming charities to some extent bypasses the autonomous judgement of recipients in the name of enhancing their welfare. Thus, anti-malaria and deworming charities are more paternalistic than GiveDirectly.

This kind of paternalism, the argument goes, is non-instrumentally bad: even if deworming and anti-malaria charities in fact produce more welfare, their relative paternalism counts against them. Paternalism is often justified by appeal to the value of autonomy. Autonomy is roughly the capacity for self-governance; it is the ability to decide for oneself and pursue one’s own chosen projects.

Even if the argument outlined in this section is sound, deworming and bednets improve the autonomy of recipients relative to no aid because they give them additional opportunities which they may take or decline if they (or their parents) wish. Giving people new opportunities and options is widely agreed to be autonomy-enhancing. This marks out an important difference between these and other welfare-enhancing interventions. For example, tobacco taxes reduce the (short-term) autonomy and liberty of those subject to them by using threats of force to encourage a welfare-enhancing behaviour.

How bad is paternalism?

Even if one accepted the argument in section 1, this would only show that donating to GiveDirectly is less paternalistic than donating to bednets or deworming. This does not necessarily entail that anti-paternalists ought to donate to GiveDirectly. Whether that’s true depends on how we ought to trade off paternalism and welfare. With respect to AMF for example, paternalism would have to be bad enough that it is worth losing ~75% of the welfare gains from a donation; with respect to DtW, ~90%.

It might be argued that anti-paternalism has ‘trumping’ force such that it always triumphs over welfarist considerations. However, ‘trumping’ is usually reserved for rights violations, and neither deworming nor anti-malaria charities violates rights. So, trumping is hard to justify here.

Nonetheless, it’s difficult to say what weight anti-paternalism should have and giving it very large weight would, if the argument in section 1 works, push one towards donating to GiveDirectly. However, there are a number of reasons to believe that donating to deworming and bednets is actually attractive from an anti-paternalistic point of view.

Are anti-malaria and deworming charities paternalistic?

(a) The main beneficiaries are children

Mass deworming programmes overwhelmingly target children. According to GiveWell’s cost-effectiveness model, 100% of DtW’s recipients are children, Sightsavers ~90%, and SCI ~85%. Around a third of the modelled benefits of bednets derive from preventing deaths of under 5s, and around a third from developmental benefits to children. The final third of the modelled benefits derive from preventing deaths of people aged 5 and over. Thus, the vast majority (>66%) of the modelled benefits of bednets accrue to children under the age of 15, though it is unclear what the overall proportion is because GiveWell does not break down the ‘over 5 mortality’ estimate.

Paternalism for children is widely agreed to be justified. The concern with bednets and deworming must then stem from the extent to which they are paternalistic with respect to adults.^[2]

In general, this shows that deworming and anti-malaria charities do a small or zero amount of objectionable paternalism. So, paternalism would have to be very very bad to justify donating to GiveDirectly. Moreover, anti-paternalists can play it safe by donating to DtW, which does not target adults at all.

This alone shows that anti-paternalism provides weak or zero additional reason to donate to cash transfer charities, rather than deworming or anti-malaria charities.

(b) Positive Externalities

Deworming drugs and bednets probably produce substantial positive externalities. Some of these come in the form of health benefits to others. According to GiveWell, there is pretty good evidence that there are community-level health benefits to bednets: giving A a bednet reduces his malaria risk, as well as his neighbour B’s. However, justifying giving A a bednet on the basis that it provides health benefits to B is more paternalistic towards B than giving her the cash, for the reasons outlined in section 1.

However, by saving lives and making people more productive, deworming and bednets are also likely to produce large monetary positive externalities over the long term. According to a weighted average of GiveWell staffers, for the same money, one can save ~10 equivalent lives by donating to DtW, but ~1 equivalent life by donating to GiveDirectly. (An ‘equivalent life’ is based on the “DALYs per death of a young child averted” input each GiveWell staffer uses. What a life saved equivalent represents will therefore vary between staffers because they are likely to adopt different value assumptions).

What are the indirect monetary benefits of all the health and mortality benefits that constitute these extra ‘equivalent lives’? I’m not sure if there’s hard quantitative evidence on this, but for what it’s worth, GiveWell believes that “If one believes that, on average, people tend to accomplish good when they become more empowered, it’s conceivable that the indirect benefits of one’s giving swamp the first-order effects”. What GiveWell is saying here is as follows. “Suppose that the direct benefits of a $1k donation are x. If people accomplish good when they are empowered, the indirect benefits of this $1k are plausibly >x.” If this is true, then what if the direct benefits are 10*x? This must make it very likely that the indirect benefits >>x.

So, given certain plausible assumptions, it’s plausible that the indirect monetary benefits of deworming and bednets exceed the direct and indirect monetary benefits of cash transfers. DtW and AMF are like indirect GiveDirectlys: they ensure that lots of people receive large cash dividends down the line.

As I argued in section 1, providing bednets and deworming drugs is autonomy-enhancing relative to no aid: it adds autonomy to the world. If, as I’ve suggested, bednets and deworming also produce larger overall cash benefits than GiveDirectly, then bednets and deworming dominate cash transfers in terms of autonomy-production. One possible counter to this is to discount the autonomy-enhancements brought about by future cash. I briefly discuss discounting future autonomy in (c).

This shows that anti-paternalists should arguably prefer deworming or anti-malaria charities to GiveDirectly, other things equal.

Short-term paternalism can enhance not only the welfare but also the long-term autonomy of an individual. For the same amount of money, one can save 10 equivalent lives by donating to DtW vs. 1 equivalent life by donating to GiveDirectly. The morbidity and mortality benefits that constitute these equivalent lives enable people to pursue their own autonomously chosen projects. It’s very plausible that this produces more autonomy than providing these benefits only to one person. Anti-paternalists who ultimately aim to maximise overall autonomy therefore have reason to favour deworming and bednets over GiveDirectly.

Some anti-paternalists may not want to maximise overall autonomy. Rather, they may argue that we should maximise autonomy with respect to some specific near-term choices. When we are deciding what to do with $100, we should maximise autonomy with respect to that $100. So, we should give them $100 rather than using the $100 to buy bednets.

This argument shows that how one justifies anti-paternalism is important. If you’re concerned with the overall long-term autonomy of recipients, you have reason to favour bednets or deworming. If you’re especially concerned with near-term autonomy over a particular subset of choices, the case for GiveDirectly is a bit stronger, but still probably defeated by argument (a).

(d) Missing markets

Deworming charities receive deworming drugs at subsidised prices from drug companies. Deworming charities can also take advantage of economies of scale in order to make the cost per treatment very low – around $0.50. I’m not sure how much it would cost recipients to purchase deworming drugs at market rates, but it seems likely to be much higher than $0.50. Similar things are likely true of bednets. The market cost of bednets is likely to be much greater than what it would cost AMF to get one. Indeed, GiveWell mentions some anecdotal evidence that the long-lasting insecticide-treated bednets that AMF gives out are simply not available in local markets.

From the point of view of anti-paternalists, this is arguably important if the following is true: recipients would have purchased bednets or deworming drugs if they were available at the cost that AMF and DtW pay for them. Suppose that if Mike could buy a bednet for the same price that AMF can deliver them – about $5 – he would buy one, but that they aren’t available at anywhere near that price. If this were true, then giving Mike cash would deprive him of an option he autonomously prefers, and therefore ought to be avoided by anti-paternalists. This shows that cash is not necessarily the best way to leave it to the individual – it all depends on what you can do with cash.

However, the limited evidence may suggest that most recipients would not in fact buy deworming drugs or bednets even if they were available at the price at which deworming and anti-malaria charities can get them. This may in part be because recipients expect to get them for free. However, Poor Economics outlines a lot of evidence showing that the very poor do not spend their money in the most welfare-enhancing way possible. (Neither do the very rich). The paper ‘Testing Paternalism’ presents some evidence in the other direction.

In sum, for anti-paternalists, concerns about missing markets may have limited force.

Conclusion

Deworming and anti-malaria charities target children, probably provide large long-term indirect monetary benefits, and enhance the long-term autonomy of beneficiaries. This suggests that anti-paternalism provides at best very weak reasons to donate to GiveDirectly over deworming and anti-malaria charities, and may favour deworming and anti-malaria charities, depending on how anti-paternalism is justified. Concerns about missing markets for deworming drugs and bednets may also count against cash transfers to some extent.

Nonetheless, even if GiveDirectly is less cost-effective than other charities, there may be other reasons to donate to GiveDirectly. One could for example argue, as George Howlett does, that GiveDirectly promises substantial systemic benefits and that its model is a great way to attract more people to the idea of effective charity.

Thanks to Catherine Hollander, James Snowden, Stefan Schubert, Michael Plant for thorough and very helpful comments.

[1] See this excellent discussion of paternalism by the philosopher Gerald Dworkin.

[2] It’s an interesting and difficult question what we are permitted to do to parents in order to help their children. We can discuss this in the comments.

The asymmetry and the far future

TL;DR: One way to justify support for causes which mainly promise near-term but not far future benefits, such as global development and animal welfare, is the ‘intuition of neutrality’: adding possible future people with positive welfare does not add value to the world. Most people who endorse claims like this also endorse ‘the asymmetry’: adding possible future people with negative welfare subtracts value from the world. However, asymmetric neutralist views are under significant pressure to accept that steering the long-run future is overwhelmingly important. In short, given some plausible additional premises, these views are practically similar to negative utilitarianism.

Neutrality and the asymmetry

Disagreements about population ethics – how to value populations of different sizes and realised at different times – appear to drive a significant portion of disagreements about cause selection among effective altruists.[1] Those who believe that that the far future has extremely large value tend to move away from spending their time and money on cause areas that don’t promise significant long-term benefits, such as global poverty reduction and animal welfare promotion. In contrast, people who put greater weight on the current generation tend to support these cause areas.

One of the most natural ways to ground this weighting is the ‘intuition of neutrality’:

Intuition of neutrality – Adding future possible people with positive welfare does not make the world better.

One could ground this in a ‘person-affecting theory’. Such theories, like all others in population ethics, have many counterintuitive implications.

Most proponents of what I’ll call neutralist theories also endorse ‘the asymmetry’ between future bad lives and future good lives:

The asymmetry – Adding future possible people with positive welfare does not make the world better, but adding future possible people with negative welfare makes the world worse.

The intuition behind the asymmetry is obvious: we should not, when making decisions today ignore, say, possible people born in 100 years’ time who live in constant agony. (It isn’t clear whether the asymmetry has any justification beyond this intuition. The justifiability of the asymmetry continues to be a source of philosophical disagreement.)

To be as clear as possible, I think the both the intuition of neutrality and the asymmetry are very implausible. However, here I’m going to figure out what they asymmetric neutralist theories imply for cause selection. I’ll argue that asymmetric neutralist theories are under significant pressure to be aggregative and temporally neutral about future bad lives. They are therefore under significant pressure to accept the far future is astronomically bad

What should asymmetric neutralist theories say about future bad lives?

The weight asymmetric neutralist theories give to lives with future negative welfare will determine the theories’ practical implications. So, what should the weight be? I’ll explore this by looking at what I call Asymmetric Neutralist Utilitarianism (ANU).

Call lives with net suffering over pleasure ‘bad lives’. It seems plausible that ANU should say that bad lives have non-diminishing disvalue across persons and across time. More technically, it should endorse additive aggregation across future bad lives, and be temporally neutral about the weighting of these lives. (We should substitute ‘sentient life’ for ‘people’ in this, but it’s a bit clunky).

Impartial treatment of future bad lives, regardless of when they occur

It’s plausible that future people suffering the same amount should count equally regardless of when those lives occur. Suppose that Gavin suffers a life of agony at -100 welfare in the year 2200, and that Stacey also has -100 welfare in the year 2600. It seems wrong to say that merely because Stacey’s suffering happens later, it should count less than Gavin’s. This seems to violate an important principle of impartiality. It is true that many people believe that partiality is often permitted, but this is usually towards people we know, rather than to strangers who are not yet born. Discounting using pure time preference at, say, 1% per year entails that the suffering of people born 100 years into the future is a small fraction of the value of people born 500 years into the future. This looks hard to justify. We should be willing to sacrifice a small amount of value today in order to prevent massive future suffering.

The badness of future bad lives adds up and is non-diminishing as the population increases

It’s plausible that future suffering should aggregate and have non-diminishing disvalue across persons. Consider two states of affairs involving possible future people:

A. Vic lives at -100 welfare.

B. Vic and Bob each live at -100 welfare.

It seems that ANU ought to say that B is twice as bad as A. The reason for this is that the badness of suffering adds up across persons. In general, it is plausible that N people living at –x welfare is N times as bad as 1 person living at –x. It just does not seem plausible that suffering has diminishing marginal disutility across persons: even if there are one trillion others living in misery, that doesn’t make it any way less bad to add a new suffering person. We can understand why resources like money might have diminishing utility for a person, but it is difficult to see why suffering across persons behaves in the same way.

Reasons to think there will be an extremely large number of expected bad lives in the future

There is an extremely large number of expected (very) bad lives in the future. This could come from four sources:

Bad future human lives

There are probably lots of bad human lives at the moment: adults suffering rare and painful diseases or prolonged and persistent unipolar depression, or children in low income countries suffering and then dying. It’s likely that poverty and illness-caused bad lives will fall a lot in the next 100 years as incomes rise and health improves. It’s less clear whether there will be vast and rapid reductions in depression over the next 100 years and beyond because, unlike health and money, this doesn’t appear to be a major policy priority even in high income countries, and it’s only weakly affected by health and money.[2] The arrival of machine superintelligence could arguably prevent a lot of human suffering in the future. But since the future is so long, even a very low error rate at preventing bad lives would imply a truly massive number of future bad lives. It seems unreasonable to be certain that the error rate would be sufficiently low.

Wild animal suffering

It’s controversial whether there is a preponderance of suffering over pleasure among mild animals. It’s not controversial that there is a massive number of bad wild animal lives. According to Oscar Horta, the overwhelming majority of animals die shortly after coming into existence, after starving or being eaten alive. It seems reasonable to expect there to be at least a 1% chance that billions of animals will suffer horribly beyond 2100. Machine superintelligence could help, but preventing wild animal suffering is much harder than preventing human suffering and it is less probable that wild animal suffering prevention will be in the value of function of an AI than human suffering prevention: if we put the goals into the AI or it learns our values, since most people don’t care about wild animal suffering, neither would the AI. Again, even a low error rate would imply massive future wild animal suffering.

Sentient AI

It’s plausible that we will eventually be able to create sentient machines. If so, there is a non-negligible probability that someone will in the far future, by accident or design, create a large number of suffering machines.

Suffering on other planets

There are probably sentient life forms in other galaxies that are suffering. It’s plausibly in our power to reach these life forms and prevent them suffering, over very long timeframes.

The practical upshot

Since ANU only counts future bad lives and there are lots of them in the future, ANU + some plausible premises implies that the far future is astronomically bad. This is a swamping concern for ANU: if we have even the slightest chance of preventing all future bad lives occurring, that should take precedence over anything we could plausibly achieve for the current generation. It’s equivalent to a tiny chance of destroying a massive torture factory.

It’s not completely straightforward figuring out the practical implications of ANU. It’s tempting to say that it implies that the expected value of a miniscule increase in existential risk to all sentient life is astronomical. This is not necessarily true. An increase in existential risk might also deprive people of superior future opportunities to prevent future bad lives.

Example

Suppose that Basil could perform action A, which increases the risk of immediate extinction to all sentient life by 1%. However, we know that if we don’t perform A, in 100 years’ time, Manuel will perform action B, which increases the risk of immediate extinction to all sentient life by 50%.

From the point of view of ANU, Basil should not perform A even though it increases the risk of immediate extinction to all sentient life: doing this might not be the best way to prevent the massive number of future bad lives.

It might be argued that most people cannot in fact have much influence on the chance that future bad lives occur, so they should instead devote their time to things they can affect, such as global poverty. This argument seems to work equally well against total utilitarians who work on existential risk reduction, so those who accept the former should also accept the latter.

[1] I’m not sure how much.

[2] The WHO projects that depressive disorders will be the number two leading cause of DALYs in 2030. Also, DALYs understate the health burden of depression.