In a recent McSweeney’s article, Kyle York proposes some interesting takes on familiar ethical dilemmas with the potential to help reveal something truly fundamental about moral judgment. For instance, pull or no pull:
“There’s an out of control trolley speeding towards four workers. You have the ability to pull off your head and turn it into a Chinese lantern. Your head floats into the sky until it takes the place of the sun. You look down upon the planet. It is as small as the eye of a moth. The moth flies away.”
Clearly a particular way of doing philosophy and moral psychology is being mocked. Philosophers and psychologists often rely on unrealistic thought experiments and experimental stimuli when conducting their research. To be fair, they do tend to avoid the surreal, but this does raise an interesting question. How worried should we be that deep down, we don’t actually believe the things we are asked to imagine as we make philosophically and psychologically relevant judgments about them?
Sacrificial dilemmas, and trolley problems, especially have gotten a lot of hate for this. They are infamous enough for the Atlantic, inspired by a great compass article by Bauman, McGraw, Bartels & Warren, to wonder “Is One of the Most Popular Psychology Experiments Worthless?” The concerns are numerous: ecological validity. They make an awful lot of assumptions. They involve stipulations real life agents couldn’t possibly know. People find them humorous. Maybe these things don’t matter for drawing the inferences we draw about them. For instance, Millar, Turri & Friedman replicate basic Trolley findings using admirably realistic adaptations. Programmable driverless car adaptations in Bonnefon, Shariff & Rahwan ground the cases in more practical calculations. But you still have to wonder.
Many thought experiments in philosophy are not better. We all have our favorites. Epistemology has barn facades land. In philosophy of mind, we are asked to imagine what would happen if the population of china started simulating the actions of neurons. In philosophy of action, thought experiments involving free will have super computers that predict the exact state of the entire world down to minute details, like that “Jeremy will definitely rob Fidelity Bank at 6:00 PM on January 26, 2195.” What do you mean “definitely”? He can’t rob it at 6:01? What if Jeremy gets hit by a bus?
In the supercomputer case, interpreting judgments as they often get interpreted to support compatible or incompatiblism does presuppose that we accept the deterministic details stipulated. When people were directly asked about whether they thought this case was possible, researchers found that the majority said “no”. Related research by Rose, Buckwalter & Nichols suggests that sometimes laypeople might not be accepting neuro-deterministic details of futuristic cases, even if they say “yes”. Maybe philosophers are better at this than other people, or maybe just more confident. You have to wonder.
Some cases create intuitive resistance. Imaginative cases are often very useful for theorizing. But there is also inherent risk in asking us to imagine a world so different from our own, and then attempting to apply core concepts of social cognition – knowledge, responsibility, choice, blame, and so on – in a context so foreign to the one in which they are explicitly or implicitly associated. Those concepts want to get used in the universe they normally do. Given those associations, it seems reasonable to expect the process of applying certain concepts to pressure subtle rejections of stimuli in favor of the familiar.
Similar to the phenomenon of imaginative resistance, there’s probably going to be a lot of unexpected variation in the kinds of judgments that lead us to subtly reject features of cases, as well as whether or not doing so is ultimately theoretically meaningful for the inferences we draw about our reactions. Unlike imaginative resistance, we probably won’t be too aware it’s happening through introspection alone.
Most famous cases aren’t quite Chinese lantern-level bad, but should we be doing more things to improve them? Should we for instance, as scientists be taking greater care to understand stimuli processing beyond just the judgments we’re interested in? Is it possible for philosophers to work a little harder to construct materials that help avoid unnecessary difficulties, when possible to do so, from the start? What are things that would improve thought experiment construction so that the usefulness of this tool is maximized in these fields?
Thanks for bringing up this topic in a great post, Wesley! I'm perhaps not as concerned about this issue, for several reasons that I'll try to quickly sketch below. However, I'd like to emphasize that I am a *bit* worried and I of course welcome ideas for how to improve.
(1) Some people have taken steps to control for such issues, and not just in the last year or two. Josh Greene and his collaborators, for example, have actually taken some steps toward improvement. In their "Pushing Moral Buttons" (2009) paper, they test for what they call "unconscious realism"---"a tendency to unconsciously replace a moral dilemma’s unrealistic assumptions with more realistic ones" (p. 365).
https://dash.harvard.edu/bitstream/handle/1/4264763/Greene_MoralButtons.pdf?sequence=2
Here's an example: "Subjects estimated the likelihood (0–100%) that the consequences of Joe’s action would be (a) as described in the dilemma (five lives saved at the cost of one), (b) worse than this, or (c) better than this." (p. 366).
(2) As you say, another way to improve is to come up with more realistic cases in the first place. And some have already been doing this yet finding similar results. I agree that Miller et al (2014) is an excellent example. Those sorts of examples make me less worried that more realistic cases in other areas will yield substantially different results. Of course, they might, but this makes me less worried. The worry about ecological validity, after all, is ultimately an empirical one. So it doesn't make a lot of sense to me to treat it as a major problem in a particular area without data suggesting it is.
Posted by: Josh May | 11/19/2015 at 02:06 PM
Hey Josh,
I think you make a really good point about the advancements Greene and collaborators have made to help improve this in that particular set of experiments. At the same time, it’s hard not to still have a lot of questions about what is going on in trolley processing more generally, if as it has recently been suggested, answers don’t end up measuring commitment to consequentialism. The study by Miller et al 2014 is one of my favorites, but it is insufficient for me to make any inferences at all that realistic cases in other areas will or will not yield substantially different results. And of course, recent data do suggest this issue could be a big problem for other areas of research, such as in free will. I take these two examples to show that how and when this is an issue will vary with specific thought experiments and judgments -- which worries me because we have so many in philosophy and there is basically no procedure or limits to constructing them to try and get the answer you want.
Posted by: Wesley Buckwalter | 11/20/2015 at 11:43 AM
Hi Wesley, great post. This is a problem I've worried about a lot and dealt with from both sides of the issue, as it were. I have some old studies, done with Bradley Thomas and Dylan Murray, that we never got around to publishing, on trolley problems. We tried to create more realistic cases pushing cases (using automatic braking systems) and less realistic switch cases (using weird loops), hoping to show that people (a) would make believability judgments and (more importantly, I think), judgments about the agent's being *justified* in believing the action would work to save net-4 lives, and (b) would make moral permissibility judgments that tracked their epistemic judgments (and hence would go way up for the pushing case and way down for the switch case).
Instead what we found was that people's epistemic judgments seemed to track their moral judgments. That is, they still found the pushing case equally impermissible and also judged it unbelievable (and vice versa for switch), controlling for order of questions. It might seem, then, that our cases just failed in being more (or less) believable. But when we took out the moral features (by making it crash test dummies and bags of luggage), the believability ratings did move in the right directions.
There's several interpretations of these results, but one is that people are rationalizing their moral judgments with their epistemic judgments (might also involve higher stakes increasing epistemic standards).
You also discuss my collaborators' and my determinism and neuro-prediction cases. I share the worry that some participants may be rejecting the stipulations of the case because of implicit or explicit commitments to indeterministic, libertarian, or dualist beliefs, and I hope we can find ways to better test for this possibility (as you know, I don't think the ways you, Rose & Nichols tested it effectively shows that most people are 'intruding' libertarian commitments into the scenarios). It's important to note that when we explicitly ask people whether it is possible for the neuro-prediction technology (allowing perfect prediction of decisions and actions based on prior brain activity) to exist in the future, 80% said yes. And then when we ask why or why not, we found almost no one saying that it is impossible because humans have free will (or a non-physical soul or indeterminism is true), as it seems more people would if they were committed to such beliefs. Instead, the 20% who say it's impossible talk about breakdowns in the technology or moral or political constraints on developing it or the complexity of the human brain.
Finally, I've long thought the thought experiments used in phil mind to challenge physicalism and/or functionalism are really problematic for the reasons you suggest. My former student Toni Adleberg did a great thesis on this issue, suggesting that the best explanation of these thought experiments involve conflicts between our agency detection mechanisms (or maybe theory of mind) and our physical/mechanistic explanation systems. The work by Tony Jack and collaborators suggests something similar (as does some work by Brian Talbot). I think something similar accounts for some of the results in the free will x-phi.
Posted by: Eddy Nahmias | 11/23/2015 at 01:04 PM
Great to see this being discussed! Just some general observations, in response to the closing questions of the OP:
Unnecessarily long, complicated, or unrealistic materials threaten to cause trouble in many ways. At the very least, they raise questions about reliability and external validity. Assuming that you're measuring what you want to, do you want to decrease random error in the measurement? Shorter and simpler materials decrease random error by minimizing opportunity for distraction, fatigue, and confusion, among other things. Do the findings generalize to other contexts, especially everyday situations where the concepts have their home? More realistic materials increase the extent to which this is true.
Manipulation checks are extremely helpful and can usually be included in a way that doesn't influence people's response on the variable of interest. You spend time crafting short, simple, realistic and tightly matched stimuli that, on the face of it, effectively manipulate the independent variable. But, however plausible this seems to you, and however silly it would be for people to not get it, it's wise to check that people understood things in the intended way. If someone — say, a reviewer, maybe? — asks whether your manipulation was effective, you can plead plausibility. Or you can just let the data do the talking. The latter is definitely preferable, but it's possible only if you have the data.
Just my $.02!
Posted by: John Turri | 11/23/2015 at 07:47 PM