ChatGPT is bullshit
Ethics and Information Technology (2024) 26:38
https://doi.org/10.1007/s10676-024-09775-5
ORIGINAL PAPER
ChatGPT is bullshit
Michael Townsen Hicks1
· James Humphries1 · Joe Slater1
Published online: 8 June 2024
© The Author(s) 2024
Abstract
Recently, there has been considerable interest in large language models: machine learning systems which produce humanlike text and dialogue. Applications of these systems have been plagued by persistent inaccuracies in their output; these are
often called “AI hallucinations”. We argue that these falsehoods, and the overall activity of large language models, is better
understood as bullshit in the sense explored by Frankfurt (On Bullshit, Princeton, 2005): the models are in an important
way indifferent to the truth of their outputs. We distinguish two ways in which the models can be said to be bullshitters,
and argue that they clearly meet at least one of these definitions. We further argue that describing AI misrepresentations
as bullshit is both a more useful and more accurate way of predicting and discussing the behaviour of these systems.
Keywords Artificial intelligence · Large language models · LLMs · ChatGPT · Bullshit · Frankfurt · Assertion ·
Content
Introduction
Large language models (LLMs), programs which use reams
of available text and probability calculations in order to
create seemingly-human-produced writing, have become
increasingly sophisticated and convincing over the last
several years, to the point where some commentators suggest that we may now be approaching the creation of artificial general intelligence (see e.g. Knight, 2023 and Sarkar,
2023). Alongside worries about the rise of Skynet and the
use of LLMs such as ChatGPT to replace work that could
and should be done by humans, one line of inquiry concerns
what exactly these programs are up to: in particular, there
is a question about the nature and meaning of the text produced, and of its connection to truth. In this paper, we argue
against the view that when ChatGPT and the like produce
false claims they are lying or even hallucinating, and in
favour of the position that the activity they are engaged in
Michael Townsen Hicks
James Humphries
Joe Slater
1
University of Glasgow, Glasgow, Scotland
is bullshitting, in the Frankfurtian sense (Frankfurt, 2002,
2005). Because these programs cannot themselves be concerned with truth, and because they are designed to produce
text that looks truth-apt without any actual concern for truth,
it seems appropriate to call their outputs bullshit.
We think that this is worth paying attention to. Descriptions of new technology, including metaphorical ones, guide
policymakers’ and the public’s understanding of new technology; they also inform applications of the new technology. They tell us what the technology is for and what it can
be expected to do. Currently, false statements by ChatGPT
and other large language models are described as “hallucinations”, which give policymakers and the public the
idea that these systems are misrepresenting the world, and
describing what they “see”. We argue that this is an inapt
metaphor which will misinform the public, policymakers,
and other interested parties.
The structure of the paper is as follows: in the first section, we outline how ChatGPT and similar LLMs operate.
Next, we consider the view that when they make factual
errors, they are lying or hallucinating: that is, deliberately
uttering falsehoods, or blamelessly uttering them on the
basis of misleading input information. We argue that neither of these ways of thinking are accurate, insofar as both
lying and hallucinating require some concern with the truth
of their statements, whereas LLMs are simply not designed
to accurately represent the way the world is, but rather to
13
38 Page 2 of 10
give the impression that this is what they’re doing. This, we
suggest, is very close to at least one way that Frankfurt talks
about bullshit. We draw a distinction between two sorts of
bullshit, which we call ‘hard’ and ‘soft’ bullshit, where the
former requires an active attempt to deceive the reader or
listener as to the nature of the enterprise, and the latter only
requires a lack of concern for truth. We argue that at minimum, the outputs of LLMs like ChatGPT are soft bullshit:
bullshit–that is, speech or text produced without concern for
its truth–that is produced without any intent to mislead the
audience about the utterer’s attitude towards truth. We also
suggest, more controversially, that ChatGPT may indeed
produce hard bullshit: if we view it as having intentions (for
example, in virtue of how it is designed), then the fact that it
is designed to give the impression of concern for truth qualifies it as attempting to mislead the audience about its aims,
goals, or agenda. So, with the caveat that the particular kind
of bullshit ChatGPT outputs is dependent on particular
views of mind or meaning, we conclude that it is appropriate
to talk about ChatGPT-generated text as bullshit, and flag up
why it matters that – rather than thinking of its untrue claims
as lies or hallucinations – we call bullshit on ChatGPT.
What is ChatGPT?
Large language models are becoming increasingly good
at carrying on convincing conversations. The most prominent large language model is OpenAI’s ChatGPT, so it’s the
one we will focus on; however, what we say carries over to
other neural network-based AI chatbots, including Google’s
Bard chatbot, AnthropicAI’s Claude (claude.ai), and Meta’s
LLaMa. Despite being merely complicated bits of software,
these models are surprisingly human-like when discussing a
wide variety of topics. Test it yourself: anyone can go to the
OpenAI web interface and ask for a ream of text; typically,
it produces text which is indistinguishable from that of your
average English speaker or writer. The variety, length, and
similarity to human-generated text that GPT-4 is capable of
has convinced many commentators to think that this chatbot
has finally cracked it: that this is real (as opposed to merely
nominal) artificial intelligence, one step closer to a humanlike mind housed in a silicon brain.
However, large language models, and other AI models
like ChatGPT, are doing considerably less than what human
brains do, and it is not clear whether they do what they do in
the same way we do. The most obvious difference between
an LLM and a human mind involves the goals of the system.
Humans have a variety of goals and behaviours, most of
which are extra-linguistic: we have basic physical desires,
for things like food and sustenance; we have social goals
and relationships; we have projects; and we create physical
13
M. T. Hicks et al.
objects. Large language models simply aim to replicate
human speech or writing. This means that their primary
goal, insofar as they have one, is to produce human-like
text. They do so by estimating the likelihood that a particular
word will appear next, given the text that has come before.
The ma (...truncated)