Probabilistic coherence, logical consistency, and Bayesian learning: Neural language models as epistemic agents

PLOS ONE, Feb 2023

It is argued that suitably trained neural language models exhibit key properties of epistemic agency: they hold probabilistically coherent and logically consistent degrees of belief, which they can rationally revise in the face of novel evidence. To this purpose, we conduct computational experiments with rankers: T5 models [Raffel et al. 2020] that are pretrained on carefully designed synthetic corpora. Moreover, we introduce a procedure for eliciting a model’s degrees of belief, and define numerical metrics that measure the extent to which given degrees of belief violate (probabilistic, logical, and Bayesian) rationality constraints. While pretrained rankers are found to suffer from global inconsistency (in agreement with, e.g., [Jang et al. 2021]), we observe that subsequent self-training on auto-generated texts allows rankers to gradually obtain a probabilistically coherent belief system that is aligned with logical constraints. In addition, such self-training is found to have a pivotal role in rational evidential learning, too, for it seems to enable rankers to propagate a novel evidence item through their belief systems, successively re-adjusting individual degrees of belief. All this, we conclude, confirms the Rationality Hypothesis, i.e., the claim that suitable trained NLMs may exhibit advanced rational skills. We suggest that this hypothesis has empirical, yet also normative and conceptual ramifications far beyond the practical linguistic problems NLMs have originally been designed to solve.

Probabilistic coherence, logical consistency, and Bayesian learning: Neural language models as epistemic agents

PLOS ONE RESEARCH ARTICLE Probabilistic coherence, logical consistency, and Bayesian learning: Neural language models as epistemic agents Gregor Betz ID1*, Kyle Richardson2 1 Department of Philosophy, Karlsruhe Institute of Technology, Karlsruhe, Germany, 2 Aristo, Allen Institute for AI, Seattle, WA, United States of America a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 OPEN ACCESS Citation: Betz G, Richardson K (2023) Probabilistic coherence, logical consistency, and Bayesian learning: Neural language models as epistemic agents. PLoS ONE 18(2): e0281372. https://doi. org/10.1371/journal.pone.0281372 Editor: Anu Sayal, Taylor’s University - Lakeside Campus: Taylor’s University, MALAYSIA Received: June 1, 2022 Accepted: January 22, 2023 Published: February 9, 2023 Peer Review History: PLOS recognizes the benefits of transparency in the peer review process; therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. The editorial history of this article is available here: https://doi.org/10.1371/journal.pone.0281372 Copyright: © 2023 Betz, Richardson. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability Statement: The data is now available via wandb under the following links/repos: https://wandb.ai/doxlm2/doxlm2_model_runs https://wandb.ai/doxlm2/doxlm2_finetuning https://wandb.ai/doxlm2/dataset_versions. * Abstract It is argued that suitably trained neural language models exhibit key properties of epistemic agency: they hold probabilistically coherent and logically consistent degrees of belief, which they can rationally revise in the face of novel evidence. To this purpose, we conduct computational experiments with RANKERS: T5 models [Raffel et al. 2020] that are pretrained on carefully designed synthetic corpora. Moreover, we introduce a procedure for eliciting a model’s degrees of belief, and define numerical metrics that measure the extent to which given degrees of belief violate (probabilistic, logical, and Bayesian) rationality constraints. While pretrained RANKERS are found to suffer from global inconsistency (in agreement with, e.g., [Jang et al. 2021]), we observe that subsequent self-training on auto-generated texts allows RANKERS to gradually obtain a probabilistically coherent belief system that is aligned with logical constraints. In addition, such self-training is found to have a pivotal role in rational evidential learning, too, for it seems to enable RANKERS to propagate a novel evidence item through their belief systems, successively re-adjusting individual degrees of belief. All this, we conclude, confirms the Rationality Hypothesis, i.e., the claim that suitable trained NLMs may exhibit advanced rational skills. We suggest that this hypothesis has empirical, yet also normative and conceptual ramifications far beyond the practical linguistic problems NLMs have originally been designed to solve. Introduction Neural language models (NLMs) are powerful natural language processing systems which have sparked a scientific revolution in the field of AI & NLP [1–4] and excel at such diverse tasks as, e.g., machine translation [5], text summarization [6], question answering [7, 8], or natural-language inference [9, 10]. The performance of these systems has exploded with the advent of the so-called Transformer network architecture [11] and has been increasing steadily over the last years (e.g., [12]) through further optimizations of machine learning algorithms and system design, increases in model size, or quantitatively and qualitatively improved training datasets. Technically, and leaving aside all the details, NLMs are essentially probabilistic PLOS ONE | https://doi.org/10.1371/journal.pone.0281372 February 9, 2023 1 / 29 PLOS ONE Funding: This work is supported by the Helmholtz Association Initiative and Networking Fund on the HAICORE@KIT partition. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: The authors have declared that no competing interests exist. Neural language models as epistemic agents word prediction machines. They are, first and foremost, trained to fill in missing or next words in a text; and they do predict a word by assigning probabilities to all words available in a given vocabulary. The strong performance of NLMs in natural language understanding tasks triggers the more fundamental question whether NLMs are rational agents: (Rationality Hypothesis) Suitably designed and trained NLMs may systematically display advanced rational skills. By discussing the (Rationality Hypothesis), we put the more specific questions addressed in this study (see Q1–Q4 below) in a broader scientific context, sketching their potential relevance for a variety of disciplines and fields. Rationality is arguably a contested concept (like justice). So what exactly does it mean that a NLM posseses advanced rational skills? We take it that such skills would include, more specifically, the abilities to reason correctly (infer, argue, and explain), to produce linguistic output that is sufficiently stable and globally consistent, and to adjust a former output in the light of novel evidence (or, more precisely, a linguistic representation of novel evidence). Moreover, advanced rational behavior of NLMs would allow one to adopt an “intentional stance” [13] towards these systems and to treat them as doxastic, if not epistemic agents holding beliefs and acquiring knowledge. In this study, we focus on, further specify, and operationalize the aforementioned epistemic competences. In doing so, we don’t, however, intend to imply that all dimensions of rationality can be reduced to such theoretical or epistemic skills. To say that future NLMs (trained on linguistic data) may exhibit artificial general intelligence (AGI) means to endorse the Rationality Hypothesis. The Rationality Hypothesis has ramifications far beyond the practical linguistic problems NLMs have been developed (and are used) to solve. Normatively and conceptually, its investigation may shed new light on the notion of rationality itself (see [14]), helping us to see whether reason is an emergent property [15]: Is reliable rational behavior a cognitive macro pattern that emerges when agents exercise basic linguistic skill (predicting missing words)? Or, to give this a normative twist: The Rationality Hypothesis asks which, if any, rational practices are grounded in elementary language norms. Empirically, an investigation of the Rationality Hypothesis will potentially alter our scientific understanding of human cognition (see also [16, 17], especially so as NLMs are found to accurately predict humans’ behavioral and neural (...truncated)


This is a preview of a remote PDF: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0281372&type=printable
Article home page: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0281372

Gregor Betz, Kyle Richardson. Probabilistic coherence, logical consistency, and Bayesian learning: Neural language models as epistemic agents, PLOS ONE, 2023, Volume 18, Issue 2, DOI: 10.1371/journal.pone.0281372