Emergent language: a survey and taxonomy
Autonomous Agents and Multi-Agent Systems (2025) 39:18
https://doi.org/10.1007/s10458-025-09691-y
(0123456789().,-volV)(0123456789().,-volV)
Emergent language: a survey and taxonomy
Jannik Peters1 · Constantin Waubert de Puiseau1 · Hasan Tercan1
Arya Gopikrishnan2 · Gustavo Adolpho Lucas de Carvalho3 ·
Christian Bitter1 · Tobias Meisen1
·
Accepted: 26 January 2025 / Published online: 7 March 2025
© The Author(s) 2025
Abstract
The field of emergent language represents a novel area of research within the domain of
artificial intelligence, particularly within the context of multi-agent reinforcement learning.
Although the concept of studying language emergence is not new, early approaches were
primarily concerned with explaining human language formation, with little consideration
given to its potential utility for artificial agents. In contrast, studies based on reinforcement
learning aim to develop communicative capabilities in agents that are comparable to or even
superior to human language. Thus, they extend beyond the learned statistical representations
that are common in natural language processing research. This gives rise to a number of
fundamental questions, from the prerequisites for language emergence to the criteria for
measuring its success. This paper addresses these questions by providing a comprehensive
review of relevant scientific publications on emergent language in artificial intelligence. Its
objective is to serve as a reference for researchers interested in or proficient in the field.
Consequently, the main contributions are the definition and overview of the prevailing
terminology, the analysis of existing evaluation methods and metrics, and the description of
the identified research gaps.
Keywords Emergent language · Emergent communication · Artificial intelligence ·
Reinforcement learning · Multi-agent
1 Introduction
Communication between individual entities is based on conventions and rules that emerge
from the necessity or advantage of coordination. Accordingly, Lewis [1] formalized settings
that facilitate the emergence of language as “coordination problems” [1] and introduced a
simple signaling game. This game, in which a speaker describes an object and a listener
confronted with multiple options has to identify the indicated one, extensively shaped the
field of emergent language (EL) research in computer science. Early works examined
narrowly defined questions regarding the characteristics of emergent communication (EC)
Arya Gopikrishnan and Gustavo Adolpho Lucas De Carvalho have work done during and after a DAAD
RISE internship at Institute ofTechnologies and Management of Digital Transformation.
Extended author information available on the last page of the article
123
18 Page 2 of 73
Autonomous Agents and Multi-Agent Systems (2025) 39:18
via hand-crafted simulations [2–12]. These approaches mostly utilized supervised learning
methods and non-situated settings, limiting them in their ability to examine the origins and
development of complex linguistic features [2]. However, EL research experienced an
upsurge in the period between 2016 and 2018 [13–20] with a focus on MARL approaches
[21–32] to enable the examination of more complex features.
One fundamental goal of EL research from the multi-agent reinforcement learning
(MARL) perspective is to have agents autonomously develop a communication form that
allows not only agent-to-agent but also agent-to-human communication in natural language
(NL) style fashion [2, 16, 24, 29, 33, 34]. Therefore, reinforcement learning (RL) methods
are attractive from two points of view. First, successful communication settings might lead
to agents that are “more flexible and useful in everyday life” [35]. Furthermore, they may
provide insights into the evolution of NL itself [36]. However, encouraging communication
alone will not automatically produce a language with natural language characteristics [37].
Providing the right incentives for language development is therefore crucial.
EL is the methodological attempt to enable agents to not only statistically understand and
use NL, like natural language processing(NLP) models that learn on text alone [38, 39], but
rather to design, acquire, develop, and learn their own language [40, 41]. The autonomy and
independent active experience of RL learning settings is a crucial difference to the datadriven approaches in the field of NLP [42–44] and its large language model (LLM).
According to Browning and LeCun, “we should not confuse the shallow understanding
LLM possess for the deep understanding humans acquire” [41] through their experiences in
life. In EL settings, the agents experience the benefits of communication through goaloriented tasks [45] just like it happens naturally [1] and therefore have the opportunity to
develop a deeper understanding of the world [33, 46]. Hence, advances in EL research
enable novel applications of multi-agent systems and a considerably advanced form of
human-centric AI [35].
In the current state of EL research, numerous different methods and metrics are already
established but they are complex to structure and important issues remain regarding the
analysis and comparison of achieved results [29, 35, 47]. Therefore, we see a need for a
taxonomy to prevent misunderstandings and incorrect use of established metrics. In this
paper, we address these issues by providing a comprehensive overview of publications in
EL research and by introducing a taxonomy for discrete EL that encompasses key concepts
and terminologies of this field. Additionally, we present established and recent metrics for
discrete EL categorized according to the taxonomy and discuss their utility. Our goal is to
provide a clear and concise description that researchers can use as a shared resource for
guidance. Finally, we create a summary of EL research that highlights its achievements and
provides an outlook on future research directions. We base our work on a comprehensive
and systematic literature search with reproducible search terms on well-known databases.
We follow the PRISMA [48] specifications and show a corresponding flow diagram in
Fig. 11 in Appendix B. The literature search and review process as well as its results are
described in detail in Sect. 4. All identified work has been reviewed and categorized
according to an extensive list of specific characteristics, e.g. regarding communication
setting, game composition, environment configuration, language design, language metrics,
and more.
Previous surveys of EL in computer science focused only on a subgroup of characteristics or specific parts of this research area. Some of these earlier surveys focus on specific
learning settings [45, 49, 50], on methodological summaries and criticism [29, 40, 51–55],
or provide a more general overview [24, 35, 36, 47, 56–58]. The most similar ones to our
work are [35] and [58]. Lazaridou [35] gives an introduction and overview of the EL field
123
Autonomous Agents and Multi-Agent Syst (...truncated)