Large language models open new way of AI-assisted molecule design for chemists
Journal of Cheminformatics
(2025) 17:36
Ishida et al. Journal of Cheminformatics
https://doi.org/10.1186/s13321-025-00984-8
Open Access
SOFTWARE
Large language models open new way
of AI‑assisted molecule design for chemists
Shoichi Ishida1,2*, Tomohiro Sato3, Teruki Honma3 and Kei Terayama1,2,4,5*
Abstract Recent advancements in artificial intelligence (AI)-based molecular design methodologies have offered
synthetic chemists new ways to design functional molecules with their desired properties. While various AI-based
molecule generators have significantly advanced toward practical applications, their effective use still requires
specialized knowledge and skills concerning AI techniques. Here, we develop a large language model (LLM)-powered chatbot, ChatChemTS, that assists users in designing new molecules using an AI-based molecule generator
through only chat interactions, including automated construction of reward functions for the specified properties.
Our study showcases the utility of ChatChemTS through de novo design cases involving chromophores and anticancer drugs (epidermal growth factor receptor inhibitors), exemplifying single- and multiobjective molecule
optimization scenarios, respectively. ChatChemTS is provided as an open-source package on GitHub at https://
github.com/molecule-generator-collection/ChatChemTS.
Scientific contribution
ChatChemTS is an open-source application that assists users in utilizing an AI-based molecule generator, ChemTSv2,
solely through chat interactions. This study demonstrates that LLMs possess the potential to utilize advanced software, such as AI-based molecular generators, which require specialized knowledge and technical skills.
Introduction
Artificial intelligence (AI)-based techniques for molecular designs are becoming promising methods for designing synthetically accessible and insightful molecules with
desired functionalities [1–8]. Research articles on these
techniques have been reported in a wide range of fields,
*Correspondence:
Shoichi Ishida
Kei Terayama
1
Graduate School of Medical Life Science, Yokohama City University,
1‑7‑29, Suehiro‑cho, Tsurumi‑ku, Yokohama, Kanagawa 230‑0045, Japan
2
MolNavi LLC, #402 Wizard building 1‑4‑3 Sengen‑cho Nishi‑ku,
Yokohama, Kanagawa 220‑0072, Japan
3
RIKEN Center for Integrative Medical Sciences, 1‑7‑22 Suehiro‑cho,
Tsurumi‑ku, Yokohama 230‑0045, Japan
4
RIKEN Center for Advanced Intelligence Project, 1‑4‑1, Nihonbashi,
Chuo‑ku, Tokyo 103‑0027, Japan
5
MDX Research Center for Element Strategy, Institute of Science Tokyo,
4259, Nagatsuta‑cho, Midori‑ku, Yokohama, Kanagawa 226‑8501, Japan
from material design to drug discovery. In terms of material design, fluorescent [4] and photofunctional [1, 5]
molecules have been designed using AI-based molecule
generators, and the designed molecules were successfully experimentally validated to exhibit the desired properties. Similarly, in drug discovery, new proton pump
inhibitors [6] and inhibitors for targeting antifibrotic
effects [7] were designed and demonstrated their good
inhibitory effects. The AI-based molecule generators
used in the above studies represent just a fraction of the
techniques that have been developed thus far [9–25], and
applying and testing various promising molecule generators to solve real-world problems is vital for achieving
further advancements.
While various AI-based molecule generators have
made significant progress toward practical applications, their effective utilization still requires specialized
knowledge and skills concerning AI techniques [26]. This
high level of expertise presents a critical obstacle to the
widespread adoption of AI-based molecule generators.
© The Author(s) 2025. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which
permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the
original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or
other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line
to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory
regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this
licence, visit http://creativecommons.org/licenses/by/4.0/.
Ishida et al. Journal of Cheminformatics
(2025) 17:36
The effective use of these methods necessitates a deep
understanding of how to design reward functions that
appropriately represent the desired functionalities and
the ability to configure the set conditions according to
the specifications of each AI-based molecule generator.
In chemical, pharmaceutical, and other industries, the
complexity of utilizing AI-based molecule generators
and the need for skills such as machine learning (ML,) to
design reward functions pose significant obstacles that
prevent users from easily adopting these technologies for
their projects. These challenges complicate the effective
utilization of AI-based molecule generators to solve realworld problems, especially for researchers and developers who possess expert knowledge and skills in chemistry
but are not well versed in AI techniques.
To address these challenges, we developed ChatChemTS, a large language model (LLM)-powered chatbot
that assists users in utilizing ChemTSv2 [11]—AI-based
molecule generator with experimental validations for
various molecule designs [1, 3–5]—through only interactive chats. All users are merely required to express
a request to ChatChemTS via chat, and ChatChemTS
then prepares the appropriate reward functions, configures the desired conditions, and executes ChemTSv2
for the users. In addition, ChatChemTS provides a tool
for analyzing the output molecule generation results.
ChatChemTS is based on a ReAct framework so that it
can address the whole workflow of general AI-based molecule generators, and the framework employs the generative pretrained transformer (GPT) model of OpenAI,
which has shown the great potential as an LLM chemistry agent to perform chemistry-related tasks [27–33]. As
example applications of ChatChemTS, we performed two
de novo molecular design tasks, one involving a photofunctional organic molecule and another concerning a
kinase inhibitor, as single- and multiobjective molecule
optimization problems, respectively. Notably, users only
need to prepare data related to the physicochemical
properties of molecules or information about the target proteins of interest to perform AI-based molecule
designs with ChatChemTS. We show that this concept
of utilizing an LLM as an assistant of AI-based molecule generators can be easily introduced to various generators developed with organized application structures (...truncated)