Enhancing bitcoin transaction confirmation prediction: a hybrid model combining neural networks and XGBoost
World Wide Web (2023) 26:4173–4191
https://doi.org/10.1007/s11280-023-01212-9
Enhancing bitcoin transaction confirmation prediction: a
hybrid model combining neural networks and XGBoost
Limeng Zhang1 · Rui Zhou1 · Qing Liu2 · Jiajie Xu3 · Chengfei Liu1 ·
Muhammad Ali Babar4
Received: 30 April 2023 / Revised: 12 September 2023 / Accepted: 14 September 2023 /
Published online: 26 December 2023
© The Author(s) 2023
Abstract
With Bitcoin being universally recognized as the most popular cryptocurrency, more Bitcoin
transactions are expected to be populated to the Bitcoin blockchain system. As a result, many
transactions can encounter different confirmation delays. Concerned about this, it becomes
vital to help a user understand (if possible) how long it may take for a transaction to be confirmed in the Bitcoin blockchain. In this work, we address the issue of predicting confirmation
time within a block interval rather than pinpointing a specific timestamp. After dividing the
future into a set of block intervals (i.e., classes), the prediction of a transaction’s confirmation
is treated as a classification problem. To solve it, we propose a framework, Hybrid Confirmation Time Estimation Network (Hybrid-CTEN), based on neural networks and XGBoost to
predict transaction confirmation time in the Bitcoin blockchain system using three different
sources of information: historical transactions in the blockchain, unconfirmed transactions in
the mempool, as well as the estimated transaction itself. Finally, experiments on real-world
blockchain data demonstrate that, other than XGBoost excelling in the binary classification
case (to predict whether a transaction will be confirmed in the next generated block), our
proposed framework Hybrid-CTEN outperforms state-of-the-art methods on precision, recall
and f1-score on all the multiclass classification cases (4-class, 6-class and 8-class) to predict
in which future block interval a transaction will be confirmed.
Keywords Transaction confirmation time · Bitcoin · Blockchain · XGBoost · Neural
network
1 Introduction
As Bitcoin is universally recognized by more organisations, institutes and governments, it is
booming in an increasing number of areas [1]. Currently, many businesses, such as PayPal,
This article belongs to the Topical Collection: Special Issue on Web Information Systems Engineering 2022
Guest editors: Richard Chbeir, Helen Huang, Yannis Manolopoulos and Fabrizio Silvestri.
B
Rui Zhou
Extended author information available on the last page of the article
123
4174
World Wide Web (2023) 26:4173–4191
Microsoft, and Overstock, have embraced Bitcoin as a method of payment. Meanwhile,
various online cryptocurrency trading platforms, such as Coinbase, Gemini1 , and PayPal,
have enabled users to purchase, sell, store, and transfer Bitcoins. As a result, more Bitcoin
exchanges are expected to be populated into the Bitcoin blockchain. Unfortunately, due to
the confirmation mechanism in the system, only a limited number of transactions (restricted
to the capacity of a block) can be confirmed at a time. Therefore, many transactions cannot
be immediately confirmed, and confirmation delays commonly occur in the Bitcoin system.
Concerned about this, it becomes vital to help a user to understand (if possible) how long it
may take for a transaction to be confirmed in the Bitcoin blockchain.
Most previous attempts at estimating the confirmation time for a transaction focus on
predicting a specific timestamp or predicting the number of blocks a transaction needs to
wait for before it is confirmed [2–9]. However, it is usually more practical to predict the
confirmation time as falling into the corresponding predefined time intervals (e.g., within 1
hour, between 1 hour and 4 hours, and more than 4 hours). It is motivated by the following
considerations: On one hand, when attempting to estimate a specific timestamp, one issue
is that the estimation performance can be affected by the submission time, especially for
transactions that are scheduled for confirmation in the subsequent block. The confirmation
time for these transactions is influenced by the remaining time before the next block is
produced. Consequently, this can lead to a situation where, as a result of delayed submission,
a transaction with a significantly higher fee can experience a longer delay than a transaction
with a lower fee if the higher fee one is submitted later than the lower-fee one. The second
issue arises from the unpredictable nature of block generation time, which can span from mere
seconds to several hundred seconds. As a result, the confirmation time for two transactions
submitted at different block heights but confirmed within the same block interval can exhibit
unpredictable differences, which may undermine users’ satisfaction when using a client-side
transaction system.
On the other hand, by utilizing the block as the unit of measurement for confirmation
time, the variance in confirmation time can be significantly diminished. However, a challenge
arises as the estimation result can be heavily influenced by a small proportion of transactions,
especially when there is a scarcity of historical transactions for that interval. In such cases, the
estimation result may become highly dependent on a single or a few transactions. Moreover,
when the estimated confirmation time (in terms of both a specific time and a block interval)
exceeds a certain level, users tend to pay a higher transaction fee to prioritize the confirmation
process. In conclusion, we suggest that as long as the confirmation time falls within an
acceptable range, it may be more practical and reasonable to estimate a confirmation time
range rather than a confirmation time stamp to system users. Under such background, if we
divide the future into a number of block intervals (representing a number of classes), the
confirmation time prediction problem can be considered as a classification problem.
The accuracy of transaction confirmation time estimation is crucial for blockchain-based
applications. However, existing efforts suffer from four key drawbacks in their frameworks:
(1) The existing methods for transaction confirmation estimation do not provide tailored
estimates for individual transactions. Instead, most of them estimate the confirmation time
for a group of transactions. For example, some works such as [5, 6] estimate the average
confirmation time of high-feerate class transactions and low-feerate transactions, while others like [8] estimate the average confirmation time of all the unconfirmed transactions. (2)
Models proposed in [3, 10] predict only whether a transaction can be confirmed in the next
block, treating the problem as a binary classification task. However, such models may not be
1 https://www.gemini.com
123
World Wide Web (2023) 26:4173–4191
4175
sufficient in practice as they do not provide more detailed confirmation information beyond
a simple yes or no. (3) Some of t (...truncated)