A method for identifying faulty cells using a classification tree-based UE diagnosis in LTE
Muñoz et al. EURASIP Journal on Wireless Communications and
Networking (2017) 2017:130
DOI 10.1186/s13638-017-0914-3
RESEARCH
Open Access
A method for identifying faulty cells using
a classification tree-based UE diagnosis in LTE
P. Muñoz1* , R. Barco1 , E. Cruz2 , A. Gómez-Andrades1 , E. J. Khatib1 and N. Faour2
Abstract
The latest advances in wireless technologies have led to a proliferation of data mobile devices and services. As a
consequence, mobile networks have experienced a significant increase in data traffic, while voice traffic has shown
nearly no growth. It is therefore essential for operators to understand the data traffic behavior at the user level in order
to ensure a good customer experience. In the radio access networks (RANs), traditional solutions based on cell-level
measurements are not adequate to analyze performance of individual users. Instead, novel alternatives such as the
use of call traces and the definition of new user-centric indicators will provide detailed and valuable information for
each connection. One of the key measurements related to data services is the user throughput. In this work, the user
throughput is adopted as the main attribute to conduct diagnosis in the RAN, which has typically been the bottleneck
for data services. To that end, a binary classification tree is proposed to determine the root cause of poor throughput
in user-level data sessions. Then, this information is aggregated at the cell level in order to provide effective diagnosis
of degraded cells. In particular, a correlation-based analysis of the cell status is proposed in order to identify abnormal
cell behaviors in an automatic way. Evaluation has been carried out with datasets from live cellular networks. Results
show that the proposed diagnosis approach is an effective means to identify the main factors that limit the user
throughput in network cells.
Keywords: Self-healing, Fault diagnosis, Long-Term Evolution, Correlation, Self-Organizing Networks
1 Introduction
During the last years, the wireless data services have
become the dominant traffic source in cellular networks.
Behind this, there is an expansion of new mobile applications and a rapid growth in the number of subscribers,
both motivated by the advances in cellular communication technologies and the development of user-friendly
smartphones. According to a large network vendor [1],
global mobile data traffic grew 69% in 2014 while the
average smartphone usage grew 45% in the same year.
This enormous increase in data traffic has forced operators not only to invest large amounts of money in new
infrastructure but also to reduce operational expenditures (OPEX) in order to maintain the levels of user
satisfaction.
To produce significant cost savings, one of the adopted
solutions by standardization bodies was the creation of
*Correspondence:
Communications Engineering Dept., University of Málaga, Málaga, Spain
Full list of author information is available at the end of the article
1
the Self-Organizing Networks (SONs) [2], which provide
a new concept of network management where the maintenance and optimization tasks are carried out mostly
in an automated way. Typically, technical experts in
these fields have to deal with hundreds of traffic measurements and performance indicators every day [3, 4].
The vast diversity and quantity of these metrics makes
the operational work very complex. Thus, the use of
automated techniques for cellular traffic data analysis is
essential to reduce human effort while expertise can be
focused on new areas, bringing additional value to the
operator [5].
Traditionally, mobile operators paid their attention in
providing a good quality of the voice service, since it
was the main offered service. To ensure this Quality-ofService (QoS), troubleshooting experts mainly monitored
the call blocking and dropping rates at the cell level to
measure the levels of accesibility and retainability, respectively, in the network. However, with the explosion of
Internet services, the QoS of multimedia and data applications is given by the data rates experienced by the users,
© The Author(s). 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and
reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license, and indicate if changes were made.
Muñoz et al. EURASIP Journal on Wireless Communications and Networking (2017) 2017:130
where integrity metrics such as throughput and latency
are essential traffic measurements [4]. The problem of
throughput performance indicators is that they are often
difficult to interpret because of their dependence on many
factors. In particular, there are some aspects beyond the
typical variables related to the radio environment (e.g.,
distance to base station, cell loading, user speed, etc.) that
should be considered. First, unlike in traditional voice services, the mobile network is only one segment of the endto-end connection in an IP world. For example, a router in
the IP cloud that suffers congestion may influence the user
perceived data rate. Second, the recent radio access technologies (RATs) such as Long-Term-Evolution (LTE) have
included a class-based QoS model as a mechanism to differentiate between services, establishing various levels of
service to the users. Third, the traffic pattern of new data
services clearly impacts throughput measurements. Due
to the increasing popularity of web navigation, streaming video, social networking, file sharing, online gaming,
and other data services, there are significant differences
in traffic patterns [6]. As a consequence, operators are
investing a large amount of money to investigate traffic
modeling and classification through packet inspection in
order to better understand the characteristic of today’s
cellular data traffic. In addition, sophisticated traffic data
filtering, processing, and correlation with other network
metrics are also important features to identify root causes
of any detected anomaly and increase the reliability of the
network [7, 8].
The increasing complexity of network infrastructure
and services has also led operators to be interested in
managing performance at the user level, instead of the
cell- or network level, with the aim of maintaining their
competitiveness levels. Today’s solutions based on percell performance counters are insufficient to perform
adequate root-cause analysis. For this reason, the standardization bodies have proposed the use of user-centric
indicators and call traces to support the optimization and
troubleshooting processes [9]. With the Minimization of
Drive Tests (MDT) described in [10], the collection of traffic measurements can be done in an autonomous manner.
In other words, each device that is active in (...truncated)