Implementation of real-time energy management strategy based on reinforcement learning for hybrid electric vehicles and simulation validation

PLOS ONE, Dec 2019

To further improve the fuel economy of series hybrid electric tracked vehicles, a reinforcement learning (RL)-based real-time energy management strategy is developed in this paper. In order to utilize the statistical characteristics of online driving schedule effectively, a recursive algorithm for the transition probability matrix (TPM) of power-request is derived. The reinforcement learning (RL) is applied to calculate and update the control policy at regular time, adapting to the varying driving conditions. A facing-forward powertrain model is built in detail, including the engine-generator model, battery model and vehicle dynamical model. The robustness and adaptability of real-time energy management strategy are validated through the comparison with the stationary control strategy based on initial transition probability matrix (TPM) generated from a long naturalistic driving cycle in the simulation. Results indicate that proposed method has better fuel economy than stationary one and is more effective in real-time control.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0180491&type=printable

Implementation of real-time energy management strategy based on reinforcement learning for hybrid electric vehicles and simulation validation

July Implementation of real-time energy management strategy based on reinforcement learning for hybrid electric vehicles and simulation validation Zehui Kong 0 1 2 Yuan Zou 0 1 2 Teng Liu 0 1 2 0 National Engineering Laboratory for Electric Vehicles, School of Mechanical Engineering, Institute of Technology , Beijing , China 1 Nature Science Foundation of China (Grant 51375044), University Science Introduction 111 Project (B12022), and Defense Basic Research Project , B20132010 2 Editor: Xiaosong Hu, Chongqing University , CHINA To further improve the fuel economy of series hybrid electric tracked vehicles, a reinforcement learning (RL)-based real-time energy management strategy is developed in this paper. In order to utilize the statistical characteristics of online driving schedule effectively, a recursive algorithm for the transition probability matrix (TPM) of power-request is derived. The reinforcement learning (RL) is applied to calculate and update the control policy at regular time, adapting to the varying driving conditions. A facing-forward powertrain model is built in detail, including the engine-generator model, battery model and vehicle dynamical model. The robustness and adaptability of real-time energy management strategy are validated through the comparison with the stationary control strategy based on initial transition probability matrix (TPM) generated from a long naturalistic driving cycle in the simulation. Results indicate that proposed method has better fuel economy than stationary one and is more effective in real-time control. - Data Availability Statement: All relevant data are within the paper and its Supporting Information files. Competing interests: The authors have declared that no competing interests exist. 1. Introduction The hybrid electric vehicles (HEVs) are booming rapidly as a solution to the depletion of fossil fuel and severe pollution of air condition. Due to the cooperation of a battery pack and the internal combustion engine, the vehicle powertrain allows the engine to avoid operating at low load with poor efficiency, and the fuel economy and emission can be improved significantly. However, the flexibility of power split also makes energy management problem more challenging. Energy management strategy (EMS) plays a crucial role in trade-off among performance, fuel economy and emission of HEVs. Numerous studies have been conducted in the energy management of HEVs [ 1 ]. Generally, energy management strategies of HEVs are classified into rule-based and optimization-based control strategy [ 2,3 ]. The rule-based strategy is widely used in practice due to the straightforward implementation and high computation efficiency. Jalil proposed a rule-based energy management strategy to determine the power split between the engine and battery by setting thresholds [4]. Trovão presented a new rule-based energy management strategy integrating meta-heuristic optimization for a mutilevel EMS in a electric vehicle [ 5 ]. To further improve the performance of energy management system, an adaptive fussy logic controller was used to calibrate the operating points and key parameters to minize the fuel consumption according to the driving cycles [ 6 ]. However, the performance of any ruel-based strategy is highly dependent on the proper design of the control rules, which usually depends on the enginenering experience. Therefore, many researchers make more efforts to optimization-based energy strategy. With a prior knowledge of the driving cycles, dynamic programming (DP) receives a optimal result and determines the best fuel economy. However, the real-time and robust performance of this strategy cannot be guaranteed [ 7 ]. Instead, DP is implemented offline and served as a benchmark to explore the potential of fuel economy [ 8 ]. To make on-line optimization possible, equivalent consumption minimization strategy (ECMS) and model predictive control (MPC) have been adopted to develop energy management [ 9,10 ]. ECMS is calculated based on the assumption that the variation of SOC (state of charge of the batttery) is neligible due to the slow dynamics compared to other dynamics in HEV [11]. The equivalence factor of ECMS has an important effect on the control performance. However, the optimal value should be determined offline according to a specific cycle [ 12 ]. MPC is a promising method for dynamic model due to the prediction ability in a finite future time-horizon. A MPC-based strategy is developed by predicting the road slope. The results show that the method not only maintains the battery SOC within its boundary, but also achieves better fuel economy [ 13 ]. A Pontryagain's minimum principle (PMP) is used to find the optimal energy management strategy through combining the power prediction based on the traffic information, such as the maximum acceleration, average velocity and maximum velocity [ 14 ]. However, the performance of MPC depends on the predicti (...truncated)


This is a preview of a remote PDF: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0180491&type=printable

Zehui Kong, Yuan Zou, Teng Liu. Implementation of real-time energy management strategy based on reinforcement learning for hybrid electric vehicles and simulation validation, PLOS ONE, 2017, Volume 12, Issue 7, DOI: 10.1371/journal.pone.0180491