Submission 174
Optimisation of trading for renewable energy systems in combination with bat-tery storage systems using reinforcement learning
WISO25-174
Presented by: Jonas Petzschmann
Summary
The trading of electricity from renewable energies poses a challenge for market participants, as the trading decision must be made before the delivery with uncertainty about the future generation of the renewable energy plants. Battery storage systems can reduce these uncertainties by being used to compensate for forecast errors of renewable energy plants. However, battery storage systems can also be used for arbitrage trading. In this work, reinforcement learning (RL) is used to determine the optimal trading strategy for battery storage systems on the short-term energy markets. Artificial neural networks are used, which are trained with the PGPE learning algorithm. The results show that the RL approach generates 7.3% higher profits compared to equation-based optimisation by reducing costs by 90 % through cyclic ageing.
Introduction
The trading of electricity from renewable energies poses a challenge for market participants, as the trading decision must be made before the delivery with uncertainty about the future generation of the renewable energy plants. Battery storage systems can reduce these uncertainties by being used to compensate for forecast errors of renewable energy plants. However, battery storage systems can also be used for arbitrage trading. In this work, reinforcement learning (RL) is used to determine the optimal trading strategy for battery storage systems on the short-term energy markets. Artificial neural networks are used, which are trained with the PGPE learning algorithm. The results show that the RL approach generates 7.3% higher profits compared to equation-based optimisation by reducing costs by 90 % through cyclic ageing.
Methodology
RL is used to determine the optimal trading strategy for a system pool consisting of renewable energy systems and flexible battery storage systems on the short-term markets. On the one hand, the battery storage systems are used for arbitrage trading. On the other hand, they are also activated for short-term balancing by providing or absorbing energy at short notice to mitigate forecast errors.
The latest status and forecast data of the plants are used to determine the optimum utilisation of the power plants in order to create the bids on the day-ahead and intraday markets. As shown in Figure 1, policy gradients with parameter-based exploration (PGPE) and artificial neural networks are used as learning algorithms for function approximation. The cyclic ageing is determined by the Rainflow algorithm. The results are compared with an equation-based optimisation approach (EB).
Results
In this work, it was demonstrated that RL optimisation of storage utilisation achieves higher profits compared to established methods by extending the lifetime of the battery storage system. As shown in Figure 2, RL optimisation results in higher imbalance settlement costs than EB optimisation due to greater deviations between the amount of energy offered and physical delivery. At the same time, the cycling ageing costs of RL-optimised storage use are only 11% of the ageing costs of storage use with EB optimisation. Overall, this means that RL optimistion achieves 7.31% higher profits than EB optimisation.