On-line reinforcement learning for optimization of real-life energy trading strategy