Modern Reinforcement Learning: Actor-Critic Algorithms