TOPPO: Rethinking PPO for Multi-Task Reinforcement Learning with Critic Balancing