Exploring the flavor structure of quarks and leptons with reinforcement learning