Multi-criteria Hardware Trojan Detection: A Reinforcement Learning Approach