Representative Action Selection for Large Action Space: From Bandits to MDPs