q-exponential family for policy optimization