Efficient and Sharp Off-Policy Learning under Unobserved Confounding