Confident Approximate Policy Iteration for Efficient Local Planning in $q^\pi$-realizable MDPs