AlignIQL: Policy Alignment in Implicit Q-Learning through Constrained Optimization