From Demonstrations to Rewards: Alignment Without Explicit Human Preferences