How Much of Your Data Can Suck? Thresholds for Domain Performance and Emergent Misalignment in LLMs