When Robots Should Say "I Don't Know": Benchmarking Abstention in Embodied Question Answering