It Takes Two to Tango: Navigating Conceptualizations of NLP Tasks and Measurements of Performance