Assessing AI system performance: thinking beyond models to deployment contexts - Microsoft Research
AI systems are becoming increasingly complex as we move from visionary research to deployable technologies such as self-driving cars, clinical predictive models, and novel accessibility devices. Unlike singular AI models, it is more difficult to assess whether these more complex AI systems are performing consistently and as intended to realize human benefit. How do we know when these more advanced systems are'good enough' for their intended use? When assessing the performance of AI models, we often rely on aggregate performance metrics like percentage of accuracy. But this ignores the many, often human elements, that make up an AI system. Our research on what it takes to build forward-looking, inclusive AI experiences has demonstrated that getting to'good enough' requires multiple performance assessment approaches at different stages of the development lifecycle, based upon realistic data and key user needs (figure 1).
Sep-28-2022, 05:24:51 GMT
- Technology: