Risk-graded Safety for Handling Medical Queries in Conversational AI
Abercrombie, Gavin, Rieser, Verena
–arXiv.org Artificial Intelligence
Conversational AI systems can engage in unsafe behaviour when handling users' medical queries that can have severe consequences and could even lead to deaths. Systems therefore need to be capable of both recognising the seriousness of medical inputs and producing responses with appropriate levels of risk. We create a corpus of human written English language medical queries and the responses of different types of systems. We label these with both crowdsourced and expert annotations. While individual crowdworkers may be unreliable at grading the seriousness of the prompts, their aggregated labels tend to agree with professional opinion to a greater extent on identifying the medical queries and recognising the risk types posed by the responses. Results of classification experiments suggest that, while these tasks can be automated, caution should be exercised, as errors can potentially be very serious.
arXiv.org Artificial Intelligence
Oct-2-2022
- Country:
- Asia
- Europe
- Ireland > Leinster
- County Dublin > Dublin (0.05)
- Italy > Tuscany
- Florence (0.04)
- United Kingdom > Scotland
- City of Edinburgh > Edinburgh (0.04)
- Ireland > Leinster
- North America > United States
- Hawaii > Honolulu County
- Honolulu (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- New York > New York County
- New York City (0.04)
- Hawaii > Honolulu County
- Genre:
- Research Report > Experimental Study (0.48)
- Industry:
- Technology: