Collins, Gary
Handling Cost and Constraints with Off-Policy Deep Reinforcement Learning
Markowitz, Jared, Silverberg, Jesse, Collins, Gary
By reusing data throughout training, off-policy deep reinforcement learning algorithms offer improved sample efficiency relative to on-policy approaches. For continuous action spaces, the most popular methods for off-policy learning include policy improvement steps where a learned state-action ($Q$) value function is maximized over selected batches of data. These updates are often paired with regularization to combat associated overestimation of $Q$ values. With an eye toward safety, we revisit this strategy in environments with "mixed-sign" reward functions; that is, with reward functions that include independent positive (incentive) and negative (cost) terms. This setting is common in real-world applications, and may be addressed with or without constraints on the cost terms. We find the combination of function approximation and a term that maximizes $Q$ in the policy update to be problematic in such environments, because systematic errors in value estimation impact the contributions from the competing terms asymmetrically. This results in overemphasis of either incentives or costs and may severely limit learning. We explore two remedies to this issue. First, consistent with prior work, we find that periodic resetting of $Q$ and policy networks can be used to reduce value estimation error and improve learning in this setting. Second, we formulate novel off-policy actor-critic methods for both unconstrained and constrained learning that do not explicitly maximize $Q$ in the policy update. We find that this second approach, when applied to continuous action spaces with mixed-sign rewards, consistently and significantly outperforms state-of-the-art methods augmented by resetting. We further find that our approach produces agents that are both competitive with popular methods overall and more reliably competent on frequently-studied control problems that do not have mixed-sign rewards.
FUTURE-AI: International consensus guideline for trustworthy and deployable artificial intelligence in healthcare
Lekadir, Karim, Feragen, Aasa, Fofanah, Abdul Joseph, Frangi, Alejandro F, Buyx, Alena, Emelie, Anais, Lara, Andrea, Porras, Antonio R, Chan, An-Wen, Navarro, Arcadi, Glocker, Ben, Botwe, Benard O, Khanal, Bishesh, Beger, Brigit, Wu, Carol C, Cintas, Celia, Langlotz, Curtis P, Rueckert, Daniel, Mzurikwao, Deogratias, Fotiadis, Dimitrios I, Zhussupov, Doszhan, Ferrante, Enzo, Meijering, Erik, Weicken, Eva, González, Fabio A, Asselbergs, Folkert W, Prior, Fred, Krestin, Gabriel P, Collins, Gary, Tegenaw, Geletaw S, Kaissis, Georgios, Misuraca, Gianluca, Tsakou, Gianna, Dwivedi, Girish, Kondylakis, Haridimos, Jayakody, Harsha, Woodruf, Henry C, Aerts, Hugo JWL, Walsh, Ian, Chouvarda, Ioanna, Buvat, Irène, Rekik, Islem, Duncan, James, Kalpathy-Cramer, Jayashree, Zahir, Jihad, Park, Jinah, Mongan, John, Gichoya, Judy W, Schnabel, Julia A, Kushibar, Kaisar, Riklund, Katrine, Mori, Kensaku, Marias, Kostas, Amugongo, Lameck M, Fromont, Lauren A, Maier-Hein, Lena, Alberich, Leonor Cerdá, Rittner, Leticia, Phiri, Lighton, Marrakchi-Kacem, Linda, Donoso-Bach, Lluís, Martí-Bonmatí, Luis, Cardoso, M Jorge, Bobowicz, Maciej, Shabani, Mahsa, Tsiknakis, Manolis, Zuluaga, Maria A, Bielikova, Maria, Fritzsche, Marie-Christine, Linguraru, Marius George, Wenzel, Markus, De Bruijne, Marleen, Tolsgaard, Martin G, Ghassemi, Marzyeh, Ashrafuzzaman, Md, Goisauf, Melanie, Yaqub, Mohammad, Ammar, Mohammed, Abadía, Mónica Cano, Mahmoud, Mukhtar M E, Elattar, Mustafa, Rieke, Nicola, Papanikolaou, Nikolaos, Lazrak, Noussair, Díaz, Oliver, Salvado, Olivier, Pujol, Oriol, Sall, Ousmane, Guevara, Pamela, Gordebeke, Peter, Lambin, Philippe, Brown, Pieta, Abolmaesumi, Purang, Dou, Qi, Lu, Qinghua, Osuala, Richard, Nakasi, Rose, Zhou, S Kevin, Napel, Sandy, Colantonio, Sara, Albarqouni, Shadi, Joshi, Smriti, Carter, Stacy, Klein, Stefan, Petersen, Steffen E, Aussó, Susanna, Awate, Suyash, Raviv, Tammy Riklin, Cook, Tessa, Mutsvangwa, Tinashe E M, Rogers, Wendy A, Niessen, Wiro J, Puig-Bosch, Xènia, Zeng, Yi, Mohammed, Yunusa G, Aquino, Yves Saint James, Salahuddin, Zohaib, Starmans, Martijn P A
Despite major advances in artificial intelligence (AI) for medicine and healthcare, the deployment and adoption of AI technologies remain limited in real-world clinical practice. In recent years, concerns have been raised about the technical, clinical, ethical and legal risks associated with medical AI. To increase real world adoption, it is essential that medical AI tools are trusted and accepted by patients, clinicians, health organisations and authorities. This work describes the FUTURE-AI guideline as the first international consensus framework for guiding the development and deployment of trustworthy AI tools in healthcare. The FUTURE-AI consortium was founded in 2021 and currently comprises 118 inter-disciplinary experts from 51 countries representing all continents, including AI scientists, clinicians, ethicists, and social scientists. Over a two-year period, the consortium defined guiding principles and best practices for trustworthy AI through an iterative process comprising an in-depth literature review, a modified Delphi survey, and online consensus meetings. The FUTURE-AI framework was established based on 6 guiding principles for trustworthy AI in healthcare, i.e. Fairness, Universality, Traceability, Usability, Robustness and Explainability. Through consensus, a set of 28 best practices were defined, addressing technical, clinical, legal and socio-ethical dimensions. The recommendations cover the entire lifecycle of medical AI, from design, development and validation to regulation, deployment, and monitoring. FUTURE-AI is a risk-informed, assumption-free guideline which provides a structured approach for constructing medical AI tools that will be trusted, deployed and adopted in real-world practice. Researchers are encouraged to take the recommendations into account in proof-of-concept stages to facilitate future translation towards clinical practice of medical AI.