Last week, the U.S. Food and Drug Administration presented the organization's first Artificial Intelligence/Machine Learning (AI/ML)- Based Software as a Medical Device (SaMD) Action Plan. This plan portrays a multi-pronged way to deal with the Agency's oversight of AI/ML-based medical software. The Artificial Intelligence/Machine Learning (AI/ML)- Based Software as a Medical Device (SaMD) Action Plan is a response to stakeholder input on the FDA's 2019 regulatory structure for AI and ML-based medical items. FDA additionally will hold a public workshop on algorithm transparency and draw in its stakeholders and partners on other key activities, for example, assessing predisposition in algorithms. While the Action Plan proposes a guide for propelling a regulatory framework, an operational structure gives off an impression of being further down the road.
Machine learning algorithms in healthcare have the potential to continually learn from real-world data generated during healthcare delivery and adapt to dataset shifts. As such, the FDA is looking to design policies that can autonomously approve modifications to machine learning algorithms while maintaining or improving the safety and effectiveness of the deployed models. However, selecting a fixed approval strategy, a priori, can be difficult because its performance depends on the stationarity of the data and the quality of the proposed modifications. To this end, we investigate a learning-to-approve approach (L2A) that uses accumulating monitoring data to learn how to approve modifications. L2A defines a family of strategies that vary in their "optimism''---where more optimistic policies have faster approval rates---and searches over this family using an exponentially weighted average forecaster. To control the cumulative risk of the deployed model, we give L2A the option to abstain from making a prediction and incur some fixed abstention cost instead. We derive bounds on the average risk of the model deployed by L2A, assuming the distributional shifts are smooth. In simulation studies and empirical analyses, L2A tailors the level of optimism for each problem-setting: It learns to abstain when performance drops are common and approve beneficial modifications quickly when the distribution is stable.
I frequently emphasize the importance of data in the U.S. Food and Drug Administration's work as a science-based regulatory agency, and the need to "unleash the power of data" through sophisticated mechanisms for collection, review and analysis so that it may become preventive, action-oriented information. As one example of this commitment, I would like to tell you about cross-cutting work the agency is undertaking to leverage our use of artificial intelligence (AI) as part of the FDA's New Era of Smarter Food Safety initiative. This work promises to equip the FDA with important new ways to apply available data sources to strengthen our public health mission. The ultimate goal is to see if AI can improve our ability to quickly and efficiently identify products that may pose a threat to public health. One area in which the FDA is assessing the use of AI is in the screening of imported foods.
An app designed for Apple Watch has received approval from the Food and Drug Administration (FDA) for an effective treatment for nightmares caused by post-traumatic stress disorder (PTSD). Called NightWare, the application is now marketed as an aid for the'temporary reduction of sleep disturbances related to nightmares in adults.' The app uses Apple Watch sensors to monitor body movement and sleep and when it detects the user is experiencing a nightmare, the device will vibrate to disturb their sleep. NightWare is currently only available with a prescription and the company stresses it is not a standalone treatment for PTSD. Approximately eight million Americans suffer from PTSD and up to 96 percent of them have nightmares as a result.
The lack of proper data training for AI algorithms used for medical devices can end up being harmful to patients, experts told the FDA. The federal agency held a nearly seven-hour patient engagement meeting on the use of artificial intelligence in healthcare Oct. 22, in which experts addressed the public's questions about machine learning in medical devices. Experts and executives in the fields of medicine, regulations, technology and public health discussed the composition of the datasets that train AI-based medical devices. A lack of transparency surrounding the datasets that train algorithms can lead to public mistrust in AI-powered medical tools, as these devices may not have been trained using patient data that accurately represents the individuals they will be treating. During the meeting, Center for Devices and Radiological Health Director Jeffrey Shuren, MD, noted that 562 AI-powered medical devices have received FDA emergency use authorization and pointed out that all patients should be considered when these devices are being developed and regulated.
The environments in which we deploy machine learning (ML) algorithms rarely look exactly like the environments in which we collected our training data. Unfortunately, we lack methodology for evaluating how well an algorithm will generalize to new environments that differ in a structured way from the training data (i.e., the case of dataset shift (Quiñonero-Candela et al., 2009)). Such methodology is increasingly important as ML systems are being deployed across a number of industries, such as health care and personal finance, in which system performance translates directly to real-world outcomes. Further, as regulation and product reviews become more common across industries, system developers will be expected to produce evidence of the validity and safety of their systems. For example, the United States Food and Drug Administration (FDA) currently regulates ML systems for medical applications, requiring evidence for the validity of such systems before approval is granted (US Food and Drug Administration, 2019). Evaluation methods for assessing model validity have typically focused on how the model performs on data from the training distribution, known as internal validity. Powerful tools, such as cross-validation and the bootstrap, satisfy the assumption that the training and test data are drawn from the same distribution. However, these validation methods do not capture a model's ability to generalize to new environments, known as external validity (Campbell and Stanley, 1963). Currently, the main way to assess a model's external validity is to empirically evaluate performance on multiple, independently collected datasets (e.g.,
At a virtual meeting of the U.S. Food and Drug Administration's Center for Devices and Radiological Health and Patient Engagement Advisory Committee on Thursday, regulators offered updates and new discussion around medical devices and decision support powered by artificial intelligence. One of the topics on the agenda was how to strike a balance between safety and innovation with algorithms getting smarter and better trained by the day. In his discussion of AI and machine learning validation, Bakul Patel, director of the FDA's recently-launched Digital Health Center of Excellence, said he sees huge breakthroughs on the horizon. "This new technology is going to help us get to a different place and a better place," said Patel. You're seeing automated image diagnostics.
While AI and machine learning have the potential for transforming healthcare, the technology has inherent biases that could negatively impact patient care, senior FDA officials and Philips' head of global software standards said at the meeting. Bakul Patel, director of FDA's new Digital Health Center of Excellence, acknowledged significant challenges to AI/ML adoption including bias and the lack of large, high-quality and well-curated datasets. "There are some constraints because of just location or the amount of information available and the cleanliness of the data might drive inherent bias. We don't want to set up a system and we would not want to figure out after the product is out in the market that it is missing a certain type of population or demographic or other other aspects that we would have accidentally not realized," Patel said. Pat Baird, Philips' head of global software standards, warned without proper context there will be "improper use" of AI/ML-based devices that provide "incorrect conclusions" provided as part of clinical decision support.
The U.S. Food and Drug Administration on Thursday convened a public meeting of its Patient Engagement Advisory Committee to discuss issues regarding artificial intelligence and machine learning in medical devices. "Devices using AI and ML technology will transform healthcare delivery by increasing efficiency in key processes in the treatment of patients," said Dr. Paul Conway, PEAC chair and chair of policy and global affairs of the American Association of Kidney Patients. As Conway and others noted during the panel, AI and ML systems may have algorithmic biases and lack transparency – potentially leading, in turn, to an undermining of patient trust in devices. Medical device innovation has already ramped up in response to the COVID-19 crisis, with Center for Devices and Radiological Health Director Dr. Jeff Shuren noting that 562 medical devices have already been granted emergency use authorization by the FDA. It's imperative, said Shuren, that patients' needs be considered as part of the creation process.