Just a few years ago, there were no legions of deep learning scientists developing intelligent products and services at major companies and startups. When the youngest among us (the authors) entered the field, machine learning did not command headlines in daily newspapers. Our parents had no idea what machine learning was, let alone why we might prefer it to a career in medicine or law. Machine learning was a forward-looking academic discipline with a narrow set of real-world applications. And those applications, e.g., speech recognition and computer vision, required so much domain knowledge that they were often regarded as separate areas entirely for which machine learning was one small component. Neural networks then, the antecedents of the deep learning models that we focus on in this book, were regarded as outmoded tools. In just the past five years, deep learning has taken the world by surprise, driving rapid progress in fields as diverse as computer vision, natural language processing, automatic speech recognition, reinforcement learning, and statistical modeling. With these advances in hand, we can now build cars that drive themselves with more autonomy than ever before (and less autonomy than some companies might have you believe), smart reply systems that automatically draft the most mundane emails, helping people dig out from oppressively large inboxes, and software agents that dominate the worldʼs best humans at board games like Go, a feat once thought to be decades away. Already, these tools exert ever-wider impacts on industry and society, changing the way movies are made, diseases are diagnosed, and playing a growing role in basic sciences--from astrophysics to biology.
AI, computer vision and machine learning systems proved that machines are better and faster than humans analyzing big data. Today, organizations have large datasets of patient data and insights about diseases through techniques like Genome Wide Association Studies (GWAS). Using AI, healthcare providers can analyze and interpret the available patient data more precisely for early diagnosis and better treatment. Today, it is possible to say whether a person has the chance to get cancer from a selfie using computer vision and machine learning to detect increased bilirubin levels in a person's sclera, the white part of the eye. As the interest in AI in the healthcare industry continues to grow, there are numerous current AI applications, and more use cases will emerge in the future.
It's probably no surprise that money is pouring into life sciences and healthcare startups during the biggest medical crisis in a century. CB Insights reported that global healthcare funding hit a new record $31.6 billion in this first quarter of 2021. It's also no shock that the two biggest trends – artificial intelligence and telehealth – also reaped record amounts of private cash. AI healthcare startups raised nearly $2.5 billion, while telehealth companies did even better by netting $4.2 billion in equity funding. That's the third consecutive quarter to hit record highs in both sectors dating back to Q3'20.
Gupta, Abhishek, Royer, Alexandrine, Wright, Connor, Khan, Falaah Arif, Heath, Victoria, Galinkin, Erick, Khurana, Ryan, Ganapini, Marianna Bergamaschi, Fancy, Muriam, Sweidan, Masa, Akif, Mo, Butalid, Renjie
The 3rd edition of the Montreal AI Ethics Institute's The State of AI Ethics captures the most relevant developments in AI Ethics since October 2020. It aims to help anyone, from machine learning experts to human rights activists and policymakers, quickly digest and understand the field's ever-changing developments. Through research and article summaries, as well as expert commentary, this report distills the research and reporting surrounding various domains related to the ethics of AI, including: algorithmic injustice, discrimination, ethical AI, labor impacts, misinformation, privacy, risk and security, social media, and more. In addition, The State of AI Ethics includes exclusive content written by world-class AI Ethics experts from universities, research institutes, consulting firms, and governments. Unique to this report is "The Abuse and Misogynoir Playbook," written by Dr. Katlyn Tuner (Research Scientist, Space Enabled Research Group, MIT), Dr. Danielle Wood (Assistant Professor, Program in Media Arts and Sciences; Assistant Professor, Aeronautics and Astronautics; Lead, Space Enabled Research Group, MIT) and Dr. Catherine D'Ignazio (Assistant Professor, Urban Science and Planning; Director, Data + Feminism Lab, MIT). The piece (and accompanying infographic), is a deep-dive into the historical and systematic silencing, erasure, and revision of Black women's contributions to knowledge and scholarship in the United Stations, and globally. Exposing and countering this Playbook has become increasingly important following the firing of AI Ethics expert Dr. Timnit Gebru (and several of her supporters) at Google. This report should be used not only as a point of reference and insight on the latest thinking in the field of AI Ethics, but should also be used as a tool for introspection as we aim to foster a more nuanced conversation regarding the impacts of AI on the world.
In 2019, The Centers for Medicare and Medicaid Services (CMS) launched an Artificial Intelligence (AI) Health Outcomes Challenge seeking solutions to predict risk in value-based care for incorporation into CMS Innovation Center payment and service delivery models. Recently, modern language models have played key roles in a number of health related tasks. This paper presents, to the best of our knowledge, the first application of these models to patient readmission prediction. To facilitate this, we create a dataset of 1.2 million medical history samples derived from the Limited Dataset (LDS) issued by CMS. Moreover, we propose a comprehensive modeling solution centered on a deep learning framework for this data. To demonstrate the framework, we train an attention-based Transformer to learn Medicare semantics in support of performing downstream prediction tasks thereby achieving 0.91 AUC and 0.91 recall on readmission classification. We also introduce a novel data pre-processing pipeline and discuss pertinent deployment considerations surrounding model explainability and bias.
Patient monitoring is vital in all stages of care. We here report the development and validation of ICU length of stay and mortality prediction models. The models will be used in an intelligent ICU patient monitoring module of an Intelligent Remote Patient Monitoring (IRPM) framework that monitors the health status of patients, and generates timely alerts, maneuver guidance, or reports when adverse medical conditions are predicted. We utilized the publicly available Medical Information Mart for Intensive Care (MIMIC) database to extract ICU stay data for adult patients to build two prediction models: one for mortality prediction and another for ICU length of stay. For the mortality model, we applied six commonly used machine learning (ML) binary classification algorithms for predicting the discharge status (survived or not). For the length of stay model, we applied the same six ML algorithms for binary classification using the median patient population ICU stay of 2.64 days. For the regression-based classification, we used two ML algorithms for predicting the number of days. We built two variations of each prediction model: one using 12 baseline demographic and vital sign features, and the other based on our proposed quantiles approach, in which we use 21 extra features engineered from the baseline vital sign features, including their modified means, standard deviations, and quantile percentages. We could perform predictive modeling with minimal features while maintaining reasonable performance using the quantiles approach. The best accuracy achieved in the mortality model was approximately 89% using the random forest algorithm. The highest accuracy achieved in the length of stay model, based on the population median ICU stay (2.64 days), was approximately 65% using the random forest algorithm.
Machine learning modules can be trained with the use of electronic health record (EHR) data to differentiate between transient and persistent cases of early childhood asthma, according the results of an analysis published in PLoS One. Researchers conducted a retrospective cohort study using data derived from the Pediatric Big Data (PBD) resource at the Children's Hospital of Philadelphia (CHOP) -- a pediatric tertiary academic medical center located in Pennsylvania. The researchers sought to develop machine learning modules that could be used to identify individuals who were diagnosed with asthma at aged 5 years or younger whose symptoms will continue to persist and who will thus continue to experience asthma-related visits. They trained 5 machine learning modules to distinguish between individuals without any subsequent asthma-related visits (transient asthma diagnosis) from those who did experience asthma-related visits from 5 to 10 years of age (persistent asthma diagnosis), based on clinical information available in these children up to 5 years of age. The PBD resource used in the current study included data obtained from the CHOP Care Network -- a primary care network of more than 30 sites -- and from CHOP Specialty Care and Surgical Centers.
Algorithms produce a growing portion of decisions and recommendations both in policy and business. Such algorithmic decisions are natural experiments (conditionally quasi-randomly assigned instruments) since the algorithms make decisions based only on observable input variables. We use this observation to develop a treatment-effect estimator for a class of stochastic and deterministic algorithms. Our estimator is shown to be consistent and asymptotically normal for well-defined causal effects. A key special case of our estimator is a high-dimensional regression discontinuity design. The proofs use tools from differential geometry and geometric measure theory, which may be of independent interest. The practical performance of our method is first demonstrated in a high-dimensional simulation resembling decision-making by machine learning algorithms. Our estimator has smaller mean squared errors compared to alternative estimators. We finally apply our estimator to evaluate the effect of Coronavirus Aid, Relief, and Economic Security (CARES) Act, where more than \$10 billion worth of relief funding is allocated to hospitals via an algorithmic rule. The estimates suggest that the relief funding has little effects on COVID-19-related hospital activity levels. Naive OLS and IV estimates exhibit substantial selection bias.
Machine learning, artificial intelligence, and other modern statistical methods are providing new opportunities to operationalise previously untapped and rapidly growing sources of data for patient benefit. Despite much promising research currently being undertaken, particularly in imaging, the literature as a whole lacks transparency, clear reporting to facilitate replicability, exploration for potential ethical concerns, and clear demonstrations of effectiveness. Among the many reasons why these problems exist, one of the most important (for which we provide a preliminary solution here) is the current lack of best practice guidance specific to machine learning and artificial intelligence. However, we believe that interdisciplinary groups pursuing research and impact projects involving machine learning and artificial intelligence for health would benefit from explicitly addressing a series of questions concerning transparency, reproducibility, ethics, and effectiveness (TREE). The 20 critical questions proposed here provide a framework for research groups to inform the design, conduct, and reporting; for editors and peer reviewers to evaluate contributions to the literature; and for patients, clinicians and policy makers to critically appraise where new findings may deliver patient benefit. Machine learning (ML), artificial intelligence (AI), and other modern statistical methods are providing new opportunities to operationalise previously untapped and rapidly growing sources of data for patient benefit. The potential uses include improving diagnostic accuracy,1 more reliably predicting prognosis,2 targeting treatments,3 and increasing the operational efficiency of health systems.4 Examples of potentially disruptive technology with early promise include image based diagnostic applications of ML/AI, which have shown the most early clinical promise (eg, deep learning based algorithms improving accuracy in diagnosing retinal pathology compared with that of specialist physicians5), or natural language processing used as a tool to extract information from structured and unstructured (that is, free) text embedded in electronic health records.2 Although we are only just …
The potential presented by Artificial Intelligence (AI) for healthcare has long been recognised by the technical community. More recently, this potential has been recognised by policymakers, resulting in considerable public and private investment in the development of AI for healthcare across the globe. Despite this, excepting limited success stories, real-world implementation of AI systems into front-line healthcare has been limited. There are numerous reasons for this, but a main contributory factor is the lack of internationally accepted, or formalised, regulatory standards to assess AI safety and impact and effectiveness. This is a well-recognised problem with numerous ongoing research and policy projects to overcome it. Our intention here is to contribute to this problem-solving effort by seeking to set out a minimally viable framework for evaluating the safety, acceptability and efficacy of AI systems for healthcare. We do this by conducting a systematic search across Scopus, PubMed and Google Scholar to identify all the relevant literature published between January 1970 and November 2020 related to the evaluation of: output performance; efficacy; and real-world use of AI systems, and synthesising the key themes according to the stages of evaluation: pre-clinical (theoretical phase); exploratory phase; definitive phase; and post-market surveillance phase (monitoring). The result is a framework to guide AI system developers, policymakers, and regulators through a sufficient evaluation of an AI system designed for use in healthcare.