Plotting

 Science


Contrapuntal gene risk

Science

Genomic Analysis Identifying functional genetic variation in humans requires sifting through hundreds of thousands of individual variants and linking them to the trait of interest. We often do not know whether a gene is functional in a tissue or specific cell. Machine-learning models have become valuable for such endeavors. Somepalli et al. developed a model they call FUGUE, which they used to map the tissue-specific expression of human disease–associated genes and their protein context and interactions. Interestingly, FUGUE revealed that tissue-relevant genes cluster on the genome within topologically associated domains. The authors supply prioritized gene lists for 30 human tissues for genes associated with heart disease, Alzheimer's disease, cancer, and development. PLoS Comput. Biol. 17 , e1009194 (2021).


Accurate prediction of protein structures and interactions using a three-track neural network

Science

In 1972, Anfinsen won a Nobel prize for demonstrating a connection between a protein's amino acid sequence and its three-dimensional structure. Since 1994, scientists have competed in the biannual Critical Assessment of Structure Prediction (CASP) protein-folding challenge. Deep learning methods took center stage at CASP14, with DeepMind's Alphafold2 achieving remarkable accuracy. Baek et al. explored network architectures based on the DeepMind framework. They used a three-track network to process sequence, distance, and coordinate information simultaneously and achieved accuracies approaching those of DeepMind. The method, RoseTTA fold, can solve challenging x-ray crystallography and cryo–electron microscopy modeling problems and generate accurate models of protein-protein complexes. Science , abj8754, this issue p. [871][1] DeepMind presented notably accurate predictions at the recent 14th Critical Assessment of Structure Prediction (CASP14) conference. We explored network architectures that incorporate related ideas and obtained the best performance with a three-track network in which information at the one-dimensional (1D) sequence level, the 2D distance map level, and the 3D coordinate level is successively transformed and integrated. The three-track network produces structure predictions with accuracies approaching those of DeepMind in CASP14, enables the rapid solution of challenging x-ray crystallography and cryo–electron microscopy structure modeling problems, and provides insights into the functions of proteins of currently unknown structure. The network also enables rapid generation of accurate protein-protein complex models from sequence information alone, short-circuiting traditional approaches that require modeling of individual subunits followed by docking. We make the method available to the scientific community to speed biological research. [1]: /lookup/doi/10.1126/science.abj8754


Evolving threat

Science

New variants have changed the face of the pandemic. What will the virus do next? ![Figure][1] CREDITS: (GRAPHIC) N. DESAI/ SCIENCE ; (DATA) NEXTSTRAIN; GISAID Edward Holmes does not like making predictions, but last year he hazarded a few. Again and again, people had asked Holmes, an expert on viral evolution at the University of Sydney, how he expected SARS-CoV-2 to change. In May 2020, 5 months into the pandemic, he started to include a slide with his best guesses in his talks. The virus would probably evolve to avoid at least some human immunity, he suggested. But it would likely make people less sick over time, he said, and there would be little change in its infectivity. In short, it sounded like evolution would not play a major role in the pandemic's near future. “A year on I've been proven pretty much wrong on all of it,” Holmes says. Well, not all: SARS-CoV-2 did evolve to better avoid human antibodies. But it has also become a bit more virulent and a lot more infectious, causing more people to fall ill. That has had an enormous influence on the course of the pandemic. The Delta strain circulating now—one of four “variants of concern” identified by the World Health Organization, along with four “variants of interest”—is so radically different from the virus that appeared in Wuhan, China, in late 2019 that many countries have been forced to change their pandemic planning. Governments are scrambling to accelerate vaccination programs while prolonging or even reintroducing mask wearing and other public health measures. As to the goal of reaching herd immunity—vaccinating so many people that the virus simply has nowhere to go—“With the emergence of Delta, I realized that it's just impossible to reach that,” says Müge Çevik, an infectious disease specialist at the University of St. Andrews. Yet the most tumultuous period in SARS-CoV-2's evolution may still be ahead of us, says Aris Katzourakis, an evolutionary biologist at the University of Oxford. There's now enough immunity in the human population to ratchet up an evolutionary competition, pressuring the virus to adapt further. At the same time, much of the world is still overwhelmed with infections, giving the virus plenty of chances to replicate and throw up new mutations. Predicting where those worrisome factors will lead is just as tricky as it was a year and a half ago, however. “We're much better at explaining the past than predicting the future,” says Andrew Read, an evolutionary biologist at Pennsylvania State University, University Park. Evolution, after all, is driven by random mutations, which are impossible to predict. “It's very, very tricky to know what's possible, until it happens,” Read says. “It's not physics. It doesn't happen on a billiard table.” Still, experience with other viruses gives evolutionary biologists some clues about where SARS-CoV-2 may be headed. The courses of past outbreaks show the coronavirus could well become even more infectious than Delta is now, Read says: “I think there's every expectation that this virus will continue to adapt to humans and will get better and better at us.” Far from making people less sick, it could also evolve to become even deadlier, as some previous viruses including the 1918 flu have. And although COVID-19 vaccines have held up well so far, history shows the virus could evolve further to elude their protective effect—although a recent study in another coronavirus suggests that could take many years, which would leave more time to adapt vaccines to the changing threat. Holmes himself uploaded one of the first SARS-CoV-2 genomes to the internet on 10 January 2020. Since then, more than 2 million genomes have been sequenced and published, painting an exquisitely detailed picture of a changing virus. “I don't think we've ever seen that level of precision in watching an evolutionary process,” Holmes says. Making sense of the endless stream of mutations is complicated. Each is just a tiny tweak in the instructions for how to make proteins. Which mutations end up spreading depends on how the viruses carrying those tweaked proteins fare in the real world. The vast majority of mutations give the virus no advantage at all, and identifying the ones that do is difficult. There are obvious candidates, such as mutations that change the part of the spike protein—which sits on the surface of the virus—that binds to human cells. But changes elsewhere in the genome may be just as crucial—yet are harder to interpret. Some genes' functions aren't even clear, let alone what a change in their sequence could mean. The impact of any one change on the virus' fitness also depends on other changes it has already accumulated. That means scientists need real-world data to see which variants appear to be taking off. Only then can they investigate, in cell cultures and animal experiments, what might explain that viral success. The most eye-popping change in SARS-CoV-2 so far has been its improved ability to spread between humans. At some point early in the pandemic, SARS-CoV-2 acquired a mutation called D614G that made it a bit more infectious. That version spread around the world; almost all current viruses are descended from it. Then in late 2020, scientists identified a new variant, now called Alpha, in patients in Kent, U.K., that was about 50% more transmissible. Delta, first seen in India and now conquering the world, is another 40% to 60% more transmissible than Alpha. Read says the pattern is no surprise. “The only way you could not get infectiousness rising would be if the virus popped into humans as perfect at infecting humans as it could be, and the chance of that happening is incredibly small,” he says. But Holmes was startled. “This virus has gone up three notches in effectively a year and that, I think, was the biggest surprise to me,” Holmes says. “I didn't quite appreciate how much further the virus could get.” Bette Korber at Los Alamos National Laboratory and her colleagues first suggested that D614G, the early mutation, was taking over because it made the virus better at spreading. She says skepticism about the virus' ability to evolve was common in the early days of the pandemic, with some researchers saying D614G's apparent advantage might be sheer luck. “There was extraordinary resistance in the scientific community to the idea this virus could evolve as the pandemic grew in seriousness in spring of 2020,” Korber says. ![Figure][1] CREDITS: (GRAPHIC) N. DESAI/ SCIENCE ; (DATA) NEXTSTRAIN; GISAID Researchers had never watched a completely novel virus spread so widely and evolve in humans, after all. “We're used to dealing with pathogens that have been in humanity for centuries, and their evolutionary course is set in the context of having been a human pathogen for many, many years,” says Jeremy Farrar, head of the Wellcome Trust. Katzourakis agrees. “This may have affected our priors and conditioned many to think in a particular way,” he says. Another, more practical problem is that real-world advantages for the virus don't always show up in cell culture or animal models. “There is no way anyone would have noticed anything special about Alpha from laboratory data alone,” says Christian Drosten, a virologist at the Charité University Hospital in Berlin. He and others are still figuring out what, at the molecular level, gives Alpha and Delta an edge. Alpha seems to bind more strongly to the human ACE2 receptor, the virus' target on the cell surface, partly because of a mutation in the spike protein called N501Y. It may also be better at countering interferons, molecules that are part of the body's viral immune defenses. Together those changes may lower the amount of virus needed to infect someone—the infectious dose. In Delta, one of the most important changes may be near the furin cleavage site on spike, where a human enzyme cuts the protein, a key step enabling the virus to invade human cells. A mutation called P681R in that region makes cleavage more efficient, which may allow the virus to enter more cells faster and lead to greater numbers of virus particles in an infected person. In July, Chinese researchers posted a preprint showing Delta could lead to virus levels in patient samples 1000 times higher than for previous variants. Evidence is accumulating that infected people not only spread the virus more efficiently, but also faster, allowing the variant to spread even more rapidly. The new variants of SARS-CoV-2 may also cause more severe disease. For example, a study in Scotland found that an infection with Delta was about twice as likely to lead to hospital admission than with Alpha. It wouldn't be the first time a newly emerging disease quickly became more serious. The 1918–19 influenza pandemic also appears to have caused more serious illness as time went on, says Lone Simonsen, an epidemiologist at Roskilde University who studies past pandemics. “Our data from Denmark suggests it was six times deadlier in the second wave.” A popular notion holds that viruses tend to evolve over time to become less dangerous, allowing the host to live longer and spread the virus more widely. But that idea is too simplistic, Holmes says. “The evolution of virulence has proven to be quicksand for evolutionary biologists,” he says. “It's not a simple thing.” Two of the best studied examples of viral evolution are myxoma virus and rabbit hemorrhagic disease virus, which were released in Australia in 1960 and 1996, respectively, to decimate populations of European rabbits that were destroying croplands and wreaking ecological havoc. Myxoma virus initially killed more than 99% of infected rabbits, but then less pathogenic strains evolved, likely because the virus was killing many animals before they had a chance to pass it on. (Rabbits also evolved to be less susceptible.) Rabbit hemorrhagic disease virus, by contrast, got more deadly over time, probably because the virus is spread by blow flies feeding on rabbit carcasses, and quicker death accelerated its spread. Other factors loosen the constraints on deadliness. For example, a virus variant that can outgrow other variants within a host can end up dominating even if it makes the host sicker and reduces the likelihood of transmission. And an assumption about human respiratory diseases may not always hold: that a milder virus—one that doesn't make you crawl into bed, say—might allow an infected person to spread the virus further. In SARS-CoV-2, most transmission happens early on, when the virus is replicating in the upper airways, whereas serious disease, if it develops, comes later, when the virus infects the lower airways. As a result, a variant that makes the host sicker might spread just as fast as before. From the start of the pandemic, researchers have worried about a third type of viral change, perhaps the most unsettling of all: that SARS-CoV-2 might evolve to evade immunity triggered by natural infections or vaccines. Already, several variants have emerged sporting changes in the surface of the spike protein that make it less easily recognized by antibodies. But although news of these variants has caused widespread fear, their impact has so far been limited. Derek Smith, an evolutionary biologist at the University of Cambridge, has worked for decades on visualizing immune evasion in the influenza virus in so-called antigenic maps. The farther apart two variants are on Smith's maps, the less well antibodies against one virus protect against the other. In a recently published preprint, Smith's group, together with David Montefiori's group at Duke University, has applied the approach to mapping the most important variants of SARS-CoV-2 (see graphic, below). The new maps place the Alpha variant very close to the original Wuhan virus, which means antibodies against one still neutralize the other. The Delta variant, however, has drifted farther away, even though it doesn't completely evade immunity. “It's not an immune escape in the way people think of an escape in slightly cartoonish terms,” Katzourakis says. But Delta is slightly more likely to infect fully vaccinated people than previous variants. “It shows the possible beginning of a trajectory and that's what worries me,” Katzourakis says. ![Figure][1] CREDITS: (GRAPHIC) N. DESAI/ SCIENCE ; (DATA) DEREK SMITH/UNIVERSITY OF CAMBRIDGE; DAVID MONTEFIORI/DUKE UNIVERSITY Other variants have evolved more antigenic distance from the original virus than Delta. Beta, which first appeared in South Africa, has traveled the farthest on the map, although natural or vaccine-induced immunity still largely protects against it. And Beta's attempts to get away may come at a price, as Delta has outstripped it worldwide. “It's probably the case that when a virus changes to escape immunity, it loses other aspects of its fitness,” Smith says. The map shows that for now, the virus is not moving in any particular direction. If the original Wuhan virus is like a town on Smith's map, the virus has been taking local trains to explore the surrounding area, but it has not traveled to the next city—not yet. Although it's impossible to predict exactly how infectiousness, virulence, and immune evasion will develop in the coming months, some of the factors that will influence the virus' trajectory are clear. One is the immunity that is now rapidly building in the human population. On one hand, immunity reduces the likelihood of people getting infected, and may hamper viral replication even when they are. “That means there will be fewer mutations emerging if we vaccinate more people,” Çevik says. On the other hand, any immune escape variant now has a huge advantage over other variants. In fact, the world is probably at a tipping point, Holmes says: With more than 2 billion people having received at least one vaccine dose and hundreds of millions more having recovered from COVID-19, variants that evade immunity may now have a bigger leg up than those that are more infectious. Something similar appears to have happened when a new H1N1 influenza strain emerged in 2009 and caused a pandemic, says Katia Kölle, an evolutionary biologist at Emory University. A 2015 paper found that changes in the virus in the first 2 years appeared to make the virus more adept at human-to-human transmission, whereas changes after 2011 were mostly to avoid human immunity. It may already be getting harder for SARS-CoV-2 to make big gains in infectiousness. “There are some fundamental limits to exactly how good a virus can get at transmitting and at some point SARS-CoV-2 will hit that plateau,” says Jesse Bloom, an evolutionary biologist at the Fred Hutchinson Cancer Research Center. “I think it's very hard to say if this is already where we are, or is it still going to happen.” Evolutionary virologist Kristian Andersen of Scripps Research guesses the virus still has space to evolve greater transmissibility. “The known limit in the viral universe is measles, which is about three times more transmissible than what we have now with Delta,” he says. ![Figure][1] CREDITS: (GRAPHIC) N. DESAI/ SCIENCE ; (DATA) E. WALL ET AL., THE LANCET , 397:10292, 2331 (2021) The limits of immune escape are equally uncertain. Smith's antigenic maps show the space the virus has explored so far. But can it go much farther? If the variants on the map are like towns, then where are the country's natural boundaries—where does the ocean start? A crucial clue will be where the next few variants appear on the map, Smith says. Beta evolved in one direction away from the original virus and Delta in another. “It's too soon to say this now, but we might be heading for a world where there are two serotypes of this virus that would also both have to be considered in any vaccines,” Drosten says. Immune escape is so worrying because it could force humanity to update its vaccines continually, as happens for flu. Yet the vaccines against many other diseases—measles, polio, and yellow fever, for example—have remained effective for decades without updates, even in the rare cases where immune-evading variants appeared. “There was big alarm around 2000 that maybe we'd need to replace the hepatitis B vaccines,” because an escape variant had popped up, Read says. But the variant has not spread around the world: It is able to infect close contacts of an infected person, but then peters out. The virus apparently faces a trade-off between transmissibility and immune escape. Such trade-offs likely exist for SARS-CoV-2 as well. Some clues about SARS-CoV-2's future path may come from coronaviruses with a much longer history in humans: those that cause common colds. Some are known to reinfect people, but until recently it was unclear whether that's because immunity in recovered people wanes, or because the virus changes its surface to evade immunity. In a study published in April in PLOS Pathogens , Bloom and other researchers compared the ability of human sera taken at different times in the past decades to block virus isolated at the same time or later. They showed that the samples could neutralize strains of a coronavirus named 229E isolated around the same time, but weren't always effective against virus from 10 years or more later. The virus had evidently evolved to evade human immunity, but it had taken 10 years or more. “Immune escape conjures this catastrophic failure of immunity when it is really immune erosion,” Bloom says. “Right now it seems like SARS-CoV-2, at least in terms of antibody escape, is actually behaving a lot like coronavirus 229E.” Others are probing SARS-CoV-2 itself. In a preprint published this month, researchers tinkered with the virus to learn how much it has to change to evade the antibodies generated in vaccine recipients and recovered patients. They found that it took 20 changes to the spike protein to escape current antibody responses almost completely. That means the bar for complete escape is high, says one of the authors, virologist Paul Bieniasz of Rockefeller University. “But it's very difficult to look into a crystal ball and say whether that is going to be easy for the virus to acquire or not,” he says. “It seems plausible that true immune escape is hard,” concludes William Hanage of the Harvard T.H. Chan School of Public Health. “However, the counterargument is that natural selection is a hell of a problem solver and the virus is only beginning to experience real pressure to evade immunity.” And the virus has tricks up its sleeve. Coronaviruses are good at recombining, for instance, which could allow new variants to emerge suddenly by combining the genomes—and the properties—of two different variants. In pigs, recombination of a coronavirus named porcine epidemic diarrhea virus with attenuated vaccine strains of another coronavirus has led to more virulent variants of PEDV. “Given the biology of these viruses, recombination may well factor into the continuing evolution of SARS-CoV-2,” Korber says. Given all that uncertainty, it's worrisome that humanity hasn't done a great job of limiting the spread of SARS-CoV-2, says Eugene Koonin, a researcher at the U.S. National Center for Biotechnology Information. Some dangerous variants may only be possible if the virus hits on a very rare, winning combination of mutations, he says. It might have to replicate an astronomical number of times to get there. “But with all these millions of infected people, it may very well find that combination.” Indeed, Katzourakis adds, the past 20 months are a warning to never underestimate viral evolution. “Many still see Alpha and Delta as being as bad as things are ever going to get,” he says. “It would be wise to consider them as steps on a possible trajectory that may challenge our public health response further.” [1]: pending:yes


Malaria infection and severe disease risks in Africa

Science

Understanding how changes in community parasite prevalence alter the rate and age distribution of severe malaria is essential for optimizing control efforts. Paton et al. assessed the incidence of pediatric severe malaria admissions from 13 hospitals in East Africa from 2006 to 2020 (see the Perspective by Taylor and Slutsker). Each 25% increase in community parasite prevalence shifted hospital admissions toward younger children. Low rates of lifetime infections appeared to confer some immunity to severe malaria in very young children. Children under the age of 5 years thus need to remain a focus of disease prevention for malaria control. Science , abj0089, this issue p. [926][1]; see also abk3443, p. [855][2] The relationship between community prevalence of Plasmodium falciparum and the burden of severe, life-threatening disease remains poorly defined. To examine the three most common severe malaria phenotypes from catchment populations across East Africa, we assembled a dataset of 6506 hospital admissions for malaria in children aged 3 months to 9 years from 2006 to 2020. Admissions were paired with data from community parasite infection surveys. A Bayesian procedure was used to calibrate uncertainties in exposure (parasite prevalence) and outcomes (severe malaria phenotypes). Each 25% increase in prevalence conferred a doubling of severe malaria admission rates. Severe malaria remains a burden predominantly among young children (3 to 59 months) across a wide range of community prevalence typical of East Africa. This study offers a quantitative framework for linking malaria parasite prevalence and severe disease outcomes in children. [1]: /lookup/doi/10.1126/science.abj0089 [2]: /lookup/doi/10.1126/science.abk3443


Spatiotemporal invasion dynamics of SARS-CoV-2 lineage B.1.1.7 emergence

Science

The B.1.1.7 lineage of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused fast-spreading outbreaks globally. Intrinsically, this variant has greater transmissibility than its predecessors, but this capacity has been amplified in some circumstances to tragic effect by a combination of human behavior and local immunity. What are the extrinsic factors that help or hinder the rapid dissemination of variants? Kraemer et al. explored the invasion dynamics of B.1.1.7. in fine detail, from its location of origin in Kent, UK, to its heterogenous spread around the country. A combination of mobile phone and virus data including more than 17,000 genomes shows how distinct phases of dispersal were related to intensity of mobility and the timing of lockdowns. As the local outbreaks grew, importation from the London source area became less important. Had B.1.1.7. emerged at a slightly different time of year, its impact might have been different. Science , abj0113, this issue p. [889][1] Understanding the causes and consequences of the emergence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants of concern is crucial to pandemic control yet difficult to achieve because they arise in the context of variable human behavior and immunity. We investigated the spatial invasion dynamics of lineage B.1.1.7 by jointly analyzing UK human mobility, virus genomes, and community-based polymerase chain reaction data. We identified a multistage spatial invasion process in which early B.1.1.7 growth rates were associated with mobility and asymmetric lineage export from a dominant source location, enhancing the effects of B.1.1.7’s increased intrinsic transmissibility. We further explored how B.1.1.7 spread was shaped by nonpharmaceutical interventions and spatial variation in previous attack rates. Our findings show that careful accounting of the behavioral and epidemiological context within which variants of concern emerge is necessary to interpret correctly their observed relative growth rates. [1]: /lookup/doi/10.1126/science.abj0113


Ecology in the age of automation

Science

The accelerating pace of global change is driving a biodiversity extinction crisis ([ 1 ][1]) and is outstripping our ability to track, monitor, and understand ecosystems, which is traditionally the job of ecologists. Ecological research is an intensive, field-based enterprise that relies on the skills of trained observers. This process is both time-consuming and expensive, thus limiting the resolution and extent of our knowledge of the natural world. Although technology will never replace the intuition and breadth of skills of the experienced naturalist ([ 2 ][2]), ecologists cannot ignore the potential to greatly expand the scale of our studies through automation. The capacity to automate biodiversity sampling is being driven by three ongoing technological developments: the commoditization of small, low-power computing devices; advances in wireless communications; and an explosion in automated data-recognition algorithms in the field of machine learning. Automated data collection and machine learning are set to revolutionize in situ studies of natural systems. Automation has swept across all human endeavors over recent decades, and science is no exception. The extent of ecological observation has traditionally been limited by the costs of manual data collection. We envision a future in which data from field studies are augmented with continuous, fine-scale, remotely sensed data recording the presence, behavior, and other properties of individual organisms. As automation drives down costs of these networks, there will not be a simple expansion of the quantity of data. Rather, the potential high resolution and broad extent of these data will lead to qualitatively new findings and will result in new discoveries about the natural world that will enable ecologists to better predict and manage changing ecosystems ([ 3 ][3]). This will be especially true as different types of sensing networks, including mobile elements such as drones, are connected together to provide a rich, multidimensional view of nature. Given the role that biodiversity plays in lending resilience to the ecosystems on which humans depend ([ 4 ][4]), monitoring the distribution and abundance of species along with climate and other variables is a critical need in developing ecological hypotheses and for adapting to emerging global challenges. Ecosystems are alive with sound and motion that can be captured with audio and video sensors. Rapid advances in audio and video classification algorithms can allow the recognition of species and labeling of complex traits and behaviors, which were traditionally the domain of manual species identification by experts. The major advance has been the discovery of deep convolutional neural networks ([ 5 ][5]). These algorithms extract fundamental aspects of contrast and shape in a manner analogous to how we and other animals recognize objects in our visual field. Applied to audio signals, these neural networks are highly effective at classifying natural and anthropogenic sounds ([ 6 ][6]). A canonical example is the classification of bird songs. Other acoustic examples include insects, amphibians, and disturbance indicators such as chainsaws. Naturally, these algorithms also lend themselves to species identification from images and videos. In cases of animals displaying complex color patterns, individuals may be distinguished, allowing minimally invasive mark recapture, an important tool in population studies and conservation ([ 7 ][7]). Beyond sight and sound, sensors can target a wide range of physical, chemical, and biological phenomena. Particularly intriguing is the possibility for widespread environmental sensing of biomolecular compounds that could, for example, allow quantification of “DNA-scapes” by means of laboratory-on-a-chip–type sensors ([ 8 ][8]). Several technological trends are shaping the emergence of large-scale sensor networks. One is the ongoing miniaturization of technology, allowing deployment of extended arrays of low-power sensor devices across landscapes [for example, ([ 9 ][9])]. In many cases, these can be solar-powered in remote locations. The widespread availability of computer-on-a-chip devices along with various attached sensors is enabling the construction of large distributed sensing networks at price points that were formerly unattainable. Similarly, the ubiquitous availability of cloud-based computing and storage for back-end processing is facilitating large-scale deployments. Another trend is advancements in wireless communications. For example, the emerging internet of things ([ 10 ][10]) enables low-power devices to establish ad hoc mesh networks that can pass information from node to node, eventually reaching points of aggregation and analysis. The same technology used to connect smart doorbells and lightbulbs can be leveraged to move data across sensor networks distributed across a landscape. These protocols are designed for low power consumption but may not have sufficient bandwidth for all applications. An alternative, although more power hungry, is cellular technology, which has increasing coverage globally. In remote locations, where commercial cellular data services may not be available, researchers can consider a private cellular network for on-site telemetry and satellite uplinks for internet streaming. However, in the near term, telecommunications costs and per-device power requirements may nonetheless prove prohibitive in certain high-bandwidth applications, such as video and audio streaming. An alternative for sites where communications bandwidth is limited by cost, isolation, or power constraints is edge computing ([ 11 ][11]). In this design, computation is moved to the sensing devices themselves, which then transmit filtered or classified results for analysis, greatly reducing transmission requirements. One more trend is the advancement of machine-learning methods ([ 12 ][12]) that can classify and extract patterns from data streams. Much of this technology has been commoditized through intensive development efforts in the technology sector that have resulted in widely available software libraries usable by nonexperts. The aforementioned convolutional neural networks can be coded both to segment data into units and to label these units with appropriate classes. The major bottleneck is in training classifiers because initial training inputs must be labeled manually by experts. Although labeled training sets exist in some domains—most notably, image recognition—future analysts may be able to skip much of the training step as large collections of pretrained networks become available. These pretrained networks can be combined and modified for specific tasks without the requirement of comprehensive training sets. Of particular interest from the standpoint of automation are new developments in continual learning ([ 13 ][13]), in which networks adjust in response to changing inputs. This holds the promise of automating model adaptation for detecting emerging phenomena, such as species shifting their ranges in response to climate change or other shifts in ecosystem properties. Ecologists could leverage these developments to create automated sensing networks at scales previously unimaginable. As an example, consider the North American Breeding Bird Survey, a highly successful citizen-science initiative running since the late 1960s with continental-scale coverage. Expert observers conduct point counts of birds along routes, generating data that have proved invaluable in tracking trends in songbird populations ([ 14 ][14]). Although we hope to see such efforts continue, imagine what could be learned if, instead of sampling these communities once per year, a long-term, continental-scale songbird observatory could be constructed to record and classify bird vocalizations in near–real time along with environmental covariates. Similar networks could use camera traps or video streams to reveal details of diurnal and seasonal variation across diverse floras and faunas. As with all sampling methods, sensing networks will not be without biases in sensitivity and discrimination, yet they hold the extraordinary promise of regional sampling of biodiversity at the organismal scale, something that has proven difficult, for example, by using traditional satellite-based remote sensing. These efforts would complement ongoing development of continental-scale observatories in ecology [for example, ([ 15 ][15])] by increasing the spatial and temporal resolution of sampling. 1. [↵][16]1. S. Díaz et al ., Science 366, eaax3100 (2019). [OpenUrl][17][Abstract/FREE Full Text][18] 2. [↵][19]1. J. Travis , Am. Nat. 196, 1 (2020). [OpenUrl][20] 3. [↵][21]1. M. C. Dietze et al ., Proc. Natl. Acad. Sci. U.S.A. 115, 1424 (2018). [OpenUrl][22][Abstract/FREE Full Text][23] 4. [↵][24]1. B. J. Cardinale et al ., Nature 486, 59 (2012). [OpenUrl][25][CrossRef][26][PubMed][27][Web of Science][28] 5. [↵][29]1. Y. LeCun, 2. Y. Bengio, 3. G. Hinton , Nature 521, 436 (2015). [OpenUrl][30][CrossRef][31][PubMed][32] 6. [↵][33]1. S. S. Sethi et al ., Proc. Natl. Acad. Sci. U.S.A. 117, 17049 (2020). [OpenUrl][34][Abstract/FREE Full Text][35] 7. [↵][36]1. R. C. Whytock et al ., Methods Ecol. Evol. 12, 1080 (2021). [OpenUrl][37] 8. [↵][38]1. B. C. Dhar, 2. N. Y. Lee , Biochip J. 12, 173 (2018). [OpenUrl][39] 9. [↵][40]1. A. P. Hill et al ., Methods Ecol. Evol. 9, 1199 (2018). [OpenUrl][41] 10. [↵][42]1. L. Atzori, 2. A. Iera, 3. G. Morabito , Comput. Netw. 54, 2787 (2010). [OpenUrl][43][CrossRef][44][Web of Science][45] 11. [↵][46]1. W. Shi, 2. J. Cao, 3. Q. Zhang, 4. Y. Li, 5. L. Xu , IEEE Internet Things J. 3, 637 (2016). [OpenUrl][47] 12. [↵][48]1. M. I. Jordan, 2. T. M. Mitchell , Science 349, 255 (2015). [OpenUrl][49][Abstract/FREE Full Text][50] 13. [↵][51]1. R. Aljundi, 2. K. Kelchtermans, 3. T. Tuytelaars , Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 11254–11263. 14. [↵][52]1. J. R. Sauer, 2. W. A. Link, 3. J. E. Fallon, 4. K. L. Pardieck, 5. D. J. Ziolkowski Jr. , N. Am. Fauna 79, 1 (2013). [OpenUrl][53] 15. [↵][54]1. M. Keller, 2. D. S. Schimel, 3. W. W. Hargrove, 4. F. M. Hoffman , Front. Ecol. Environ. 6, 282 (2008). [OpenUrl][55][CrossRef][56] Acknowledgments: Our perspective on autonomous sensing was developed with the support of the Stengl-Wyer Endowment and the Office of the Vice President for Research Bridging Barriers programs at the University of Texas at Austin, and the National Science Foundation (BCS-2009669). Comments from members of the Keitt laboratory, Planet Texas 2050, A. Wolf, and M. Abelson were invaluable in refining our ideas. [1]: #ref-1 [2]: #ref-2 [3]: #ref-3 [4]: #ref-4 [5]: #ref-5 [6]: #ref-6 [7]: #ref-7 [8]: #ref-8 [9]: #ref-9 [10]: #ref-10 [11]: #ref-11 [12]: #ref-12 [13]: #ref-13 [14]: #ref-14 [15]: #ref-15 [16]: #xref-ref-1-1 "View reference 1 in text" [17]: {openurl}?query=rft.jtitle%253DScience%26rft.stitle%253DScience%26rft.aulast%253DDiaz%26rft.auinit1%253DS.%26rft.volume%253D366%26rft.issue%253D6471%26rft.spage%253Deaax3100%26rft.epage%253Deaax3100%26rft.atitle%253DPervasive%2Bhuman-driven%2Bdecline%2Bof%2Blife%2Bon%2BEarth%2Bpoints%2Bto%2Bthe%2Bneed%2Bfor%2Btransformative%2Bchange%26rft_id%253Dinfo%253Adoi%252F10.1126%252Fscience.aax3100%26rft_id%253Dinfo%253Apmid%252F31831642%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [18]: /lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Mzoic2NpIjtzOjU6InJlc2lkIjtzOjE3OiIzNjYvNjQ3MS9lYWF4MzEwMCI7czo0OiJhdG9tIjtzOjIyOiIvc2NpLzM3My82NTU3Lzg1OC5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30= [19]: #xref-ref-2-1 "View reference 2 in text" [20]: {openurl}?query=rft.jtitle%253DAm.%2BNat.%26rft.volume%253D196%26rft.spage%253D1%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [21]: #xref-ref-3-1 "View reference 3 in text" [22]: {openurl}?query=rft.jtitle%253DProc.%2BNatl.%2BAcad.%2BSci.%2BU.S.A.%26rft_id%253Dinfo%253Adoi%252F10.1073%252Fpnas.1710231115%26rft_id%253Dinfo%253Apmid%252F29382745%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [23]: /lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoicG5hcyI7czo1OiJyZXNpZCI7czoxMDoiMTE1LzcvMTQyNCI7czo0OiJhdG9tIjtzOjIyOiIvc2NpLzM3My82NTU3Lzg1OC5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30= [24]: #xref-ref-4-1 "View reference 4 in text" [25]: {openurl}?query=rft.jtitle%253DNature%26rft.stitle%253DNature%26rft.aulast%253DCardinale%26rft.auinit1%253DB.%2BJ.%26rft.volume%253D486%26rft.issue%253D7401%26rft.spage%253D59%26rft.epage%253D67%26rft.atitle%253DBiodiversity%2Bloss%2Band%2Bits%2Bimpact%2Bon%2Bhumanity.%26rft_id%253Dinfo%253Adoi%252F10.1038%252Fnature11148%26rft_id%253Dinfo%253Apmid%252F22678280%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [26]: /lookup/external-ref?access_num=10.1038/nature11148&link_type=DOI [27]: /lookup/external-ref?access_num=22678280&link_type=MED&atom=%2Fsci%2F373%2F6557%2F858.atom [28]: /lookup/external-ref?access_num=000304854000027&link_type=ISI [29]: #xref-ref-5-1 "View reference 5 in text" [30]: {openurl}?query=rft.jtitle%253DNature%26rft.volume%253D521%26rft.spage%253D436%26rft_id%253Dinfo%253Adoi%252F10.1038%252Fnature14539%26rft_id%253Dinfo%253Apmid%252F26017442%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [31]: /lookup/external-ref?access_num=10.1038/nature14539&link_type=DOI [32]: /lookup/external-ref?access_num=26017442&link_type=MED&atom=%2Fsci%2F373%2F6557%2F858.atom [33]: #xref-ref-6-1 "View reference 6 in text" [34]: {openurl}?query=rft.jtitle%253DProc.%2BNatl.%2BAcad.%2BSci.%2BU.S.A.%26rft_id%253Dinfo%253Adoi%252F10.1073%252Fpnas.2004702117%26rft_id%253Dinfo%253Apmid%252F32636258%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [35]: /lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoicG5hcyI7czo1OiJyZXNpZCI7czoxMjoiMTE3LzI5LzE3MDQ5IjtzOjQ6ImF0b20iO3M6MjI6Ii9zY2kvMzczLzY1NTcvODU4LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ== [36]: #xref-ref-7-1 "View reference 7 in text" [37]: {openurl}?query=rft.jtitle%253DMethods%2BEcol.%2BEvol.%26rft.volume%253D12%26rft.spage%253D1080%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [38]: #xref-ref-8-1 "View reference 8 in text" [39]: {openurl}?query=rft.jtitle%253DBiochip%2BJ.%26rft.volume%253D12%26rft.spage%253D173%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [40]: #xref-ref-9-1 "View reference 9 in text" [41]: {openurl}?query=rft.jtitle%253DMethods%2BEcol.%2BEvol.%26rft.volume%253D9%26rft.spage%253D1199%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [42]: #xref-ref-10-1 "View reference 10 in text" [43]: {openurl}?query=rft.jtitle%253DComput.%2BNetw.%26rft.volume%253D54%26rft.spage%253D2787%26rft_id%253Dinfo%253Adoi%252F10.1016%252Fj.comnet.2010.05.010%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [44]: /lookup/external-ref?access_num=10.1016/j.comnet.2010.05.010&link_type=DOI [45]: /lookup/external-ref?access_num=000283039900014&link_type=ISI [46]: #xref-ref-11-1 "View reference 11 in text" [47]: {openurl}?query=rft.jtitle%253DIEEE%2BInternet%2BThings%2BJ.%26rft.volume%253D3%26rft.spage%253D637%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [48]: #xref-ref-12-1 "View reference 12 in text" [49]: {openurl}?query=rft.jtitle%253DScience%26rft.stitle%253DScience%26rft.aulast%253DJordan%26rft.auinit1%253DM.%2BI.%26rft.volume%253D349%26rft.issue%253D6245%26rft.spage%253D255%26rft.epage%253D260%26rft.atitle%253DMachine%2Blearning%253A%2BTrends%252C%2Bperspectives%252C%2Band%2Bprospects%26rft_id%253Dinfo%253Adoi%252F10.1126%252Fscience.aaa8415%26rft_id%253Dinfo%253Apmid%252F26185243%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [50]: /lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Mzoic2NpIjtzOjU6InJlc2lkIjtzOjEyOiIzNDkvNjI0NS8yNTUiO3M6NDoiYXRvbSI7czoyMjoiL3NjaS8zNzMvNjU1Ny84NTguYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9 [51]: #xref-ref-13-1 "View reference 13 in text" [52]: #xref-ref-14-1 "View reference 14 in text" [53]: {openurl}?query=rft.jtitle%253DN.%2BAm.%2BFauna%26rft.volume%253D79%26rft.spage%253D1%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [54]: #xref-ref-15-1 "View reference 15 in text" [55]: {openurl}?query=rft.jtitle%253DFront.%2BEcol.%2BEnviron.%26rft.volume%253D6%26rft.spage%253D282%26rft.atitle%253DFRONT%2BECOL%2BENVIRON%26rft_id%253Dinfo%253Adoi%252F10.1890%252F1540-9295%25282008%25296%255B282%253AACSFTN%255D2.0.CO%253B2%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [56]: /lookup/external-ref?access_num=10.1890/1540-9295(2008)6[282:ACSFTN]2.0.CO;2&link_type=DOI


Banking on protein structural data

Science

In 1953, the proposed structure of DNA magnificently linked biological function and structure. By contrast, 4 years later, the first elucidation of the structure of a protein—myoglobin, by Kendrew and colleagues—revealed an inelegant shape, described disdainfully as a “visceral knot.” Additional complexity, as well as some general principles, was revealed as more protein structures were solved over the next decade. In 1971, scientists at Brookhaven National Laboratory launched the Protein Data Bank (PDB) as a repository to collect and make available the atomic coordinates of structures (seven at the time) to interested parties. The PDB now includes more than 180,000 structures, and this resource has fueled an incalculable number of advances, including the recent development of powerful structure prediction tools. Biology takes place in three dimensions, yet most biological information is stored in one-dimensional sequences of DNA that encode the amino acid sequences of proteins. The transition from one to three dimensions is accomplished through the spontaneous folding of a sequence of amino acids into a folded protein structure. Comparing elucidated structures revealed that proteins that are at least 30% identical in amino acid sequence almost always have the same folded structure; evolutionarily, structure is much more conserved than sequence. Conversely, some short stretches of five or more amino acids can adopt completely different structures; structure is context dependent. Thus, the relationship between sequence and structure is not a simple one. Predicting protein structures from sequences has been a grand challenge for decades. By 1994, fueled by the explosion of sequences, biophysicist John Moult and colleagues organized the first Critical Assessment of Structure Prediction (CASP) meeting. CASP is based on blinded assessments, which are common in clinical trials. Sequences of proteins whose structures had been determined but not publicly shared were made available to would-be predictors to develop and submit structural predictions for subsequent independent assessment. The first CASP meeting was somewhat depressing because the results revealed that predictors were doing substantially worse than they thought. CASP meetings have continued every 2 years and have driven the field forward through feedback and competition. The most recent CASP meeting, in November 2020, was shaken by results from the company DeepMind. Its AlphaFold program performed substantially better than other programs had in the past, producing many results that are of similar quality to that of experimental structures. The RoseTTA-Fold program, developed by the laboratory of structural biologist David Baker, builds on this laboratory’s previous work, combined with insights from the DeepMind success (see page 871). The results of both programs are sufficiently good that many are claiming that these represent relatively general (but certainly not perfect, and incomplete) solutions to the structure prediction problem. Notably, both groups have provided their computer code for their methods for others to use, test, and enhance. These programs are based on deep-learning artificial intelligence methods. Such approaches depend on the availability of many thousands of questions with known answers to train the neural networks at their core. Thus, without the sequences with known structures from structural biologists from around the world shared in the PDB, these approaches would not have been feasible. The teams that developed these powerful programs deserve great credit for their accomplishments, but these stand on a foundation of the results from billions of dollars of public fund investments in structural biology and the sustained support of the PDB from around the world (now overseen by the Worldwide PDB). Policies from funders, publishers, and the scientific community have led to requirements that reported structures be promptly deposited in the PDB. As someone who has interacted with the PDB as a consumer, a contributor, a policy-maker, and a funder, I have experienced the power and challenges of trying to optimize such a public resource. The cultural shifts, at the cutting (and often bleeding) edge of open science, were often controversial, but it is hard to argue that they have not both increased the impact of individual determined structures and accelerated scientific progress in many ways. The ever-growing PDB provides researchers with a universe of structures with which to compare their favorite structures. The new structure prediction tools expand this universe further and provide truly compelling evidence of the power of open science. Moreover, these tools bring truth to an old saying in structural biology circles, “The structure prediction problem has been solved; it’s hiding in the PDB.”


Making machine learning trustworthy

Science

Machine learning (ML) has advanced dramatically during the past decade and continues to achieve impressive human-level performance on nontrivial tasks in image, speech, and text recognition. It is increasingly powering many high-stake application domains such as autonomous vehicles, self–mission-fulfilling drones, intrusion detection, medical image classification, and financial predictions ([ 1 ][1]). However, ML must make several advances before it can be deployed with confidence in domains where it directly affects humans at training and operation, in which cases security, privacy, safety, and fairness are all essential considerations ([ 1 ][1], [ 2 ][2]). The development of a trustworthy ML model must build in protections against several types of adversarial attacks (see the figure). An ML model requires training datasets, which can be “poisoned” through the insertion, modification, or removal of training samples with the purpose of influencing the decision boundary of a model to serve the adversary's intent ([ 3 ][3]). Poisoning happens when models learn from crowdsourced data or from inputs they receive while in operation, both of which are susceptible to tampering. Adversarially manipulated inputs can evade ML models through purposely crafted inputs called adversarial examples ([ 4 ][4]). For example, in an autonomous vehicle, a control model may rely on road-sign recognition for its navigation. By placing a tiny sticker on a stop sign, an adversary can evade the model to mistakenly recognize the stop sign as a yield sign or a “speed limit 45” sign, whereas a human driver would simply ignore the visually nonconsequential sticker and apply the brakes at the stop sign (see the figure). Attacks can also abuse the input-output interaction of a model's prediction interface to steal the ML model itself ([ 5 ][5], [ 6 ][6]). By supplying a batch of inputs (for example, publicly available images of traffic signs) and obtaining predictions for each, a model serves as a labeling oracle that enables an adversary to train a surrogate model that is functionally equivalent to the model. Such attacks pose greater risks for ML models that learn from high-stake data such as intellectual property and military or national security intelligence. ![Figure][7] Adversarial threats to machine learning Machine learning models are vulnerable to attacks that degrade model confidentiality and model integrity or that reveal private information. GRAPHIC: KELLIE HOLOSKI/ SCIENCE When models are trained for predictive analytics on privacy-sensitive data, such as patient clinical data and bank customer transactions, privacy is of paramount importance. Privacy-motivated attacks can reveal sensitive information contained in training data through mere interaction with deployed models ([ 7 ][8]). The root cause for such attacks is that ML models tend to “memorize” ancillary parts of their training data and, at prediction time, inadvertently divulge identifying details about individuals who contributed to the training data. One common strategy, called membership inference, enables an adversary to exploit the differences in a model's response to members and nonmembers of a training dataset ([ 7 ][8]). In response to these threats to ML models, the quest for countermeasures is promising. Research has made progress on detecting poisoning and adversarial inputs to limiting what an adversary may learn by just interacting with a model to limit the extent of model stealing or membership inference attacks ([ 1 ][1], [ 8 ][9]). One promising example is the formally rigorous formulation of privacy. The notion of differential privacy promises to an individual who participates in a dataset that whether your record belongs to a training dataset of a model or not, what an adversary learns about you by interacting with the model is basically the same ([ 9 ][10]). Beyond technical remedies, the lessons learned from the ML attack-defense arms race provide opportunities to motivate broader efforts to make ML truly trustworthy in terms of societal needs. Issues include how a model “thinks” when it makes decisions (transparency) and fairness of an ML model when it is trained to solve high-stake inference tasks for which bias exists if those decisions were made by humans. Making meaningful progress toward trustworthy ML requires an understanding about the connections, and at times tensions, between the traditional security and privacy requirements and the broader issues of transparency, fairness, and ethics when ML is used to address human needs. Several worrisome instances of biases in consequential ML applications have been documented ([ 10 ][11], [ 11 ][12]), such as race and gender misidentification, wrongfully scoring darker-skin faces for higher likelihood of being a criminal, disproportionately favoring male applicants in resume screenings, and disfavoring black patients in medical trials. These harmful consequences require that the developers of ML models look beyond technical solutions to win trust among human subjects who are affected by these harmful consequences. On the research front, especially for the security and privacy of ML, the aforementioned defensive countermeasures have solidified the understanding around blind spots of ML models in adversarial settings ([ 8 ][9], [ 9 ][10], [ 12 ][13], [ 13 ][14]). On the fairness and ethics front, there is more than enough evidence to demonstrate pitfalls of ML, especially on underrepresented subjects of training datasets. Thus, there is still more to be done by way of human-centered and inclusive formulations of what it means for ML to be fair and ethical. One misconception about the root cause of bias in ML is attributing bias to data and data alone. Data collection, sampling, and annotation play a critical role in causing historical bias, but there are multiple junctures in the data processing pipeline where bias can manifest. From data sampling to feature extraction, from aggregation during training to evaluation methodologies and metrics during testing, bias issues manifest across the ML data-processing pipeline. At present, there is a lack of broadly accepted definitions and formulations of adversarial robustness ([ 13 ][14]) and privacy-preserving ML (except for differential privacy, which is formally appealing yet not widely deployed). Lack of transferability of notions of attacks, defenses, and metrics from one domain to another is also a pressing issue that impedes progress toward trustworthy ML. For example, most ML evasion and membership inference attacks illustrated earlier are predominantly on applications such as image classification (road-sign detection by an autonomous vehicle), object detection (identifying a flower from a living room photo with multiple objects), speech processing (voice assistants), and natural language processing (machine translation). The threats and countermeasures proposed in the context of vision, speech, and text domain hardly translate to one another, often naturally adversarial domains, such as network intrusion detection and financial-fraud detection. Another important consideration is the inherent tension between some trustworthiness properties. For example, transparency and privacy are often conflicting because if a model is trained on privacy-sensitive data, aiming for the highest level of transparency in production would inevitably lead to leakage of privacy-sensitive details of data subjects ([ 14 ][15]). Thus, choices need to be made as to the extent that transparency is penalized to gain privacy, and vice versa, and such choices need to be made clear to system purchasers and users. Generally, privacy concerns prevail because of the legal implications if they are not enforced (for example, patient privacy with respect to the Health Insurance Portability and Accountability Act in the United States). Also, privacy and fairness may not always develop synergy. For example, although privacy-preserving ML (such as differential privacy) provides a bounded guarantee on indistinguishability of individual training examples, in terms of utility, research shows that minority groups in the training data (for example, based on race, gender, or sexuality) tend to be negatively affected by the model outputs ([ 15 ][16]). Broadly speaking, the scientific community needs to step back and align the robustness, privacy, transparency, fairness, and ethical norms in ML with human norms. To do this, clearer norms for robustness and fairness need to be developed and accepted. In research efforts, limited formulations of adversarial robustness, fairness, and transparency must be replaced with broadly applicable formulations like what differential privacy offers. In policy formulation, there needs to be concrete steps toward regulatory frameworks that spell out actionable accountability measures on bias and ethical norms on datasets (including diversity guidelines), training methodologies (such as bias-aware training), and decisions on inputs (such as augmenting model decisions with explanations). The hope is that these regulatory frameworks will eventually evolve into ML governance modalities backed by legislation to lead to accountable ML systems in the future. Most critically, there is a dire need for insights from diverse scientific communities to consider societal norms of what makes a user confident about using ML for high-stake decisions, such as a passenger in a self-driving car, a bank customer accepting investment recommendations by a bot, and a patient trusting an online diagnostic interface. Policies need to be developed that govern safe and fair adoption of ML in such high-stake applications. Equally important, the fundamental tensions between adversarial robustness and model accuracy, privacy and transparency, and fairness and privacy invite more rigorous and socially grounded reasonings about trustworthy ML. Fortunately, at this juncture in the adoption of ML, a consequential window of opportunity remains open to tackle its blind spots before ML is pervasively deployed and becomes unmanageable. 1. [↵][17]1. I. Goodfellow, 2. P. McDaniel, 3. N. Papernot , Commun. ACM 61, 56 (2018). [OpenUrl][18] 2. [↵][19]1. S. G. Finlayson et al ., Science 363, 1287 (2019). [OpenUrl][20][Abstract/FREE Full Text][21] 3. [↵][22]1. J. Langford, 2. J. Pineau 1. B. Biggio, 2. B. Nelson, 3. P. Laskov , Proceedings of the 29th International Conference on Machine Learning, Edinburgh, Scotland, UK, J. Langford, J. Pineau, Eds. (Omnipress, 2012), pp. 1807–1814. 4. [↵][23]1. K. Eykholt et al ., Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2018), pp. 1625–1634. 5. [↵][24]1. F. Tramèr, 2. F. Zhang, 3. A. Juels, 4. M. K. Reiter, 5. T. Ristenpart , Proceedings of the 25th USENIX Security Symposium, Austin, TX (USENIX Association, 2016), pp. 601–618. 6. [↵][25]1. A. Ali, 2. B. Eshete , Proceedings of the 16th EAI International Conference on Security and Privacy in Communication Networks, Washington, DC (EAI, 2020), pp. 318–338. 7. [↵][26]1. R. Shokri, 2. M. Stronati, 3. C. Song, 4. V. Shmatikov , Proceedings of the 2017 IEEE Symposium on Security and Privacy, San Jose, CA (IEEE, 2017), pp. 3–18. 8. [↵][27]1. N. Papernot, 2. M. Abadi, 3. U. Erlingsson, 4. I. Goodfellow, 5. K. Talwar , arXiv:1610.05755 [stat.ML] (2017). 9. [↵][28]1. I. Jarin, 2. B. Eshete , Proceedings of the 7th ACM International Workshop on Security and Privacy Analytics (2021), pp. 25–35. 10. [↵][29]1. J. Buolamwini, 2. T. Gebru , Proceedings of Conference on Fairness, Accountability and Transparency, New York, NY (ACM, 2018), pp. 77–91. 11. [↵][30]1. A. Birhane, 2. V. U. Prabhu , Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (IEEE, 2021), pp. 1537–1547. 12. [↵][31]1. N. Carlini et al ., arXiv:1902.06705 [cs.LG] (2019). 13. [↵][32]1. N. Papernot, 2. P. McDaniel, 3. A. Sinha, 4. M. P. Wellman , Proceedings of 3rd IEEE European Symposium on Security and Privacy (London, 2018), pp. 399–414. 14. [↵][33]1. R. Shokri, 2. M. Strobel, 3. Y. Zick , Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, New York, NY (2021); . 15. [↵][34]1. V. M. Suriyakumar, 2. N. Papernot, 3. A. Goldenberg, 4. M. Ghassemi , FAccT '21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (ACM, 2021), pp. 723–734. [1]: #ref-1 [2]: #ref-2 [3]: #ref-3 [4]: #ref-4 [5]: #ref-5 [6]: #ref-6 [7]: pending:yes [8]: #ref-7 [9]: #ref-8 [10]: #ref-9 [11]: #ref-10 [12]: #ref-11 [13]: #ref-12 [14]: #ref-13 [15]: #ref-14 [16]: #ref-15 [17]: #xref-ref-1-1 "View reference 1 in text" [18]: {openurl}?query=rft.jtitle%253DCommun.%2BACM%26rft.volume%253D61%26rft.spage%253D56%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [19]: #xref-ref-2-1 "View reference 2 in text" [20]: {openurl}?query=rft.jtitle%253DScience%26rft.stitle%253DScience%26rft.aulast%253DFinlayson%26rft.auinit1%253DS.%2BG.%26rft.volume%253D363%26rft.issue%253D6433%26rft.spage%253D1287%26rft.epage%253D1289%26rft.atitle%253DAdversarial%2Battacks%2Bon%2Bmedical%2Bmachine%2Blearning%26rft_id%253Dinfo%253Adoi%252F10.1126%252Fscience.aaw4399%26rft_id%253Dinfo%253Apmid%252F30898923%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [21]: /lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Mzoic2NpIjtzOjU6InJlc2lkIjtzOjEzOiIzNjMvNjQzMy8xMjg3IjtzOjQ6ImF0b20iO3M6MjI6Ii9zY2kvMzczLzY1NTYvNzQzLmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ== [22]: #xref-ref-3-1 "View reference 3 in text" [23]: #xref-ref-4-1 "View reference 4 in text" [24]: #xref-ref-5-1 "View reference 5 in text" [25]: #xref-ref-6-1 "View reference 6 in text" [26]: #xref-ref-7-1 "View reference 7 in text" [27]: #xref-ref-8-1 "View reference 8 in text" [28]: #xref-ref-9-1 "View reference 9 in text" [29]: #xref-ref-10-1 "View reference 10 in text" [30]: #xref-ref-11-1 "View reference 11 in text" [31]: #xref-ref-12-1 "View reference 12 in text" [32]: #xref-ref-13-1 "View reference 13 in text" [33]: #xref-ref-14-1 "View reference 14 in text" [34]: #xref-ref-15-1 "View reference 15 in text"


Phenotyping Alzheimer's disease with blood tests

Science

Alzheimer's disease (AD) is characterized by brain protein aggregates of amyloid-β (Aβ) and phosphorylated tau (pTau) that become plaques and tangles, and dystrophic neurites surrounding the plaques, which are accompanied by downstream neurodegeneration. These protein changes can be used as biomarkers detected through positron emission tomography (PET) imaging and in cerebrospinal fluid (CSF), allowing for ATN (amyloid, tau, and neurodegeneration) classification of patients. This phenotyping has become standard in AD clinical trials to overcome the high misclassification rate (20 to 30%) for clinical criteria and also enables enrollment of preclinical AD patients. The recent approval of the first disease-modifying anti-amyloid immunotherapy, aducanumab, for AD will generate a need for widely accessible and inexpensive biomarkers for ATN classification of patients with cognitive complaints. Technological advances have also overcome the challenges of measuring the extraordinarily low amounts of brain-derived proteins in blood samples, and recent studies indicate that AD blood tests may soon be possible. The Aβ42 variant of Aβ is aggregation-prone and is deposited in plaques in the brains of people with AD, whereas the shorter Aβ40 isoform is by far the most abundant Aβ peptide (see the figure). Thus, as AD progresses and Aβ42 forms plaques, its concentration in the CSF and blood is reduced. Ascertaining the ratio of Aβ42 and Aβ40 concentrations in the CSF is known to adjust for between-individual differences in “total” Aβ production, thereby increasing concordance with amyloid PET imaging to detect brain amyloidosis. Applying the same principle for blood plasma Aβ, immunoprecipitation–mass spectrometry (IP-MS) measures of plasma Aβ42/Aβ40 ratio can reach an accuracy exceeding 90% to identify brain amyloidosis ([ 1 ][1]). A population-based study of 441 asymptomatic elderly individuals indicates that IP-MS plasma Aβ can identify those who are amyloid PET-positive with high accuracy ([ 2 ][2]). ![Figure][3] Biomarkers of Alzheimer's disease Low amyloid-β (Aβ) 42/40 isoform ratio is associated with brain amyloidosis, and several phosphorylated tau (pTau) fragments increase with tau pathology; both are specific blood biomarkers for Alzheimer's disease (AD). Among neurodegeneration biomarkers, neurofilament light (NFL) is modestly increased in AD, and total tau (T-tau) is markedly increased only in cerebrospinal fluid (CSF), and not blood, in AD. Glial fibrillary acidic protein (GFAP) is a candidate blood biomarker for astrocytic activation, to indicate neuroinflammation. GRAPHIC: KELLIE HOLOSKI/ SCIENCE The question then arises whether plasma Aβ detection can replace PET or CSF tests for brain amyloidosis. A potential issue is that Aβ is produced not only in the brain but also in platelets and peripheral tissues, which will obscure the central nervous system–derived Aβ signal in plasma. Consequently, in amyloid PET-positive cases, plasma Aβ42/Aβ40 ratio is only ∼10% lower than in individuals without brain amyloidosis, whereas it is more than 40% lower in CSF ([ 3 ][4]). This leads to an overlap that introduces challenges to robustly classify individuals as being either amyloid positive or negative, especially in those with Aβ42/Aβ40 ratios close to the cut-off for normality. Algorithms combining plasma Aβ42/Aβ40 ratio with the ϵ4 variant of apolipoprotein E ( APOE ), which is the major AD risk gene, and age (the main risk factor for AD) increase accuracy in detecting brain amyloidosis by 2 to 6% ([ 2 ][2], [ 3 ][4]). However, merging biomarker data with genetic risk and aging may cause confusion because some younger APOE -ϵ4 noncarriers with low plasma Aβ42/Aβ40 will be misclassified as amyloid negative by the algorithm, whereas a proportion of older individuals with homozygous APOE -ϵ4 but normal plasma Aβ42/Aβ40 will be wrongly classified as amyloid positive. Tau protein is truncated into amino-terminal to mid-domain fragments before being secreted in blood plasma and CSF ([ 4 ][5]). CSF pTau has long been used as an AD-specific biomarker. A major breakthrough is the use of new ultrasensitive methods that allow for quantification of pTau in blood plasma, with high concentrations occurring in AD ([ 5 ][6]). Of 321 patients and controls, high plasma concentrations of pTau181 fragments were associated with brain tau pathology as measured by PET ([ 6 ][7]). Similar results were subsequently presented for other pTau species, including pTau217 ([ 7 ][8]) and pTau231 ([ 8 ][9]). The findings of very high accuracy of plasma pTau217 in the ability to discriminate AD from other neurodegenerative disorders ([ 7 ][8]) and IP-MS data showing a higher magnitude of increase and better association with amyloid plaques by PET of plasma pTau217 than of pTau181 ([ 4 ][5]) suggest that there may be diagnostic or pathophysiological differences between pTau species, but this remains a matter of debate. Nonetheless, these pTau blood biomarkers all show high concordance with AD pathology at autopsy, with accuracies in differentiating AD from non-AD dementia cases up to 99% for pTau231 ([ 8 ][9]). However, these studies are based on different analytical methods and cohorts. In an attempt to directly compare these pTau species, a study of 381 participants employing digital immunoassays for pTau181, pTau217, and pTau231 found strong correlations with amounts of pTau species in CSF. Moreover, although the fold change was highest for pTau217, the accuracy in identifying amyloid PET positivity was very high for all pTau species ([ 9 ][10]), suggesting that differences are not meaningful. A study of two large cohorts of 883 individuals with cognitive symptoms also showed high accuracy (90 to 91%) of both plasma pTau181 and pTau217 to predict clinical progression to AD dementia in algorithms that include memory and executive function tests and APOE genotyping ([ 10 ][11]). Overall, plasma pTau biomarkers fulfill many requirements for a clinically useful AD test, with a high fold change in AD (between two to four times higher in AD than non-AD controls across studies), and an increase early in the AD continuum (even preclinically), an association with amyloid-associated tau pathophysiology and tangle burden in the brain, and an increase specifically found in AD but not in other types of dementia. The findings of an early increase in plasma pTau fragments in patients with evidence of amyloid plaques, but not tau abnormalities, by PET imaging may be interpreted as a neuronal response to Aβ aggregates that gives rise to increased pTau secretion into CSF and blood plasma. However, findings in biomarker studies are only associations and may not directly reveal causal relationships. For example, plasma pTau231 shows a 10- to 15-fold increase within 24 hours after acute traumatic brain injury, especially evident in younger patients (who are unlikely to have amyloid or tau pathology) ([ 11 ][12]). Total tau (T-tau), referring to any tau variant or fragment regardless of phosphorylation, and other brain proteins such as glial fibrillary acidic protein (GFAP) also increase in blood plasma, hypothetically mediated by a trauma-induced compromise of the blood-brain barrier, with release of proteins preexisting in the extracellular space. Even if different mechanisms operate in specific disorders, further research is needed to understand the mechanisms underlying the increase in plasma pTau in AD. In the search for blood biomarkers of neurodegeneration, it has become evident that in contrast to CSF, where T-tau is markedly increased in AD, T-tau does not work as a biomarker of AD neurodegeneration in blood. Instead, another axonal protein, neurofilament light (NFL), has been evaluated as a substitute AD neurodegeneration biomarker, even though it is not involved in AD pathogenesis. Plasma NFL concentrations correlate well with CSF concentrations, supporting that it reflects brain pathophysiology. But high amounts are found in a wide variety of neurodegenerative disorders, so this biomarker lacks specificity. Nevertheless, plasma NFL, which shows a modest increase in AD, predicts both cognitive deterioration and rate of neurodegeneration as measured by atrophy on brain imaging. Notably, both plasma and CSF NFL concentrations increase in cognitively unimpaired people with autosomal dominant AD 7 years before symptom onset ([ 12 ][13]), so this may be a good biomarker for predicting AD. Another candidate AD blood biomarker includes the astrocytic protein GFAP, which is markedly increased in AD. Plasma GFAP distinguishes amyloid PET-positive and -negative cognitively normal elderly with high accuracy ([ 13 ][14]), and may serve as a blood biomarker for glial activation and neuroinflammation. Despite both rapid and robust reductions in amyloid PET ligand binding after treatment with Aβ immunotherapies (indicative of drug target engagement), effects on cognitive outcomes have been less evident. Therefore, biomarker evidence for downstream effects on reducing tau pathology and neurodegeneration is important to support disease-modifying effects by this class of drugs. Given that in most clinical trials only a small percentage of enrolled patients undergo repeat lumbar puncture for CSF testing, blood biomarkers could play an important role to accomplish this. Data from other areas of clinical neuroscience show that children with spinal muscular atrophy have a marked increase in CSF NFL, but treatment with the antisense oligonucleotide drug nusinersen results in a successive reduction of NFL concentrations in CSF with normalization after ∼7 months, and the reduction correlates with clinical improvements ([ 14 ][15]). Similar, but less pronounced, reductions of plasma NFL are seen with disease-modifying treatments in multiple sclerosis patients. These findings may serve as proof of concept for the usefulness of plasma NFL in identifying downstream drug effects on neurodegeneration. Target engagement for the anti-Aβ drug, aducanumab, was demonstrated in 2017, with dose-dependent reductions on amyloid PET ([ 15 ][16]), but to date there are no reports of effects on blood biomarkers of neurodegeneration (or tau pathology) from any Aβ immunotherapy trial. Current studies of blood AD biomarkers come exclusively from cohorts at highly specialized research centers. Thus, further clinical validation is needed, specifically on the diagnostic accuracy of the AD blood biomarkers, alone or in combination, in consecutive patient populations at memory clinics and in primary care settings. In addition, because plasma pTau increases progressively with tau pathology in the brain and more advanced clinical stage, more data are needed on the accuracy of plasma pTau biomarkers to identify individuals with preclinical or early symptoms who will go on to develop AD. Moreover, studies comparing plasma pTau species in the same cohorts and using the same technology are needed to understand if there are pathophysiological differences across the pTau epitopes. Current assays are research grade, and full analytical validation of methods is needed to achieve accurate and comparable results between laboratories, as well as global efforts to develop certified reference materials to achieve harmonization across assay platforms. Transferring the blood tests to fully automated platforms would also help to streamline these procedures and to establish these blood tests as clinically useful tools. Lastly, to make blood biomarkers attractive substitutes for imaging, costs need to be substantially lower than costs for the PET scans. 1. [↵][17]1. A. Nakamura et al ., Nature 554, 249 (2018). [OpenUrl][18][CrossRef][19][PubMed][20] 2. [↵][21]1. A. Keshavan et al ., Brain 144, 434 (2021). [OpenUrl][22] 3. [↵][23]1. S. E. Schindler et al ., Neurology 93, 17 (2019). [OpenUrl][24] 4. [↵][25]1. N. R. Barthélemy, 2. K. Horie, 3. C. Sato, 4. R. J. Bateman , J. Exp. Med. 217, e20200861 (2020). [OpenUrl][26][CrossRef][27][PubMed][28] 5. [↵][29]1. M. M. Mielke et al ., Alzheimers Dement. 14, 989 (2018). [OpenUrl][30][CrossRef][31][PubMed][28] 6. [↵][32]1. T. K. Karikari et al ., Lancet Neurol. 19, 422 (2020). [OpenUrl][33][CrossRef][34][PubMed][28] 7. [↵][35]1. S. Palmqvist et al ., JAMA 324, 772 (2020). [OpenUrl][36][PubMed][28] 8. [↵][37]1. N. J. Ashton et al ., Acta Neuropathol. 141, 709 (2021). [OpenUrl][38][PubMed][28] 9. [↵][39]1. M. Suárez-Calvet et al ., EMBO Mol. Med. 12, e12921 (2020). [OpenUrl][40] 10. [↵][41]1. S. Palmqvist et al ., Nat. Med. 27, 1034 (2021). [OpenUrl][42] 11. [↵][43]1. R. Rubenstein et al ., JAMA Neurol. 74, 1063 (2017). [OpenUrl][44] 12. [↵][45]1. O. Preische et al ., Nat. Med. 25, 277 (2019). [OpenUrl][46][PubMed][28] 13. [↵][47]1. P. Chatterjee et al ., Transl. Psychiatry 11, 27 (2021). [OpenUrl][48] 14. [↵][49]1. B. Olsson et al ., J. Neurol. 266, 2129 (2019). [OpenUrl][50] 15. [↵][51]1. J. Sevigny et al ., Nature 546, 564 (2017). [OpenUrl][52] Acknowledgments: K.B. has consulted for Axon, Biogen, Lilly, and Roche Diagnostics and is cofounder of Brain Biomarker Solutions in Gothenburg AB. [1]: #ref-1 [2]: #ref-2 [3]: pending:yes [4]: #ref-3 [5]: #ref-4 [6]: #ref-5 [7]: #ref-6 [8]: #ref-7 [9]: #ref-8 [10]: #ref-9 [11]: #ref-10 [12]: #ref-11 [13]: #ref-12 [14]: #ref-13 [15]: #ref-14 [16]: #ref-15 [17]: #xref-ref-1-1 "View reference 1 in text" [18]: {openurl}?query=rft.jtitle%253DNature%26rft.volume%253D554%26rft.spage%253D249%26rft_id%253Dinfo%253Adoi%252F10.1038%252Fnature25456%26rft_id%253Dinfo%253Apmid%252F29420472%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [19]: /lookup/external-ref?access_num=10.1038/nature25456&link_type=DOI [20]: /lookup/external-ref?access_num=29420472&link_type=MED&atom=%2Fsci%2F373%2F6555%2F626.atom [21]: #xref-ref-2-1 "View reference 2 in text" [22]: {openurl}?query=rft.jtitle%253DBrain%26rft.volume%253D144%26rft.spage%253D434%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [23]: #xref-ref-3-1 "View reference 3 in text" [24]: {openurl}?query=rft.jtitle%253DNeurology%26rft.volume%253D93%26rft.spage%253D17%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [25]: #xref-ref-4-1 "View reference 4 in text" [26]: {openurl}?query=rft.jtitle%253DJ.%2BExp.%2BMed.%26rft.volume%253D217%26rft.spage%253De20200861%26rft_id%253Dinfo%253Adoi%252F10.1084%252Fjem.20200861%26rft_id%253Dinfo%253Apmid%252Fhttp%253A%252F%252Fwww.n%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [27]: /lookup/external-ref?access_num=10.1084/jem.20200861&link_type=DOI [28]: /lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fsci%2F373%2F6555%2F626.atom [29]: #xref-ref-5-1 "View reference 5 in text" [30]: {openurl}?query=rft.jtitle%253DAlzheimers%2BDement.%26rft.volume%253D14%26rft.spage%253D989%26rft_id%253Dinfo%253Adoi%252F10.1016%252Fj.jalz.2018.02.013%26rft_id%253Dinfo%253Apmid%252Fhttp%253A%252F%252Fwww.n%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [31]: /lookup/external-ref?access_num=10.1016/j.jalz.2018.02.013&link_type=DOI [32]: #xref-ref-6-1 "View reference 6 in text" [33]: {openurl}?query=rft.jtitle%253DLancet%2BNeurol.%26rft.volume%253D19%26rft.spage%253D422%26rft_id%253Dinfo%253Adoi%252F10.1016%252FS1474-4422%252820%252930071-5%26rft_id%253Dinfo%253Apmid%252Fhttp%253A%252F%252Fwww.n%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [34]: /lookup/external-ref?access_num=10.1016/S1474-4422(20)30071-5&link_type=DOI [35]: #xref-ref-7-1 "View reference 7 in text" [36]: {openurl}?query=rft.jtitle%253DJAMA%26rft.volume%253D324%26rft.spage%253D772%26rft_id%253Dinfo%253Apmid%252Fhttp%253A%252F%252Fwww.n%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [37]: #xref-ref-8-1 "View reference 8 in text" [38]: {openurl}?query=rft.jtitle%253DActa%2BNeuropathol.%26rft.volume%253D141%26rft.spage%253D709%26rft_id%253Dinfo%253Apmid%252Fhttp%253A%252F%252Fwww.n%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [39]: #xref-ref-9-1 "View reference 9 in text" [40]: {openurl}?query=rft.jtitle%253DEMBO%2BMol.%2BMed.%26rft.volume%253D12%26rft.spage%253De12921%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [41]: #xref-ref-10-1 "View reference 10 in text" [42]: {openurl}?query=rft.jtitle%253DNat.%2BMed.%26rft.volume%253D27%26rft.spage%253D1034%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [43]: #xref-ref-11-1 "View reference 11 in text" [44]: {openurl}?query=rft.jtitle%253DJAMA%2BNeurol.%26rft.volume%253D74%26rft.spage%253D1063%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [45]: #xref-ref-12-1 "View reference 12 in text" [46]: {openurl}?query=rft.jtitle%253DNat.%2BMed.%26rft.volume%253D25%26rft.spage%253D277%26rft_id%253Dinfo%253Apmid%252Fhttp%253A%252F%252Fwww.n%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [47]: #xref-ref-13-1 "View reference 13 in text" [48]: {openurl}?query=rft.jtitle%253DTransl.%2BPsychiatry%26rft.volume%253D11%26rft.spage%253D27%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [49]: #xref-ref-14-1 "View reference 14 in text" [50]: {openurl}?query=rft.jtitle%253DJ.%2BNeurol.%26rft.volume%253D266%26rft.spage%253D2129%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [51]: #xref-ref-15-1 "View reference 15 in text" [52]: {openurl}?query=rft.jtitle%253DNature%26rft.volume%253D546%26rft.spage%253D564%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx


Chinese students fight back against visa rejections

Science

When Chen Siyu met a consular official at the U.S. embassy in Beijing in March to review her qualifications for a student visa, “Everything was going well,” she says—or so it seemed. Chen, who has a master's in public health from the University of Hong Kong, had won a fully funded slot in an epidemiology Ph.D. program at the University of Florida. When the consular officer asked about her current employment, Chen explained that she had worked as an epidemiology research assistant at a major hospital for 5 years. She mentioned that the hospital is affiliated with a military medical university. The consular officer thanked Chen for the information and moments later handed her a rejection form letter with “Other: 212(f)” ticked off from among a selection of reasons. The interview was over, as were her dreams of earning a Ph.D. in the United States. Chen is one of a growing group of Chinese students barred from the United States based on 212(f), a clause in the decades-old Immigration and Nationality Act (INA) that allows the U.S. president to identify aliens whose entry would be “detrimental to the interests of the United States.” In May 2020, then-President Donald Trump signed a proclamation that invoked the clause to bar Chinese graduate students and postgraduate researchers with ties to an entity in China “that implements or supports China's ‘military-civil fusion strategy.’” The proclamation exempts those working in fields that don't contribute to that strategy—but apparently epidemiology is not among them. Now, Chen is one of 2500 activists—Chinese students with visa problems and their supporters—who are fighting back against what they see as an arbitrary and discriminatory policy. Armed with a website and a Twitter account, the students have written to more than 50 top U.S. research universities to focus attention on their plight. They are getting a sympathetic hearing in the U.S. academic world: A 10 June letter from the American Council on Education to the Department of State warned of “delays in students' academic careers and critical projects.” The group is also discussing legal action with a U.S. immigration lawyer and recently launched a fundraising campaign to try to cover the costs. “We think this is a policy of discrimination based on nationality,” says Hu Desheng, a doctoral candidate in computer science at Northeastern University who got stuck in China because of pandemic-related travel restrictions in early 2020, and whose visa application is now backlogged. Trump's proclamation initially had little impact because the pandemic disrupted academic travel globally. But after more than a year, the U.S. embassy and consulates in China resumed processing routine visa applications on 4 May. Between then and mid-June, more than 500 visa applications have been rejected, according to the students' tally. More than 1000 Chinese scholars already in the United States reportedly had their visas revoked by September 2020. Many others hesitate to leave the United States, fearing they won't get back in. How many students will be affected annually is unclear, in part because the U.S. government has not said which Chinese entities are deemed to be supporting the military-civil fusion strategy and which fields of study are considered sensitive or exempt. A study of the measure's potential impact published in February by Georgetown University's Center for Security and Emerging Technology (CSET) assumed the designated entities include 11 universities subject to stringent export control restrictions by the U.S. Department of Commerce, including the so-called Seven Sons of National Defence—schools with historical ties to China's defense establishment. The study also assumed the sensitive fields mentioned in the proclamation will cover all areas of science, technology, engineering, and math (STEM). If so, it could block 3000 to 5000 of the roughly 19,000 Chinese students who start graduate programs each year, CSET estimated. The report did not cover postdoctoral and visiting researchers, graduates of other universities, or those in non-STEM fields. (The proclamation exempts undergraduate students from scrutiny.) A spokesperson for the State Department declined to name which institutions are blacklisted, but said the sensitive technologies include quantum computing, big data, semiconductors, biotechnology, 5G, advanced nuclear technology, aerospace technology, and artificial intelligence. “By design, the policy is narrowly targeted,” the spokesperson says. But the Chinese students say rejections are broad. Even those intending to study finance, obstetrics and gynecology, water conservation, medicine, agronomy, and other seemingly nonmilitary topics have had visas rejected under clause 212(f), they say. Li Xiang, for example, earned a master's in linguistics from the Harbin Institute of Technology, one of the schools with historical defense ties, then studied at an art school to prepare for a master's program in game development at the Academy of Art University in San Francisco. “To be an artist in the game and film industry is my dream,” she says. Her application was rejected and she was told she is not even eligible for a visa to visit her husband, who is working in the United States. The visa of another student, Xue Shilue, was revoked in the summer of 2020 after she had completed the first year of a master's program in “user experience design” at the University of Texas, Austin. She happened to be in China at the time and can't go back to Austin to complete her degree or even collect her personal belongings. The proclamation also appears to target students supported by the China Scholarship Council (CSC), which falls under China's Ministry of Education but has been under scrutiny for supposed links to the defense establishment, according to a separate CSET study. Blacklisting CSC could have dramatic implications. CSET estimates that during the 2017–18 academic year, the council supported 26,000 Chinese scholars in all disciplines in the United States. Huang Yunan, who last year started a Ph.D. program in food science at Cornell University remotely because of the pandemic, was denied a visa after telling a consular officer about her CSC support during a May interview. More than 100 of some 500 CSC-supported members of a chat group she belongs to have recently had visa applications rejected, she says. The students object to the absence of any individual assessment. “There is a presumption of guilt on the part of every Chinese student who has studied at a targeted university,” Hu says. As to the Seven Sons, “We go to those schools because they are top-ranked universities,” Hu says, not because of their military ties. Wendy Wolford, vice provost for International Affairs at Cornell University, asked U.S. Secretary of State Antony Blinken in a 26 May letter to rectify the “capricious, unclear, and excessive” interpretations of the proclamation that are “creating tremendous uncertainty and confusion for international students and their U.S. universities.” (Wolford did not respond to an email asking whether she had heard back from Blinken.) A lawsuit, however, is a long shot, says Charles Kuck, a U.S. immigration lawyer who has advised the students. “The Supreme Court has given a literal carte blanche to the president to use INA 212(f), along with a ‘reasonable’ explanation, for whatever entry ban the president wants to put into place,” Kuck says. The problems are driving some students to pursue advanced degrees elsewhere; Chen, for one, will now get her Ph.D. at the University of Hong Kong. Moves like hers should be a bigger worry than the possibility that graduate students are stealing U.S. technology, says Denis Simon, an expert in innovation at Duke University who studies China's research efforts. “The notion of there being a conspiratorial effort [to acquire advanced technology] is just far beyond the reality.” In contrast, he says, slowing the flow of Chinese students will harm the United States, where they help sustain many research programs. “It's a pipeline that has been built over 40 years, and by deconstructing it, we will do some very serious damage to our ability to have the kind of talent needed to drive our innovation system forward.”