Collaborating Authors


Transmission heterogeneities, kinetics, and controllability of SARS-CoV-2


A minority of people infected with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) transmit most infections. How does this happen? Sun et al. reconstructed transmission in Hunan, China, up to April 2020. Such detailed data can be used to separate out the relative contribution of transmission control measures aimed at isolating individuals relative to population-level distancing measures. The authors found that most of the secondary transmissions could be traced back to a minority of infected individuals, and well over half of transmission occurred in the presymptomatic phase. Furthermore, the duration of exposure to an infected person combined with closeness and number of household contacts constituted the greatest risks for transmission, particularly when lockdown conditions prevailed. These findings could help in the design of infection control policies that have the potential to minimize both virus transmission and economic strain. Science , this issue p. [eabe2424][1] ### INTRODUCTION The role of transmission heterogeneities in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) dynamics remains unclear, particularly those heterogeneities driven by demography, behavior, and interventions. To understand individual heterogeneities and their effect on disease control, we analyze detailed contact-tracing data from Hunan, a province in China adjacent to Hubei and one of the first regions to experience a SARS-CoV-2 outbreak in January to March 2020. The Hunan outbreak was swiftly brought under control by March 2020 through a combination of nonpharmaceutical interventions including population-level mobility restriction (i.e., lockdown), traveler screening, case isolation, contact tracing, and quarantine. In parallel, highly detailed epidemiological information on SARS-CoV-2–infected individuals and their close contacts was collected by the Hunan Provincial Center for Disease Control and Prevention. ### RATIONALE Contact-tracing data provide information to reconstruct transmission chains and understand outbreak dynamics. These data can in turn generate valuable intelligence on key epidemiological parameters and risk factors for transmission, which paves the way for more-targeted and cost-effective interventions. ### RESULTS On the basis of epidemiological information and exposure diaries on 1178 SARS-CoV-2–infected individuals and their 15,648 close contacts, we developed a series of statistical and computational models to stochastically reconstruct transmission chains, identify risk factors for transmission, and infer the infectiousness profile over the course of a typical infection. We observe overdispersion in the distribution of secondary infections, with 80% of secondary cases traced back to 15% of infections, which indicates substantial transmission heterogeneities. We find that SARS-CoV-2 transmission risk scales positively with the duration of exposure and the closeness of social interactions, with the highest per-contact risk estimated in the household. Lockdown interventions increase transmission risk in families and households, whereas the timely isolation of infected individuals reduces risk across all types of contacts. There is a gradient of increasing susceptibility with age but no significant difference in infectivity by age or clinical severity. Early isolation of SARS-CoV-2–infected individuals drastically alters transmission kinetics, leading to shorter generation and serial intervals and a higher fraction of presymptomatic transmission. After adjusting for the censoring effects of isolation, we find that the infectiousness profile of a typical SARS-CoV-2 patient peaks just before symptom onset, with 53% of transmission occurring in the presymptomatic phase in an uncontrolled setting. We then use these results to evaluate the effectiveness of individual-based strategies (case isolation and contact quarantine) both alone and in combination with population-level contact reductions. We find that a plausible parameter space for SARS-CoV-2 control is restricted to scenarios where interventions are synergistically combined, owing to the particular transmission kinetics of this virus. ### CONCLUSION There is considerable heterogeneity in SARS-CoV-2 transmission owing to individual differences in biology and contacts that is modulated by the effects of interventions. We estimate that about half of secondary transmission events occur in the presymptomatic phase of a primary case in uncontrolled outbreaks. Achieving epidemic control requires that isolation and contact-tracing interventions are layered with population-level approaches, such as mask wearing, increased teleworking, and restrictions on large gatherings. Our study also demonstrates the value of conducting high-quality contact-tracing investigations to advance our understanding of the transmission dynamics of an emerging pathogen. ![Figure][2] Transmission chains, contact patterns, and transmission kinetics of SARS-CoV-2 in Hunan, China, based on case and contact-tracing data from Hunan, China. (Top left) One realization of the reconstructed transmission chains, with a histogram representing overdispersion in the distribution of secondary infections. (Top right) Contact matrices of community, social, extended family, and household contacts reveal distinct age profiles. (Bottom) Earlier isolation of primary infections shortens the generation and serial intervals while increasing the relative contribution of transmission in the presymptomatic phase. A long-standing question in infectious disease dynamics concerns the role of transmission heterogeneities, which are driven by demography, behavior, and interventions. On the basis of detailed patient and contact-tracing data in Hunan, China, we find that 80% of secondary infections traced back to 15% of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) primary infections, which indicates substantial transmission heterogeneities. Transmission risk scales positively with the duration of exposure and the closeness of social interactions and is modulated by demographic and clinical factors. The lockdown period increases transmission risk in the family and households, whereas isolation and quarantine reduce risks across all types of contacts. The reconstructed infectiousness profile of a typical SARS-CoV-2 patient peaks just before symptom presentation. Modeling indicates that SARS-CoV-2 control requires the synergistic efforts of case isolation, contact quarantine, and population-level interventions because of the specific transmission kinetics of this virus. [1]: /lookup/doi/10.1126/science.abe2424 [2]: pending:yes

Code Adam Gradient Descent Optimization From Scratch


Gradient descent is an optimization algorithm that follows the negative gradient of an objective function in order to locate the minimum of the function. A limitation of gradient descent is that a single step size (learning rate) is used for all input variables. Extensions to gradient descent like AdaGrad and RMSProp update the algorithm to use a separate step size for each input variable but may result in a step size that rapidly decreases to very small values. The Adaptive Movement Estimation algorithm, or Adam for short, is an extension to gradient descent and a natural successor to techniques like AdaGrad and RMSProp that automatically adapts a learning rate for each input variable for the objective function and further smooths the search process by using an exponentially decreasing moving average of the gradient to make updates to variables. In this tutorial, you will discover how to develop gradient descent with Adam optimization algorithm from scratch.

2020 in Review: 10 Nonfiction Books AI Experts Enjoyed Reading


The long-anticipated revision of Artificial Intelligence: A Modern Approach explores the full breadth and depth of the field of artificial intelligence (AI). The 4th Edition brings readers up to date on the latest technologies, presents concepts in a more unified manner, and offers new or expanded coverage of machine learning, deep learning, transfer learning, multiagent systems, robotics, natural language processing, causality, probabilistic programming, privacy, fairness, and safe AI.

A learning perspective on the emergence of abstractions: the curious case of phonemes Machine Learning

In the present paper we use a range of modeling techniques to investigate whether an abstract phone could emerge from exposure to speech sounds. We test two opposing principles regarding the development of language knowledge in linguistically untrained language users: Memory-Based Learning (MBL) and Error-Correction Learning (ECL). A process of generalization underlies the abstractions linguists operate with, and we probed whether MBL and ECL could give rise to a type of language knowledge that resembles linguistic abstractions. Each model was presented with a significant amount of pre-processed speech produced by one speaker. We assessed the consistency or stability of what the models have learned and their ability to give rise to abstract categories. Both types of models fare differently with regard to these tests. We show that ECL learning models can learn abstractions and that at least part of the phone inventory can be reliably identified from the input.

Inference for High-dimensional Maximin Effects in Heterogeneous Regression Models Using a Sampling Approach Machine Learning

Heterogeneity is an important feature of modern data sets and a central task is to extract information from large-scale and heterogeneous data. In this paper, we consider multiple high-dimensional linear models and adopt the definition of maximin effect (Meinshausen, B{\"u}hlmann, AoS, 43(4), 1801--1830) to summarize the information contained in this heterogeneous model. We define the maximin effect for a targeted population whose covariate distribution is possibly different from that of the observed data. We further introduce a ridge-type maximin effect to simultaneously account for reward optimality and statistical stability. To identify the high-dimensional maximin effect, we estimate the regression covariance matrix by a debiased estimator and use it to construct the aggregation weights for the maximin effect. A main challenge for statistical inference is that the estimated weights might have a mixture distribution and the resulted maximin effect estimator is not necessarily asymptotic normal. To address this, we devise a novel sampling approach to construct the confidence interval for any linear contrast of high-dimensional maximin effects. The coverage and precision properties of the proposed confidence interval are studied. The proposed method is demonstrated over simulations and a genetic data set on yeast colony growth under different environments.

Learning Graphons via Structured Gromov-Wasserstein Barycenters Machine Learning

We propose a novel and principled method to learn a nonparametric graph model called graphon, which is defined in an infinite-dimensional space and represents arbitrary-size graphs. Based on the weak regularity lemma from the theory of graphons, we leverage a step function to approximate a graphon. We show that the cut distance of graphons can be relaxed to the Gromov-Wasserstein distance of their step functions. Accordingly, given a set of graphs generated by an underlying graphon, we learn the corresponding step function as the Gromov-Wasserstein barycenter of the given graphs. Furthermore, we develop several enhancements and extensions of the basic algorithm, $e.g.$, the smoothed Gromov-Wasserstein barycenter for guaranteeing the continuity of the learned graphons and the mixed Gromov-Wasserstein barycenters for learning multiple structured graphons. The proposed approach overcomes drawbacks of prior state-of-the-art methods, and outperforms them on both synthetic and real-world data. The code is available at

Are we Forgetting about Compositional Optimisers in Bayesian Optimisation? Machine Learning

Bayesian optimisation presents a sample-efficient methodology for global optimisation. Within this framework, a crucial performance-determining subroutine is the maximisation of the acquisition function, a task complicated by the fact that acquisition functions tend to be non-convex and thus nontrivial to optimise. In this paper, we undertake a comprehensive empirical study of approaches to maximise the acquisition function. Additionally, by deriving novel, yet mathematically equivalent, compositional forms for popular acquisition functions, we recast the maximisation task as a compositional optimisation problem, allowing us to benefit from the extensive literature in this field. We highlight the empirical advantages of the compositional approach to acquisition function maximisation across 3958 individual experiments comprising synthetic optimisation tasks as well as tasks from Bayesmark. Given the generality of the acquisition function maximisation subroutine, we posit that the adoption of compositional optimisers has the potential to yield performance improvements across all domains in which Bayesian optimisation is currently being applied.

A connection between the pattern classification problem and the General Linear Model for statistical inference Machine Learning

A connection between the General Linear Model (GLM) in combination with classical statistical inference and the machine learning (MLE)-based inference is described in this paper. Firstly, the estimation of the GLM parameters is expressed as a Linear Regression Model (LRM) of an indicator matrix, that is, in terms of the inverse problem of regressing the observations. In other words, both approaches, i.e. GLM and LRM, apply to different domains, the observation and the label domains, and are linked by a normalization value at the least-squares solution. Subsequently, from this relationship we derive a statistical test based on a more refined predictive algorithm, i.e. the (non)linear Support Vector Machine (SVM) that maximizes the class margin of separation, within a permutation analysis. The MLE-based inference employs a residual score and includes the upper bound to compute a better estimation of the actual (real) error. Experimental results demonstrate how the parameter estimations derived from each model resulted in different classification performances in the equivalent inverse problem. Moreover, using real data the aforementioned predictive algorithms within permutation tests, including such model-free estimators, are able to provide a good trade-off between type I error and statistical power.

Acceleration in Hyperbolic and Spherical Spaces Machine Learning

We further research on the acceleration phenomenon on Riemannian manifolds by introducing the first global first-order method that achieves the same rates as accelerated gradient descent in the Euclidean space for the optimization of smooth and geodesically convex (g-convex) or strongly g-convex functions defined on the hyperbolic space or a subset of the sphere, up to constants and log factors. To the best of our knowledge, this is the first method that is proved to achieve these rates globally on functions defined on a Riemannian manifold $\mathcal{M}$ other than the Euclidean space. Additionally, for any Riemannian manifold of bounded sectional curvature, we provide reductions from optimization methods for smooth and g-convex functions to methods for smooth and strongly g-convex functions and vice versa. As a proxy, we solve a constrained non-convex Euclidean problem, under a condition between convexity and quasar-convexity.

Quality-Diversity Optimization: a novel branch of stochastic optimization Machine Learning

Traditional optimization algorithms search for a single global optimum that maximizes (or minimizes) the objective function. Multimodal optimization algorithms search for the highest peaks in the search space that can be more than one. Quality-Diversity algorithms are a recent addition to the evolutionary computation toolbox that do not only search for a single set of local optima, but instead try to illuminate the search space. In effect, they provide a holistic view of how high-performing solutions are distributed throughout a search space. The main differences with multimodal optimization algorithms are that (1) Quality-Diversity typically works in the behavioral space (or feature space), and not in the genotypic (or parameter) space, and (2) Quality-Diversity attempts to fill the whole behavior space, even if the niche is not a peak in the fitness landscape. In this chapter, we provide a gentle introduction to Quality-Diversity optimization, discuss the main representative algorithms, and the main current topics under consideration in the community. Throughout the chapter, we also discuss several successful applications of Quality-Diversity algorithms, including deep learning, robotics, and reinforcement learning.