Collaborating Authors


The language of a virus


Uncovering connections between seemingly unrelated branches of science might accelerate research in one branch by using the methods developed in the other branch as stepping stones. On page 284 of this issue, Hie et al. ([ 1 ][1]) provide an elegant example of such unexpected connections. The authors have uncovered a parallel between the properties of a virus and its interpretation by the host immune system and the properties of a sentence in natural language and its interpretation by a human. By leveraging an extensive natural language processing (NLP) toolbox ([ 2 ][2], [ 3 ][3]) developed over the years, they have come up with a powerful new method for the identification of mutations that allow a virus to escape from recognition by neutralizing antibodies. In 1950, Alan Turing predicted that machines will eventually compete with men in “intellectual fields” and suggested that one possible way forward would be to build a machine that can be taught to understand and speak English ([ 4 ][4]). This was, and still is, an ambitious goal. It is clear that language grammar can provide a formal skeleton for building sentences, but how can machines be trained to infer the meanings? In natural language, there are many ways to express the same idea, and yet small changes in expression can often change the meaning. Linguistics developed a way of quantifying the similarity of meaning (semantics). Specifically, it was proposed that words that are used in the same context are likely to have similar meanings ([ 5 ][5], [ 6 ][6]). This distributional hypothesis became a key feature for the computational technique in NLP, known as word (semantic) embedding. The main idea is to characterize words as vectors that represent distributional properties in a large amount of language data and then embed these sparse, high-dimensional vectors into more manageable, low-dimensional space in a distance-preserving manner. By the distributional hypothesis, this technique should group words that have similar semantics together in the embedding space. Hie et al. proposed that viruses can also be thought to have a grammar and semantics. Intuitively, the grammar describes which sequences make specific viruses (or their parts). Biologically, a viral protein sequence should have all the properties needed to invade a host, multiply, and continue invading another host. Thus, in some way, the grammar represents the fitness of a virus. With enough data, current machine learning approaches can be used to learn this sequence-based fitness function. ![Figure][7] Predicting immune escape The constrained semantic change search algorithm obtains semantic embeddings of all mutated protein sequences using bidirectional long short-term memory (LSTM). The sequences are ranked according to the combined score of the semantic change (the distance of a mutation from the original sequence) and fitness (the probability that a mutation appears in viral sequences). GRAPHIC: V. ALTOUNIAN/SCIENCE But what would be the meaning (semantics) of a virus? Hie et al. suggested that the semantics of a virus should be defined in terms of its recognition by immune systems. Specifically, viruses with different semantics would require a different state of the immune system (for example, different antibodies) to be recognized. The authors hypothesized that semantic embeddings allow sequences that require different immune responses to be uncovered. In this context, words represent protein sequences (or protein fragments), and recognition of such protein fragments is the task performed by the immune system. To escape immune responses, viral genomes can become mutated so that the virus evolves to no longer be recognized by the immune system. However, a virus that acquires a mutation that compromises its function (and thus fitness) will not survive. Using the NLP analogy, immune escape will be achieved by the mutations that change the semantics of the virus while maintaining its grammaticality so that the virus will remain infectious but escape the immune system. On the basis of this idea, Hie et al. developed a new approach, called constrained semantic change search (CSCS). Computationally, the goal of CSCS is to identify mutations that confer high fitness and substantial semantic changes at the same time (see the figure). The immune escape scores are computed by combining the two quantities. The search algorithm builds on a powerful deep learning technique for language modeling, called long short-term memory (LSTM), to obtain semantic embeddings of all mutated sequences and rank the sequences according to their immune escape scores in the embedded space. The semantic changes correspond to the distance of the mutated sequences to the original sequence in the semantic embedding, and its “grammaticality” (or fitness) is estimated by the probability that the mutation appears in viral sequences. The immune escape scores can then be computed by simultaneously considering both the semantic distance and fitness probability. Hie et al. confirmed their hypothesis for the correspondence of grammaticality and semantics to fitness and immune response in three viral proteins: influenza A hemagglutinin (HA), HIV-1 envelope (Env), and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) Spike. For the analogy of semantics to immune response, they found that clusters of semantically similar viruses were in good correspondence with virus subtypes, host, or both, confirming that the language model can extract functional meanings from protein sequences. The clustering patterns also revealed interspecies transmissibility and antigenic similarity. The correspondence of grammaticality to fitness was assessed more directly by using deep mutational scans evaluated for replication fitness (for HA and Env) or binding (for Spike). The combined model was tested against experimentally verified mutations that allow for immue escape. Scoring each amino acid residue with CSCS, the authors uncovered viral protein regions that are significantly enriched with escape potential: the head of HA for influenza, the V1/V2 hypervariable regions for HIV Env, and the receptor-binding domain (RBD) and amino-terminal domain for SARS-CoV-2 Spike. The language of viral evolution and escape proposed by Hie et al. provides a powerful framework for predicting mutations that lead to viral escape. However, interesting questions remain. Further extending the natural language analogy, it is notable that individuals can interpret the same English sentence differently depending on their past experience and the fluency in the language. Similarly, immune response differs between individuals depending on factors such as past pathogenic exposures and overall “strength” of the immune system. It will be interesting to see whether the proposed approach can be adapted to provide a “personalized” view of the language of virus evolution. 1. [↵][8]1. B. Hie, 2. E. Zhong, 3. B. Berger, 4. B. Bryson , Science 371, 284 (2021). [OpenUrl][9][Abstract/FREE Full Text][10] 2. [↵][11]1. L. Yann, 2. Y. Bengio, 3. G. Hinton , Nature 521, 436 (2015). [OpenUrl][12][CrossRef][13][PubMed][14] 3. [↵][15]1. T. Young, 2. D. Hazarika, 3. S. Poria, 4. E. Cambria , IEEE Comput. Intell. Mag. 13, 55 (2018). [OpenUrl][16] 4. [↵][17]1. A. Turing , Mind LIX, 433 (1950). 5. [↵][18]1. Z. S. Harris , Word 10, 146 (1954). [OpenUrl][19][CrossRef][20][PubMed][21] 6. [↵][22]1. J. R. Firth , in Studies in Linguistic Analysis (1957), pp. 1–32. Acknowledgments: The authors are supported by the Intramural Research Programs of the National Library of Medicine at the National Institutes of Health, USA. [1]: #ref-1 [2]: #ref-2 [3]: #ref-3 [4]: #ref-4 [5]: #ref-5 [6]: #ref-6 [7]: pending:yes [8]: #xref-ref-1-1 "View reference 1 in text" [9]: {openurl}?query=rft.jtitle%253DScience%26rft.stitle%253DScience%26rft.aulast%253DHie%26rft.auinit1%253DB.%26rft.volume%253D371%26rft.issue%253D6526%26rft.spage%253D284%26rft.epage%253D288%26rft.atitle%253DLearning%2Bthe%2Blanguage%2Bof%2Bviral%2Bevolution%2Band%2Bescape%26rft_id%253Dinfo%253Adoi%252F10.1126%252Fscience.abd7331%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [10]: /lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Mzoic2NpIjtzOjU6InJlc2lkIjtzOjEyOiIzNzEvNjUyNi8yODQiO3M6NDoiYXRvbSI7czoyMjoiL3NjaS8zNzEvNjUyNi8yMzMuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9 [11]: #xref-ref-2-1 "View reference 2 in text" [12]: {openurl}?query=rft.jtitle%253DNature%26rft.volume%253D521%26rft.spage%253D436%26rft_id%253Dinfo%253Adoi%252F10.1038%252Fnature14539%26rft_id%253Dinfo%253Apmid%252F26017442%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [13]: /lookup/external-ref?access_num=10.1038/nature14539&link_type=DOI [14]: /lookup/external-ref?access_num=26017442&link_type=MED&atom=%2Fsci%2F371%2F6526%2F233.atom [15]: #xref-ref-3-1 "View reference 3 in text" [16]: {openurl}?query=rft.jtitle%253DIEEE%2BComput.%2BIntell.%2BMag.%26rft.volume%253D13%26rft.spage%253D55%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [17]: #xref-ref-4-1 "View reference 4 in text" [18]: #xref-ref-5-1 "View reference 5 in text" [19]: {openurl}?query=rft.jtitle%253DWord%26rft.volume%253D10%26rft.spage%253D146%26rft_id%253Dinfo%253Adoi%252F10.1080%252F00437956.1954.11659520%26rft_id%253Dinfo%253Apmid%252F32513867%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [20]: /lookup/external-ref?access_num=10.1080/00437956.1954.11659520&link_type=DOI [21]: /lookup/external-ref?access_num=32513867&link_type=MED&atom=%2Fsci%2F371%2F6526%2F233.atom [22]: #xref-ref-6-1 "View reference 6 in text"

What's coming up at IJCAI-PRICAI 2020?


IJCAI-PRICAI2020, the 29th International Joint Conference on Artificial Intelligence and the 17th Pacific Rim International Conference on Artificial Intelligence starts today and will run until 15 January. Find out what's happening during the event. The conference schedule is here and includes tutorials, workshops, invited talks and technical sessions. There are also competitions, early career spotlight talks, panel discussions and social events. There will be eight invited talks on a wide variety of topics.

Top 7 NLP Trends To Look Forward To In 2021


Natural language processing first studied in the 1950s, is one of the most dynamic and exciting fields of artificial intelligence. With the rise in technologies such as chatbots, voice assistants, and translators, NLP has continued to show some very encouraging developments. In this article, we attempt to predict what NLP trends will look like in the future as near as 2021. A large amount of data is generated at every moment on social media. It also births a peculiar problem of making sense of all this information generated, which cannot be possibly done manually.

Top 100 Artificial Intelligence Companies in the World


Artificial Intelligence (AI) is not just a buzzword, but a crucial part of the technology landscape. AI is changing every industry and business function, which results in increased interest in its applications, subdomains and related fields. This makes AI companies the top leaders driving the technology swift. AI helps us to optimise and automate crucial business processes, gather essential data and transform the world, one step at a time. From Google and Amazon to Apple and Microsoft, every major tech company is dedicating resources to breakthroughs in artificial intelligence. As big enterprises are busy acquiring or merging with other emerging inventions, small AI companies are also working hard to develop their own intelligent technology and services. By leveraging artificial intelligence, organizations get an innovative edge in the digital age. AI consults are also working to provide companies with expertise that can help them grow. In this digital era, AI is also a significant place for investment. AI companies are constantly developing the latest products to provide the simplest solutions. Henceforth, Analytics Insight brings you the list of top 100 AI companies that are leading the technology drive towards a better tomorrow. AEye develops advanced vision hardware, software, and algorithms that act as the eyes and visual cortex of autonomous vehicles. AEye is an artificial perception pioneer and creator of iDAR, a new form of intelligent data collection that acts as the eyes and visual cortex of autonomous vehicles. Since its demonstration of its solid state LiDAR scanner in 2013, AEye has pioneered breakthroughs in intelligent sensing. Their mission was to acquire the most information with the fewest ones and zeros. This would allow AEye to drive the automotive industry into the next realm of autonomy. Algorithmia invented the AI Layer.

XAI-P-T: A Brief Review of Explainable Artificial Intelligence from Practice to Theory Artificial Intelligence

In this work, we report the practical and theoretical aspects of Explainable AI (XAI) identified in some fundamental literature. Although there is a vast body of work on representing the XAI backgrounds, most of the corpuses pinpoint a discrete direction of thoughts. Providing insights into literature in practice and theory concurrently is still a gap in this field. This is important as such connection facilitates a learning process for the early stage XAI researchers and give a bright stand for the experienced XAI scholars. Respectively, we first focus on the categories of black-box explanation and give a practical example. Later, we discuss how theoretically explanation has been grounded in the body of multidisciplinary fields. Finally, some directions of future works are presented.

Predicting Events In MOBA Games: Dataset, Attribution, and Evaluation Artificial Intelligence

The multiplayer online battle arena (MOBA) games are becoming increasingly popular in recent years. Consequently, many efforts have been devoted to providing pre-game or in-game predictions for MOBA games. However, these works are limited in the following two aspects: 1) the lack of sufficient in-game features; 2) the absence of interpretability in the prediction results. These two limitations greatly restrict their practical performances and industrial applications. In this work, we collect and release a large-scale dataset containing rich in-game features for the popular MOBA game Honor of Kings. We then propose to predict four types of important events in an interpretable way by attributing the predictions to the input features using two gradient-based attribution methods: Integrated Gradients and SmoothGrad. To evaluate the explanatory power of different models and attribution methods, a fidelity-based evaluation metric is further proposed. Finally, we evaluate the accuracy and Fidelity of several competitive methods on the collected dataset to assess how well do machines predict the events in MOBA games.

SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning Artificial Intelligence

The attention mechanism is becoming increasingly popular in Natural Language Processing (NLP) applications, showing superior performance than convolutional and recurrent architectures. However, general-purpose platforms such as CPUs and GPUs are inefficient when performing attention inference due to complicated data movement and low arithmetic intensity. Moreover, existing NN accelerators mainly focus on optimizing convolutional or recurrent models, and cannot efficiently support attention. In this paper, we present SpAtten, an efficient algorithm-architecture co-design that leverages token sparsity, head sparsity, and quantization opportunities to reduce the attention computation and memory access. Inspired by the high redundancy of human languages, we propose the novel cascade token pruning to prune away unimportant tokens in the sentence. We also propose cascade head pruning to remove unessential heads. Cascade pruning is fundamentally different from weight pruning since there is no trainable weight in the attention mechanism, and the pruned tokens and heads are selected on the fly. To efficiently support them on hardware, we design a novel top-k engine to rank token and head importance scores with high throughput. Furthermore, we propose progressive quantization that first fetches MSBs only and performs the computation; if the confidence is low, it fetches LSBs and recomputes the attention outputs, trading computation for memory reduction. Extensive experiments on 30 benchmarks show that, on average, SpAtten reduces DRAM access by 10.0x with no accuracy loss, and achieves 1.6x, 3.0x, 162x, 347x speedup, and 1,4x, 3.2x, 1193x, 4059x energy savings over A3 accelerator, MNNFast accelerator, TITAN Xp GPU, Xeon CPU, respectively.

Unsupervised Learning of Discourse Structures using a Tree Autoencoder Artificial Intelligence

Discourse information, as postulated by popular discourse theories, such as RST and PDTB, has been shown to improve an increasing number of downstream NLP tasks, showing positive effects and synergies of discourse with important real-world applications. While methods for incorporating discourse become more and more sophisticated, the growing need for robust and general discourse structures has not been sufficiently met by current discourse parsers, usually trained on small scale datasets in a strictly limited number of domains. This makes the prediction for arbitrary tasks noisy and unreliable. The overall resulting lack of high-quality, high-quantity discourse trees poses a severe limitation to further progress. In order the alleviate this shortcoming, we propose a new strategy to generate tree structures in a task-agnostic, unsupervised fashion by extending a latent tree induction framework with an auto-encoding objective. The proposed approach can be applied to any tree-structured objective, such as syntactic parsing, discourse parsing and others. However, due to the especially difficult annotation process to generate discourse trees, we initially develop a method to generate larger and more diverse discourse treebanks. In this paper we are inferring general tree structures of natural text in multiple domains, showing promising results on a diverse set of tasks.

Continual Lifelong Learning in Natural Language Processing: A Survey Artificial Intelligence

Continual learning (CL) aims to enable information systems to learn from a continuous data stream across time. However, it is difficult for existing deep learning architectures to learn a new task without largely forgetting previously acquired knowledge. Furthermore, CL is particularly challenging for language learning, as natural language is ambiguous: it is discrete, compositional, and its meaning is context-dependent. In this work, we look at the problem of CL through the lens of various NLP tasks. Our survey discusses major challenges in CL and current methods applied in neural network models. We also provide a critical review of the existing CL evaluation methods and datasets in NLP.

Multi-type Disentanglement without Adversarial Training Artificial Intelligence

Controlling the style of natural language by disentangling the latent space is an important step towards interpretable machine learning. After the latent space is disentangled, the style of a sentence can be transformed by tuning the style representation without affecting other features of the sentence. Previous works usually use adversarial training to guarantee that disentangled vectors do not affect each other. However, adversarial methods are difficult to train. Especially when there are multiple features (e.g., sentiment, or tense, which we call style types in this paper), each feature requires a separate discriminator for extracting a disentangled style vector corresponding to that feature. In this paper, we propose a unified distribution-controlling method, which provides each specific style value (the value of style types, e.g., positive sentiment, or past tense) with a unique representation. This method contributes a solid theoretical basis to avoid adversarial training in multi-type disentanglement. We also propose multiple loss functions to achieve a style-content disentanglement as well as a disentanglement among multiple style types. In addition, we observe that if two different style types always have some specific style values that occur together in the dataset, they will affect each other when transferring the style values. We call this phenomenon training bias, and we propose a loss function to alleviate such training bias while disentangling multiple types. We conduct experiments on two datasets (Yelp service reviews and Amazon product reviews) to evaluate the style-disentangling effect and the unsupervised style transfer performance on two style types: sentiment and tense. The experimental results show the effectiveness of our model.