Collaborating Authors


What's next for AlphaFold and the AI protein-folding revolution


For more than a decade, molecular biologist Martin Beck and his colleagues have been trying to piece together one of the world's hardest jigsaw puzzles: a detailed model of the largest molecular machine in human cells. This behemoth, called the nuclear pore complex, controls the flow of molecules in and out of the nucleus of the cell, where the genome sits. Hundreds of these complexes exist in every cell. Each is made up of more than 1,000 proteins that together form rings around a hole through the nuclear membrane. These 1,000 puzzle pieces are drawn from more than 30 protein building blocks that interlace in myriad ways. Making the puzzle even harder, the experimentally determined 3D shapes of these building blocks are a potpourri of structures gathered from many species, so don't always mesh together well. And the picture on the puzzle's box -- a low-resolution 3D view of the nuclear pore complex -- lacks sufficient detail to know how many of the pieces precisely fit together. In 2016, a team led by Beck, who is based at the Max Planck Institute of Biophysics (MPIBP) in Frankfurt, Germany, reported a model1 that covered about 30% of the nuclear pore complex and around half of the 30 building blocks, called Nup proteins.

Protein structure prediction using AlphaFold2


My name is Dima and here I want to share my small project. It is about implementation of deep-learning tool in protein structure prediction. In the late December 2021 I was lucky to find online internship in the field of Bioinformatics. That was NyBerMan Merit Internship from LLBio-IT School and the main focus was, surprisingly (not), Covid investigation. After some technical interviews and huge competition (near 1000 participants for 20 places) I was planning next weeks of learning and doing.

First Wholly AI-Developed Drug Enters Phase 1 Trials


For several years we have been hearing about the potential of Artificial Intelligence (AI) to improve traditional drug discovery and development. In the last two years, clinical trials have begun. The UK's Exscientia made headlines last April by announcing the start of a Phase 1 clinical trial for a drug it designed using AI for an established protein target. Recursion Pharmaceuticals in Utah uses AI to find new uses for the drugs owned by other companies. Insilico Medicine has now announced the crucial next step: the start of the world's first Phase 1 clinical trial of a drug developed from scratch using AI.

AI for protein folding


The software, which uses an AI technique called deep learning, can predict the shape of proteins to the nearest atom, the first time a computer has matched the slow but accurate techniques used in the lab. Scientific teams around the world have started using it for research on cancer, antibiotic resistance, and covid-19. DeepMind has also set up a public database that it's filling with protein structures as AlphaFold2 predicts them. It currently has around 800,000 entries, and DeepMind says it will add more than 100 million--nearly every protein known to science--in the next year. DeepMind has spun off this work into a company called Isomorphic Labs, which it says will collaborate with existing biotech and pharma companies.

State of AI Ethics Report (Volume 6, February 2022) Artificial Intelligence

This report from the Montreal AI Ethics Institute (MAIEI) covers the most salient progress in research and reporting over the second half of 2021 in the field of AI ethics. Particular emphasis is placed on an "Analysis of the AI Ecosystem", "Privacy", "Bias", "Social Media and Problematic Information", "AI Design and Governance", "Laws and Regulations", "Trends", and other areas covered in the "Outside the Boxes" section. The two AI spotlights feature application pieces on "Constructing and Deconstructing Gender with AI-Generated Art" as well as "Will an Artificial Intellichef be Cooking Your Next Meal at a Michelin Star Restaurant?". Given MAIEI's mission to democratize AI, submissions from external collaborators have featured, such as pieces on the "Challenges of AI Development in Vietnam: Funding, Talent and Ethics" and using "Representation and Imagination for Preventing AI Harms". The report is a comprehensive overview of what the key issues in the field of AI ethics were in 2021, what trends are emergent, what gaps exist, and a peek into what to expect from the field of AI ethics in 2022. It is a resource for researchers and practitioners alike in the field to set their research and development agendas to make contributions to the field of AI ethics.

GENEOnet: A new machine learning paradigm based on Group Equivariant Non-Expansive Operators. An application to protein pocket detection Artificial Intelligence

Nowadays there is a big spotlight cast on the development of techniques of explainable machine learning. Here we introduce a new computational paradigm based on Group Equivariant Non-Expansive Operators, that can be regarded as the product of a rising mathematical theory of information-processing observers. This approach, that can be adjusted to different situations, may have many advantages over other common tools, like Neural Networks, such as: knowledge injection and information engineering, selection of relevant features, small number of parameters and higher transparency. We chose to test our method, called GENEOnet, on a key problem in drug design: detecting pockets on the surface of proteins that can host ligands. Experimental results confirmed that our method works well even with a quite small training set, providing thus a great computational advantage, while the final comparison with other state-of-the-art methods shows that GENEOnet provides better or comparable results in terms of accuracy.

DrugOOD: Out-of-Distribution (OOD) Dataset Curator and Benchmark for AI-aided Drug Discovery -- A Focus on Affinity Prediction Problems with Noise Annotations Artificial Intelligence

AI-aided drug discovery (AIDD) is gaining increasing popularity due to its promise of making the search for new pharmaceuticals quicker, cheaper and more efficient. In spite of its extensive use in many fields, such as ADMET prediction, virtual screening, protein folding and generative chemistry, little has been explored in terms of the out-of-distribution (OOD) learning problem with \emph{noise}, which is inevitable in real world AIDD applications. In this work, we present DrugOOD, a systematic OOD dataset curator and benchmark for AI-aided drug discovery, which comes with an open-source Python package that fully automates the data curation and OOD benchmarking processes. We focus on one of the most crucial problems in AIDD: drug target binding affinity prediction, which involves both macromolecule (protein target) and small-molecule (drug compound). In contrast to only providing fixed datasets, DrugOOD offers automated dataset curator with user-friendly customization scripts, rich domain annotations aligned with biochemistry knowledge, realistic noise annotations and rigorous benchmarking of state-of-the-art OOD algorithms. Since the molecular data is often modeled as irregular graphs using graph neural network (GNN) backbones, DrugOOD also serves as a valuable testbed for \emph{graph OOD learning} problems. Extensive empirical studies have shown a significant performance gap between in-distribution and out-of-distribution experiments, which highlights the need to develop better schemes that can allow for OOD generalization under noise for AIDD.

Deep Learning Based DNA and RNA Binding Sites Prediction for Accelerated Drug Discovery - CBIRT


Scientists from Skoltech's iMolecule group have created an artificial intelligence-driven approach to identify sites on the structures of DNA or RNA molecules where drug compounds may bind. The drug-binding site information will allow pharmaceutical firms to find novel therapeutic compounds – including antiviral agents – in a far more focused manner. The new method, which was published in Nucleic Acid Research: Genomics and Bioinformatics, claims to be more accurate than previous methods since it considers how a nucleic acid molecule's shape impacts which binding sites are accessible. Most drugs target proteins because pharmacologists have traditionally seen RNA as just a mediator between DNA and the functional proteins it encodes. As almost 85% of the genome is translated into RNAs, only a tiny percentage of those RNAs encode proteins.

How Artificial Intelligence is set to evolve in 2022? - ELE Times


Machines are getting smarter and smarter every year, but artificial intelligence is yet to live up to the hype that's been generated by some of the world's largest technology companies. Artificial Intelligence can excel at specific narrow tasks such as playing chess but it struggles to do more than one thing well. A seven-year-old has far broader intelligence than any of today's AI systems, for example. "AI algorithms are good at approaching individual tasks, or tasks that include a small degree of variability," Edward Grefenstette, a research scientist at Meta AI, formerly Facebook AI Research. "However, the real world encompasses the significant potential for change, a dynamic which we are bad at capturing within our training algorithms, yielding brittle intelligence," he added.

A Survey on Hyperdimensional Computing aka Vector Symbolic Architectures, Part II: Applications, Cognitive Models, and Challenges Artificial Intelligence

This is Part II of the two-part comprehensive survey devoted to a computing framework most commonly known under the names Hyperdimensional Computing and Vector Symbolic Architectures (HDC/VSA). Both names refer to a family of computational models that use high-dimensional distributed representations and rely on the algebraic properties of their key operations to incorporate the advantages of structured symbolic representations and vector distributed representations. Holographic Reduced Representations is an influential HDC/VSA model that is well-known in the machine learning domain and often used to refer to the whole family. However, for the sake of consistency, we use HDC/VSA to refer to the area. Part I of this survey covered foundational aspects of the area, such as historical context leading to the development of HDC/VSA, key elements of any HDC/VSA model, known HDC/VSA models, and transforming input data of various types into high-dimensional vectors suitable for HDC/VSA. This second part surveys existing applications, the role of HDC/VSA in cognitive computing and architectures, as well as directions for future work. Most of the applications lie within the machine learning/artificial intelligence domain, however we also cover other applications to provide a thorough picture. The survey is written to be useful for both newcomers and practitioners.