Collaborating Authors


DeepMind releases AlphaFold database of nearly all human protein structures


British artificial intelligence giant DeepMind has released a database of nearly all human protein structures that it amassed as part of its AlphaFold program. Last year, the organisers of the biennial Critical Assessment of protein Structure Prediction (CASP) recognised AlphaFold as a solution to the grand challenge of figuring out what shapes proteins fold into. "We have been stuck on this one problem – how do proteins fold up – for nearly 50 years. To see DeepMind produce a solution for this, having worked personally on this problem for so long and after so many stops and starts, wondering if we'd ever get there, is a very special moment." AlphaFold is a major scientific advance that will play a crucial role in helping scientists to solve important problems such as the protein misfolding associated with diseases such as Alzheimer's, Parkinson's, cystic fibrosis and Huntington's disease.

Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems Artificial Intelligence

In this tutorial article, we aim to provide the reader with the conceptual tools needed to get started on research on offline reinforcement learning algorithms: reinforcement learning algorithms that utilize previously collected data, without additional online data collection. Offline reinforcement learning algorithms hold tremendous promise for making it possible to turn large datasets into powerful decision making engines. Effective offline reinforcement learning methods would be able to extract policies with the maximum possible utility out of the available data, thereby allowing automation of a wide range of decision-making domains, from healthcare and education to robotics. However, the limitations of current algorithms make this difficult. We will aim to provide the reader with an understanding of these challenges, particularly in the context of modern deep reinforcement learning methods, and describe some potential solutions that have been explored in recent work to mitigate these challenges, along with recent applications, and a discussion of perspectives on open problems in the field.

Deep Learning in Mining Biological Data Machine Learning

Recent technological advancements in data acquisition tools allowed life scientists to acquire multimodal data from different biological application domains. Broadly categorized in three types (i.e., sequences, images, and signals), these data are huge in amount and complex in nature. Mining such an enormous amount of data for pattern recognition is a big challenge and requires sophisticated data-intensive machine learning techniques. Artificial neural network-based learning systems are well known for their pattern recognition capabilities and lately their deep architectures - known as deep learning (DL) - have been successfully applied to solve many complex pattern recognition problems. Highlighting the role of DL in recognizing patterns in biological data, this article provides - applications of DL to biological sequences, images, and signals data; overview of open access sources of these data; description of open source DL tools applicable on these data; and comparison of these tools from qualitative and quantitative perspectives. At the end, it outlines some open research challenges in mining biological data and puts forward a number of possible future perspectives.

Industry News


Find here a listing of the latest industry news in genomics, genetics, precision medicine, and beyond. Updates are provided on a monthly basis. Sign-Up for our newsletter and never miss out on the latest news and updates. As 2019 came to an end, Veritas Genetics struggled to get funding due to concerns it had previously taken money from China. It was forced to cease US operations and is in talks with potential buyers. The GenomeAsia 100K Project announced its pilot phase with hopes to tackle the underrepresentation of non-Europeans in human genetic studies and enable genetic discoveries across Asia. Veritas Genetics, the start-up that can sequence a human genome for less than $600, ceases US operations and is in talks with potential buyers Veritas Genetics ceases US operations but will continue Veritas Europe and Latin America. It had trouble raising funding due to previous China investments and is looking to be acquired. Illumina loses DNA sequencing patents The European Patent ...

Machine Learning for Integrating Data in Biology and Medicine: Principles, Practice, and Opportunities Machine Learning

New technologies have enabled the investigation of biology and human health at an unprecedented scale and in multiple dimensions. These dimensions include myriad properties describing genome, epigenome, transcriptome, microbiome, phenotype, and lifestyle. No single data type, however, can capture the complexity of all the factors relevant to understanding a phenomenon such as a disease. Integrative methods that combine data from multiple technologies have thus emerged as critical statistical and computational approaches. The key challenge in developing such approaches is the identification of effective models to provide a comprehensive and relevant systems view. An ideal method can answer a biological or medical question, identifying important features and predicting outcomes, by harnessing heterogeneous data across several dimensions of biological variation. In this Review, we describe the principles of data integration and discuss current methods and available implementations. We provide examples of successful data integration in biology and medicine. Finally, we discuss current challenges in biomedical integrative methods and our perspective on the future development of the field.

Focal onset seizure prediction using convolutional networks Machine Learning

Objective: This work investigates the hypothesis that focal seizures can be predicted using scalp electroencephalogram (EEG) data. Our first aim is to learn features that distinguish between the interictal and preictal regions. The second aim is to define a prediction horizon in which the prediction is as accurate and as early as possible, clearly two competing objectives. Methods: Convolutional filters on the wavelet transformation of the EEG signal are used to define and learn quantitative signatures for each period: interictal, preictal, and ictal. The optimal seizure prediction horizon is also learned from the data as opposed to making an a priori assumption. Results: Computational solutions to the optimization problem indicate a ten-minute seizure prediction horizon. This result is verified by measuring Kullback-Leibler divergence on the distributions of the automatically extracted features. Conclusion: The results on the EEG database of 204 recordings demonstrate that (i) the preictal phase transition occurs approximately ten minutes before seizure onset, and (ii) the prediction results on the test set are promising, with a sensitivity of 87.8% and a low false prediction rate of 0.142 FP/h. Our results significantly outperform a random predictor and other seizure prediction algorithms. Significance: We demonstrate that a robust set of features can be learned from scalp EEG that characterize the preictal state of focal seizures.