pathway


Autonomous Discovery of Unknown Reaction Pathways from Data by Chemical Reaction Neural Network

arXiv.org Machine Learning

The inference of chemical reaction networks is an important task in understanding the chemical processes in life sciences and environment. Yet, only a few reaction systems are well-understood due to a large number of important reaction pathways involved but still unknown. Revealing unknown reaction pathways is an important task for scientific discovery that takes decades and requires lots of expert knowledge. This work presents a neural network approach for discovering unknown reaction pathways from concentration time series data. The neural network denoted as Chemical Reaction Neural Network (CRNN), is designed to be equivalent to chemical reaction networks by following the fundamental physics laws of the Law of Mass Action and Arrhenius Law. The CRNN is physically interpretable, and its weights correspond to the reaction pathways and rate constants of the chemical reaction network. Then, inferencing the reaction pathways and the rate constants are accomplished by training the equivalent CRNN via stochastic gradient descent. The approach precludes the need for expert knowledge in proposing candidate reactions, such that the inference is autonomous and applicable to new systems for which there is no existing empirical knowledge to propose reaction pathways. The physical interpretability also makes the CRNN not only capable of fitting the data for a given system but also developing knowledge of unknown pathways that could be generalized to similar chemical systems. Finally, the approach is applied to several chemical systems in chemical engineering and biochemistry to demonstrate its robustness and generality.


How to Teach Artificial Intelligence Getting Smart

#artificialintelligence

Artificial intelligence--code that learns--is likely to be humankind's most important invention. It's a 60-year-old idea that took off five years ago when fast chips enabled massive computing and sensors, cameras, and robots fed data-hungry algorithms. We're a couple of years into a new age where machine learning (a functional subset of AI), big data and enabling technologies are transforming every sector. In every sector, there is a big data set behind every question. Every field is computational: healthcare, manufacturing, law, finance and accounting, retail, and real estate.


The Synthesizability of Molecules Proposed by Generative Models

arXiv.org Machine Learning

The discovery of functional molecules is an expensive and time-consuming process, exemplified by the rising costs of small molecule therapeutic discovery. One class of techniques of growing interest for early-stage drug discovery is de novo molecular generation and optimization, catalyzed by the development of new deep learning approaches. These techniques can suggest novel molecular structures intended to maximize a multi-objective function, e.g., suitability as a therapeutic against a particular target, without relying on brute-force exploration of a chemical space. However, the utility of these approaches is stymied by ignorance of synthesizability. To highlight the severity of this issue, we use a data-driven computer-aided synthesis planning program to quantify how often molecules proposed by state-of-the-art generative models cannot be readily synthesized. Our analysis demonstrates that there are several tasks for which these models generate unrealistic molecular structures despite performing well on popular quantitative benchmarks. Synthetic complexity heuristics can successfully bias generation toward synthetically-tractable chemical space, although doing so necessarily detracts from the primary objective. This analysis suggests that to improve the utility of these models in real discovery workflows, new algorithm development is warranted.


How To Teach Artificial Intelligence

#artificialintelligence

Artificial intelligence--code that learns--is likely to be humankind's most important invention. It's a 60-year-old idea that took off five years ago when fast chips enabled massive computing and sensors, cameras, and robots fed data-hungry algorithms. We're a couple of years into a new age where machine learning (a functional subset of AI), big data and enabling technologies are transforming every sector. In every sector, there is a big data set behind every question. Every field is computational: healthcare, manufacturing, law, finance and accounting, retail, and real estate.


Is Analytics-driven Innovation the Ultimate Oxymoron?

#artificialintelligence

Sometimes it just takes a simple, provocative statement to kick-off the innovation process – to remove an everyday given like driving a car or possessing a landline phone or centralizing all of your data in the cloud – to fuel the innovation process. Henrik Christensen, director of University San Diego's Contractual Robotics Institute, issued such a provocative statement: "My own prediction is that kids born today will never get to drive a car." I have recently been promoted to Chief Innovation Officer at Hitachi Vantara. I am very excited about the opportunity to build upon my work to interweave data science, design thinking, value engineering and economics to create a "Pathway to Analytics-driven Innovation" map that helps organizations derive and drive new sources of customer, product and operational value. Think of the "Pathway to Analytics-driven Innovation" as a maturity model that measures how effective organizations are at leveraging analytics to deliver innovative products and services to the market.


Reducing the Computational Burden of Deep Learning with Recursive Local Representation Alignment

arXiv.org Machine Learning

Training deep neural networks on large-scale datasets requires significant hardware resources whose costs (even on cloud platforms) put them out of reach of smaller organizations, groups, and individuals. Backpropagation (backprop), the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize. Furthermore, it requires researchers to continually develop various tricks, such as specialized weight initializations and activation functions, in order to ensure a stable parameter optimization. Our goal is to seek an effective, parallelizable alternative to backprop that can be used to train deep networks. In this paper, we propose a gradient-free learning procedure, recursive local representation alignment, for training large-scale neural architectures. Experiments with deep residual networks on CIFAR-10 and the massive-scale benchmark, ImageNet, show that our algorithm generalizes as well as backprop while converging sooner due to weight updates that are parallelizable and computationally less demanding. This is empirical evidence that a backprop-free algorithm can scale up to larger datasets. Another contribution is that we also significantly reduce total parameter count of our networks by utilizing fast, fixed noise maps in place of convolutional operations without compromising generalization.


Improved inter-scanner MS lesion segmentation by adversarial training on longitudinal data

arXiv.org Machine Learning

The evaluation of white matter lesion progression is an important biomarker in the follow-up of MS patients and plays a crucial role when deciding the course of treatment. Current automated lesion segmentation algorithms are susceptible to variability in image characteristics related to MRI scanner or protocol differences. We propose a model that improves the consistency of MS lesion segmentations in inter-scanner studies. First, we train a CNN base model to approximate the performance of icobrain, an FDA-approved clinically available lesion segmentation software. A discriminator model is then trained to predict if two lesion segmentations are based on scans acquired using the same scanner type or not, achieving a 78% accuracy in this task. Finally, the base model and the discriminator are trained adversarially on multi-scanner longitudinal data to improve the inter-scanner consistency of the base model. The performance of the models is evaluated on an unseen dataset containing manual delineations. The inter-scanner variability is evaluated on test-retest data, where the adversarial network produces improved results over the base model and the FDA-approved solution.


Google publishes the largest synapse-resolution map of brain connectivity - Tech Explorist

#artificialintelligence

The connectivity between brain cells plays a significant role in the function of the brain. In general, brain regions and their interactions can be modeled as complex brain network, which describes highly efficient information transmission in a brain. To study brain networks in detail, neuroscientists use various neuroimaging techniques. Last year, in collaboration with Janelia Research Campus and Cambridge University, Google published a study that represents the automated reconstruction of an entire fruit fly brain. The study mainly focused on the individual shape of the cells.


Learning Dynamic and Personalized Comorbidity Networks from Event Data using Deep Diffusion Processes

arXiv.org Machine Learning

Comorbid diseases co-occur and progress via complex temporal patterns that vary among individuals. In electronic health records we can observe the different diseases a patient has, but can only infer the temporal relationship between each co-morbid condition. Learning such temporal patterns from event data is crucial for understanding disease pathology and predicting prognoses. To this end, we develop deep diffusion processes (DDP) to model "dynamic comorbidity networks", i.e., the temporal relationships between comorbid disease onsets expressed through a dynamic graph. A DDP comprises events modelled as a multi-dimensional point process, with an intensity function parameterized by the edges of a dynamic weighted graph. The graph structure is modulated by a neural network that maps patient history to edge weights, enabling rich temporal representations for disease trajectories. The DDP parameters decouple into clinically meaningful components, which enables serving the dual purpose of accurate risk prediction and intelligible representation of disease pathology. We illustrate these features in experiments using cancer registry data.


Facebook Open-Sources PySlowFast Codebase for Video Understanding

#artificialintelligence

Facebook AI Research (FAIR) has been contributing heavily to video understanding research in recent years. At October's ICCV 2019 the team unveiled a Python-based codebase, PySlowFast. FAIR as now open-sourced PySlowFast, along with a pretrained model library and a pledge to continue adding cutting-edge resources to the project. The name "PySlowFast" derives from a novel duality -- the model has both a slow pathway that operates at a low frame rate to capture spatial semantics, and a lightweight, fast pathway that operates at a high frame rate, captures motion at fine temporal resolution, and can learn useful temporal information for video recognition. The introduction of PySlowFast addresses a couple of needs for ML researchers.