California Institute of Technology
Phase Mapper: Accelerating Materials Discovery with AI
Bai, Junwen (Cornell University) | Xue, Yexiang (Cornell University) | Bjorck, Johan (Cornell University) | Bras, Ronan Le (Cornell University) | Rappazzo, Brendan (Cornell University) | Bernstein, Richard (Cornell University) | Suram, Santosh K. (California Institute of Technology) | Dover, Robert Bruce van (Cornell University) | Gregoire, John M. (California Institute of Technology) | Gomes, Carla P. (Cornell University)
From the stone age, to the bronze, iron age, and modern silicon age, the discovery and characterization of new materials has always been instrumental to humanity's progress and development. With the current pressing need to address sustainability challenges and find alternatives to fossil fuels, we look for solutions in the development of new materials that will allow for renewable energy. To discover materials with the required properties, materials scientists can perform high-throughput materials discovery, which includes rapid synthesis and characterization via X-ray diffraction (XRD) of thousands of materials. A central problem in materials discovery, the phase map identification problem, involves the determination of the crystal structure of materials from materials composition and structural characterization data. This analysis is traditionally performed mainly by hand, which can take days for a single material system. In this work we present Phase-Mapper, a solution platform that tightly integrates XRD experimentation, AI problem solving, and human intelligence for interpreting XRD patterns and inferring the crystal structures of the underlying materials. Phase-Mapper is compatible with any spectral demixing algorithm, including our novel solver, AgileFD, which is based on convolutive non-negative matrix factorization. AgileFD allows materials scientists to rapidly interpret XRD patterns, and incorporates constraints to capture prior knowledge about the physics of the materials as well as human feedback. With our system, materials scientists have been able to interpret previously unsolvable systems of XRD data at the Department of Energyโs Joint Center for Artificial Photosynthesis, including the Nb-Mn-V oxide system, which led to the discovery of new solar light absorbers and is provided as an illustrative example of AI-enabled high throughput materials discovery
Mars Target Encyclopedia: Rock and Soil Composition Extracted From the Literature
Wagstaff, Kiri L. (California Institute of Technology) | Francis, Raymond (California Institute of Technology) | Gowda, Thamme (California Institute of Technology) | Lu, You (Information Sciences Institute, University of Southern California ) | Riloff, Ellen (California Institute of Technology) | Singh, Karanjeet (University of Utah) | Lanza, Nina L. (California Institute of Technology)
We have constructed an information extraction system called the Mars Target Encyclopedia that takes in planetary science publications and extracts scientific knowledge about target compositions. The extracted knowledge is stored in a searchable database that can greatly accelerate the ability of scientists to compare new discoveries with what is already known. To date, we have applied this system to ~6000 documents and achieved 41-56% precision in the extracted information.
Deep Mars: CNN Classification of Mars Imagery for the PDS Imaging Atlas
Wagstaff, Kiri L. (California Institute of Technology) | Lu, You (California Institute of Technology) | Stanboli, Alice (California Institute of Technology) | Grimes, Kevin (California Institute of Technology) | Gowda, Thamme (California Institute of Technology) | Padams, Jordan (Information Sciences Institute, University of Southern California)
NASA has acquired more than 22 million images from the planet Mars. To help users find images of interest, we developed a content-based search capability for Mars rover surface images and Mars orbital images. We started with the AlexNet convolutional neural network, which was trained on Earth images, and used transfer learning to adapt the network for use with Mars images. We report on our deployment of these classifiers within the PDS Imaging Atlas, a publicly accessible web interface, to enable the first content-based image search for NASAโs Mars images.
Safe Exploration and Optimization of Constrained MDPs Using Gaussian Processes
Wachi, Akifumi (University of Tokyo) | Sui, Yanan (California Institute of Technology) | Yue, Yisong (California Institute of Technology) | Ono, Masahiro (California Institute of Technology)
We present a reinforcement learning approach to explore and optimize a safety-constrained Markov Decision Process(MDP). In this setting, the agent must maximize discounted cumulative reward while constraining the probability of entering unsafe states, defined using a safety function being within some tolerance. The safety values of all states are not known a priori, and we probabilistically model them via aGaussian Process (GP) prior. As such, properly behaving in such an environment requires balancing a three-way trade-off of exploring the safety function, exploring the reward function, and exploiting acquired knowledge to maximize reward. We propose a novel approach to balance this trade-off. Specifically, our approach explores unvisited states selectively; that is, it prioritizes the exploration of a state if visiting that state significantly improves the knowledge on the achievable cumulative reward. Our approach relies on a novel information gain criterion based on Gaussian Process representations of the reward and safety functions. We demonstrate the effectiveness of our approach on a range of experiments, including a simulation using the real Martian terrain data.
A Parallelizable Acceleration Framework for Packing Linear Programs
London, Palma (California Institute of Technology) | Vardi, Shai (California Institute of Technology) | Wierman, Adam (California Institute of Technology) | Yi, Hanling (The Chinese University of Hong Kong)
This paper presents an acceleration framework for packing linear programming problems where the amount of data available is limited, i.e., where the number of constraints m is small compared to the variable dimension n. The framework can be used as a black box to speed up linear programming solvers dramatically, by two orders of magnitude in our experiments. We present worst-case guarantees on the quality of the solution and the speedup provided by the algorithm, showing that the framework provides an approximately optimal solution while running the original solver on a much smaller problem. The framework can be used to accelerate exact solvers, approximate solvers, and parallel/distributed solvers. Further, it can be used for both linear programs and integer linear programs.
Non-Exploitable Protocols for Repeated Cake Cutting
Tamuz, Omer (California Institute of Technology) | Vardi, Shai (California Institute of Technology) | Ziani, Juba (California Institute of Technology)
We introduce the notion of exploitability in cut-and-choose protocols for repeated cake cutting. If a cut-and-choose protocol is repeated, the cutter can possibly gain information about the chooser from her previous actions, and exploit this information for her own gain, at the expense of the chooser. We define a generalization of cut-and-choose protocols - forced-cut protocols - in which some cuts are made exogenously while others are made by the cutter, and show that there exist non-exploitable forced-cut protocols that use a small number of cuts per day: When the cake has at least as many dimensions as days, we show a protocol that uses a single cut per day. When the cake is 1-dimensional, we show an adaptive non-exploitable protocol that uses 3 cuts per day, and a non-adaptive protocol that uses n cuts per day (where n is the number of days). In contrast, we show that no non-adaptive non-exploitable forced-cut protocol can use a constant number of cuts per day. Finally, we show that if the cake is at least 2-dimensional, there is a non-adaptive non-exploitable protocol that uses 3 cuts per day.
Phase-Mapper: An AI Platform to Accelerate High Throughput Materials Discovery
Xue, Yexiang (Cornell University) | Bai, Junwen (Shanghai Jiaotong University) | Bras, Ronan Le (Cornell University) | Rappazzo, Brendan (Cornell University) | Bernstein, Richard (Cornell University) | Bjorck, Johan ( Department of Computer Science Cornell University ) | Longpre, Liane (Cornell University) | Suram, Santosh K. (California Institute of Technology) | Dover, Robert B. van (Cornell University) | Gregoire, John (California Institute of Technology) | Gomes, Carla P. (Cornell University)
High-throughput materials discovery involves the rapid synthesis, measurement, and characterization of many different but structurally related materials. A central problem in materials discovery, the phase map identification problem, involves the determination of the crystal structure of materials from materials composition and structural characterization data. We present Phase-Mapper, a novel solution platform that allows humans to interact with both the data and products of AI algorithms, including the incorporation of human feedback to constrain or initialize solutions. Phase-Mapper is compatible with any spectral demixing algorithm, including our novel solver, AgileFD, which is based on convolutive non-negative matrix factorization. AgileFD allows materials scientists to rapidly interpret XRD patterns, and can incorporate constraints to capture the physics of the materials as well as human feedback. We compare three solver variants with previously proposed methods in a large-scale experiment involving 20 synthetic systems, demonstrating the efficacy of imposing physical constraints using AgileFD. Since the deployment of Phase-Mapper at the Department of Energy's Joint Center for Artificial Photosynthesis (JCAP), thousands of X-ray diffraction patterns have been processed and the results are yielding discovery of new materials for energy applications, as exemplified by the discovery of a new family of metal oxide solar light absorbers, among the previously unsolved Nb-Mn-V oxide system, which is provided here as an illustrative example. Phase-Mapper is also being deployed at the Stanford Synchrotron Radiation Lightsource (SSRL) to enable phase mapping on datasets in real time.
Entropic Causal Inference
Kocaoglu, Murat (The University of Texas at Austin) | Dimakis, Alexandros G. (The University of Texas at Austin) | Vishwanath, Sriram (The University of Texas at Austin) | Hassibi, Babak (California Institute of Technology)
We consider the problem of identifying the causal direction between two discrete random variables using observational data. Unlike previous work, we keep the most general functional model but make an assumption on the unobserved exogenous variable: Inspired by Occam's razor, we assume that the exogenous variable is simple in the true causal direction. We quantify simplicity using Renyi entropy. Our main result is that, under natural assumptions, if the exogenous variable has low H0 entropy (cardinality) in the true direction, it must have high H0 entropy in the wrong direction. We establish several algorithmic hardness results about estimating the minimum entropy exogenous variable. We show that the problem of finding the exogenous variable with minimum H1 entropy (Shannon Entropy) is equivalent to the problem of finding minimum joint entropy given n marginal distributions, also known as minimum entropy coupling problem. We propose an efficient greedy algorithm for the minimum entropy coupling problem, that for n=2 provably finds a local optimum. This gives a greedy algorithm for finding the exogenous variable with minimum Shannon entropy. Our greedy entropy-based causal inference algorithm has similar performance to the state of the art additive noise models in real datasets. One advantage of our approach is that we make no use of the values of random variables but only their distributions. Our method can therefore be used for causal inference for both ordinal and also categorical data, unlike additive noise models.
When and Why Are Deep Networks Better Than Shallow Ones?
Mhaskar, Hrushikesh (California Institute of Technology) | Liao, Qianli (Massachusetts Institute of Technology) | Poggio, Tomaso (Massachusetts Institute of Technology)
While the universal approximation property holds both for hierarchical and shallow networks, deep networks can approximate the class of compositional functions as well as shallow networks but with exponentially lower number of training parameters and sample complexity. Compositional functions are obtained as a hierarchy of local constituent functions, where "local functions'' are functions with low dimensionality. This theorem proves an old conjecture by Bengio on the role of depth in networks, characterizing precisely the conditions under which it holds. It also suggests possible answers to the the puzzle of why high-dimensional deep networks trained on large training sets often do not seem to show overfit.
Tropel: Crowdsourcing Detectors with Minimal Training
Patterson, Genevieve (Brown University) | Horn, Grant Van (California Institute of Technology) | Belongie, Serge (Cornell University and Cornell Tech) | Perona, Pietro (California Institue of Technology) | Hays, James (Brown University)
This paper introduces the Tropel system which enables non-technical users to create arbitrary visual detectors without first annotating a training set. Our primary contribution is a crowd active learning pipeline that is seeded with only a single positive example and an unlabeled set of training images. We examine the crowd's ability to train visual detectors given severely limited training themselves. This paper presents a series of experiments that reveal the relationship between worker training, worker consensus and the average precision of detectors trained by crowd-in-the-loop active learning. In order to verify the efficacy of our system, we train detectors for bird species that work nearly as well as those trained on the exhaustively labeled CUB 200 dataset at significantly lower cost and with little effort from the end user. To further illustrate the usefulness of our pipeline, we demonstrate qualitative results on unlabeled datasets containing fashion images and street-level photographs of Paris.