Mattmann, Chris A.
MARVIN: An Open Machine Learning Corpus and Environment for Automated Machine Learning Primitive Annotation and Execution
Mattmann, Chris A., Shah, Sujen, Wilson, Brian
In this demo paper, we introduce the DARPA D3M program for automatic machine learning (ML) and JPL's MARVIN tool that provides an environment to locate, annotate, and execute machine learning primitives for use in ML pipelines. MARVIN is a web-based application and associated back-end interface written in Python that enables composition of ML pipelines from hundreds of primitives from the world of Scikit-Learn, Keras, DL4J and other widely used libraries. MARVIN allows for the creation of Docker containers that run on Kubernetes clusters within DARPA to provide an execution environment for automated machine learning. MARVIN currently contains over 400 datasets and challenge problems from a wide array of ML domains including routine classification and regression to advanced video/image classification and remote sensing.
Creating a Mars Target Encyclopedia by Extracting Information from the Planetary Science Literature
Wagstaff, Kiri L. (Jet Propulsion Laboratory) | Riloff, Ellen (University of Utah) | Lanza, Nina L. (Los Alamos National Laboratory) | Mattmann, Chris A. (Jet Propulsion Laboratory) | Ramirez, Paul M. (Jet Propulsion Laboratory)
Staying up to date with the latest discoveries is a challenge in any scientific field. In planetary science, new observation targets on the surface of Mars are identified and named every day, and new publications announcing new discoveries and conclusions provide frequent updates about these targets. We are constructing a system that uses information extraction and retrieval methods to mine the steadily growing body of planetary science publications about Mars surface targets and automatically construct a concise summary of what is known about each target. The Mars Target Encyclopedia will provide a central, continually updated resource for use by planetary scientists and the interested public. We describe our use of Tika, Sundance, and AutoSlog to extract and summarize information, some of the challenges associated with this domain, and our plans for maturing the system.