Collaborating Authors


How to use AI to play Sonic the Hedgehog. It's NEAT!


Generation after generation, humans have adapted to become more fit with our surroundings. We started off as primates living in a world of eat or be eaten. Eventually we evolved into who we are today, reflecting modern society. Through the process of evolution we become smarter. We are able to work better with our environment and accomplish what we need to.

Next-generation natural language technologies: The deep learning agenda


The most appreciative advancements in statistical AI, the ones with the most meaning and potential to improve data's worth to the enterprise, are deep learning deployments of computer vision and natural language technologies. The distinctions between these applications involve much more than image recognition versus that of speech or language. Horizontal computer vision use cases pertain to some aspects of inter-machine intelligence, e.g., scanning videos or production settings for anomalies and generating alerts to initiate automated procedures to address them. Conversely, natural language technologies provide the most effective cognitive computing application for furthering human intelligence, decision making, and the action required to extract business value from such perceptivity. While the utility derived from image recognition largely varies according to the vertical, the capability for machines to understand natural language--for humans to interact with databases in layperson's terms across sources--strikes at the core of converting the unstructured data of language into informed action.

Unsupervised meta-learning: learning to learn without supervision


The history of machine learning has largely been a story of increasing abstraction. In the dawn of ML, researchers spent considerable effort engineering features. As deep learning gained popularity, researchers then shifted towards tuning the update rules and learning rates for their optimizers. Recent research in meta-learning has climbed one level of abstraction higher: many researchers now spend their days manually constructing task distributions, from which they can automatically learn good optimizers. What might be the next rung on this ladder?

ASNets: Deep Learning for Generalised Planning

Journal of Artificial Intelligence Research

In this paper, we discuss the learning of generalised policies for probabilistic and classical planning problems using Action Schema Networks (ASNets). The ASNet is a neural network architecture that exploits the relational structure of (P)PDDL planning problems to learn a common set of weights that can be applied to any problem in a domain. By mimicking the actions chosen by a traditional, non-learning planner on a handful of small problems in a domain, ASNets are able to learn a generalised reactive policy that can quickly solve much larger instances from the domain. This work extends the ASNet architecture to make it more expressive, while still remaining invariant to a range of symmetries that exist in PPDDL problems. We also present a thorough experimental evaluation of ASNets, including a comparison with heuristic search planners on seven probabilistic and deterministic domains, an extended evaluation on over 18,000 Blocksworld instances, and an ablation study. Finally, we show that sparsity-inducing regularisation can produce ASNets that are compact enough for humans to understand, yielding insights into how the structure of ASNets allows them to generalise across a domain.

Bridging the Gap Between Probabilistic Model Checking and Probabilistic Planning: Survey, Compilations, and Empirical Comparison

Journal of Artificial Intelligence Research

Markov decision processes are of major interest in the planning community as well as in the model checking community. But in spite of the similarity in the considered formal models, the development of new techniques and methods happened largely independently in both communities. This work is intended as a beginning to unite the two research branches. We consider goal-reachability analysis as a common basis between both communities. The core of this paper is the translation from Jani, an overarching input language for quantitative model checkers, into the probabilistic planning domain definition language (PPDDL), and vice versa from PPDDL into Jani. These translations allow the creation of an overarching benchmark collection, including existing case studies from the model checking community, as well as benchmarks from the international probabilistic planning competitions (IPPC). We use this benchmark set as a basis for an extensive empirical comparison of various approaches from the model checking community, variants of value iteration, and MDP heuristic search algorithms developed by the AI planning community. On a per benchmark domain basis, techniques from one community can achieve state-ofthe-art performance in benchmarks of the other community. Across all benchmark domains of one community, the performance comparison is however in favor of the solvers and algorithms of that particular community. Reasons are the design of the benchmarks, as well as tool-related limitations. Our translation methods and benchmark collection foster crossfertilization between both communities, pointing out specific opportunities for widening the scope of solvers to different kinds of models, as well as for exchanging and adopting algorithms across communities.

Cancer Epidemiology Biomarkers & Prevention: 29 (5)


The cover image is adapted from Figure 4 in the article, "Harnessing Population Pedigree Data and Machine Learning Methods to Identify Patterns of Familial Bladder Cancer Risk," by Hanson and colleagues. To simultaneously assess risks for multiple cancers to identify distinct multicancer configurations, this study takes advantage of a unique population-level data resource, the Utah Population Database, containing vast genealogy and statewide cancer data. This study identified five familial bladder cancer FMCs showing unique risk patterns for cancers of other organs, suggesting phenotypic heterogeneity in familial bladder cancer. FMC configurations could permit better definitions of cancer phenotypes (subtypes or multicancer) for gene discovery and environmental risk factor studies. For more information, see the article beginning 918.

r/MachineLearning - [D] AI21 Labs Asks: How Much Does It Cost to Train NLP Models?


AI21 Labs Co-CEO, Stanford University Professor of Computer Science (emeritus), and AI Index initiator Yoav Shoham describes the motivation for the project. "It started with an inquiry we got at the AI Index. I started jotting down a quick answer and realized it deserved a longer one. I also realized we had a lot of the expertise at AI21 Labs. So we spun up a small effort to put this report together, to benefit the community."

AI21 Labs Asks: How Much Does It Cost to Train NLP Models?


These days the total cost of training an NLP model can climb into millions of dollars. And so it is only natural that budget-constrained researchers, engineers and scientists, when planning their model-training experiments, ask the important question: How much is this going to cost? And what are the main factors affecting that price tag? Israeli research company AI21 Labs looks for answers in their recently published paper The Cost of Training NLP Models: A Concise Overview. AI21 Labs Co-CEO, Stanford University Professor of Computer Science (emeritus), and AI Index initiator Yoav Shoham describes the motivation for the project.

Running machine learning at scale


Our team runs dozens of production machine learning models on a daily, weekly, and monthly basis. We recently went through a redesign of our ML infrastructure to increase its abilities to enable self-serve, scale to match computing needs, reduce impacts among models running on the same VM, and remove differences between dev and production environments. In this post, I will describe the challenges we faced with the previous infrastructure and how we addressed them with our Version 2 architecture. Our machine learning engineers use Python and R to implement models. Our Version 1 infrastructure used a custom XML format from which we generated Azure Data Factory (ADF) v1 pipelines to copy the model input data to blob storage.

Language may help AI navigate new environments


In a new study published this week on the preprint server, Both it and several baseline models will soon be available on GitHub. One of the most powerful techniques in machine learning -- reinforcement learning, which entails spurring software agents toward goals via rewards -- is also one of the most flawed. It's sample inefficient, meaning it requires a large number of compute cycles to complete, and without additional data to cover variations, it adapts poorly to environments that differ from the training environment. It's theorized that prior knowledge of tasks through structured language could be combined with reinforcement learning to mitigate its shortcomings, and BabyAI was designed to put this theory to the test.