If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."
However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …
Jing, Li (Massachusetts Institute of Technology) | Gulcehre, Caglar (MILA - Universite de Montreal) | Peurifoy, John (Massachusetts Institute of Technology) | Shen, Yichen (Massachusetts Institute of Technology) | Tegmark, Max (Massachusetts Institute of Technology) | Soljacic, Marin (Massachusetts Institute of Technology) | Bengio, Yoshua (MILA - Universite de Montreal)
We present a novel recurrent neural network (RNN) based model that combines the remembering ability of unitary RNNs with the ability of gated RNNs to effectively forget redundant/irrelevant information in its memory. We achieve this by extending unitary RNNs with a gating mechanism. Our model is able to outperform LSTMs, GRUs and Unitary RNNs on several long-term dependency benchmark tasks. We empirically both show the orthogonal/unitary RNNs lack the ability to forget and also the ability of GORU to simultaneously remember long term dependencies while forgetting irrelevant information. This plays an important role in recurrent neural networks. We provide competitive results along with an analysis of our model on many natural sequential tasks including the bAbI Question Answering, TIMIT speech spectrum prediction, Penn TreeBank, and synthetic tasks that involve long-term dependencies such as algorithmic, parenthesis, denoising and copying tasks.
Ghassemi, Mohammad M. (Massachusetts Institute of Technology) | AlHanai, Tuka (Massachusetts Institute of Technology) | Westover, M. Brandon (Massachusetts General Hospital) | Mark, Roger G. (Massachusetts Institute of Technology) | Nemati, Shamim (Emory University)
One area of medicine that could benefit from personalized procedures is medication dosing. Mis-dosing medications may incur additional morbidity, or unnecessarily increase the length of patient stay. Here we illustrate a novel approach to personalized medication dosing that is robust to missing data, a common problem in the clinical care setting. We perform dose estimation using a novel take on multinomial logistic regression where model parameters are continuously estimated, for each patient, using a weighted combination of the data from a population of other patients, and a volatile data stream available from the individual under treatment. We evaluate our approach on 4,470 patients who received anti-coagulation therapy during intensive care treatment. Our approach was 29% more accurate than intensive care staff, and better able to distinguish outcomes than a non-personalized baseline (0.11 improvement in model VUS, a multiclass version of AUC). The advantages of our approach are its ease of interpretation, robustness to missing features, and extensibility to other problems with similar structure.
Chen, Jingkai (Massachusetts Institute of Technology) | Fang, Cheng (Massachusetts Institute of Technology) | Muise, Christian (Massachusetts Institute of Technology) | Shrobe, Howard (Massachusetts Institute of Technology) | Williams, Brian C. (Massachusetts Institute of Technology) | Yu, Peng (Massachusetts Institute of Technology)
Current network approaches aim to maximize network utilization when routing flows. While such approaches are fast and usually result in acceptable behavior, existing methods are not mission aware. There is no concept of utility maximization, no capability to handle flows with specified deadlines and loss requirements, and no guarantees over the probability of network saturation. In the presence of network degradation due to attacks, there is no guarantee that important flows will be properly transported. In this paper, we present RADMAX, a system for Risk And Deadline Aware Planning for Maximum Utility based on constraint programming, which allows us to handle higher level mission specifications. We show the correctness of RADMAX with respect to loss and delay bounds, provide results for the optimality of RADMAX with respect to the mission utility, and review current results on computational performance.
Neller, Todd W. (Gettysburg College) | Butler, Zack (Rochester Institute of Technology) | Derbinsky, Nate (Northeastern University) | Furey, Heidi (Manhattan College) | Martin, Fred (University of Massachusetts Lowell) | Guerzhoy, Michael (University of Toronto) | Anders, Ariel (Massachusetts Institute of Technology) | Eckroth, Joshua (Stetson University)
The Model AI Assignments session seeks to gather and disseminate the best assignment designs of the Artificial Intelligence (AI) Education community. Recognizing that assignments form the core of student learning ex- perience, we here present abstracts of seven AI assign- ments from the 2018 session that are easily adoptable, playfully engaging, and flexible for a variety of instruc- tor needs.
Mihaylova, Tsvetomila (Sofia University "St. Kliment Ohridski") | Nakov, Preslav ( Qatar Computing Research Institute, HBKU ) | Màrquez, Lluís (Qatar Computing Research Institute, HBKU) | Barrón-Cedeño, Alberto (Qatar Computing Research Institute, HBKU) | Mohtarami, Mitra (Massachusetts Institute of Technology) | Karadzhov, Georgi (Sofia University "St. Kliment Ohridski") | Glass, James (Massachusetts Institute of Technology)
Community Question Answering (cQA) forums are very popular nowadays, as they represent effective means for communities around particular topics to share information. Unfortunately, this information is not always factual. Thus, here we explore a new dimension in the context of cQA, which has been ignored so far: checking the veracity of answers to particular questions in cQA forums. As this is a new problem, we create a specialized dataset for it. We further propose a novel multi-faceted model, which captures information from the answer content (what is said and how), from the author profile (who says it), from the rest of the community forum (where it is said), and from external authoritative sources of information (external support). Evaluation results show a MAP value of 86.54, which is 21 points absolute above the baseline.
Hoang, Trong Nghia (Massachusetts Institute of Technology) | Hoang, Quang Minh (National University of Singapore) | Ouyang, Ruofei (National University of Singapore) | Low, Kian Hsiang (National University of Singapore)
This paper presents a novel decentralized high-dimensional Bayesian optimization (DEC-HBO) algorithm that, in contrast to existing HBO algorithms, can exploit the interdependent effects of various input components on the output of the unknown objective function f for boosting the BO performance and still preserve scalability in the number of input dimensions without requiring prior knowledge or the existence of a low (effective) dimension of the input space. To realize this, we propose a sparse yet rich factor graph representation of f to be exploited for designing an acquisition function that can be similarly represented by a sparse factor graph and hence be efficiently optimized in a decentralized manner using distributed message passing. Despite richly characterizing the interdependent effects of the input components on the output of f with a factor graph, DEC-HBO can still guarantee no-regret performance asymptotically. Empirical evaluation on synthetic and real-world experiments (e.g., sparse Gaussian process model with 1811 hyperparameters) shows that DEC-HBO outperforms the state-of-the-art HBO algorithms.
We propose semi-random features for nonlinear function approximation. The flexibility of semi-random feature lies between the fully adjustable units in deep learning and the random features used in kernel methods. For one hidden layer models with semi-random features, we prove with no unrealistic assumptions that the model classes contain an arbitrarily good function as the width increases (universality), and despite non-convexity, we can find such a good function (optimization theory) that generalizes to unseen new data (generalization bound). For deep models, with no unrealistic assumptions, we prove universal approximation ability, a lower bound on approximation error, a partial optimization guarantee, and a generalization bound. Depending on the problems, the generalization bound of deep semi-random features can be exponentially better than the known bounds of deep ReLU nets; our generalization error bound can be independent of the depth, the number of trainable weights as well as the input dimensionality. In experiments, we show that semi-random features can match the performance of neural networks by using slightly more units, and it outperforms random features by using significantly fewer units. Moreover, we introduce a new implicit ensemble method by using semi-random features.
Noothigattu, Ritesh (Carnegie Mellon University) | Gaikwad, Snehalkumar S. (Massachusetts Institute of Technology) | Awad, Edmond (Massachusetts Institute of Technology) | Dsouza, Sohan (Massachusetts Institute of Technology) | Rahwan, Iyad (Massachusetts Institute of Technology) | Ravikumar, Pradeep ( Carnegie Mellon University ) | Procaccia, Ariel D. ( Carnegie Mellon University )
We present a general approach to automating ethical decisions, drawing on machine learning and computational social choice. In a nutshell, we propose to learn a model of societal preferences, and, when faced with a specific ethical dilemma at runtime, efficiently aggregate those preferences to identify a desirable choice. We provide a concrete algorithm that instantiates our approach; some of its crucial steps are informed by a new theory of swap-dominance efficient voting rules. Finally, we implement and evaluate a system for ethical decision making in the autonomous vehicle domain, using preference data collected from 1.3 million people through the Moral Machine website.
In robotics, it is essential to be able to plan efficiently in high-dimensional continuous state-action spaces for long horizons. For such complex planning problems, unguided uniform sampling of actions until a path to a goal is found is hopelessly inefficient, and gradient-based approaches often fall short when the optimization manifold of a given problem is not smooth. In this paper, we present an approach that guides search in continuous spaces for generic planners by learning an action sampler from past search experience. We use a Generative Adversarial Network (GAN) to represent an action sampler, and address an important issue: search experience consists of a relatively large number of actions that are not on a solution path and a relatively small number of actions that actually are on a solution path. We introduce a new technique, based on an importance-ratio estimation method, for using samples from a non-target distribution to make GAN learning more data-efficient. We provide theoretical guarantees and empirical evaluation in three challenging continuous robot planning problems to illustrate the effectiveness of our algorithm.
Bohg, Jeannette (Max Planck Institute for Intelligent Systems) | Boix, Xavier (Massachusetts Institute of Technology) | Chang, Nancy (Google) | Churchill, Elizabeth F. (Google) | Chu, Vivian (Georgia Institute of Technology) | Fang, Fei (Harvard University) | Feldman, Jerome (University of California at Berkeley) | González, Avelino J. (University of Central Florida) | Kido, Takashi (Preferred Networks in Japan) | Lawless, William F. (Paine College) | Montaña, José L. (University of Cantabria) | Ontañón, Santiago (Drexel University) | Sinapov, Jivko (University of Texas at Austin) | Sofge, Don (Naval Research Laboratory) | Steels, Luc (Institut de Biologia Evolutiva) | Steenson, Molly Wright (Carnegie Mellon University) | Takadama, Keiki (University of Electro-Communications) | Yadav, Amulya (University of Southern California)