Europe
Reports of the AAAI 2010 Conference Workshops
Aha, David W. (Naval Research Laboratory) | Boddy, Mark (Adventium Labs) | Bulitko, Vadim (University of Alberta) | Garcez, Artur S. d' (City University London) | Avila (University of Georgia) | Doshi, Prashant (TZI, Bremen University) | Edelkamp, Stefan (University of Edinburgh) | Geib, Christopher (University of Illinois, Chicago) | Gmytrasiewicz, Piotr (Smart Information Flow Technologies) | Goldman, Robert P. (Wright State University) | Hitzler, Pascal (Georgia Institute of Technology) | Isbell, Charles (University of Maryland, College Park) | Josyula, Darsana (Massachusetts Institute of Technology) | Kaelbling, Leslie Pack (University of Bonn) | Kersting, Kristian (Georgia Institute of Technology) | Kunda, Maithilee (Universidade Federal do Rio Grande do Sul (UFRGS)) | Lamb, Luis C. (Willow Garage) | Marthi, Bhaskara (Georgia Institute of Technology) | McGreggor, Keith (EML Research gGmbH) | Nastase, Vivi (University College Cork) | Provan, Gregory (University of North Carolina, Charlotte) | Raja, Anita (Georgia Institute of Technology) | Ram, Ashwin (Georgia Institute of Technology) | Riedl, Mark (University of California, Berkeley) | Russell, Stuart (Cornell University) | Sabharwal, Ashish (University of Freiburg) | Smaus, Jan-Georg (University of Central Florida) | Sukthankar, Gita (Maastricht University) | Tuyls, Karl (University of New South Wales) | Meyden, Ron van der (Google, Inc.) | Halevy, Alon (University of Maryland) | Mihalkova, Lilyana (University of Wisconsin) | Natarajan, Sriraam
The AAAI-10 Workshop program was held Sunday and Monday, July 11–12, 2010 at the Westin Peachtree Plaza in Atlanta, Georgia. The AAAI-10 workshop program included 13 workshops covering a wide range of topics in artificial intelligence. The titles of the workshops were AI and Fun, Bridging the Gap between Task and Motion Planning, Collaboratively-Built Knowledge Sources and Artificial Intelligence, Goal-Directed Autonomy, Intelligent Security, Interactive Decision Theory and Game Theory, Metacognition for Robust Social Systems, Model Checking and Artificial Intelligence, Neural-Symbolic Learning and Reasoning, Plan, Activity, and Intent Recognition, Statistical Relational AI, Visual Representations and Reasoning, and Abstraction, Reformulation, and Approximation. This article presents short summaries of those events.
Algorithmic Game Theory and Artificial Intelligence
Elkind, Edith (Nanyang Technological University) | Leyton-Brown, Kevin (University of British Columbia)
Indeed, game theory now serves as perhaps the main analytical framework in microeconomic theory, as evidenced by its prominent role in economics textbooks (for example, Mas-Colell, Whinston, and Green 1995) and by the many Nobel prizes in economic sciences awarded to prominent game theorists. Artificial intelligence got its start shortly after game theory (McCarthy et al. 1955), and indeed pioneers such as von Neumann and Simon made early contributions to both fields (see, for example, Findler [1988], Simon [1981]). Both game theory and AI draw (nonexclusively) on decision theory (von Neumann and Morgenstern 1947); for example, one prominent view defines artificial intelligence as "the study and construction of rational agents" (Russell and Norvig 2003), and hence takes a decision-theoretic approach when the world is stochastic. However, artificial intelligence spent most of its first 40 years focused on the design and analysis of agents that act in isolation, and hence had little need for game-theoretic analysis. Starting in the mid to late 1990s, game theory became a major topic of study for computer scientists, for at least two main reasons. First, economists began to be interested in systems whose computational properties posed serious barriers to practical use, and hence reached out to computer scientists; notably, this occurred around the study of combinatorial auctions (see, for example, Cramton, Shoham, and Steinberg 2006). Second, the rise of distributed computing in general and the Internet in particular made it increasingly necessary for computer scientists to study settings in which intelligent agents reason about and interact with other agents.
A Factorial Experiment on Scalability of Search Based Software Testing
Mehrmand, Arash, Feldt, Robert
Software testing is an expensive process, which is vital in the industry. Construction of the test-data in software testing requires the major cost and to decide which method to use in order to generate the test data is important. This paper discusses the efficiency of search-based algorithms (preferably genetic algorithm) versus random testing, in soft- ware test-data generation. This study differs from all previous studies due to sample programs (SUTs) which are used. Since we want to in- crease the complexity of SUTs gradually, and the program generation is automatic as well, Grammatical Evolution is used to guide the program generation. SUTs are generated according to the grammar we provide, with different levels of complexity. SUTs will first undergo genetic al- gorithm and then random testing. Based on the test results, this paper recommends one method to use for automation of software testing.
Node harvest
When choosing a suitable technique for regression and classification with multivariate predictor variables, one is often faced with a tradeoff between interpretability and high predictive accuracy. To give a classical example, classification and regression trees are easy to understand and interpret. Tree ensembles like Random Forests provide usually more accurate predictions. Yet tree ensembles are also more difficult to analyze than single trees and are often criticized, perhaps unfairly, as `black box' predictors. Node harvest is trying to reconcile the two aims of interpretability and predictive accuracy by combining positive aspects of trees and tree ensembles. Results are very sparse and interpretable and predictive accuracy is extremely competitive, especially for low signal-to-noise data. The procedure is simple: an initial set of a few thousand nodes is generated randomly. If a new observation falls into just a single node, its prediction is the mean response of all training observation within this node, identical to a tree-like prediction. A new observation falls typically into several nodes and its prediction is then the weighted average of the mean responses across all these nodes. The only role of node harvest is to `pick' the right nodes from the initial large ensemble of nodes by choosing node weights, which amounts in the proposed algorithm to a quadratic programming problem with linear inequality constraints. The solution is sparse in the sense that only very few nodes are selected with a nonzero weight. This sparsity is not explicitly enforced. Maybe surprisingly, it is not necessary to select a tuning parameter for optimal predictive accuracy. Node harvest can handle mixed data and missing values and is shown to be simple to interpret and competitive in predictive accuracy on a variety of data sets.
Learning Hidden Markov Models using Non-Negative Matrix Factorization
Cybenko, George, Crespi, Valentino
The Baum-Welsh algorithm together with its derivatives and variations has been the main technique for learning Hidden Markov Models (HMM) from observational data. We present an HMM learning algorithm based on the non-negative matrix factorization (NMF) of higher order Markovian statistics that is structurally different from the Baum-Welsh and its associated approaches. The described algorithm supports estimation of the number of recurrent states of an HMM and iterates the non-negative matrix factorization (NMF) algorithm to improve the learned HMM parameters. Numerical examples are provided as well.
Non-Deterministic Policies in Markovian Decision Processes
Markovian processes have long been used to model stochastic environments. Reinforcement learning has emerged as a framework to solve sequential planning and decision-making problems in such environments. In recent years, attempts were made to apply methods from reinforcement learning to construct decision support systems for action selection in Markovian environments. Although conventional methods in reinforcement learning have proved to be useful in problems concerning sequential decision-making, they cannot be applied in their current form to decision support systems, such as those in medical domains, as they suggest policies that are often highly prescriptive and leave little room for the user's input. Without the ability to provide flexible guidelines, it is unlikely that these methods can gain ground with users of such systems. This paper introduces the new concept of non-deterministic policies to allow more flexibility in the user's decision-making process, while constraining decisions to remain near optimal solutions. We provide two algorithms to compute non-deterministic policies in discrete domains. We study the output and running time of these method on a set of synthetic and real-world problems. In an experiment with human subjects, we show that humans assisted by hints based on non-deterministic policies outperform both human-only and computer-only agents in a web navigation task.
The Local Optimality of Reinforcement Learning by Value Gradients, and its Relationship to Policy Gradient Learning
Fairbank, Michael, Alonso, Eduardo
In this theoretical paper we are concerned with the problem of learning a value function by a smooth general function approximator, to solve a deterministic episodic control problem in a large continuous state space. It is shown that learning the gradient of the value-function at every point along a trajectory generated by a greedy policy is a sufficient condition for the trajectory to be locally extremal, and often locally optimal, and we argue that this brings greater efficiency to value-function learning. This contrasts to traditional value-function learning in which the value-function must be learnt over the whole of state space. It is also proven that policy-gradient learning applied to a greedy policy on a value-function produces a weight update equivalent to a value-gradient weight update, which provides a surprising connection between these two alternative paradigms of reinforcement learning, and a convergence proof for control problems with a value function represented by a general smooth function approximator. Email addresses: michael.fairbank.1 'at' city.ac.uk (Michael Fairbank), eduardo'at' soi.city.ac.uk (Eduardo Alonso) Preprint submitted to Arxiv January 8, 2018 1. Introduction Reinforcement learning (RL) is the study of how an agent can learn actions that maximise some given reward function. The robot keeps moving until it reaches one of the designated terminal states. One key approach to tackle this RL problem is to assign a score to every point in state space that gives the best possible total reward attainable if starting from that state.
Inference and communication in the game of Password
Communication between a speaker and hearer will be most efficient when both parties make accurate inferences about the other. We study inference and communication in a television game called Password, where speakers must convey secret words to hearers by providing one-word clues. Our working hypothesis is that human communication is relatively efficient, and we use game show data to examine three predictions. First, we predict that speakers and hearers are both considerate, and that both take the other's perspective into account. Second, we predict that speakers and hearers are calibrated, and that both make accurate assumptions about the strategy used by the other. Finally, we predict that speakers and hearers are collaborative, and that they tend to share the cognitive burden of communication equally. We find evidence in support of all three predictions, and demonstrate in addition that efficient communication tends to break down when speakers and hearers are placed under time pressure.
The LASSO risk: asymptotic results and real world examples
Bayati, Mohsen, Pereira, José, Montanari, Andrea
We consider the problem of learning a coefficient vector x0 from noisy linear observation y=Ax0+w. In many contexts (ranging from model selection to image processing) it is desirable to construct a sparse estimator. In this case, a popular approach consists in solving an l1-penalized least squares problem known as the LASSO or BPDN. For sequences of matrices A of increasing dimensions, with iid gaussian entries, we prove that the normalized risk of the LASSO converges to a limit, and we obtain an explicit expression for this limit. Our result is the first rigorous derivation of an explicit formula for the asymptotic risk of the LASSO for random instances. The proof technique is based on the analysis of AMP, a recently developed efficient algorithm, that is inspired from graphical models ideas. Through simulations on real data matrices (gene expression data and hospital medical records) we observe that these results can be relevant in a broad array of practical applications.