Country
Optimism in Reinforcement Learning and Kullback-Leibler Divergence
Filippi, Sarah, Cappé, Olivier, Garivier, Aurélien
We consider model-based reinforcement learning in finite Markov De- cision Processes (MDPs), focussing on so-called optimistic strategies. In MDPs, optimism can be implemented by carrying out extended value it- erations under a constraint of consistency with the estimated model tran- sition probabilities. The UCRL2 algorithm by Auer, Jaksch and Ortner (2009), which follows this strategy, has recently been shown to guarantee near-optimal regret bounds. In this paper, we strongly argue in favor of using the Kullback-Leibler (KL) divergence for this purpose. By studying the linear maximization problem under KL constraints, we provide an ef- ficient algorithm, termed KL-UCRL, for solving KL-optimistic extended value iteration. Using recent deviation bounds on the KL divergence, we prove that KL-UCRL provides the same guarantees as UCRL2 in terms of regret. However, numerical experiments on classical benchmarks show a significantly improved behavior, particularly when the MDP has reduced connectivity. To support this observation, we provide elements of com- parison between the two algorithms based on geometric considerations.
Random scattering of bits by prediction
We investigate a population of binary mistake sequences that result from learning with parametric models of different order. We obtain estimates of their error, algorithmic complexity and divergence from a purely random Bernoulli sequence. We study the relationship of these variables to the learner's information density parameter which is defined as the ratio between the lengths of the compressed to uncompressed files that contain the learner's decision rule. The results indicate that good learners have a low information density$\rho$ while bad learners have a high $\rho$. Bad learners generate mistake sequences that are atypically complex or diverge stochastically from a purely random Bernoulli sequence. Good learners generate typically complex sequences with low divergence from Bernoulli sequences and they include mistake sequences generated by the Bayes optimal predictor. Based on the static algorithmic interference model of \cite{Ratsaby_entropy} the learner here acts as a static structure which "scatters" the bits of an input sequence (to be predicted) in proportion to its information density $\rho$ thereby deforming its randomness characteristics.
The World as Evolving Information
This paper discusses the benefits of describing the world as information, especially in the study of the evolution of life and cognition. Traditional studies encounter problems because it is difficult to describe life and cognition in terms of matter and energy, since their laws are valid only at the physical scale. However, if matter and energy, as well as life and cognition, are described in terms of information, evolution can be described consistently as information becoming more complex. The paper presents eight tentative laws of information, valid at multiple scales, which are generalizations of Darwinian, cybernetic, thermodynamic, psychological, philosophical, and complexity principles. These are further used to discuss the notions of life, cognition and their evolution.
Online Multiple Kernel Learning for Structured Prediction
Martins, Andre F. T., Figueiredo, Mario A. T., Aguiar, Pedro M. Q., Smith, Noah A., Xing, Eric P.
Despite the recent progress towards efficient multiple kernel learning (MKL), the structured output case remains an open research front. Current approaches involve repeatedly solving a batch learning problem, which makes them inadequate for large scale scenarios. We propose a new family of online proximal algorithms for MKL (as well as for group-lasso and variants thereof), which overcomes that drawback. We show regret, convergence, and generalization bounds for the proposed method. Experiments on handwriting recognition and dependency parsing testify for the successfulness of the approach.
AAAI Conferences Calendar
IEA/AIE-2011 continues the tradition of emphasizing Industrial, Engineering and Other applications of Applied Intelligent Systems' technology in solving real-life problems. Paper submission is required by November 12, 2010. Submission instructions and additional details may be obtained from the website or from the Program Chair, Dr. Kishan G. Mehrotra. This page includes forthcoming AAAI sponsored conferences, conferences presented by AAAI Affiliates, and conferences held in cooperation with AAAI. AI Magazine also maintains a calendar listing that includes nonaffiliated conferences at www.aaai.org/Magazine/calendar.php.
Reports of the AAAI 2010 Spring Symposia
Barkowsky, Thomas (University of Bremen) | Bertel, Sven (University of Illinois at Urbana-Champaign) | Broz, Frank (University of Hertfordshire) | Chaudhri, Vinay K. (SRI International) | Eagle, Nathan (txteagle, Inc.) | Genesereth, Michael (Stanford University) | Halpin, Harry (University of Edinburgh) | Hamner, Emily (Carnegie Mellon University) | Hoffmann, Gabe (Palo Alto Research Center) | Hölscher, Christoph (University of Freiburg) | Horvitz, Eric (Microsoft Research) | Lauwers, Tom (Carnegie Mellon University) | McGuinness, Deborah L. (Rensselaer Polytechnic Institute) | Michalowski, Marek (BeatBots LLC) | Mower, Emily (University of Southern California) | Shipley, Thomas F. (Temple University) | Stubbs, Kristen (iRobot) | Vogl, Roland (Stanford University) | Williams, Mary-Anne (University of Technology)
The Association for the Advancement of Artificial Intelligence, in cooperation with Stanford University’s Department of Computer Science, is pleased to present the 2010 Spring Symposium Series, to be held Monday through Wednesday, March 22–24, 2010 at Stanford University. The titles of the seven symposia are Artificial Intelligence for Development; Cognitive Shape Processing; Educational Robotics and Beyond: Design and Evaluation; Embedded Reasoning: Intelligence in Embedded Systems Intelligent Information Privacy Management; It’s All in the Timing: Representing and Reasoning about Time in Interactive Behavior; and Linked Data Meets Artificial Intelligence.
Harnessing Cyc to Answer Clinical Researchers' Ad Hoc Queries
Lenat, Douglas (Cycorp) | Witbrock, Michael (Cycorp) | Baxter, David (Cycorp) | Blackstone, Eugene (Cleveland Clinic Foundation) | Deaton, Chris (Cycorp) | Schneider, Dave (Cycorp) | Scott, Jerry (Research Intelligence) | Shepard, Blake (Cycorp)
By extending Cyc’s ontology and KB approximately 2%, Cycorp and Cleveland Clinic Foundation (CCF) have built a system to answer clinical researchers’ ad hoc queries. The query may be long and complex, hence only partially understood at first, parsed into a set of CycL (higher-order logic) fragments with open variables. But, surprisingly often, after applying various constraints (medical domain knowledge, common sense, discourse pragmatics, syntax), there is only one single way to fit those fragments together, one semantically meaningful formal query P. The system, SRA (for Semantic Research Assistant), dispatches a series of database calls and then combines, logically and arithmetically, their results into answers to P. Seeing the first few answers stream back, the user may realize that they need to abort, modify, and re-ask their query. Even before they push ASK, just knowing approximately how many answers would be returned can spark such editing. Besides real-time ad hoc query-answering, queries can be bundled and persist over time. One bundle of 275 queries is rerun quarterly by CCF to produce the procedures and outcomes data it needs to report to STS (Society of Thoracic Surgeons, an external hospital accreditation and ranking body); another bundle covers ACC (American College of Cardiology) reporting. Until full articulation/answering of precise, analytical queries becomes as straight-forward and ubiquitous as text search, even partial understanding of a query empowers semantic search over semi-structured data (ontology-tagged text), avoiding many of the false positives and false negatives that standard text searching suffers from.
AI Theory and Practice: A Discussion on Hard Challenges and Opportunities Ahead
Horvitz, Eric (Microsoft Research) | Getoor, Lise (University of Maryland) | Guestrin, Carlos (Carnegie Mellon University) | Hendler, James (Rensselaer Polytechnic Institute) | Konstan, Joseph (University of Minnesota) | Subramanian, Devika (Rice University) | Wellman, Michael (University of Michigan) | Kautz, Henry (University of Rochester)
So, we have a variety of people here with different interests and backgrounds that I asked to talk about not just the key challenges ahead but potential opportunities and promising pathways, trajectories to solving those problems, and their predictions about how R&D might proceed in terms of the timing of various kinds of development over time. I asked the panelists briefly to frame their comments sharing a little bit about fundamental questions, such as, "What is the research goal?" Not everybody stays up late at night hunched over a computer or a simulation or a robotic system, pondering the foundations of intelligence and human-level AI. We have here today Lise Getoor from the University ipate the liability and insurance industry; and the of Maryland; Devika Subramanian, who other one, that it was a human interface problem, comes to us from Rice University; we have Carlos that people don't necessarily want to go and type Guestrin from Carnegie Mellon University (CMU); a bunch of yes/no questions into a computer to get James Hendler from Rensselaer Polytechnic Institute an answer, even with a rule-based explanation, (RPI); Mike Wellman at the University of that if you'd taken that just a step further and Michigan; Henry Kautz at tjhe University of solved the human problem, it might have worked. Rochester; and Joe Konstan, who comes to us from Related to that, I was remembering a bunch of the Midwest, as our Minneapolis person here on these smart house projects. And I have to admit I the panel. I think everyone Joe Konstan: I was actually surprised when you hates smart spaces. I think of myself at the core there's nobody there, do you warn people and give in human-computer interaction. So I went back them a chance to answer? There's no good answer and started looking at what I knew of artificial to this question. I can tell you if that person is in intelligence to try to see where the path forward bed asleep, the answer is no, don't wake them up was, and I was inspired by the past.
Report on the Twenty-Third International Florida Artificial Intelligence Research Society Conference (FLAIRS-23)
Murray, R. Charles (Carnegie Mellon University) | Guesgen, Hans W. (Massey University)
The Best Paper award went to Sidney D'Mello, Blair Lehman, and Natalie Person for "Expert Tutors' Feedback Is Immediate, Direct, and Discriminating" in the special track on Intelligent Tutoring Systems. The Best Student Paper award went to Rong Hu, Brian Mac Namee, and Sarah Jane Delany for "Off to a Good Start: Using Clustering to Select the Initial Training Set in Active Learning" in the general conference. The Best Poster award went to Robert Holder for "Problem Space Analysis for Library Generation and Algorithm Selection in Real-Time Systems" in the general conference. In addition to a diverse assortment of papers and British Columbia, who presented "What Should posters presented at the conference, FLAIRS-23 featured the World-Wide Mind Believe? Information about FLAIRS-24, University, who presented "Rational Ways of Talking"; including the call for papers, is available online at and Janet L. Kolodner of the Georgia Institute www.flairs-24.info. of Technology, who presented "How Can We Help Université de Paris-Sorbonne, who presented "Reasoning in Natural Language Using Combinatory Games"; and David Poole of the University of
AAAI News
Hamilton, Carol M. (Association for the Advancement of Artificial Intelligence)
AAAI/SIGART Doctoral Consortium, and the second AAAI Educational Advances in Artificial Intelligence Symposium, to name only a few of the AAAI is pleased to present the 2011 Spring Symposium Series, to highlights. For complete information be held Monday through Wednesday, March 21-23, 2011, at on these programs, including Tutorial Stanford University.