Industry
M\'{e}todos para la Selecci\'{o}n y el Ajuste de Caracter\'{i}sticas en el Problema de la Detecci\'{o}n de Spam
Lorenzetti, Carlos M., Cecchini, Rocío L., Maguitman, Ana G., Benczúr, András A.
The email is used daily by millions of people to communicate around the globe and it is a mission-critical application for many businesses. Over the last decade, unsolicited bulk email has become a major problem for email users. An overwhelming amount of spam is flowing into users' mailboxes daily. In 2004, an estimated 62% of all email was attributed to spam. Spam is not only frustrating for most email users, it strains the IT infrastructure of organizations and costs businesses billions of dollars in lost productivity. In recent years, spam has evolved from an annoyance into a serious security threat, and is now a prime medium for phishing of sensitive information, as well the spread of malicious software. This work presents a first approach to attack the spam problem. We propose an algorithm that will improve a classifier's results by adjusting its training set data. It improves the document's vocabulary representation by detecting good topic descriptors and discriminators.
Online Multiple Kernel Learning for Structured Prediction
Martins, Andre F. T., Figueiredo, Mario A. T., Aguiar, Pedro M. Q., Smith, Noah A., Xing, Eric P.
Despite the recent progress towards efficient multiple kernel learning (MKL), the structured output case remains an open research front. Current approaches involve repeatedly solving a batch learning problem, which makes them inadequate for large scale scenarios. We propose a new family of online proximal algorithms for MKL (as well as for group-lasso and variants thereof), which overcomes that drawback. We show regret, convergence, and generalization bounds for the proposed method. Experiments on handwriting recognition and dependency parsing testify for the successfulness of the approach.
The World as Evolving Information
This paper discusses the benefits of describing the world as information, especially in the study of the evolution of life and cognition. Traditional studies encounter problems because it is difficult to describe life and cognition in terms of matter and energy, since their laws are valid only at the physical scale. However, if matter and energy, as well as life and cognition, are described in terms of information, evolution can be described consistently as information becoming more complex. The paper presents eight tentative laws of information, valid at multiple scales, which are generalizations of Darwinian, cybernetic, thermodynamic, psychological, philosophical, and complexity principles. These are further used to discuss the notions of life, cognition and their evolution.
Adapting Open Information Extraction to Domain-Specific Relations
Soderland, Stephen (University of Washington) | Roof, Brendan (University of Washington) | Qin, Bo (University of Washington) | Xu, Shi (University of Washington) | Mausam, - (University of Washington) | Etzioni, Oren (University of Washington)
Information extraction (IE) can identify a set of relations from free text to support question answering (QA). Until recently, IE systems were domain-specific and needed a combination of manual engineering and supervised learning to adapt to each target domain. A new paradigm, Open IE operates on large text corpora without any manual tagging of relations, and indeed without any pre-specified relations. We explore the steps needed to adapt Open IE to a domain-specific ontology and demonstrate our approach of mapping domain-independent tuples to an ontology using domains from DARPA's Machine Reading Project.
Reports of the AAAI 2010 Spring Symposia
Barkowsky, Thomas (University of Bremen) | Bertel, Sven (University of Illinois at Urbana-Champaign) | Broz, Frank (University of Hertfordshire) | Chaudhri, Vinay K. (SRI International) | Eagle, Nathan (txteagle, Inc.) | Genesereth, Michael (Stanford University) | Halpin, Harry (University of Edinburgh) | Hamner, Emily (Carnegie Mellon University) | Hoffmann, Gabe (Palo Alto Research Center) | Hölscher, Christoph (University of Freiburg) | Horvitz, Eric (Microsoft Research) | Lauwers, Tom (Carnegie Mellon University) | McGuinness, Deborah L. (Rensselaer Polytechnic Institute) | Michalowski, Marek (BeatBots LLC) | Mower, Emily (University of Southern California) | Shipley, Thomas F. (Temple University) | Stubbs, Kristen (iRobot) | Vogl, Roland (Stanford University) | Williams, Mary-Anne (University of Technology)
The Association for the Advancement of Artificial Intelligence, in cooperation with Stanford University's Department of Computer Science, is pleased to present the 2010 Spring Symposium Series, to be held Monday through Wednesday, March 22–24, 2010 at Stanford University. The titles of the seven symposia are Artificial Intelligence for Development; Cognitive Shape Processing; Educational Robotics and Beyond: Design and Evaluation; Embedded Reasoning: Intelligence in Embedded Systems Intelligent Information Privacy Management; It's All in the Timing: Representing and Reasoning about Time in Interactive Behavior; and Linked Data Meets Artificial Intelligence.
Harnessing Cyc to Answer Clinical Researchers' Ad Hoc Queries
Lenat, Douglas (Cycorp) | Witbrock, Michael (Cycorp) | Baxter, David (Cycorp) | Blackstone, Eugene (Cleveland Clinic Foundation) | Deaton, Chris (Cycorp) | Schneider, Dave (Cycorp) | Scott, Jerry (Research Intelligence) | Shepard, Blake (Cycorp)
By extending Cyc's ontology and KB approximately 2%, Cycorp and Cleveland Clinic Foundation (CCF) have built a system to answer clinical researchers' ad hoc queries. But, surprisingly often, after applying various constraints (medical domain knowledge, common sense, discourse pragmatics, syntax), there is only one single way to fit those fragments together, one semantically meaningful formal query P. The system, SRA (for Semantic Research Assistant), dispatches a series of database calls and then combines, logically and arithmetically, their results into answers to P. Seeing the first few answers stream back, the user may realize that they need to abort, modify, and re-ask their query. Besides real-time ad hoc query-answering, queries can be bundled and persist over time. Until full articulation/answering of precise, analytical queries becomes as straight-forward and ubiquitous as text search, even partial understanding of a query empowers semantic search over semi-structured data (ontology-tagged text), avoiding many of the false positives and false negatives that standard text searching suffers from.
AI Theory and Practice: A Discussion on Hard Challenges and Opportunities Ahead
Horvitz, Eric (Microsoft Research) | Getoor, Lise (University of Maryland) | Guestrin, Carlos (Carnegie Mellon University) | Hendler, James (Rensselaer Polytechnic Institute) | Konstan, Joseph (University of Minnesota) | Subramanian, Devika (Rice University) | Wellman, Michael (University of Michigan) | Kautz, Henry (University of Rochester)
So, we have a variety of people here with different interests and backgrounds that I asked to talk about not just the key challenges ahead but potential opportunities and promising pathways, trajectories to solving those problems, and their predictions about how R&D might proceed in terms of the timing of various kinds of development over time. I asked the panelists briefly to frame their comments sharing a little bit about fundamental questions, such as, "What is the research goal?" Not everybody stays up late at night hunched over a computer or a simulation or a robotic system, pondering the foundations of intelligence and human-level AI. We have here today Lise Getoor from the University ipate the liability and insurance industry; and the of Maryland; Devika Subramanian, who other one, that it was a human interface problem, comes to us from Rice University; we have Carlos that people don't necessarily want to go and type Guestrin from Carnegie Mellon University (CMU); a bunch of yes/no questions into a computer to get James Hendler from Rensselaer Polytechnic Institute an answer, even with a rule-based explanation, (RPI); Mike Wellman at the University of that if you'd taken that just a step further and Michigan; Henry Kautz at tjhe University of solved the human problem, it might have worked. Rochester; and Joe Konstan, who comes to us from Related to that, I was remembering a bunch of the Midwest, as our Minneapolis person here on these smart house projects. And I have to admit I the panel. I think everyone Joe Konstan: I was actually surprised when you hates smart spaces. I think of myself at the core there's nobody there, do you warn people and give in human-computer interaction. So I went back them a chance to answer? There's no good answer and started looking at what I knew of artificial to this question. I can tell you if that person is in intelligence to try to see where the path forward bed asleep, the answer is no, don't wake them up was, and I was inspired by the past.
Fuzzy Micro-Agents for Interactive Narrative
Magerko, Brian (Georgia Tech) | Fiesler, Casey ( Georgia Institute of Technology ) | Baumer, Allan (Georgia Institute of Technology)
This paper describes our current approach in implementing computational improvisational micro-agents. This approach is intended to foster bottom-up research to better understand how to build more complex agent behaviors in a theatrical improvisational setting. Micro-agent designs are based on our current findings in a multi-year study focused on studying real life theatrical improvisers with an aim towards better understanding the cognition employed inimprovisation at the individual and group level. It also introduces a key architectural component from the domain of fuzzy logic that enables us to clearly represent some of our current findings.
On the Complexity of Two-Player Attrition Games Played on Graphs
Furtak, Timothy Michael (University of Alberta) | Buro, Michael (University of Alberta)
The attrition game considered in this study is a graph based strategic game which is a movement-prohibited analogue of small-scale combat situations that arise frequently in popular real-time strategy video games. We present proofs that the attrition game, under a variety of assumptions, is a computationally hard problem in general. We also analyze the 1 vs. n unit case, for which we derive optimal target-orderings that can be computed in polynomial time and used as a core for heuristics for the general case. Finally, we present small problem instances that require randomizing moves — a fact that at first glance seems counter-intuitive.
Modeling User Knowledge with Dynamic Bayesian Networks in Interactive Narrative Environments
Rowe, Jonathan P. (North Carolina State University) | Lester, James C. (North Carolina State University)
Recent years have seen a growing interest in interactive narrative systems that dynamically adapt story experiences in response to users’ actions, preferences, and goals. However, relatively little empirical work has investigated runtime models of user knowledge for informing interactive narrative adaptations. User knowledge about plot scenarios, story environments, and interaction strategies is critical in a range of interactive narrative contexts, such as mystery and detective genre stories, as well as narrative scenarios for education and training. This paper proposes a dynamic Bayesian network approach for modeling user knowledge in interactive narrative environments. A preliminary version of the model has been implemented for the Crystal Island interactive narrative-centered learning environment. Results from an initial empirical evaluation suggest several future directions for the design and evaluation of user knowledge models for guiding interactive narrative generation and adaptation.