Goto

Collaborating Authors

 SPE


M\'{e}todos para la Selecci\'{o}n y el Ajuste de Caracter\'{i}sticas en el Problema de la Detecci\'{o}n de Spam

arXiv.org Artificial Intelligence

The email is used daily by millions of people to communicate around the globe and it is a mission-critical application for many businesses. Over the last decade, unsolicited bulk email has become a major problem for email users. An overwhelming amount of spam is flowing into users' mailboxes daily. In 2004, an estimated 62% of all email was attributed to spam. Spam is not only frustrating for most email users, it strains the IT infrastructure of organizations and costs businesses billions of dollars in lost productivity. In recent years, spam has evolved from an annoyance into a serious security threat, and is now a prime medium for phishing of sensitive information, as well the spread of malicious software. This work presents a first approach to attack the spam problem. We propose an algorithm that will improve a classifier's results by adjusting its training set data. It improves the document's vocabulary representation by detecting good topic descriptors and discriminators.


Report on the Twenty-Third International Florida Artificial Intelligence Research Society Conference (FLAIRS-23)

AI Magazine

The 23rd International Florida Artificial Intelligence Research Society Conference (FLAIRS-23) was held May 19-21, 2010 at The Shores Resort & Spa in Daytona Beach Shores, Florida, USA. The conference featured an exciting lineup of invited speakers, a general conference track on artificial intelligence research, and numerous special tracks. The conference chair was David Wilson from the University of North Carolina at Charlotte. The special tracks coordinator was Philip McCarthy from the University of Memphis.


Adapting Open Information Extraction to Domain-Specific Relations

AI Magazine

Information extraction (IE) can identify a set of relations from free text to support question answering (QA). Until recently, IE systems were domain-specific and needed a combination of manual engineering and supervised learning to adapt to each target domain. A new paradigm, Open IE operates on large text corpora without any manual tagging of relations, and indeed without any pre-specified relations. We explore the steps needed to adapt Open IE to a domain-specific ontology and demonstrate our approach of mapping domain-independent tuples to an ontology using domains from DARPA's Machine Reading Project.


True Knowledge: Open-Domain Question Answering Using Structured Knowledge and Inference

AI Magazine

This article gives a detailed description of True Knowledge: a commercial, open-domain question answering platform. The system combines a large and growing structured knowledge base of common sense, factual and lexical knowledge; a natural language translation system that turns user questions into internal language-independent queries and an inference system that can answer those queries using both directly represented and inferred knowledge. The system is live and answers millions of questions per month asked by internet users.


Reports of the AAAI 2010 Spring Symposia

AI Magazine

The Association for the Advancement of Artificial Intelligence, in cooperation with Stanford University's Department of Computer Science, is pleased to present the 2010 Spring Symposium Series, to be held Monday through Wednesday, March 22–24, 2010 at Stanford University. The titles of the seven symposia are Artificial Intelligence for Development; Cognitive Shape Processing; Educational Robotics and Beyond: Design and Evaluation; Embedded Reasoning: Intelligence in Embedded Systems Intelligent Information Privacy Management; It's All in the Timing: Representing and Reasoning about Time in Interactive Behavior; and Linked Data Meets Artificial Intelligence.


Project Halo Update--Progress Toward Digital Aristotle

AI Magazine

In the winter, 2004 issue of AI Magazine, we reported Vulcan Inc.'s first step toward creating a question-answering system called "Digital Aristotle." The goal of that first step was to assess the state of the art in applied Knowledge Representation and Reasoning (KRR) by asking AI experts to represent 70 pages from the advanced placement (AP) chemistry syllabus and to deliver knowledge-based systems capable of answering questions from that syllabus. This paper reports the next step toward realizing a Digital Aristotle: we present the design and evaluation results for a system called AURA, which enables domain experts in physics, chemistry, and biology to author a knowledge base and that then allows a different set of users to ask novel questions against that knowledge base. These results represent a substantial advance over what we reported in 2004, both in the breadth of covered subjects and in the provision of sophisticated technologies in knowledge representation and reasoning, natural language processing, and question answering to domain experts and novice users.


Introduction to the Special Issue on Question Answering

AI Magazine

This special issue issue of AI Magazine presents six articles on some of the most interesting question answering systems in development today. Included are articles on Project, the Semantic Research, Watson, True Knowledge, and TextRunner (University of Washington's clever use of statistical NL techniques to answer questions across the open web).


Harnessing Cyc to Answer Clinical Researchers' Ad Hoc Queries

AI Magazine

By extending Cyc's ontology and KB approximately 2%, Cycorp and Cleveland Clinic Foundation (CCF) have built a system to answer clinical researchers' ad hoc queries. But, surprisingly often, after applying various constraints (medical domain knowledge, common sense, discourse pragmatics, syntax), there is only one single way to fit those fragments together, one semantically meaningful formal query P. The system, SRA (for Semantic Research Assistant), dispatches a series of database calls and then combines, logically and arithmetically, their results into answers to P. Seeing the first few answers stream back, the user may realize that they need to abort, modify, and re-ask their query. Besides real-time ad hoc query-answering, queries can be bundled and persist over time. Until full articulation/answering of precise, analytical queries becomes as straight-forward and ubiquitous as text search, even partial understanding of a query empowers semantic search over semi-structured data (ontology-tagged text), avoiding many of the false positives and false negatives that standard text searching suffers from.


Harnessing Cyc to Answer Clinical Researchers' Ad Hoc Queries

AI Magazine

By extending Cyc’s ontology and KB approximately 2%, Cycorp and Cleveland Clinic Foundation (CCF) have built a system to answer clinical researchers’ ad hoc queries. The query may be long and complex, hence only partially understood at first, parsed into a set of CycL (higher-order logic) fragments with open variables. But, surprisingly often, after applying various constraints (medical domain knowledge, common sense, discourse pragmatics, syntax), there is only one single way to fit those fragments together, one semantically meaningful formal query P. The system, SRA (for Semantic Research Assistant), dispatches a series of database calls and then combines, logically and arithmetically, their results into answers to P. Seeing the first few answers stream back, the user may realize that they need to abort, modify, and re-ask their query. Even before they push ASK, just knowing approximately how many answers would be returned can spark such editing. Besides real-time ad hoc query-answering, queries can be bundled and persist over time. One bundle of 275 queries is rerun quarterly by CCF to produce the procedures and outcomes data it needs to report to STS (Society of Thoracic Surgeons, an external hospital accreditation and ranking body); another bundle covers ACC (American College of Cardiology) reporting. Until full articulation/answering of precise, analytical queries becomes as straight-forward and ubiquitous as text search, even partial understanding of a query empowers semantic search over semi-structured data (ontology-tagged text), avoiding many of the false positives and false negatives that standard text searching suffers from.


A Distributed Method for Evaluating Properties of a Robot Formation

AAAI Conferences

As a robot formation increases in size or explores places where it is difficult for a human operator to interact, autonomous control becomes critical. We propose a distributed autonomous method for evaluating properties of multi-robot systems, and then discuss how this information can be applied to improve performance with respect to a given operation. We present this as an extension of our previous work on robot formations; however, the techniques described could be adapted to other multi-robot systems.