Rensselaer Polytechnic Institute
Toward an Intelligent Agent for Fraud Detection — The CFE Agent
Johnson, Joe (Rensselaer Polytechnic Institute)
One of the primary realms into which artificial intelligence research has ventured is that of psychometric tests. It has been debated since Alan Turing proposed the Turing Test whether performance on tests should serve as the metric by which we should determine whether a machine is intelligent. This is an idea that may either solidify or challenge, depending on the reader's predisposition, one's sense of what artificial intelligence really is. As will be discussed in this paper, there is a history of efforts to create agents that perform well on tests in the spirit of an interpretation of artificial intelligence called ``Psychometric AI''. However, the focus of this paper is to describe a machine agent, hereafter called the CFE Agent, developed in this tradition. The CFE Exam is a gateway to certification in the Association of Certified Fraud Examiners (ACFE), a widely recognized professional credential within the fraud examiner profession. The CFE Agent attempts to emulate the successful performance of a human test taker, using what would appear to be simplistic natural language processing approaches to answer test questions. But it is also hoped that the the reader will be convinced that the same core technologies can be successfully applied within the larger domain of fraud detection. Further work will also be briefly discussed, in which we attempt to take these techniques to the next level, a deeper level, by which we can get a better sense of the knowledge the agent is using, and how that knowledge is being applied to formulate answers.
OntoAgents Gauge Their Confidence In Language Understanding
McShane, Marjorie (Rensselaer Polytechnic Institute) | Nirenburg, Sergei (Rensselaer Polytechnic Institute)
This paper details how OntoAgents, language-endowed intelligent agents developed in the OntoAgent framework, assess their confidence in understanding language inputs. It presents scoring heuristics for the following subtasks of natural language understanding: lexical disambiguation and the establishment of semantic dependencies; reference resolution; nominal compounding; the treatment of fragments; and the interpretation of indirect speech acts. The scoring of confidence in individual linguistic subtasks is a prerequisite for computing the overall confidence in the understanding of an utterance. This, in turn, is a prerequisite for the agent’s deciding how to act upon that level of understanding.
Why the Data Train Needs Semantic Rails
Janowicz, Krzysztof (University of California, Santa Barbara) | Harmelen, Frank van (Vrije Universiteit Amsterdam) | Hendler, James A. (Rensselaer Polytechnic Institute) | Hitzler, Pascal (Wright State University)
While catchphrases such as big data, smart data, data-intensive science, or smart dust highlight different aspects, they share a common theme: Namely, a shift towards a data-centric perspective in which the synthesis and analysis of data at an ever-increasing spatial, temporal, and thematic resolution promises new insights, while, at the same time, reducing the need for strong domain theories as starting points. In terms of the envisioned methodologies, those catchphrases tend to emphasize the role of predictive analytics, that is, statistical techniques including data mining and machine learning, as well as supercomputing. Interestingly, however, while this perspective takes the availability of data as a given, it does not answer the question how one would discover the required data in today’s chaotic information universe, how one would understand which datasets can be meaningfully integrated, and how to communicate the results to humans and machines alike. The semantic web addresses these questions. In the following, we argue why the data train needs semantic rails. We point out that making sense of data and gaining new insights works best if inductive and deductive techniques go hand-in-hand instead of competing over the prerogative of interpretation.
Semantics for Big Data
Harmelen, Frank van (Vrije Universiteit Amsterdam) | Hendler, James A. (Rensselaer Polytechnic Institute) | Hitzler, Pascal (Wright State University) | Janowicz, Krzysztof (University of California, Santa Barbara)
A Novel Neural Topic Model and Its Supervised Extension
Cao, Ziqiang (Peking University) | Li, Sujian (Peking University) | Liu, Yang (Peking University) | Li, Wenjie (Hong Kong Polytechnic University) | Ji, Heng (Rensselaer Polytechnic Institute)
Topic modeling techniques have the benefits of modeling words and documents uniformly under a probabilistic framework. However, they also suffer from the limitations of sensitivity to initialization and unigram topic distribution, which can be remedied by deep learning techniques. To explore the combination of topic modeling and deep learning techniques, we first explain the standard topic modelfrom the perspective of a neural network. Based on this, we propose a novel neural topic model (NTM) where the representation of words and documents are efficiently and naturally combined into a uniform framework. Extending from NTM, we can easily add a label layer and propose the supervised neural topic model (sNTM) to tackle supervised tasks. Experiments show that our models are competitive in both topic discovery and classification/regression tasks.
Spatio-Temporal Signatures of User-Centric Data: How Similar Are We?
Shukla, Samta (Rensselaer Polytechnic Institute) | Telang, Aditya (IBM Reasearch, India) | Joshi, Salil (IBM Reasearch, India) | Subramaniam, L. Venkat (IBM Reasearch, India)
Much work has been done on understanding and predicting human mobility in time. In this work, we are interested in obtaining a set of users who are spatio-temporally most similar to a query user. We propose an efficient way of user data representation called Spatio-Temporal Signatures to keep track of complete record of user movement. We define a measure called Spatio-Temporal similarity for comparing a given pair of users. Although computing exact pairwise Spatio-Temporal similarities between query user with all users is inefficient, we show that with our hybrid pruning scheme the most similar users can be obtained in logarithmic time with in a (1+\epsilon) factor approximation of the optimal. We are developing a framework to test our models against a real dataset of urban users.
Automatic Ellipsis Resolution: Recovering Covert Information from Text
McShane, Marjorie (Rensselaer Polytechnic Institute) | Babkin, Petr (Rensselaer Polytechnic Institute)
Ellipsis is a linguistic process that makes certain aspects of text meaning not directly traceable to surface text elements and, therefore, inaccessible to most language processing technologies. However, detecting and resolving ellipsis is an indispensable capability for language-enabled intelligent agents. The key insight of the work presented here is that not all cases of ellipsis are equally difficult: some can be detected and resolved with high confidence even before we are able to build agents with full human-level semantic and pragmatic understanding of text. This paper describes a fully automatic, implemented and evaluated method of treating one class of ellipsis: elided scopes of modality. Our cognitively-inspired approach, which centrally leverages linguistic principles, has also been applied to overt referring expressions with equally promising results.
Toward Next Generation Integrative Semantic Health Information Assistants
Patton, Evan W. (Rensselaer Polytechnic Institute) | McGuinness, Deborah L. (Rensselaer Polytechnic Institute)
We can also leverage medical ontologies/taxonomies to help Traditionally, artificial intelligence in medical applications abstract specific details to concepts that can be more easily has focused on improving the abilities of medical professionals introduced and then later refined when a patient is ready. Additionally, to perform tasks such as diagnosis (e.g., Shortliffe we can have annotations to provide information 1986; Wyatt and Spiegelhalter 1991; Garg et al. 2005; Vihinen about the authoritativeness of content. Furthermore, in many and Samarghitean 2008) or to aid in managing drug interactions cases information will need to travel beyond the patient to (e.g., Bindoff et al. 2007) or side effects (Edwards family or hired caregivers (Williams et al. 2002, p. 387), and Aronson 2000, p. 1258). These efforts target users who which means that multiple explanations will need to be generated have years of medical experience. In contrast, patients often based on the target individual's knowledge. Explanation have limited medical knowledge, and they may be coping generation also involves applications of user modeling with new life-threatening diagnoses that may require a number (e.g.
Motivation, Microdrives and Microgoals in Mockingbird
Lynch, Michael Francis (Rensselaer Polytechnic Institute)
This paper is a work-in-progress report about Mockingbird, an intelligent musical agent (IMA) based on Sun's Clarion cognitive architecture (Sun 2003). In the first part we present the Clarion architecture and the manner in which its Motivation Subsystem models drive states and goals. In the second part we propose a potential structure for modeling fine-grained secondary drives in the context of a free improvisational performance.
Preface
Braziunas, Darius (Kobo Inc.) | Endres, Markus (University of Augsburg) | Venable, K. Brent (Tulane University) | Weng, Paul (Université Pierre et Marie Curi) | Xia, Lirong (Rensselaer Polytechnic Institute)
Nearly all areas of artificial intelligence deal with choice situations and can thus benefit from computational methods for handling preferences. Moreover, social choice methods are also of key importance in computational domains such as multiagent systems. This broadened scope of preferences leads to new types of preference models, new problems for applying preference structures, and new kinds of benefits. Preferences are inherently a multi-disciplinary topic, of interest to economists, computer scientists, operations researchers, math- ematicians and more. The workshop on Advances in Preferences Handling promotes this broadened scope of preference handling. The workshop seeks to improve the overall understanding of the benefits of preferences for those tasks. Another important goal is to provide cross-fertilization between different fields.