Goto

Collaborating Authors

 Country


The Relative Expressiveness of Abstract Argumentation and Logic Programming

AAAI Conferences

We analyze the relative expressiveness of the two-valued semantics of abstract argumentation frameworks, normal logic programs and abstract dialectical frameworks. By expressiveness we mean the ability to encode a desired set of two-valued interpretations over a given propositional vocabulary A using only atoms from A. While the computational complexity of the two-valued model existence problem for all these languages is (almost) the same, we show that the languages form a neat hierarchy with respect to their expressiveness. We then demonstrate that this hierarchy collapses once we allow to introduce a linear number of new vocabulary elements.


Expressing Arbitrary Reward Functions as Potential-Based Advice

AAAI Conferences

Effectively incorporating external advice is an important problem in reinforcement learning, especially as it moves into the real world. Potential-based reward shaping is a way to provide the agent with a specific form of additional reward, with the guarantee of policy invariance. In this work we give a novel way to incorporate an arbitrary reward function with the same guarantee, by implicitly translating it into the specific form of dynamic advice potentials, which are maintained as an auxiliary value function learnt at the same time. We show that advice provided in this way captures the input reward function in expectation, and demonstrate its efficacy empirically.


Policy Tree: Adaptive Representation for Policy Gradient

AAAI Conferences

Much of the focus on finding good representations in reinforcement learning has been on learning complex non-linear predictors of value. Policy gradient algorithms, which directly represent the policy, often need fewer parameters to learn good policies. However, they typically employ a fixed parametric representation that may not be sufficient for complex domains. This paper introduces the Policy Tree algorithm, which can learn an adaptive representation of policy in the form of a decision tree over different instantiations of a base policy. Policy gradient is used both to optimize the parameters and to grow the tree by choosing splits that enable the maximum local increase in the expected return of the policy. Experiments show that this algorithm can choose genuinely helpful splits and significantly improve upon the commonly used linear Gibbs softmax policy, which we choose as our base policy.


Unsupervised Word Sense Disambiguation Using Markov Random Field and Dependency Parser

AAAI Conferences

Word Sense Disambiguation is a difficult problem to solve in the unsupervised setting. This is because in this setting inference becomes more dependent on the interplay between different senses in the context due to unavailability of learning resources. Using two basic ideas, sense dependency and selective dependency, we model the WSD problem as a Maximum A Posteriori (MAP) Inference Query on a Markov Random Field (MRF) built using WordNet and Link Parser or Stanford Parser. To the best of our knowledge this combination of dependency and MRF is novel, and our graph-based unsupervised WSD system beats state-of-the-art system on SensEval-2, SensEval-3 and SemEval-2007 English all-words datasets while being over 35 times faster.


Dictionary Learning with Mutually Reinforcing Group-Graph Structures

AAAI Conferences

In this paper, we propose a novel dictionary learning method in the semi-supervised setting by dynamically coupling graph and group structures. To this end, samples are represented by sparse codes inheriting their graph structure while the labeled samples within the same class are represented with group sparsity, sharing the same atoms of the dictionary. Instead of statically combining graph and group structures, we take advantage of them in a mutually reinforcing way — in the dictionary learning phase, we introduce the unlabeled samples into groups by an entropy-based method and then update the corresponding local graph, resulting in a more structured and discriminative dictionary. We analyze the relationship between the two structures and prove the convergence of our proposed method. Focusing on image classification task, we evaluate our approach on several datasets and obtain superior performance compared with the state-of-the-art methods, especially in the case of only a few labeled samples and limited dictionary size.


Bayesian Approach to Modeling and Detecting Communities in Signed Network

AAAI Conferences

There has been an increasing interest in exploring signed networks with positive and negative links in that they contain more information than unsigned networks. As fundamental problems of signed network analysis, community detection and sign (or attitude) prediction are still primary challenges. To address them, we propose a generative Bayesian approach, in which 1) a signed stochastic blockmodel is proposed to characterize the community structure in context of signed networks, by means of explicitly formulating the distributions of both density and frustration of signed links from a stochastic perspective, and 2) a model learning algorithm is proposed by theoretically deriving a variational Bayes EM for parameter estimation and a variation based approximate evidence for model selection. Through the comparisons with state-of-the-art methods on synthetic and real-world networks, the proposed approach shows its superiority in both community detection and sign prediction for exploratory networks.


From Non-Negative to General Operator Cost Partitioning

AAAI Conferences

Operator cost partitioning is a well-known technique to make admissible heuristics additive by distributing the operator costs among individual heuristics. Planning tasks are usually defined with non-negative operator costs and therefore it appears natural to demand the same for the distributed costs. We argue that this requirement is not necessary and demonstrate the benefit of using general cost partitioning. We show that LP heuristics for operator-counting constraints are cost-partitioned heuristics and that the state equation heuristic computes a cost partitioning over atomic projections. We also introduce a new family of potential heuristics and show their relationship to general cost partitioning.


An Unsupervised Framework of Exploring Events on Twitter: Filtering, Extraction and Categorization

AAAI Conferences

Twitter, as a popular microblogging service, has become a new information channel for users to receive and exchange the mostup-to-date information on current events. However, since there is no control on how users can publish messages on Twitter, finding newsworthy events from Twitter becomes a difficult task like "finding a needle in a haystack". In this paper we propose a general unsupervised framework to explore events from tweets, which consists of a pipeline process of filtering, extraction and categorization. To filter out noisy tweets, the filtering step exploits a lexicon-based approach to separate tweets that are event-related from those that are not. Then, based on these event-related tweets, the structured representations of events are extracted and categorized automatically using an unsupervised Bayesian model without the use of any labelled data. Moreover, the categorized events are assigned with the event type labels without human intervention. The proposed framework has been evaluated on over 60 millions tweets which were collected for one month in December 2010. A precision of 70.49% is achieved in event extraction, outperforming a competitive baseline by nearly 6%. Events are also clustered into coherence groups with the automatically assigned event type label.


Unsupervised Cross-Domain Transfer in Policy Gradient Reinforcement Learning via Manifold Alignment

AAAI Conferences

The success of applying policy gradient reinforcement learning (RL) to difficult control tasks hinges crucially on the ability to determine a sensible initialization for the policy. Transfer learning methods tackle this problem by reusing knowledge gleaned from solving other related tasks. In the case of multiple task domains, these algorithms require an inter-task mapping to facilitate knowledge transfer across domains. However, there are currently no general methods to learn an inter-task mapping without requiring either background knowledge that is not typically present in RL settings, or an expensive analysis of an exponential number of inter-task mappings in the size of the state and action spaces. This paper introduces an autonomous framework that uses unsupervised manifold alignment to learn inter-task mappings and effectively transfer samples between different task domains. Empirical results on diverse dynamical systems, including an application to quadrotor control, demonstrate its effectiveness for cross-domain transfer in the context of policy gradient RL.


Convex Batch Mode Active Sampling via α-Relative Pearson Divergence

AAAI Conferences

Active learning is a machine learning technique that trains a classifier after selecting a subset from an unlabeled dataset for labeling and using the selected data for training. Recently, batch mode active learning, which selects a batch of samples to label in parallel, has attracted a lot of attention. Its challenge lies in the choice of criteria used for guiding the search of the optimal batch. In this paper, we propose a novel approach to selecting the optimal batch of queries by minimizing the α-relative Pearson divergence (RPE) between the labeled and the original datasets. This particular divergence is chosen since it can distinguish the optimal batch more easily than other measures especially when available candidates are similar. The proposed objective is a min-max optimization problem, and it is difficult to solve due to the involvement of both minimization and maximization. We find that the objective has an equivalent convex form, and thus a global optimal solution can be obtained. Then the subgradient method can be applied to solve the simplified convex problem. Our empirical studies on UCI datasets demonstrate the effectiveness of the proposed approach compared with the state-of-the-art batch mode active learning methods.