Goto

Collaborating Authors

 Industry


How Many Vote Operations Are Needed to Manipulate A Voting System?

arXiv.org Artificial Intelligence

In this paper, we propose a framework to study a general class of strategic behavior in voting, which we call vote operations. We prove the following theorem: if we fix the number of alternatives, generate $n$ votes i.i.d. according to a distribution $\pi$, and let $n$ go to infinity, then for any $\epsilon >0$, with probability at least $1-\epsilon$, the minimum number of operations that are needed for the strategic individual to achieve her goal falls into one of the following four categories: (1) 0, (2) $\Theta(\sqrt n)$, (3) $\Theta(n)$, and (4) $\infty$. This theorem holds for any set of vote operations, any individual vote distribution $\pi$, and any integer generalized scoring rule, which includes (but is not limited to) almost all commonly studied voting rules, e.g., approval voting, all positional scoring rules (including Borda, plurality, and veto), plurality with runoff, Bucklin, Copeland, maximin, STV, and ranked pairs. We also show that many well-studied types of strategic behavior fall under our framework, including (but not limited to) constructive/destructive manipulation, bribery, and control by adding/deleting votes, margin of victory, and minimum manipulation coalition size. Therefore, our main theorem naturally applies to these problems.


The Guppy Effect as Interference

arXiv.org Artificial Intelligence

A concrete formal understanding of how concepts combine is vital to significant progress in many fields including psychology, linguistics, and cognitive science. However, concepts have been resistant to mathematical description because people use conjunctions and disjunctions of concepts in ways that violate the rules of classical logic; i.e., concepts interact in ways that are non-compositional [4]. This is true also with respect to properties (e.g., although people do not rate talks as a characteristic property of Pet or Bird, they rate it as characteristic of Pet Bird) and exemplar typicalities (e.g., although people do not rate Guppy as a typical Pet, nor a typical Fish, they rate it as a highly typical Pet Fish [5]). This has come to be known as the Pet Fish Problem, and the general phenomenon wherein the typicality of an exemplar for a conjunctively combined concept is greater than that for either of the constituent concepts has come to be called the Guppy Effect, although further investigation revealed that the Pet Fish Problem is not a particularly good example of the Guppy Effect, and that other concept combinations exhibit this effect more strongly [6]. One can refer to the situation wherein people estimate the typicality of an exemplar of the concept combination as more extreme than it is for one of the constituent concepts in a conjunctive combination as overextension.


Experiments with Game Tree Search in Real-Time Strategy Games

arXiv.org Artificial Intelligence

Game tree search algorithms such as minimax have been used with enormous success in turn-based adversarial games such as Chess or Checkers. However, such algorithms cannot be directly applied to real-time strategy (RTS) games because a number of reasons. For example, minimax assumes a turn-taking game mechanics, not present in RTS games. In this paper we present RTMM, a real-time variant of the standard minimax algorithm, and discuss its applicability in the context of RTS games. We discuss its strengths and weaknesses, and evaluate it in two real-time games.


The Graphical Lasso: New Insights and Alternatives

arXiv.org Machine Learning

The graphical lasso \citep{FHT2007a} is an algorithm for learning the structure in an undirected Gaussian graphical model, using $\ell_1$ regularization to control the number of zeros in the precision matrix ${\B\Theta}={\B\Sigma}^{-1}$ \citep{BGA2008,yuan_lin_07}. The {\texttt R} package \GL\ \citep{FHT2007a} is popular, fast, and allows one to efficiently build a path of models for different values of the tuning parameter. Convergence of \GL\ can be tricky; the converged precision matrix might not be the inverse of the estimated covariance, and occasionally it fails to converge with warm starts. In this paper we explain this behavior, and propose new algorithms that appear to outperform \GL. By studying the "normal equations" we see that, \GL\ is solving the {\em dual} of the graphical lasso penalized likelihood, by block coordinate ascent; a result which can also be found in \cite{BGA2008}. In this dual, the target of estimation is $\B\Sigma$, the covariance matrix, rather than the precision matrix $\B\Theta$. We propose similar primal algorithms \PGL\ and \DPGL, that also operate by block-coordinate descent, where $\B\Theta$ is the optimization target. We study all of these algorithms, and in particular different approaches to solving their coordinate sub-problems. We conclude that \DPGL\ is superior from several points of view.


Structured Prediction Cascades

arXiv.org Machine Learning

Structured prediction tasks pose a fundamental trade-off between the need for model complexity to increase predictive power and the limited computational resources for inference in the exponentially-sized output spaces such models require. We formulate and develop the Structured Prediction Cascade architecture: a sequence of increasingly complex models that progressively filter the space of possible outputs. The key principle of our approach is that each model in the cascade is optimized to accurately filter and refine the structured output state space of the next model, speeding up both learning and inference in the next layer of the cascade. We learn cascades by optimizing a novel convex loss function that controls the trade-off between the filtering efficiency and the accuracy of the cascade, and provide generalization bounds for both accuracy and efficiency. We also extend our approach to intractable models using tree-decomposition ensembles, and provide algorithms and theory for this setting. We evaluate our approach on several large-scale problems, achieving state-of-the-art performance in handwriting recognition and human pose recognition. We find that structured prediction cascades allow tremendous speedups and the use of previously intractable features and models in both settings.


Toward an Integrated Framework for Automated Development and Optimization of Online Advertising Campaigns

arXiv.org Artificial Intelligence

Creating and monitoring competitive and cost-effective pay-per-click advertisement campaigns through the web-search channel is a resource demanding task in terms of expertise and effort. Assisting or even automating the work of an advertising specialist will have an unrivaled commercial value. In this paper we propose a methodology, an architecture, and a fully functional framework for semi- and fully- automated creation, monitoring, and optimization of cost-efficient pay-per-click campaigns with budget constraints. The campaign creation module generates automatically keywords based on the content of the web page to be advertised extended with corresponding ad-texts. These keywords are used to create automatically the campaigns fully equipped with the appropriate values set. The campaigns are uploaded to the auctioneer platform and start running. The optimization module focuses on the learning process from existing campaign statistics and also from applied strategies of previous periods in order to invest optimally in the next period. The objective is to maximize the performance (i.e. clicks, actions) under the current budget constraint. The fully functional prototype is experimentally evaluated on real world Google AdWords campaigns and presents a promising behavior with regards to campaign performance statistics as it outperforms systematically the competing manually maintained campaigns.


Domain and Function: A Dual-Space Model of Semantic Relations and Compositions

Journal of Artificial Intelligence Research

Given appropriate representations of the semantic relations between carpenter and wood and between mason and stone (for example, vectors in a vector space model), a suitable algorithm should be able to recognize that these relations are highly similar (carpenter is to wood as mason is to stone; the relations are analogous). Likewise, with representations of dog, house, and kennel, an algorithm should be able to recognize that the semantic composition of dog and house, dog house, is highly similar to kennel (dog house and kennel are synonymous). It seems that these two tasks, recognizing relations and compositions, are closely connected. However, up to now, the best models for relations are significantly different from the best models for compositions. In this paper, we introduce a dual-space model that unifies these two tasks. This model matches the performance of the best previous models for relations and compositions. The dual-space model consists of a space for measuring domain similarity and a space for measuring function similarity. Carpenter and wood share the same domain, the domain of carpentry. Mason and stone share the same domain, the domain of masonry. Carpenter and mason share the same function, the function of artisans. Wood and stone share the same function, the function of materials. In the composition dog house, kennel has some domain overlap with both dog and house (the domains of pets and buildings). The function of kennel is similar to the function of house (the function of shelters). By combining domain and function similarities in various ways, we can model relations, compositions, and other aspects of semantics.


SAP Speaks PDDL: Exploiting a Software-Engineering Model for Planning in Business Process Management

Journal of Artificial Intelligence Research

Planning is concerned with the automated solution of action sequencing problems described in declarative languages giving the action preconditions and effects. One important application area for such technology is the creation of new processes in Business Process Management (BPM), which is essential in an ever more dynamic business environment. A major obstacle for the application of Planning in this area lies in the modeling. Obtaining a suitable model to plan with -- ideally a description in PDDL, the most commonly used planning language -- is often prohibitively complicated and/or costly. Our core observation in this work is that this problem can be ameliorated by leveraging synergies with model-based software development. Our application at SAP, one of the leading vendors of enterprise software, demonstrates that even one-to-one model re-use is possible. The model in question is called Status and Action Management (SAM). It describes the behavior of Business Objects (BO), i.e., large-scale data structures, at a level of abstraction corresponding to the language of business experts. SAM covers more than 400 kinds of BOs, each of which is described in terms of a set of status variables and how their values are required for, and affected by, processing steps (actions) that are atomic from a business perspective. SAM was developed by SAP as part of a major model-based software engineering effort. We show herein that one can use this same model for planning, thus obtaining a BPM planning application that incurs no modeling overhead at all. We compile SAM into a variant of PDDL, and adapt an off-the-shelf planner to solve this kind of problem. Thanks to the resulting technology, business experts may create new processes simply by specifying the desired behavior in terms of status variable value changes: effectively, by describing the process in their own language.


Learning a peptide-protein binding affinity predictor with kernel ridge regression

arXiv.org Machine Learning

We propose a specialized string kernel for small bio-molecules, peptides and pseudo-sequences of binding interfaces. The kernel incorporates physico-chemical properties of amino acids and elegantly generalize eight kernels, such as the Oligo, the Weighted Degree, the Blended Spectrum, and the Radial Basis Function. We provide a low complexity dynamic programming algorithm for the exact computation of the kernel and a linear time algorithm for it's approximation. Combined with kernel ridge regression and SupCK, a novel binding pocket kernel, the proposed kernel yields biologically relevant and good prediction accuracy on the PepX database. For the first time, a machine learning predictor is capable of accurately predicting the binding affinity of any peptide to any protein. The method was also applied to both single-target and pan-specific Major Histocompatibility Complex class II benchmark datasets and three Quantitative Structure Affinity Model benchmark datasets. On all benchmarks, our method significantly (p-value < 0.057) outperforms the current state-of-the-art methods at predicting peptide-protein binding affinities. The proposed approach is flexible and can be applied to predict any quantitative biological activity. The method should be of value to a large segment of the research community with the potential to accelerate peptide-based drug and vaccine development.


Decision Making for Symbolic Probability

arXiv.org Artificial Intelligence

This paper proposes a decision theory for a symbolic generalization of probability theory (SP). Darwiche and Ginsberg [2, 3] proposed SP to relax the requirement of using numbers for uncertainty while preserving desirable patterns of Bayesian reasoning. SP represents uncertainty by symbolic supports that are ordered partially rather than completely as in the case of standard probability. We show that a preference relation on acts that satisfies a number of intuitive postulates is represented by a utility function whose domain is a set of pairs of supports. We argue that a subjective interpretation is as useful and appropriate for SP as it is for numerical probability. It is useful because the subjective interpretation provides a basis for uncertainty elicitation. It is appropriate because we can provide a decision theory that explains how preference on acts is based on support comparison.