Goto

Collaborating Authors

 opt1


ImprovedGuaranteesfork-means + + andk-means ++Parallel

Neural Information Processing Systems

Lloyd's algorithm uses iterative improvements to find a locally optimalk-means clustering. The performance of Lloyd'salgorithm crucially depends on the quality of the initial clustering, which isdefined bytheinitial setofcenters, called aseed.



Prompting Contrastive Explanations for Commonsense Reasoning Tasks

arXiv.org Artificial Intelligence

Many commonsense reasoning NLP tasks involve choosing between one or more possible answers to a question or prompt based on knowledge that is often implicit. Large pretrained language models (PLMs) can achieve near-human performance on such tasks, while providing little human-interpretable evidence of the underlying reasoning they use. In this work, we show how to use these same models to generate such evidence: inspired by the contrastive nature of human explanations, we use PLMs to complete explanation prompts which contrast alternatives according to the key attribute(s) required to justify the correct answer (for example, peanuts are usually salty while raisins are sweet). Conditioning model decisions on these explanations improves performance on two commonsense reasoning benchmarks, as compared to previous non-contrastive alternatives. These explanations are also judged by humans to be more relevant for solving the task, and facilitate a novel method to evaluate explanation faithfulfness.


Understanding Distributional Ambiguity via Non-robust Chance Constraint

arXiv.org Machine Learning

The choice of the ambiguity radius is critical when an investor uses the distributionally robust approach to address the issue that the portfolio optimization problem is sensitive to the uncertainties of the asset return distribution. It cannot be set too large because the larger the size of the ambiguity set the worse the portfolio return. It cannot be too small either; otherwise, one loses the robust protection. This tradeoff demands a financial understanding of the ambiguity set. In this paper, we propose a non-robust interpretation of the distributionally robust optimization (DRO) problem. By relating the impact of an ambiguity set to the impact of a non-robust chance constraint, our interpretation allows investors to understand the size of the ambiguity set through parameters that are directly linked to investment performance. We first show that for general $\phi$-divergences, a DRO problem is asymptotically equivalent to a class of mean-deviation problem, where the ambiguity radius controls investor's risk preference. Based on this non-robust reformulation, we then show that when a boundedness constraint is added to the investment strategy, the DRO problem can be cast as a chance-constrained optimization (CCO) problem without distributional uncertainties. If the boundedness constraint is removed, the CCO problem is shown to perform uniformly better than the DRO problem, irrespective of the radius of the ambiguity set, the choice of the divergence measure, or the tail heaviness of the center distribution. Our results apply to both the widely-used Kullback-Leibler (KL) divergence which requires the distribution of the objective function to be exponentially bounded, as well as those more general divergence measures which allow heavy tail ones such as student $t$ and lognormal distributions.


Parametric Dual Maximization for Non-Convex Learning Problems

AAAI Conferences

We consider a class of non-convex learning problems that can be formulated as jointly optimizing regularized hinge loss and a set of auxiliary variables. Such problems encompass but are not limited to various versions of semi-supervised learning,learning with hidden structures, robust learning, etc. Existing methods either suffer from local minima or have to invoke anon-scalable combinatorial search. In this paper, we propose a novel learning procedure, namely Parametric Dual Maximization(PDM), that can approach global optimality efficiently with user specified approximation levels. The building blocks of PDM are two new results: (1) The equivalent convex maximization reformulation derived by parametric analysis.(2) The improvement of local solutions based on a necessary and sufficient condition for global optimality. Experimental results on two representative applications demonstrate the effectiveness of PDM compared to other approaches.