Goto

Collaborating Authors

 Garg, Dinesh


Improved Linear Embeddings via Lagrange Duality

arXiv.org Machine Learning

Near isometric orthogonal embeddings to lower dimensions are a fundamental tool in data science and machine learning. In this paper, we present the construction of such embeddings that minimizes the maximum distortion for a given set of points. We formulate the problem as a non convex constrained optimization problem. We first construct a primal relaxation and then use the theory of Lagrange duality to create dual relaxation. We also suggest a polynomial time algorithm based on the theory of convex optimization to solve the dual relaxation provably. We provide a theoretical upper bound on the approximation guarantees for our algorithm, which depends only on the spectral properties of the dataset. We experimentally demonstrate the superiority of our algorithm compared to baselines in terms of the scalability and the ability to achieve lower distortion.


A Sparse Nonlinear Classifier Design Using AUC Optimization

arXiv.org Machine Learning

AUC (Area under the ROC curve) is an important performance measure for applications where the data is highly imbalanced. Learning to maximize AUC performance is thus an important research problem. Using a max-margin based surrogate loss function, AUC optimization problem can be approximated as a pairwise rankSVM learning problem. Batch learning methods for solving the kernelized version of this problem suffer from scalability and may not result in sparse classifiers. Recent years have witnessed an increased interest in the development of online or single-pass online learning algorithms that design a classifier by maximizing the AUC performance. The AUC performance of nonlinear classifiers, designed using online methods, is not comparable with that of nonlinear classifiers designed using batch learning algorithms on many real-world datasets. Motivated by these observations, we design a scalable algorithm for maximizing AUC performance by greedily adding the required number of basis functions into the classifier model. The resulting sparse classifiers perform faster inference. Our experimental results show that the level of sparsity achievable can be order of magnitude smaller than the Kernel RankSVM model without affecting the AUC performance much.


Mechanism Design for Cost Optimal PAC Learning in the Presence of Strategic Noisy Annotators

arXiv.org Machine Learning

We consider the problem of Probably Approximate Correct (PAC) learning of a binary classifier from noisy labeled examples acquired from multiple annotators (each characterized by a respective classification noise rate). First, we consider the complete information scenario, where the learner knows the noise rates of all the annotators. For this scenario, we derive sample complexity bound for the Minimum Disagreement Algorithm (MDA) on the number of labeled examples to be obtained from each annotator. Next, we consider the incomplete information scenario, where each annotator is strategic and holds the respective noise rate as a private information. For this scenario, we design a cost optimal procurement auction mechanism along the lines of Myerson's optimal auction design framework in a non-trivial manner. This mechanism satisfies incentive compatibility property, thereby facilitating the learner to elicit true noise rates of all the annotators.


Threats and Trade-Offs in Resource Critical Crowdsourcing Tasks Over Networks

AAAI Conferences

In recent times, crowdsourcing over social networks has emerged as an active tool for complex task execution. In this paper, we address the problem faced by a planner to incentivize agents in the network to execute a task and also help in recruiting other agents for this purpose. We study this mechanism design problem under two natural resource optimization settings: (1) cost critical tasks, where the planner's goal is to minimize the total cost, and (2) time critical tasks, where the goal is to minimize the total time elapsed before the task is executed. We define a set of fairness properties that should be ideally satisfied by a crowdsourcing mechanism. We prove that no mechanism can satisfy all these properties simultaneously. We relax some of these properties and define their approximate counterparts. Under appropriate approximate fairness criteria, we obtain a non-trivial family of payment mechanisms. Moreover, we provide precise characterizations of cost critical and time critical mechanisms.