Mathematical & Statistical Methods
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The paper formalizes the problem of feature cross-substitution in adversarial classification and presents some approaches to solving it (one heuristic and one exact based on mixed-integer linear programming). Previous work in adversarial classification have considered simpler models for how an adversary would try to fool a classifier by replacing values of features (basically, simple distance functions) and assume that the adversary would try to minimize the distance to the real features why fooling the classifier. The paper points out that this approach has a serious shortcoming when feature selection is done. The authors then suggest that it makes more sense to consider equivalent classes of features as basically being the same features and treating them transparently in feature selection and classifier building. They show results of their two approaches (heuristic and exact) on three datasets, outperforming previous work.
Random Quadratic Forms with Dependence: Applications to Restricted Isometry and Beyond
Arindam Banerjee, Qilong Gu, Vidyashankar Sivakumar, Steven Z. Wu
Several important families of computational and statistical results in machine learning and randomized algorithms rely on uniform bounds on quadratic forms of random vectors or matrices. Such results include the Johnson-Lindenstrauss (J-L) Lemma, the Restricted Isometry Property (RIP), randomized sketching algorithms, and approximate linear algebra. The existing results critically depend on statistical independence, e.g., independent entries for random vectors, independent rows for random matrices, etc., which prevent their usage in dependent or adaptive modeling settings. In this paper, we show that such independence is in fact not needed for such results which continue to hold under fairly general dependence structures. In particular, we present uniform bounds on random quadratic forms of stochastic processes which are conditionally independent and sub-Gaussian given another (latent) process. Our setup allows general dependencies of the stochastic process on the history of the latent process and the latent process to be influenced by realizations of the stochastic process. The results are thus applicable to adaptive modeling settings and also allows for sequential design of random vectors and matrices. We also discuss stochastic process based forms of J-L, RIP, and sketching, to illustrate the generality of the results.
Graph Coloring for Multi-Task Learning
Patapati, Santosh, Srinivasan, Trisanth
When different objectives conflict with each other in multi-task learning, gradients begin to interfere and slow convergence, thereby potentially reducing the final model's performance. To address this, we introduce SON-GOKU, a scheduler that computes gradient interference, constructs an interference graph, and then applies greedy graph-coloring to partition tasks into groups that align well with each other. At each training step, only one group (color class) of tasks are activated, and the grouping partition is constantly recomputed as task relationships evolve throughout training. By ensuring that each mini-batch contains only tasks that pull the model in the same direction, our method improves the effectiveness of any underlying multi-task learning optimizer without additional tuning. Since tasks within these groups will update in compatible directions, multi-task learning will improve model performance rather than impede it. Empirical results on six different datasets show that this interference-aware graph-coloring approach consistently outperforms baselines and state-of-the-art multi-task optimizers. We provide extensive theory showing why grouping and sequential updates improve multi-task learning, with guarantees on descent, convergence, and accurately identifying what tasks conflict or align.