Goto

Collaborating Authors

 orient



LLossfunction b Batchsize ETotaltrainingepochs LEpochinterval SDA gfeatureextractor hpredictorfunction Composition Table4: Notations

Neural Information Processing Systems

Hence, the above formulation of set function is submodular and is an instance of concave over modular function. Inour setting, we use labeled target dataDt asthe validation set. The completed-SNE loss is defined as a combination of Land cross-entropy loss on source and targetdomain. Table 8 shows the training times for this setting. Again, we see that all instantiationsofORIENTachieve 2.5 speed-upcomparedtoFull.


ORIENT: SubmodularMutualInformationMeasures forDataSubsetSelectionunderDistributionShift

Neural Information Processing Systems

The recent success of deep learning frameworks in applications such as image classification [9], speech recognition [20], and object detection [13] stems primarily from the availability of large amounts of labeled data.



ORIENT: Submodular Mutual Information Measures for Data Subset Selection under Distribution Shift

Neural Information Processing Systems

Real-world machine-learning applications require robust models that generalize well to distribution shift settings, which is typical in real-world situations. Domain adaptation techniques aim to address this issue of distribution shift by minimizing the disparities between domains to ensure that the model trained on the source domain performs well on the target domain. Nevertheless, the existing domain adaptation methods are computationally very expensive. In this work, we aim to improve the efficiency of existing supervised domain adaptation (SDA) methods by using a subset of source data that is similar to target data for faster model training. Specifically, we propose ORIENT, a subset selection framework that uses the submodular mutual information (SMI) functions to select a source data subset similar to the target data for faster training. Additionally, we demonstrate how existing robust subset selection strategies, such as GLISTER, GRADMATCH, and CRAIG, when used with a held-out query set, fit within our proposed framework and demonstrate the connections with them. Finally, we empirically demonstrate that SDA approaches like d-SNE, CCSA, and standard Cross-entropy training, when employed together with ORIENT, achieve a) faster training and b) better performance on the target data.




Thank you for pointing out the need for more emphasis on our setting, such as the consideration of only

Neural Information Processing Systems

We thank the reviewers for their valuable feedback. We will emphasize this in the abstract and more clearly throughout the paper. Shanmugam et al. (2015) prove minimax bounds, which both involve the size of largest maximal clique. R2 points out the paper might appear dense and not accessible to non-expert audiences. In fact, any apparent simplicity in the proofs is partially the result of careful definitions.


ORIENT: Submodular Mutual Information Measures for Data Subset Selection under Distribution Shift

Neural Information Processing Systems

Real-world machine-learning applications require robust models that generalize well to distribution shift settings, which is typical in real-world situations. Domain adaptation techniques aim to address this issue of distribution shift by minimizing the disparities between domains to ensure that the model trained on the source domain performs well on the target domain. Nevertheless, the existing domain adaptation methods are computationally very expensive. In this work, we aim to improve the efficiency of existing supervised domain adaptation (SDA) methods by using a subset of source data that is similar to target data for faster model training. Specifically, we propose ORIENT, a subset selection framework that uses the submodular mutual information (SMI) functions to select a source data subset similar to the target data for faster training. Additionally, we demonstrate how existing robust subset selection strategies, such as GLISTER, GRADMATCH, and CRAIG, when used with a held-out query set, fit within our proposed framework and demonstrate the connections with them.


Active Structure Learning of Causal DAGs via Directed Clique Trees

Neural Information Processing Systems

A growing body of work has begun to study intervention design for efficient structure learning of causal directed acyclic graphs (DAGs). A typical setting is a \emph{causally sufficient} setting, i.e. a system with no latent confounders, selection bias, or feedback, when the essential graph of the observational equivalence class (EC) is given as an input and interventions are assumed to be noiseless. Most existing works focus on \textit{worst-case} or \textit{average-case} lower bounds for the number of interventions required to orient a DAG. These worst-case lower bounds only establish that the largest clique in the essential graph \textit{could} make it difficult to learn the true DAG. In this work, we develop a \textit{universal} lower bound for single-node interventions that establishes that the largest clique is \textit{always} a fundamental impediment to structure learning.