Not enough data to create a plot.
Try a different view from the menu above.
Linear Causal Bandits: Unknown Graph and Soft Interventions
Designing causal bandit algorithms depends on two central categories of assumptions: (i) the extent of information about the underlying causal graphs and (ii) the extent of information about interventional statistical models. There have been extensive recent advances in dispensing with assumptions on either category. These include assuming known graphs but unknown interventional distributions, and the converse setting of assuming unknown graphs but access to restrictive hard/do interventions, which removes the stochasticity and ancestral dependencies. Nevertheless, the problem in its general form, i.e., unknown graph and unknown stochastic intervention models, remains open.
Efficient and Modular Implicit Differentiation, Roy Frostig, Stephan Hoyer, Felipe Llinares-Lรณpez, Fabian Pedregosa, Jean-Philippe Vert
Automatic differentiation (autodiff) has revolutionized machine learning. It allows to express complex computations by composing elementary ones in creative ways and removes the burden of computing their derivatives by hand. More recently, differentiation of optimization problem solutions has attracted widespread attention with applications such as optimization layers, and in bi-level problems such as hyper-parameter optimization and meta-learning. However, so far, implicit differentiation remained difficult to use for practitioners, as it often required case-by-case tedious mathematical derivations and implementations. In this paper, we propose automatic implicit differentiation, an efficient and modular approach for implicit differentiation of optimization problems.
Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning
Recently, vision model pre-training has evolved from relying on manually annotated datasets to leveraging large-scale, web-crawled image-text data. Despite these advances, there is no pre-training method that effectively exploits the interleaved image-text data, which is very prevalent on the Internet. Inspired by the recent success of compression learning in natural language processing, we propose a novel vision model pre-training method called Latent Compression Learning (LCL) for interleaved image-text data. This method performs latent compression learning by maximizing the mutual information between the inputs and outputs of a causal attention model. The training objective can be decomposed into two basic tasks: 1) contrastive learning between visual representation and preceding context, and 2) generating subsequent text based on visual representation. Our experiments demonstrate that our method not only matches the performance of CLIP on paired pre-training datasets (e.g., LAION), but can also leverage interleaved pre-training data (e.g., MMC4) to learn robust visual representations from scratch, showcasing the potential of vision model pre-training with interleaved image-text data. Code is released at https://github.com/OpenGVLab/LCL.
A Supplementary Material
The output of each model (baselines or VADeR) follows a 1 1 conv. W train on train_aug set of VOC and train set of Cityscapes and evaluate on val set of each dataset. We use crop size 512 on PASCAL VOC and 768 in Cityscapes and evaluation is done on original image size. Each training sample is generated with random cropping, scaling (by ratio in [0.5,2.0]) and normalization. We train with batch size of 64 for 60 epochs for VOC and batch size 16 and 100 epochs for Cityscapes.
from reviewers
We thank the reviewers for the detailed and helpful reviews. Current SSL methods (including MoCo, which we base upon) train only the bottom-up encoder w/o labels. This is fundamentally different than an'intensive data augmentation', as suggested by R3. Comparison with MoCo trained with 50 extra epochs (R2, R4). Everything else is different, e.g., the high-level goal, the dataset (imagenet vs. fine-grained CUB), the loss, Why not share params in f and g (L118)?
51a472c08e21aef54ed749806e3e6490-Paper.pdf
While studying semantics in the brain, neuroscientists use two approaches. One is to identify areas that are correlated with semantic processing load. Another is to find areas that are predicted by the semantic representation of the stimulus words. However, most studies of syntax have focused only on identifying areas correlated with syntactic processing load. One possible reason for this discrepancy is that representing syntactic structure in an embedding space such that it can be used to model brain activity is a non-trivial computational problem. Another possible reason is that it is unclear if the low signal-to-noise ratio of neuroimaging tools such as functional Magnetic Resonance Imaging (fMRI) can allow us to reveal the correlates of complex (and perhaps subtle) syntactic representations. In this study, we propose novel multi-dimensional features that encode information about the syntactic structure of sentences.