Plotting


Supplementary Material for CrossGNN: Confronting Noisy Multivariate Time Series Via Cross Interaction Refinement Anonymous Author(s) Affiliation Address email

Neural Information Processing Systems

We conduct extensive experiments on 8 real-world datasets following [4]. Correlation mechanism to capture cross-time dependency for forecasting. Besides, the dimension of the channel is set to 16 based on efficiency considerations. The first row shows the performance when the prediction horizon is 96, while the second row shows the performance when the prediction horizon is 336. Figure 3: The MSE (left Y-axis) and MAE results (right Y-axis) of CrossGNN with different number of scales (X-axis) on ETTh2, ETTm2, Traffic, and Weather. Figure 4: The MSE (left Y-axis) and MAE results (right Y-axis) of CrossGNN with different K (X-axis) on ETTh2, ETTm2, Traffic, and Weather.



NeuralSteiner: Learning Steiner Tree for Overflow-avoiding Global Routing in Chip Design

Neural Information Processing Systems

Global routing plays a critical role in modern chip design. The routing paths generated by global routers often form a rectilinear Steiner tree (RST). Recent advances from the machine learning community have shown the power of learning-based route generation; however, the yielded routing paths by the existing approaches often suffer from considerable overflow, thus greatly hindering their application in practice. We propose NeuralSteiner, an accurate approach to overflow-avoiding global routing in chip design. The key idea of NeuralSteiner approach is to learn Steiner trees: we first predict the locations of highly likely Steiner points by adopting a neural network considering full-net spatial and overflow information, then select appropriate points by running a graph-based post-processing algorithm, and finally connect these points with the input pins to yield overflow-avoiding RSTs. NeuralSteiner offers two advantages over previous learning-based models. First, by using the learning scheme, NeuralSteiner ensures the connectivity of generated routes while significantly reducing congestion. Second, NeuralSteiner can effectively scale to large nets and transfer to unseen chip designs without any modifications or fine-tuning. Extensive experiments over public large-scale benchmarks reveal that, compared with the state-of-the-art deep generative methods, NeuralSteiner achieves up to a 99.8% reduction in overflow while speeding up the generation and maintaining a slight wirelength loss within only 1.8%.




AverNet: All-in-one Video Restoration for Time-varying Unknown Degradations

Neural Information Processing Systems

Traditional video restoration approaches were designed to recover clean videos from a specific type of degradation, making them ineffective in handling multiple unknown types of degradation. To address this issue, several studies have been conducted and have shown promising results. However, these studies overlook that the degradations in video usually change over time, dubbed time-varying unknown degradations (TUD). To tackle such a less-touched challenge, we propose an innovative method, termed as All-in-one VidEo Restoration Network (Aver-Net), which comprises two core modules, i.e., Prompt-Guided Alignment (PGA) module and Prompt-Conditioned Enhancement (PCE) module. Specifically, PGA addresses the issue of pixel shifts caused by time-varying degradations by learning and utilizing prompts to align video frames at the pixel level. To handle multiple unknown degradations, PCE recasts it into a conditional restoration problem by implicitly establishing a conditional map between degradations and ground truths. Thanks to the collaboration between PGA and PCE modules, AverNet empirically demonstrates its effectiveness in recovering videos from TUD. Extensive experiments are carried out on two synthesized datasets featuring seven types of degradations with random corruption levels.


Disentangling Causal Effects from Sets of Interventions in the Presence of Unobserved Confounders

Neural Information Processing Systems

The ability to answer causal questions is crucial in many domains, as causal inference allows one to understand the impact of interventions. In many applications, only a single intervention is possible at a given time. However, in some important areas, multiple interventions are concurrently applied. Disentangling the effects of single interventions from jointly applied interventions is a challenging task-- especially as simultaneously applied interventions can interact. This problem is made harder still by unobserved confounders, which influence both treatments and outcome.


Learning Group Actions on Latent Representations

Neural Information Processing Systems

In this work, we introduce a new approach to model group actions in autoencoders. Diverging from prior research in this domain, we propose to learn the group actions on the latent space rather than strictly on the data space. This adaptation enhances the versatility of our model, enabling it to learn a broader range of scenarios prevalent in the real world, where groups can act on latent factors. Our method allows a wide flexibility in the encoder and decoder architectures and does not require group-specific layers. In addition, we show that our model theoretically serves as a superset of methods that learn group actions on the data space. We test our approach on five image datasets with diverse groups acting on them and demonstrate superior performance to recently proposed methods for modeling group actions.


A Transfer and finetuning details

Neural Information Processing Systems

Few-shot evaluation We use the linear adaptation protocol and evaluation sets from [68, 70], reporting the 10-shot classification accuracy. For every combination of data set and model we run the 10-shot adaptation three times and report the mean (and standard deviation for key results). LiT decoder and T5 decoder To train a multi-task decoder from scratch on top of the frozen representation for classification, captioning and VQA, we precisely follow the setup and hyper parameters from [2] except for the data mixing strategy, for which we set to "concat image-question pairs" ([2, Sec. For all encoders, we use the full feature sequence before pooling (including the class token for the evaluation of CLIP). Throughout, we rely on a B-sized transformer decoder [60] with 12 layers.