Plotting

Checklist

Neural Information Processing Systems

For all authors... (a) Do the main claims made in the abstract and introduction accurately reflect the paper's contributions and scope? If you ran experiments (e.g. for benchmarks)... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] See A.2 (b) Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? The data was accessed from publicly accessible sources, and no private or sensitive information was collected or utilized in this study. If you used crowdsourcing or conducted research with human subjects... (a) Did you include the full text of instructions given to participants and screenshots, if applicable? [Yes] See A.5 (b) Did you describe any potential participant risks, with links to Institutional Review Board (IRB) approvals, if applicable? [N/A] (c) Did you include the estimated hourly wage paid to participants and the total amount spent on participant compensation? PAPERS is a dataset of conversational question-answer pairs from reviews of academic papers grounded in these paper components and their associated references from scientific documents available on arXiv.


cPAPERS: A Dataset of Situated and Multimodal Interactive Conversations in Scientific Papers

Neural Information Processing Systems

An emerging area of research in situated and multimodal interactive conversations (SIMMC) includes interactions in scientific papers. Since scientific papers are primarily composed of text, equations, figures, and tables, SIMMC methods must be developed specifically for each component to support the depth of inquiry and interactions required by research scientists.


Pure Transformers are Powerful Graph Learners Jinwoo Kim 1 Tien Dat Nguyen 1 Seonwoo Min 2

Neural Information Processing Systems

We show that standard Transformers without graph-specific modifications can lead to promising results in graph learning both in theory and practice. Given a graph, we simply treat all nodes and edges as independent tokens, augment them with token embeddings, and feed them to a Transformer. With an appropriate choice of token embeddings, we prove that this approach is theoretically at least as expressive as an invariant graph network (2-IGN) composed of equivariant linear layers, which is already more expressive than all message-passing Graph Neural Networks (GNN). When trained on a large-scale graph dataset (PCQM4Mv2), our method coined Tokenized Graph Transformer (TokenGT) achieves significantly better results compared to GNN baselines and competitive results compared to Transformer variants with sophisticated graph-specific inductive bias. Our implementation is available at https://github.com/jw9730/tokengt.


A Novel Unified Architecture for Low-Shot Counting by Detection and Segmentation

Neural Information Processing Systems

Low-shot object counters estimate the number of objects in an image using few or no annotated exemplars. Objects are localized by matching them to prototypes, which are constructed by unsupervised image-wide object appearance aggregation. Due to potentially diverse object appearances, the existing approaches often lead to overgeneralization and false positive detections. Furthermore, the best-performing methods train object localization by a surrogate loss, that predicts a unit Gaussian at each object center. This loss is sensitive to annotation error, hyperparameters and does not directly optimize the detection task, leading to suboptimal counts.


Just Add $ 100 More: Augmenting Pseudo-LiDAR Point Cloud for Resolving Class-imbalance Problem

Neural Information Processing Systems

Typical LiDAR-based 3D object detection models are trained with real-world data collection, which is often imbalanced over classes. To deal with it, augmentation techniques are commonly used, such as copying ground truth LiDAR points and pasting them into scenes.


Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials

Neural Information Processing Systems

A recent goal in the theory of deep learning is to identify how neural networks can escape the "lazy training," or Neural Tangent Kernel (NTK) regime, where the network is coupled with its first order Taylor expansion at initialization. While the NTK is minimax optimal for learning dense polynomials [25], it cannot learn features, and hence has poor sample complexity for learning many classes of functions including sparse polynomials. Recent works have thus aimed to identify settings where gradient based algorithms provably generalize better than the NTK. One such example is the "QuadNTK" approach of Bai and Lee [7], which analyzes the second-order term in the Taylor expansion. Bai and Lee [7] show that the second-order term can learn sparse polynomials efficiently; however, it sacrifices the ability to learn general dense polynomials.



Supplementary material: High-recall causal discovery for autocorrelated time series with latent confounders

Neural Information Processing Systems

The Fast Causal Inference (FCI) algorithm is an algorithm for constraint-based causal discovery in the presence of unobserved variables [Spirtes et al., 1995, Spirtes et al., 2000, Zhang, 2008]. It allows for both latent confounders and selection variables, although in this paper we assume the absence of selection variables. Under the assumptions of faithfulness [Spirtes et al., 2000], acyclicity, and the existence of an underlying SCM the algorithm determines the maximally informative PAG from perfect statistical decisions of conditional independencies in the distribution P generated by the SCM. The algorithm is based on the following fact: Proposition S1 (m-separation by subsets of D-Sep sets [Spirtes et al., 2000]). Let A and B be two nodes such that A / adj(B, M) and B / an(A, M), then they are m-separated by some subset of D-Sep(B, A, M).


High-recall causal discovery for autocorrelated time series with latent confounders

Neural Information Processing Systems

We present a new method for linear and nonlinear, lagged and contemporaneous constraint-based causal discovery from observational time series in the presence of latent confounders. We show that existing causal discovery methods such as FCI and variants suffer from low recall in the autocorrelated time series case and identify low effect size of conditional independence tests as the main reason. Information-theoretical arguments show that effect size can often be increased if causal parents are included in the conditioning sets. To identify parents early on, we suggest an iterative procedure that utilizes novel orientation rules to determine ancestral relationships already during the edge removal phase. We prove that the method is order-independent, and sound and complete in the oracle case. Extensive simulation studies for different numbers of variables, time lags, sample sizes, and further cases demonstrate that our method indeed achieves much higher recall than existing methods for the case of autocorrelated continuous variables while keeping false positives at the desired level. This performance gain grows with stronger autocorrelation.


Latent Functional Maps: a spectral framework for representation alignment

Neural Information Processing Systems

Neural models learn data representations that lie on low-dimensional manifolds, yet modeling the relation between these representational spaces is an ongoing challenge. By integrating spectral geometry principles into neural modeling, we show that this problem can be better addressed in the functional domain, mitigating complexity, while enhancing interpretability and performances on downstream tasks. To this end, we introduce a multi-purpose framework to the representation learning community, which allows to: (i) compare different spaces in an interpretable way and measure their intrinsic similarity; (ii) find correspondences between them, both in unsupervised and weakly supervised settings, and (iii) to effectively transfer representations between distinct spaces. We validate our framework on various applications, ranging from stitching to retrieval tasks, and on multiple modalities, demonstrating that Latent Functional Maps can serve as a swiss-army knife for representation alignment.