Plotting

LESS: Label-Efficient and Single-Stage Referring 3D Segmentation

Neural Information Processing Systems

Referring 3D Segmentation is a visual-language task that segments all points of the specified object from a 3D point cloud described by a sentence of query. Previous works perform a two-stage paradigm, first conducting language-agnostic instance segmentation then matching with given text query. However, the semantic concepts from text query and visual cues are separately interacted during the training, and both instance and semantic labels for each object are required, which is time consuming and human-labor intensive. To mitigate these issues, we propose a novel Referring 3D Segmentation pipeline, Label-Efficient and Single-Stage, dubbed LESS, which is only under the supervision of efficient binary mask.



A Simple and Optimal Approach for Universal Online Learning with Gradient Variations

Neural Information Processing Systems

We investigate the problem of universal online learning with gradient-variation regret. Universal online learning aims to achieve regret guarantees without the prior knowledge of the curvature of the online functions. Moreover, we study the problem-dependent gradient-variation regret as it plays a crucial role in bridging stochastic and adversarial optimization as well as game theory. In this work, we design a universal approach with the optimal gradient-variation regret simultaneously for strongly convex, exp-concave, and convex functions, thus addressing an open problem highlighted by Yan et al. [2023]. Our approach is simple since it is algorithmically efficient-to-implement with a two-layer online ensemble structure and only 1 gradient query per round, and theoretically easy-to-analyze with a novel and alternative analysis to the gradient-variation regret. Concretely, previous works on gradient variations require controlling the algorithmic stability, which is challenging and leads to sub-optimal regret and less efficient algorithm design. Our analysis overcomes this issue by using a Bregman divergence negative term from linearization and a useful smoothness property.


Opera teases Neon, its first agentic browser - join the waitlist to try it

ZDNet

Opera has announced that it will release one of the first agentic browsers. Called Neon, this browser will serve two purposes: use AI to automatically perform tasks for users, and serve as an AI playground for future development. Also: Opera's Android browser just got a major tab management upgrade The purpose of an agentic browser is to use AI agents to understand user intent, automate complex tasks, and act on behalf of the user. Agentic browsers move from being a passive tool to access information to an active digital assistant. This isn't the first time Opera has released a browser called Neon.



AVOIDDS: Aircraft Vision-based Intruder Detection Dataset and Simulator

Neural Information Processing Systems

Designing robust machine learning systems remains an open problem, and there is a need for benchmark problems that cover both environmental changes and evaluation on a downstream task. In this work, we introduce AVOIDDS, a realistic object detection benchmark for the vision-based aircraft detect-and-avoid problem. We provide a labeled dataset consisting of 72,000 photorealistic images of intruder aircraft with various lighting conditions, weather conditions, relative geometries, and geographic locations. We also provide an interface that evaluates trained models on slices of this dataset to identify changes in performance with respect to changing environmental conditions. Finally, we implement a fullyintegrated, closed-loop simulator of the vision-based detect-and-avoid problem to evaluate trained models with respect to the downstream collision avoidance task. This benchmark will enable further research in the design of robust machine learning systems for use in safety-critical applications.


Simplicity Bias in 1-Hidden Layer Neural Networks

Neural Information Processing Systems

Recent works (Shah et al., 2020; Chen et al., 2021) have demonstrated that neural networks exhibit extreme simplicity bias (SB). That is, they learn only the simplest features to solve a task at hand, even in the presence of other, more robust but more complex features. Due to the lack of a general and rigorous definition of features, these works showcase SB on semi-synthetic datasets such as Color-MNIST, MNIST-CIFAR where defining features is relatively easier. In this work, we rigorously define as well as thoroughly establish SB for one hidden layer neural networks. More concretely, (i) we define SB as the network essentially being a function of a low dimensional projection of the inputs (ii) theoretically, in the infinite width regime, we show that when the data is linearly separable, the network primarily depends on only the linearly separable (1-dimensional) subspace even in the presence of an arbitrarily large number of other, more complex features which could have led to a significantly more robust classifier, (iii) empirically, we show that models trained on real datasets such as Imagenet and Waterbirds-Landbirds indeed depend on a low dimensional projection of the inputs, thereby demonstrating SB on these datasets, iv) finally, we present a natural ensemble approach that encourages diversity in models by training successive models on features not used by earlier models, and demonstrate that it yields models that are significantly more robust to Gaussian noise.



Understanding the Expressivity and Trainability of Fourier Neural Operator: A Mean-Field Perspective

Neural Information Processing Systems

In this paper, we explores the expressivity and trainability of the Fourier Neural Operator (FNO). We establish a mean-field theory for the FNO, analyzing the behavior of the random FNO from an edge of chaos perspective. Our investigation into the expressivity of a random FNO involves examining the ordered-chaos phase transition of the network based on the weight distribution. This phase transition demonstrates characteristics unique to the FNO, induced by mode truncation, while also showcasing similarities to those of densely connected networks. Furthermore, we identify a connection between expressivity and trainability: the ordered and chaotic phases correspond to regions of vanishing and exploding gradients, respectively. This finding provides a practical prerequisite for the stable training of the FNO. Our experimental results corroborate our theoretical findings.


Adversarial Resilience in Sequential Prediction via Abstention

Neural Information Processing Systems

We study the problem of sequential prediction in the stochastic setting with an adversary that is allowed to inject clean-label adversarial (or out-of-distribution) examples. Algorithms designed to handle purely stochastic data tend to fail in the presence of such adversarial examples, often leading to erroneous predictions. This is undesirable in many high-stakes applications such as medical recommendations, where abstaining from predictions on adversarial examples is preferable to misclassification. On the other hand, assuming fully adversarial data leads to very pessimistic bounds that are often vacuous in practice. To move away from these pessimistic guarantees, we propose a new model of sequential prediction that sits between the purely stochastic and fully adversarial settings by allowing the learner to abstain from making a prediction at no cost on adversarial examples, thereby asking the learner to make predictions with certainty. Assuming access to the marginal distribution on the non-adversarial examples, we design a learner whose error scales with the VC dimension (mirroring the stochastic setting) of the hypothesis class, as opposed to the Littlestone dimension which characterizes the fully adversarial setting. Furthermore, we design learners for VC dimension 1 classes and the class of axis-aligned rectangles, which work even in the absence of access to the marginal distribution. Our key technical contribution is a novel measure for quantifying uncertainty for learning VC classes, which may be of independent interest.