AITopics | Genre

Plotting

Genre

Masked Hard-Attention Transformers Recognize Exactly the Star-Free Languages

Neural Information Processing SystemsMay-28-2025, 12:46:56 GMT

The expressive power of transformers over inputs of unbounded size can be studied through their ability to recognize classes of formal languages. In this paper, we establish exact characterizations of transformers with hard attention (in which all attention is focused on exactly one position) and attention masking (in which each position only attends to positions on one side). With strict masking (each position cannot attend to itself) and without position embeddings, these transformers are expressively equivalent to linear temporal logic (LTL), which defines exactly the star-free languages. A key technique is the use of Boolean RASP as a convenient intermediate language between transformers and LTL. We then take numerous results known for LTL and apply them to transformers, showing how position embeddings, strict masking, and depth all increase expressive power.

logic & formal reasoning, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.66)

Add feedback

Fully Unconstrained Online Learning

Neural Information Processing SystemsMay-28-2025, 12:46:38 GMT

T is roughly linear in T. Thus, at a high level it matches the optimal bound in all cases in which one can achieve sublinear regret.

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Genre: Research Report > Experimental Study (0.92)

Industry: Education > Educational Setting > Online (0.65)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (0.52)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents

Neural Information Processing SystemsMay-28-2025, 12:43:04 GMT

Automated scientific discovery promises to accelerate progress across scientific domains. However, developing and evaluating an AI agent's capacity for endto-end scientific reasoning is challenging as running real-world experiments is often prohibitively expensive or infeasible.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Genre:

Research Report (0.93)
Workflow (0.68)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.70)
Leisure & Entertainment > Games > Computer Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Semantic Routing via Autoregressive Modeling

Neural Information Processing SystemsMay-28-2025, 12:41:53 GMT

We study learning-based approaches to semantic route planning, which concerns producing routes in response to rich queries that specify various criteria and preferences. Semantic routing is already widely found in industry applications, especially navigational services like Google Maps; however, existing implemenations only support limited route criteria and narrow query sets as they rely on repurposing classical route optimization algorithms. We argue for a learning-based approach to semantic routing as a more scalable and general alternative. To foster interest in this important application of graph learning, we are releasing a large-scale publicly-licensed benchmark for semantic routing consisting of real-world multiobjective navigation problems--expressed via natural language queries--on the richly annotated road networks of US cities. In addition to being intractable with existing approaches to semantic routing, our benchmark poses a significant scaling challenge for graph learning methods. As a proof-of-concept, we show that--at scale--even a standard transformer network is a powerful semantic routing system and achieves non-trivial performance on our benchmark. In the process, we demonstrate a simple solution to the challenge of scaling up graph learning: an autoregressive approach that decomposes semantic routing into smaller "next-edge" prediction problems.

benchmark, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Country: North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)
Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Spiking Transformer with Experts Mixture Zhaokun Zhou

Neural Information Processing SystemsMay-28-2025, 12:41:32 GMT

Spiking Neural Networks (SNNs) provide a sparse spike-driven mechanism which is believed to be critical for energy-efficient deep learning. Mixture-of-Experts (MoE), on the other side, aligns with the brain mechanism of distributed and sparse processing, resulting in an efficient way of enhancing model capacity and conditional computation. In this work, we consider how to incorporate SNNs' spike-driven and MoE's conditional computation into a unified framework. However, MoE uses softmax to get the dense conditional weights for each expert and TopK to hard-sparsify the network, which does not fit the properties of SNNs. To address this issue, we reformulate MoE in SNNs and introduce the Spiking Experts Mixture Mechanism (SEMM) from the perspective of sparse spiking activation. Both the experts and the router output spiking sequences, and their element-wise operation makes SEMM computation spike-driven and dynamic sparse-conditional. By developing SEMM into Spiking Transformer, the Experts Mixture Spiking Attention (EMSA) and the Experts Mixture Spiking Perceptron (EMSP) are proposed, which performs routing allocation for head-wise and channel-wise spiking experts, respectively. Experiments show that SEMM realizes sparse conditional computation and obtains a stable improvement on neuromorphic and static datasets with approximate computational overhead based on the Spiking Transformer baselines.

artificial intelligence, deep learning, machine learning, (19 more...)

Neural Information Processing Systems

Country: Asia > China (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Resource-Aware Federated Self-Supervised Learning with Global Class Representations

Neural Information Processing SystemsMay-28-2025, 12:38:40 GMT

Due to the heterogeneous architectures and class skew, the global representation models training in resource-adaptive federated self-supervised learning face with tricky challenges: deviated representation abilities and inconsistent representation spaces. In this work, we are the first to propose a multi-teacher knowledge distillation framework, namely FedMKD, to learn global representations with whole class knowledge from heterogeneous clients even under extreme class skew. Firstly, the adaptive knowledge integration mechanism is designed to learn better representations from all heterogeneous models with deviated representation abilities. Then the weighted combination of the self-supervised loss and the distillation loss can support the global model to encode all classes from clients into a unified space. Besides, the global knowledge anchored alignment module can make the local representation spaces close to the global spaces, which further improves the representation abilities of local ones. Finally, extensive experiments conducted on two datasets demonstrate the effectiveness of FedMKD which outperforms state-of-the-art baselines 4.78% under linear evaluation on average.

artificial intelligence, machine learning, representation, (19 more...)

Neural Information Processing Systems

Country:

Asia > China (0.46)
North America > United States (0.46)

Genre: Research Report > Experimental Study (0.93)

Industry:

Education (0.82)
Information Technology (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.71)

Add feedback

The Road Less Scheduled

Neural Information Processing SystemsMay-28-2025, 12:38:21 GMT

Existing learning rate schedules that do not require specification of the optimization stopping step T are greatly out-performed by learning rate schedules that depend on T. We propose an approach that avoids the need for this stopping time by eschewing the use of schedules entirely, while exhibiting state-of-the-art performance compared to schedules across a wide family of problems ranging from convex problems to large-scale deep learning problems. Our Schedule-Free approach introduces no additional hyper-parameters over standard optimizers with momentum. Our method is a direct consequence of a new theory we develop that unifies scheduling and iterate averaging.

artificial intelligence, averaging, machine learning, (20 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.67)

Industry: Education (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

CRAYM: Neural Field Optimization via Camera RAY Matching 1

Neural Information Processing SystemsMay-28-2025, 12:37:08 GMT

We introduce camera ray matching (CRAYM) into the joint optimization of camera poses and neural fields from multi-view images. The optimized field, referred to as a feature volume, can be "probed" by the camera rays for novel view synthesis (NVS) and 3D geometry reconstruction. One key reason for matching camera rays, instead of pixels as in prior works, is that the camera rays can be parameterized by the feature volume to carry both geometric and photometric information. Multi-view consistencies involving the camera rays and scene rendering can be naturally integrated into the joint optimization and network training, to impose physically meaningful constraints to improve the final quality of both the geometric reconstruction and photorealistic rendering. We formulate our per-ray optimization and matched ray coherence by focusing on camera rays passing through keypoints in the input images to elevate both the efficiency and accuracy of scene correspondences. Accumulated ray features along the feature volume provide a means to discount the coherence constraint amid erroneous ray matching. We demonstrate the effectiveness of CRAYM for both NVS and geometry reconstruction, over dense-or sparse-view settings, with qualitative and quantitative comparisons to state-of-the-art alternatives.

artificial intelligence, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Country:

Asia (0.67)
North America > United States > Texas (0.14)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Marginal Causal Flows for Validation and Inference

Neural Information Processing SystemsMay-28-2025, 12:36:47 GMT

Investigating the marginal causal effect of an intervention on an outcome from complex data remains challenging due to the inflexibility of employed models and the lack of complexity in causal benchmark datasets, which often fail to reproduce intricate real-world data patterns. In this paper we introduce Frugal Flows, a novel likelihood-based machine learning model that uses normalising flows to flexibly learn the data-generating process, while also directly inferring the marginal causal quantities from observational data. We propose that these models are exceptionally well suited for generating synthetic data to validate causal methods. They can create synthetic datasets that closely resemble the empirical dataset, while automatically and exactly satisfying a user-defined average treatment effect. To our knowledge, Frugal Flows are the first generative model to both learn flexible data representations and also exactly parameterise quantities such as the average treatment effect and the degree of unobserved confounding. We demonstrate the above with experiments on both simulated and real-world datasets.

artificial intelligence, dataset, machine learning, (19 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Add feedback

Online Budgeted Matching with General Bids Pengfei Li University of Houston University of California, Riverside Houston, TX, USA

Neural Information Processing SystemsMay-28-2025, 12:33:30 GMT

Online Budgeted Matching (OBM) is a classic problem with important applications in online advertising, online service matching, revenue management, and beyond. Traditional online algorithms typically assume a small bid setting, where the maximum bid-to-budget ratio (κ) is infinitesimally small. While recent algorithms have tried to address scenarios with non-small or general bids, they often rely on the Fractional Last Matching (FLM) assumption, which allows for accepting partial bids when the remaining budget is insufficient. This assumption, however, does not hold for many applications with indivisible bids. In this paper, we remove the FLM assumption and tackle the open problem of OBM with general bids. We first establish an upper bound of 1 κ on the competitive ratio for any deterministic online algorithm. We then propose a novel meta algorithm, called MetaAd, which reduces to different algorithms with first known provable competitive ratios parameterized by the maximum bid-to-budget ratio κ [0, 1]. As a by-product, we extend MetaAd to the FLM setting and get provable competitive algorithms. Finally, we apply our competitive analysis to the design learningaugmented algorithms.

artificial intelligence, cloud computing, machine learning, (16 more...)

Neural Information Processing Systems

Country: