AITopics

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsFeb-8-2026, 22:00:14 GMT

7250eb93b3c18cc9daa29cf58af7a004-Supplemental.pdf

executor, hyper-parameter, sensor, (16 more...)

Country: North America > Canada (0.05)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.70)

Neural Information Processing SystemsOct-9-2025, 00:56:56 GMT

8bc74514d554a90c996576f6c373f5f3-Paper-Conference.pdf

artificial intelligence, machine learning, trajectory, (17 more...)

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsOct-3-2025, 05:47:09 GMT

Appendix 1 Goal generation for executor training

The pseudo goal generation is introduced for training the executor without coordinator. The scripted policy is allowed to access the grounded state, e.g. the absolute position Note that it is not the optimal policy for the executor, it will fail when two targets are far. The notations used here are defined as follows. The objective is to maximize the number of covered targets. After formulation, we can solve the target coverage problem as an ILP problem with CBC optimizer. Then, the primitive actions for all the sensors can be derived from the results of ILP shown as Tab. 1.

artificial intelligence, executor, sensor, (17 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.70)

Kabayama, Tempei, Komuro, Motomasa, Kuniyoshi, Yasuo, Aihara, Kazuyuki, Nakajima, Kohei

Attractor-merging Crises and Intermittency in Reservoir Computing

arXiv.org Artificial IntelligenceSep-19-2025

Reservoir computing can embed attractors into random neural networks (RNNs), generating a ``mirror'' of a target attractor because of its inherent symmetrical constraints. In these RNNs, we report that an attractor-merging crisis accompanied by intermittency emerges simply by adjusting the global parameter. We further reveal its underlying mechanism through a detailed analysis of the phase-space structure and demonstrate that this bifurcation scenario is intrinsic to a general class of RNNs, independent of training data.

artificial intelligence, attractor, machine learning, (19 more...)

doi: 10.1103/vqvp-mbxx

2504.12695

Country:

Europe (0.68)
Asia > Japan > Honshū (0.15)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area (0.46)
Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Shihab, Ibne Farabi, Akter, Sanjeda, Sharma, Anuj

What Fundamental Structure in Reward Functions Enables Efficient Sparse-Reward Learning?

arXiv.org Artificial IntelligenceSep-10-2025

Sparse-reward reinforcement learning (RL) remains fundamentally hard: without structure, any agent needs Ω(|S||A|/p) samples to recover rewards. We introduce Policy-A ware Matrix Completion (P AMC) as a first concrete step toward a structural reward learning framework. Our key idea is to exploit approximate low-rank + sparse structure in the reward matrix, under policy-biased (MNAR) sampling. We prove recovery guarantees with inverse-propensity weighting, and establish a visitation-weighted error-to-regret bound linking completion error to control performance. Importantly, when assumptions weaken, P AMC degrades gracefully: confidence intervals widen and the algorithm abstains, ensuring safe fallback to exploration. Empirically, P AMC improves sample efficiency across Atari-26 (10M steps), DM Control, MetaWorld MT50, D4RL offline RL, and preference-based RL benchmarks, outperforming DrQ-v2, DreamerV3, Agent57, T -REX/D-REX, and PrefPPO under compute-normalized comparisons. Our results highlight P AMC as a practical and principled tool when structural rewards exist, and as a concrete first instantiation of a broader structural reward learning perspective. What fundamental properties of reward functions determine the sample complexity of reinforcement learning?

artificial intelligence, machine learning, reinforcement learning, (17 more...)

2509.0379

Country: North America > United States > Iowa (0.15)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceMay-27-2025

Temporal Sampling for Forgotten Reasoning in LLMs

Li, Yuetai, Xu, Zhangchen, Jiang, Fengqing, Ramasubramanian, Bhaskar, Niu, Luyao, Lin, Bill Yuchen, Yue, Xiang, Poovendran, Radha

Fine-tuning large language models (LLMs) is intended to improve their reasoning capabilities, yet we uncover a counterintuitive effect: models often forget how to solve problems they previously answered correctly during training. We term this phenomenon T emporal F orgettingand show that it is widespread across model sizes, fine-tuning methods (both Reinforcement Learning and Supervised Fine-Tuning), and multiple reasoning benchmarks. Our analysis reveals that 6.4% to 56.1% of final errors were once solved correctly at an earlier checkpoint. Inspired by the phenomenon of Temporal Forgetting, we proposed T emporal Sampling, a simple decoding strategy that draws outputs from multiple checkpoints along the training trajectory. This approach recovers forgotten solutions without retraining or ensembling, and leads to significant improvements in reasoning performance, gains from 4 to 19 points in Pass@ k and consistent gains for majority-voting and Best-of-N across several benchmarks. To make Temporal Sampling deployment-friendly, we extend it to LoRA-adapted models. By leveraging the temporal diversity inherent in training, Temporal Sampling offers a practical, compute-efficient way to surface hidden reasoning ability and rethink how we evaluate LLMs.Figure 1: (a) We observed that during RL training process of Deepseek-R1-1.5B model, 76.7% of AIME problems were solved correctly at some intermediate checkpoint, yet only 30% remained correct in the final model.

large language model, machine learning, natural language, (18 more...)

2505.20196

Genre: Research Report (1.00)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

arXiv.org Artificial IntelligenceMar-11-2025

Revolution of Wireless Signal Recognition for 6G: Recent Advances, Challenges and Future Directions

Zhang, Hao, Zhou, Fuhui, Du, Hongyang, Wu, Qihui, Yuen, Chau

Wireless signal recognition (WSR) is a crucial technique for intelligent communications and spectrum sharing in the next six-generation (6G) wireless communication networks. It can be utilized to enhance network performance and efficiency, improve quality of service (QoS), and improve network security and reliability. Additionally, WSR can be applied for military applications such as signal interception, signal race, and signal abduction. In the past decades, great efforts have been made for the research of WSR. Earlier works mainly focus on model-based methods, including likelihood-based (LB) and feature-based (FB) methods, which have taken the leading position for many years. With the emergence of artificial intelligence (AI), intelligent methods including machine learning-based (ML-based) and deep learning-based (DL-based) methods have been developed to extract the features of the received signals and perform the classification. In this work, we provide a comprehensive review of WSR from the view of applications, main tasks, recent advances, datasets and evaluation metrics, challenges, and future directions. Specifically, intelligent WSR methods are introduced from the perspective of model, data, learning and implementation. Moreover, we analyze the challenges for WSR from the view of complex, dynamic, and open 6G wireless environments and discuss the future directions for WSR. This survey is expected to provide a comprehensive overview of the state-of-the-art WSR techniques and inspire new research directions for WSR in 6G networks.

classification, modulation classification, neural network, (14 more...)

2503.08091

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Asia > China > Hong Kong (0.04)
Asia > Singapore (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Telecommunications (1.00)
Leisure & Entertainment (1.00)
Information Technology > Security & Privacy (1.00)
Education (1.00)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(4 more...)

Maene, Jaron, De Raedt, Luc

The Gradient of Algebraic Model Counting

arXiv.org Artificial IntelligenceFeb-25-2025

Algebraic model counting unifies many inference tasks on logic formulas by exploiting semirings. Rather than focusing on inference, we consider learning, especially in statistical-relational and neurosymbolic AI, which combine logical, probabilistic and neural representations. Concretely, we show that the very same semiring perspective of algebraic model counting also applies to learning. This allows us to unify various learning algorithms by generalizing gradients and backpropagation to different semirings. Furthermore, we show how cancellation and ordering properties of a semiring can be exploited for more memory-efficient backpropagation. This allows us to obtain some interesting variations of state-of-the-art gradient-based optimisation methods for probabilistic logical models. We also discuss why algebraic model counting on tractable circuits does not lead to more efficient second-order optimization. Empirically, our algebraic backpropagation exhibits considerable speed-ups as compared to existing approaches.

algorithm, amc, gradient, (15 more...)

2502.18406

Country:

Asia > Middle East > Jordan (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(4 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.94)

arXiv.org Artificial IntelligenceAug-9-2024

Meta-Learning Guided Label Noise Distillation for Robust Signal Modulation Classification

Hao, Xiaoyang, Feng, Zhixi, Peng, Tongqing, Yang, Shuyuan

Automatic modulation classification (AMC) is an effective way to deal with physical layer threats of the internet of things (IoT). However, there is often label mislabeling in practice, which significantly impacts the performance and robustness of deep neural networks (DNNs). In this paper, we propose a meta-learning guided label noise distillation method for robust AMC. Specifically, a teacher-student heterogeneous network (TSHN) framework is proposed to distill and reuse label noise. Based on the idea that labels are representations, the teacher network with trusted meta-learning divides and conquers untrusted label samples and then guides the student network to learn better by reassessing and correcting labels. Furthermore, we propose a multi-view signal (MVS) method to further improve the performance of hard-to-classify categories with few-shot trusted label samples. Extensive experimental results show that our methods can significantly improve the performance and robustness of signal AMC in various and complex label noise scenarios, which is crucial for securing IoT applications.

label noise, noise, radioml2016, (14 more...)

2408.05151

Genre: Research Report (0.70)

Industry: Information Technology > Smart Houses & Appliances (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)