AITopics

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.60)

Neural Information Processing SystemsApr-24-2026, 04:44:12 GMT

Self-Supervised Motion Magnification by Backpropagating Through Optical Flow

This paper presents a simple, self-supervised method for magnifying subtle motions in video: given an input video and a magnification factor, we manipulate the video such that its new optical flow is scaled by the desired amount. To train our model, we propose a loss function that estimates the optical flow of the generated video and penalizes how far if deviates from the given magnification factor. Thus, training involves differentiating through a pretrained optical flow network. Since our model is self-supervised, we can further improve its performance through test-time adaptation, by finetuning it on the input video. It can also be easily extended to magnify the motions of only user-selected objects. Our approach avoids the need for synthetic magnification datasets that have been used to train prior learning-based approaches.

artificial intelligence, machine learning, video, (14 more...)

Country:

Europe (1.00)
North America > United States (0.92)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsFeb-10-2026, 12:32:10 GMT

a7f0d2b95c60161b3f3c82f764b1d1c9-Supplemental.pdf

agent, reward function, xp rd, (13 more...)

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Saarland > Saarbrücken (0.04)

Genre: Research Report (0.67)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Neural Information Processing SystemsNov-20-2025, 05:27:04 GMT

Physics-based Zero-Shot Video Generation Antonio Montanaro

However, it seems that including the dimension of time remains challenging.

large language model, machine learning, natural language, (21 more...)

Country:

North America > United States (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > Finland (0.04)
Africa (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Government (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(2 more...)

Neural Information Processing SystemsOct-10-2025, 19:02:34 GMT

Physics-based Zero-Shot Video Generation Antonio Montanaro

However, it seems that including the dimension of time remains challenging.

optical flow, simulation, video, (17 more...)

Country:

North America > United States (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > Finland (0.04)
Africa (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Government (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(2 more...)

Beron-Vera, Francisco J., Bonner, Gage

Discovering the dynamics of \emph{Sargassum} rafts' centers of mass

arXiv.org Artificial IntelligenceJul-28-2025

Since 2011, rafts of floating \emph{Sargassum} seaweed have frequently obstructed the coasts of the Intra-Americas Seas. The motion of the rafts is represented by a high-dimensional nonlinear dynamical system. Referred to as the eBOMB model, this builds on the Maxey--Riley equation by incorporating interactions between clumps of \emph{Sargassum} forming a raft and the effects of Earth's rotation. The absence of a predictive law for the rafts' centers of mass suggests a need for machine learning. In this paper, we evaluate and contrast Long Short-Term Memory (LSTM) Recurrent Neural Networks (RNNs) and Sparse Identification of Nonlinear Dynamics (SINDy). In both cases, a physics-inspired closure modeling approach is taken rooted in eBOMB. Specifically, the LSTM model learns a mapping from a collection of eBOMB variables to the difference between raft center-of-mass and ocean velocities. The SINDy model's library of candidate functions is suggested by eBOMB variables and includes windowed velocity terms incorporating far-field effects of the carrying flow. Both LSTM and SINDy models perform most effectively in conditions with tightly bonded clumps, despite declining precision with rising complexity, such as with wind effects and when assessing loosely connected clumps. The LSTM model delivered the best results when designs were straightforward, with fewer neurons and hidden layers. While LSTM model serves as an opaque black-box model lacking interpretability, the SINDy model brings transparency by discerning explicit functional relationships through the function libraries. Integration of the windowed velocity terms enabled effective modeling of nonlocal interactions, particularly in datasets featuring sparsely connected rafts.

artificial intelligence, machine learning, trajectory, (18 more...)

2507.18771

Country: North America > United States > Wisconsin (0.28)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJun-12-2025

ProteinZero: Self-Improving Protein Generation via Online Reinforcement Learning

Wang, Ziwen, Fan, Jiajun, Guo, Ruihan, Nguyen, Thao, Ji, Heng, Liu, Ge

Protein generative models have shown remarkable promise in protein design but still face limitations in success rate, due to the scarcity of high-quality protein datasets for supervised pretraining. We present ProteinZero, a novel framework that enables scalable, automated, and continuous self-improvement of the inverse folding model through online reinforcement learning. To achieve computationally tractable online feedback, we introduce efficient proxy reward models based on ESM-fold and a novel rapid ddG predictor that significantly accelerates evaluation speed. ProteinZero employs a general RL framework balancing multi-reward maximization, KL-divergence from a reference model, and a novel protein-embedding level diversity regularization that prevents mode collapse while promoting higher sequence diversity. Through extensive experiments, we demonstrate that ProteinZero substantially outperforms existing methods across every key metric in protein design, achieving significant improvements in structural accuracy, designability, thermodynamic stability, and sequence diversity. Most impressively, ProteinZero reduces design failure rates by approximately 36% - 48% compared to widely-used methods like ProteinMPNN, ESM-IF and InstructPLM, consistently achieving success rates exceeding 90% across diverse and complex protein folds. Notably, the entire RL run on CATH-4.3 can be done with a single 8 X GPU node in under 3 days, including reward computation. Our work establishes a new paradigm for protein design where models evolve continuously from their own generated outputs, opening new possibilities for exploring the vast protein design space.

machine learning, natural language, reinforcement learning, (18 more...)

2506.07459

Country: North America > United States > Illinois (0.15)

Genre:

Research Report > New Finding (1.00)
Instructional Material > Online (0.61)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Education (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Shi, Luyao, Kazda, Michael, Schmitter, Charles, Gupta, Hemlata

Improving LLM-Powered EDA Assistants with RAFT

arXiv.org Artificial IntelligenceJun-10-2025

Electronic design engineers often struggle to efficiently access relevant information for tasks like design verification and technology development. While large language models (LLMs) can enhance productivity as conversational agents, pre-trained open-source LLMs lack domain-specific knowledge for Electronic Design Automation (EDA). In a Retrieval-Augmented Generation (RAG) context, LLMs rely on external context but may still produce inaccurate responses. Retrieval-Augmented Fine-Tuning (RAFT) improves LLM performance, but acquiring labeled question/answer (Q/A) data in EDA is difficult. To address this, we propose using synthetic Q/A datasets to enhance LLMs with RAFT. Our results show that RAFT with synthetic data significantly boosts LLM performance for RAG-based EDA tasks. We also investigate the impact of using real user questions as Retrieval-Augmented Few-Shot (RAFS) examples for synthetic data generation. Additionally, we implement secure access control to ensure sensitive information is only accessible to authorized personnel. Finally, we assess the risk of data leakage and unintended memorization during fine-tuning with synthetic data, providing practical insights.

information, large language model, machine learning, (20 more...)

2506.065

Country: North America > United States (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

arXiv.org Artificial IntelligenceJun-2-2025

Rationales Are Not Silver Bullets: Measuring the Impact of Rationales on Model Performance and Reliability

Zhu, Chiwei, Xu, Benfeng, Yang, An, Lin, Junyang, Wang, Quan, Zhou, Chang, Mao, Zhendong

Training language models with rationales augmentation has been shown to be beneficial in many existing works. In this paper, we identify that such a prevailing view does not hold consistently. We conduct comprehensive investigations to thoroughly inspect the impact of rationales on model performance as well as a novel perspective of model reliability. The results lead to several key findings that add new insights upon existing understandings: 1) Rationales can, at times, deteriorate model performance; 2) Rationales can, at times, improve model reliability, even outperforming their untrained counterparts; 3) A linear correspondence exists in between the performance and reliability improvements, while both are driven by the intrinsic difficulty of the task. These findings provide informative regulations on the broad utilization of rationales and raise critical implications on the procedure of explicitly aligning language models with implicit human thoughts. Codes can be found at https://github.com/Ignoramus0817/rationales.

large language model, machine learning, natural language, (20 more...)

2505.24147

Country:

Europe (1.00)
North America > United States (0.93)
Asia (0.67)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.71)

arXiv.org Machine LearningApr-15-2025

A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce

Xiong, Wei, Yao, Jiarui, Xu, Yuhui, Pang, Bo, Wang, Lei, Sahoo, Doyen, Li, Junnan, Jiang, Nan, Zhang, Tong, Xiong, Caiming, Dong, Hanze

We investigate reinforcement learning (RL) algorithms in the context of fine-tuning large language models (LLMs) with verifiable rewards. Our focus is on mathematical reasoning tasks, which have recently received significant attention following the release of models such as OpenAI's O1 Model (Jaech et al., 2024) and DeepSeek-R1 (DeepSeek-AI et al., 2025). The dominant approach in LLM post-training has been Proximal Policy Optimization (PPO) (Schulman et al., 2017; Bai et al., 2022; Ouyang et al., 2022). However, PPO requires an additional critic network beyond the vanilla Reinforce algorithm (Williams and Peng, 1991), introducing both computational overhead and algorithmic complexity. Meanwhile, the deterministic transition nature of LLM also simplifies the problem with a relatively lower variance, many of PPO's sophisticated components may be unnecessary in this setting. This observation has inspired growing interest in designing simpler yet effective RL algorithms for post-training LLMs. Several recent works revisit Reinforce-style approaches, including ReMax (Li et al., 2023), RLOO (Ahma-dian et al., 2024; Kool et al., 2019), GRPO (Shao et al., 2024), and Reinforce++ (Hu, 2025). In parallel, other methods explore different directions beyond policy gradients. Reward-ranked fine-tuning (RAFT) (Anthony et al., 2017; Dong et al., 2023) iteratively generates n responses per prompt, filter out those with incorrect answers, and fine-tune the LLM on the remaining accepted samples.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

arXiv.org Machine Learning

2504.11343

Country: North America > United States > Illinois > Champaign County > Urbana (0.04)

Genre: Research Report (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)