AITopics

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Research Report > Promising Solution (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsJun-12-2026, 19:31:54 GMT

MISA: Memory-Efficient LLMs Optimization with Module-wise Importance Sampling

The substantial memory demands of pre-training and fine-tuning large language models (LLMs) require memory-efficient optimization algorithms. One promising approach is layer-wise optimization, which treats each transformer block as a single layer and optimizes it sequentially, while freezing the other layers to save optimizer states and activations. Although effective, these methods ignore the varying importance of the modules within each layer, leading to suboptimal performance. Moreover, layer-wise sampling provides only limited memory savings, as at least one full layer must remain active during optimization. To overcome these limitations, we propose **M**odule-wise **I**mportance **SA**mpling (**MISA**), a novel method that divides each layer into smaller modules and assigns importance scores to each module. MISA uses a weighted random sampling mechanism to activate modules, provably reducing gradient variance compared to layer-wise sampling. Additionally, we establish an $\mathcal{O}(1/\sqrt{K})$ convergence rate under non-convex and stochastic conditions, where $K$ is the total number of training steps, and provide a detailed memory analysis showcasing MISA's superiority over existing baseline methods.

artificial intelligence, large language model, natural language, (6 more...)

Genre: Research Report > Promising Solution (0.60)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.63)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.60)

Neural Information Processing SystemsFeb-19-2026, 07:58:17 GMT

3c6bd2021c10462c5164638d22f3d5d8-Paper-Conference.pdf

estimation, misa, mutual information, (10 more...)

Country: Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Neural Information Processing SystemsDec-24-2025, 19:07:46 GMT

Mutual Information Regularized Offline Reinforcement Learning

The major challenge of offline RL is the distribution shift that appears when out-of-distribution actions are queried, which makes the policy improvement direction biased by extrapolation errors. Most existing methods address this problem by penalizing the policy or value for deviating from the behavior policy during policy improvement or evaluation. In this work, we propose a novel MISA framework to approach offline RL from the perspective of Mutual Information between States and Actions in the dataset by directly constraining the policy improvement direction.

artificial intelligence, machine learning, proceedings, (6 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.59)

Neural Information Processing SystemsOct-8-2025, 12:30:26 GMT

Mutual Information Regularized Offline Reinforcement Learning

We show that optimizing this lower bound is equivalent to maximizing the likelihood of a one-step improved policy on the offline dataset. Hence, we constrain the policy improvement direction to lie in the data manifold.

estimation, misa, mutual information, (10 more...)

Country: Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

arXiv.org Artificial IntelligenceSep-4-2025

BioMD: All-atom Generative Model for Biomolecular Dynamics Simulation

Feng, Bin, Zhang, Jiying, Zhang, Xinni, Liu, Zijing, Li, Yu

Molecular dynamics (MD) simulations are essential tools in computational chemistry and drug discovery, offering crucial insights into dynamic molecular behavior. However, their utility is significantly limited by substantial computational costs, which severely restrict accessible timescales for many biologically relevant processes. Despite the encouraging performance of existing machine learning (ML) methods, they struggle to generate extended biomolecular system trajectories, primarily due to the lack of MD datasets and the large computational demands of modeling long historical trajectories. Here, we introduce BioMD, the first all-atom generative model to simulate long-timescale protein-ligand dynamics using a hierarchical framework of forecasting and interpolation. We demonstrate the effectiveness and versatility of BioMD on the DD-13M (ligand unbinding) and MISATO datasets. For both datasets, BioMD generates highly realistic conformations, showing high physical plausibility and low reconstruction errors. Besides, BioMD successfully generates ligand unbinding paths for 97.1% of the protein-ligand systems within ten attempts, demonstrating its ability to explore critical unbinding pathways. Collectively, these results establish BioMD as a tool for simulating complex biomolecular processes, offering broad applicability for computational chemistry and drug discovery.

machine learning, natural language, trajectory, (19 more...)

2509.02642

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Neural Information Processing SystemsOct-11-2024, 10:56:47 GMT

Mutual Information Regularized Offline Reinforcement Learning

The major challenge of offline RL is the distribution shift that appears when out-of-distribution actions are queried, which makes the policy improvement direction biased by extrapolation errors. Most existing methods address this problem by penalizing the policy or value for deviating from the behavior policy during policy improvement or evaluation. In this work, we propose a novel MISA framework to approach offline RL from the perspective of Mutual Information between States and Actions in the dataset by directly constraining the policy improvement direction. We show that optimizing this lower bound is equivalent to maximizing the likelihood of a one-step improved policy on the offline dataset. Hence, we constrain the policy improvement direction to lie in the data manifold.

information regularized offline reinforcement learning, offline rl, policy improvement direction, (2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.57)

arXiv.org Artificial IntelligenceDec-19-2023

MISA: Unveiling the Vulnerabilities in Split Federated Learning

Wan, Wei, Ning, Yuxuan, Hu, Shengshan, Xue, Lulu, Li, Minghui, Zhang, Leo Yu, Jin, Hai

\textit{Federated learning} (FL) and \textit{split learning} (SL) are prevailing distributed paradigms in recent years. They both enable shared global model training while keeping data localized on users' devices. The former excels in parallel execution capabilities, while the latter enjoys low dependence on edge computing resources and strong privacy protection. \textit{Split federated learning} (SFL) combines the strengths of both FL and SL, making it one of the most popular distributed architectures. Furthermore, a recent study has claimed that SFL exhibits robustness against poisoning attacks, with a fivefold improvement compared to FL in terms of robustness. In this paper, we present a novel poisoning attack known as MISA. It poisons both the top and bottom models, causing a \textbf{\underline{misa}}lignment in the global model, ultimately leading to a drastic accuracy collapse. This attack unveils the vulnerabilities in SFL, challenging the conventional belief that SFL is robust against poisoning attacks. Extensive experiments demonstrate that our proposed MISA poses a significant threat to the availability of SFL, underscoring the imperative for academia and industry to accord this matter due attention.

acc drop, bottom model, poisoning attack, (16 more...)

2312.11026

Country: Asia > China > Hubei Province (0.04)

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

arXiv.org Artificial IntelligenceNov-5-2023

Mutual Information Regularized Offline Reinforcement Learning

Ma, Xiao, Kang, Bingyi, Xu, Zhongwen, Lin, Min, Yan, Shuicheng

The major challenge of offline RL is the distribution shift that appears when out-of-distribution actions are queried, which makes the policy improvement direction biased by extrapolation errors. Most existing methods address this problem by penalizing the policy or value for deviating from the behavior policy during policy improvement or evaluation. In this work, we propose a novel MISA framework to approach offline RL from the perspective of Mutual Information between States and Actions in the dataset by directly constraining the policy improvement direction. MISA constructs lower bounds of mutual information parameterized by the policy and Q-values. We show that optimizing this lower bound is equivalent to maximizing the likelihood of a one-step improved policy on the offline dataset. Hence, we constrain the policy improvement direction to lie in the data manifold. The resulting algorithm simultaneously augments the policy evaluation and improvement by adding mutual information regularizations. MISA is a general framework that unifies conservative Q-learning (CQL) and behavior regularization methods (e.g., TD3+BC) as special cases. We introduce 3 different variants of MISA, and empirically demonstrate that tighter mutual information lower bound gives better offline RL performance. In addition, our extensive experiments show MISA significantly outperforms a wide range of baselines on various tasks of the D4RL benchmark,e.g., achieving 742.9 total points on gym-locomotion tasks. Our code is available at https://github.com/sail-sg/MISA.

estimation, misa, mutual information, (10 more...)

2210.07484

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Artificial IntelligenceOct-31-2023

Compliant actuators that mimic biological muscle performance with applications in a highly biomimetic robotic arm

Yang, Haosen, Wei, Guowu, Ren, Lei, Yan, Lingyun

This paper endeavours to bridge the existing gap in muscular actuator design for ligament-skeletal-inspired robots, thereby fostering the evolution of these robotic systems. We introduce two novel compliant actuators, namely the Internal Torsion Spring Compliant Actuator (ICA) and the External Spring Compliant Actuator (ECA), and present a comparative analysis against the previously conceived Magnet Integrated Soft Actuator (MISA) through computational and experimental results. These actuators, employing a motor-tendon system, emulate biological muscle-like forms, enhancing artificial muscle technology. A robotic arm application inspired by the skeletal ligament system is presented. Experiments demonstrate satisfactory power in tasks like lifting dumbbells (peak power: 36W), playing table tennis (end-effector speed: 3.2 m/s), and door opening, without compromising biomimetic aesthetics. Compared to other linear stiffness serial elastic actuators (SEAs), ECA and ICA exhibit high power-to-volume (361 x 10^3 W/m) and power-to-mass (111.6 W/kg) ratios respectively, endorsing the biomimetic design's promise in robotic development.

actuator, robotic arm, tendon, (15 more...)