Goto

Collaborating Authors

 lis


Limits of Generalization in RLVR: Two Case Studies in Mathematical Reasoning

Alam, Md Tanvirul, Rastogi, Nidhi

arXiv.org Artificial Intelligence

Mathematical reasoning is a central challenge for large language models (LLMs), requiring not only correct answers but also faithful reasoning processes. Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a promising approach for enhancing such capabilities; however, its ability to foster genuine reasoning remains unclear. We investigate RLVR on two combinatorial problems with fully verifiable solutions: \emph{Activity Scheduling} and the \emph{Longest Increasing Subsequence}, using carefully curated datasets with unique optima. Across multiple reward designs, we find that RLVR improves evaluation metrics but often by reinforcing superficial heuristics rather than acquiring new reasoning strategies. These findings highlight the limits of RLVR generalization, emphasizing the importance of benchmarks that disentangle genuine mathematical reasoning from shortcut exploitation and provide faithful measures of progress. Code available at https://github.com/xashru/rlvr-seq-generalization.


A Metric for MLLM Alignment in Large-scale Recommendation

Zhang, Yubin, Huang, Yanhua, Xu, Haiming, Qi, Mingliang, Wang, Chang, Jin, Jiarui, Ren, Xiangyuan, Wang, Xiaodan, Xu, Ruiwen

arXiv.org Artificial Intelligence

Multimodal recommendation has emerged as a critical technique in modern recommender systems, leveraging content representations from advanced multimodal large language models (MLLMs). To ensure these representations are well-adapted, alignment with the recommender system is essential. However, evaluating the alignment of MLLMs for recommendation presents significant challenges due to three key issues: (1) static benchmarks are inaccurate because of the dynamism in real-world applications, (2) evaluations with online system, while accurate, are prohibitively expensive at scale, and (3) conventional metrics fail to provide actionable insights when learned representations underperform. To address these challenges, we propose the Leakage Impact Score (LIS), a novel metric for multimodal recommendation. Rather than directly assessing MLLMs, LIS efficiently measures the upper bound of preference data. We also share practical insights on deploying MLLMs with LIS in real-world scenarios. Online A/B tests on both Content Feed and Display Ads of Xiaohongshu's Explore Feed production demonstrate the effectiveness of our proposed method, showing significant improvements in user spent time and advertiser value.


Not Quite 'Ask a Librarian': AI on the Nature, Value, and Future of LIS

Dinneen, Jesse David, Bubinger, Helen

arXiv.org Artificial Intelligence

AI language models trained on Web data generate prose that reflects human knowledge and public sentiments, but can also contain novel insights and predictions. We asked the world's best language model, GPT-3, fifteen difficult questions about the nature, value, and future of library and information science (LIS), topics that receive perennial attention from LIS scholars. We present highlights from its 45 different responses, which range from platitudes and caricatures to interesting perspectives and worrisome visions of the future, thus providing an LIS-tailored demonstration of the current performance of AI language models. We also reflect on the viability of using AI to forecast or generate research ideas in this way today. Finally, we have shared the full response log online for readers to consider and evaluate for themselves.


Constructing Hierarchical Bayesian Networks With Pooling

Nishino, Kaneharu (The University of Tokyo) | Inaba, Mary (The University of Tokyo)

AAAI Conferences

Inspired by the Bayesian brain hypothesis and deep learning, we develop a Bayesian autoencoder, a method of constructing recognition systems using a Bayesian network. We construct hierarchical Bayesian networks based on feature extraction and implement pooling to achieve invariance within a Bayesian network framework. The constructed networks propagate information bidirectionally between layers. We expect they will be able to achieve brain-like recognition using local features and global information such as their environments.


Locally Imposing Function for Generalized Constraint Neural Networks - A Study on Equality Constraints

Cao, Linlin, He, Ran, Hu, Bao-Gang

arXiv.org Machine Learning

This work is a further study on the Generalized Constraint Neural Network (GCNN) model [1], [2]. Two challenges are encountered in the study, that is, to embed any type of prior information and to select its imposing schemes. The work focuses on the second challenge and studies a new constraint imposing scheme for equality constraints. A new method called locally imposing function (LIF) is proposed to provide a local correction to the GCNN prediction function, which therefore falls within Locally Imposing Scheme (LIS). In comparison, the conventional Lagrange multiplier method is considered as Globally Imposing Scheme (GIS) because its added constraint term exhibits a global impact to its objective function. Two advantages are gained from LIS over GIS. First, LIS enables constraints to fire locally and explicitly in the domain only where they need on the prediction function. Second, constraints can be implemented within a network setting directly. We attempt to interpret several constraint methods graphically from a viewpoint of the locality principle. Numerical examples confirm the advantages of the proposed method. In solving boundary value problems with Dirichlet and Neumann constraints, the GCNN model with LIF is possible to achieve an exact satisfaction of the constraints.


Advances in Lifted Importance Sampling

Gogate, Vibhav (The University of Texas at Dallas) | Jha, Abhay (University of Washington) | Venugopal, Deepak (The University of Texas at Dallas)

AAAI Conferences

We consider lifted importance sampling (LIS), a previously proposed approximate inference algorithm for statistical relational learning (SRL) models. LIS achieves substantial variance reduction over conventional importance sampling by using various lifting rules that take advantage of the symmetry in the relational representation. However, it suffers from two drawbacks. First, it does not take advantage of some important symmetries in the relational representation and may exhibit needlessly high variance on models having these symmetries. Second, it uses an uninformative proposal distribution which adversely affects its accuracy. We propose two improvements to LIS that address these limitations. First, we identify a new symmetry in SRL models and define a lifting rule for taking advantage of this symmetry. The lifting rule reduces the variance of LIS. Second, we propose a new, structured approach for constructing and dynamically updating the proposal distribution via adaptive sampling. We demonstrate experimentally that our new, improved LIS algorithm is substantially more accurate than the LIS algorithm.