Overview
RJE: A Retrieval-Judgment-Exploration Framework for Efficient Knowledge Graph Question Answering with LLMs
Lin, Can, Jiang, Zhengwang, Zheng, Ling, Zhao, Qi, Zhang, Yuhang, Song, Qi, Zhou, Wangqiu
Knowledge graph question answering (KGQA) aims to answer natural language questions using knowledge graphs. Recent research leverages large language models (LLMs) to enhance KGQA reasoning, but faces limitations: retrieval-based methods are constrained by the quality of retrieved information, while agent-based methods rely heavily on proprietary LLMs. To address these limitations, we propose Retrieval-Judgment-Exploration (RJE), a framework that retrieves refined reasoning paths, evaluates their sufficiency, and conditionally explores additional evidence. Moreover, RJE introduces specialized auxiliary modules enabling small-sized LLMs to perform effectively: Reasoning Path Ranking, Question Decomposition, and Retriever-assisted Exploration. Experiments show that our approach with proprietary LLMs (such as GPT-4o-mini) outperforms existing baselines while enabling small open-source LLMs (such as 3B and 8B parameters) to achieve competitive results without fine-tuning LLMs. Additionally, RJE substantially reduces the number of LLM calls and token usage compared to agent-based methods, yielding significant efficiency improvements.
Let's Play Across Cultures: A Large Multilingual, Multicultural Benchmark for Assessing Language Models' Understanding of Sports
Singh, Punit Kumar, Kumar, Nishant, Ghosh, Akash, Pasad, Kunal, Soni, Khushi, Jaishwal, Manisha, Saha, Sriparna, Alfarozi, Syukron Abu Ishaq, Abagissa, Asres Temam, Pasupa, Kitsuchart, Yang, Haiqin, Moreno, Jose G
Language Models (LMs) are primarily evaluated on globally popular sports, often overlooking regional and indigenous sporting traditions. To address this gap, we introduce \textbf{\textit{CultSportQA}}, a benchmark designed to assess LMs' understanding of traditional sports across 60 countries and 6 continents, encompassing four distinct cultural categories. The dataset features 33,000 multiple-choice questions (MCQs) across text and image modalities, each of which is categorized into three key types: history-based, rule-based, and scenario-based. To evaluate model performance, we employ zero-shot, few-shot, and chain-of-thought (CoT) prompting across a diverse set of Large Language Models (LLMs), Small Language Models (SLMs), and Multimodal Large Language Models (MLMs). By providing a comprehensive multilingual and multicultural sports benchmark, \textbf{\textit{CultSportQA}} establishes a new standard for assessing AI's ability to understand and reason about traditional sports.
The Rise of AfricaNLP: Contributions, Contributors, and Community Impact (2005-2025)
Belay, Tadesse Destaw, Hussen, Kedir Yassin, Imam, Sukairaj Hafiz, Ahmad, Ibrahim Said, Inuwa-Dutse, Isa, Haile, Abrham Belete, Sidorov, Grigori, Ameer, Iqra, Abdulmumin, Idris, Gwadabe, Tajuddeen, Marivate, Vukosi, Yimam, Seid Muhie, Muhammad, Shamsuddeen Hassan
Natural Language Processing (NLP) is undergoing constant transformation, as Large Language Models (LLMs) are driving daily breakthroughs in research and practice. In this regard, tracking the progress of NLP research and automatically analyzing the contributions of research papers provides key insights into the nature of the field and the researchers. This study explores the progress of African NLP (AfricaNLP) by asking (and answering) basic research questions such as: i) How has the nature of NLP evolved over the last two decades?, ii) What are the contributions of AfricaNLP papers?, and iii) Which individuals and organizations (authors, affiliated institutions, and funding bodies) have been involved in the development of AfricaNLP? We quantitatively examine the contributions of AfricaNLP research using 1.9K NLP paper abstracts, 4.9K author contributors, and 7.8K human-annotated contribution sentences (AfricaNLPContributions) along with benchmark results. Our dataset and continuously existing NLP progress tracking website provide a powerful lens for tracing AfricaNLP research trends and hold potential for generating data-driven literature surveys.
Theoretical Foundations of Representation Learning using Unlabeled Data: Statistics and Optimization
Esser, Pascal, Fleissner, Maximilian, Ghoshdastidar, Debarghya
Representation learning from unlabeled data has been extensively studied in statistics, data science and signal processing with a rich literature on techniques for dimension reduction, compression, multi-dimensional scaling among others. However, current deep learning models use new principles for unsupervised representation learning that cannot be easily analyzed using classical theories. For example, visual foundation models have found tremendous success using self-supervision or denoising/masked autoencoders, which effectively learn representations from massive amounts of unlabeled data. However, it remains difficult to characterize the representations learned by these models and to explain why they perform well for diverse prediction tasks or show emergent behavior. To answer these questions, one needs to combine mathematical tools from statistics and optimization. This paper provides an overview of recent theoretical advances in representation learning from unlabeled data and mentions our contributions in this direction.
Search-Based Software Engineering and AI Foundation Models: Current Landscape and Future Roadmap
Sartaj, Hassan, Ali, Shaukat, Arcaini, Paolo, Arcuri, Andrea
Search-based software engineering (SBSE), which integrates metaheuristic search techniques with software engineering, has been an active area of research for about 25 years. It has been applied to solve numerous problems across the entire software engineering lifecycle and has demonstrated its versatility in multiple domains. With recent advances in AI, particularly the emergence of foundation models (FMs) such as large language models (LLMs), the evolution of SBSE alongside these models remains undetermined. In this window of opportunity, we present a research roadmap that articulates the current landscape of SBSE in relation to FMs, identifies open challenges, and outlines potential research directions to advance SBSE through its integration and interplay with FMs. Specifically, we analyze five core aspects: leveraging FMs for SBSE design, applying FMs to complement SBSE in SE problems, employing SBSE to address FM challenges, adapting SBSE practices for FMs tailored to SE activities, and exploring the synergistic potential between SBSE and FMs. Furthermore, we present a forward-thinking perspective that envisions the future of SBSE in the era of FMs, highlighting promising research opportunities to address challenges in emerging domains.
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper looks at differentially private algorithms for a generic maximization problem (private argmax might be a good name). Given a collection of K of items, and a data set D of n individuals, and a score function f that assigns each item i a data-based score f(i;D), the goal is to find an item i with approximately maximal score, while preserving differential privacy. This private argmax has proven to be a fundamental problem in the theory of private data analysis. It was first formulated by McSherry and Talwar (2007), who proposed the exponential mechanism to solve it.
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper proposes an incremental but very sensible and practical modification to'curriculum learning'. Given a partition of the training examples into classes, they propose an additional regularising term (and an additional parameter) to ensure that the'easy' examples selected during learning are spread across the classes, and not from one class. The partition into classes can come from a clustering algorithm, or from a priori knowledge. The idea is straightforward and sensible, and the authors propose an algorithm that looks efficient and correct.
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The paper presents a VB method for learning nonlinear state-space models using sparse GPs to model the nonlinear state transition and observation mappings. The proposed method looks very good and efficient, but the empirical evaluation is relatively weak. Quality: The paper appears technically sound, save one minor problem listed below. The method is based on existing solid principles with VB-based sparse GPs, stochastic variational inference and sequential Monte Carlo.
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. In particle filtering, the resampling step is a synchronous operation: one needs all the particles before computing the normalised weights (since the denominator is the sum of all the weights), and then resample. The reviewed paper propose an asynchronous resampling mechanism, where the number of children of particle k depends only the weights of particles 1 to k. The proposed idea is quite straightforward, but still interesting and potentially very useful. What is a bit lacking in the current version is some motivation for an asynchronous implementation of particle filtering.