Education
The Evolution of Learning Algorithms for Artificial Neural Networks
In this paper we investigate a neural network model in which weights between computational nodes are modified according to a local learning rule. To determine whether local learning rules are sufficient for learning, we encode the network architectures and learning dynamics genetically and then apply selection pressure to evolve networks capable of learning the four boolean functions of one variable. The successful networks are analysed and we show how learning behaviour emerges as a distributed property of the entire network. Finally the utility of genetic algorithms as a tool of discovery is discussed.
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe
Zhang, Kaichen, Wu, Keming, Yang, Zuhao, Li, Bo, Hu, Kairui, Wang, Bin, Liu, Ziwei, Li, Xingxuan, Bing, Lidong
Recent advancements in large reasoning models have fueled growing interest in extending such capabilities to multimodal domains. However, despite notable progress in visual reasoning, the lack of transparent and reproducible data curation and training strategies remains a major barrier to scalable research. In this work, we introduce OpenMMReasoner, a fully transparent two-stage recipe for multimodal reasoning spanning supervised fine-tuning (SFT) and reinforcement learning (RL). In the SFT stage, we construct an 874K-sample cold-start dataset with rigorous step-by-step validation, providing a strong foundation for reasoning capabilities. The subsequent RL stage leverages a 74K-sample dataset across diverse domains to further sharpen and stabilize these abilities, resulting in a more robust and efficient learning process. Extensive evaluations demonstrate that our training recipe not only surpasses strong baselines but also highlights the critical role of data quality and training design in shaping multimodal reasoning performance. Notably, our method achieves a 11.6% improvement over the Qwen2.5-VL-7B-Instruct baseline across nine multimodal reasoning benchmarks, establishing a solid empirical foundation for future large-scale multimodal reasoning research. We open-sourced all our codes, pipeline, and data at https://github.com/EvolvingLMMs-Lab/OpenMMReasoner.
Spilling the Beans: Teaching LLMs to Self-Report Their Hidden Objectives
Li, Chloe, Phuong, Mary, Tan, Daniel
As AI systems become more capable of complex agentic tasks, they also become more capable of pursuing undesirable objectives and causing harm. Previous work has attempted to catch these unsafe instances by interrogating models directly about their objectives and behaviors. However, the main weakness of trusting interrogations is that models can lie. We propose self-report fine-tuning (SRFT), a simple supervised fine-tuning technique that trains models to occasionally make factual mistakes, then admit them when asked. We show that the admission of factual errors in simple question-answering settings generalizes out-of-distribution (OOD) to the admission of hidden misaligned objectives in adversarial agentic settings. We evaluate SRFT in OOD stealth tasks, where models are instructed to complete a hidden misaligned objective alongside a user-specified objective without being caught by monitoring. After SRFT, models are more likely to confess the details of their hidden objectives when interrogated, even under strong pressure not to disclose them. Interrogation on SRFT models can detect hidden objectives with near-ceiling performance (F1 score = 0.98), while the baseline model lies when interrogated under the same conditions (F1 score = 0). Interrogation on SRFT models can further elicit the content of the hidden objective, recovering 28-100% details, compared to 0% details recovered in the baseline model and by prefilled assistant turn attacks. This provides a promising technique for promoting honesty propensity and incriminating misaligned AIs.
Counterfactual Reasoning for Steerable Pluralistic Value Alignment of Large Language Models
Guo, Hanze, Yao, Jing, Zhou, Xiao, Yi, Xiaoyuan, Xie, Xing
As large language models (LLMs) become increasingly integrated into applications serving users across diverse cultures, communities and demographics, it is critical to align LLMs with pluralistic human values beyond average principles (e.g., HHH). In psychological and social value theories such as Schwartz's Value Theory, pluralistic values are represented by multiple value dimensions paired with various priorities. However, existing methods encounter two challenges when aligning with such fine-grained value objectives: 1) they often treat multiple values as independent and equally important, ignoring their interdependence and relative priorities (value complexity); 2) they struggle to precisely control nuanced value priorities, especially those underrepresented ones (value steerability). To handle these challenges, we propose COUPLE, a COUnterfactual reasoning framework for PLuralistic valuE alignment. It introduces a structural causal model (SCM) to feature complex interdependency and prioritization among features, as well as the causal relationship between high-level value dimensions and behaviors. Moreover, it applies counterfactual reasoning to generate outputs aligned with any desired value objectives. Benefitting from explicit causal modeling, COUPLE also provides better interpretability. We evaluate COUPLE on two datasets with different value systems and demonstrate that COUPLE advances other baselines across diverse types of value objectives.
From Simulation to Strategy: Automating Personalized Interaction Planning for Conversational Agents
Chang, Wen-Yu, Huang, Tzu-Hung, Chen, Chih-Ho, Chen, Yun-Nung
Abstract--Amid the rapid rise of agentic dialogue models, realistic user-simulator studies are essential for tuning effective conversation strategies. This work investigates a sales-oriented agent that adapts its dialogue based on user profiles spanning age, gender, and occupation. While age and gender influence overall performance, occupation produces the most pronounced differences in conversational intent. Leveraging this insight, we introduce a lightweight, occupation-conditioned strategy that guides the agent to prioritize intents aligned with user preferences, resulting in shorter and more successful dialogues. Our findings highlight the importance of rich simulator profiles and demonstrate how simple persona-informed strategies can enhance the effectiveness of sales-oriented dialogue systems. With the ongoing evolution of Agentic AI, researchers have begun to explore its application across diverse domains. Among these, dialogue systems designed for business recommendation tasks have attracted significant attention.
Variational Supervised Contrastive Learning
Wang, Ziwen, Fan, Jiajun, Nguyen, Thao, Ji, Heng, Liu, Ge
Contrastive learning has proven to be highly efficient and adaptable in shaping representation spaces across diverse modalities by pulling similar samples together and pushing dissimilar ones apart. However, two key limitations persist: (1) Without explicit regulation of the embedding distribution, semantically related instances can inadvertently be pushed apart unless complementary signals guide pair selection, and (2) excessive reliance on large in-batch negatives and tailored augmentations hinders generalization. To address these limitations, we propose Variational Supervised Contrastive Learning (VarCon), which reformulates supervised contrastive learning as variational inference over latent class variables and maximizes a posterior-weighted evidence lower bound (ELBO) that replaces exhaustive pair-wise comparisons for efficient class-aware matching and grants fine-grained control over intra-class dispersion in the embedding space. Trained exclusively on image data, our experiments on CIFAR-10, CIFAR-100, ImageNet-100, and ImageNet-1K show that VarCon (1) achieves state-of-the-art performance for contrastive learning frameworks, reaching 79.36% Top-1 accuracy on ImageNet-1K and 78.29% on CIFAR-100 with a ResNet-50 encoder while converging in just 200 epochs; (2) yields substantially clearer decision boundaries and semantic organization in the embedding space, as evidenced by KNN classification, hierarchical clustering results, and transfer-learning assessments; and (3) demonstrates superior performance in few-shot learning than supervised baseline and superior robustness across various augmentation strategies. Our code is available at https://github.com/ziwenwang28/VarContrast.
Enhancing SPARQL Query Rewriting for Complex Ontology Alignments
Ondo, Anicet Lepetit, Capus, Laurence, Bousso, Mamadou
SPARQL query rewriting is a fundamental mechanism for uniformly querying heterogeneous ontologies in the Linked Data Web. However, the complexity of ontology alignments, particularly rich correspondences (c: c), makes this process challenging. Existing approaches primarily focus on simple (s: s) and par tially complex (s: c) alignments, thereby overlooking the challenges posed by more expressive alignments. Moreover, the intricate syntax of SPARQL presents a barrier for non - expert users seeking to fully exploit the knowledge encapsulated in ontologies. T his article proposes an innovative approach for the automatic rewriting of SPARQL queries from a source ontology to a target ontology, based on a user's need expressed in natural language. It leverages the principles of equivalence transitivity as well as the advanced capabilities of large language models such as GPT - 4 . By integrating these elements, this approach stands out for its ability to efficiently handle complex alignments, particularly (c: c) correspondences, by fully exploiting their expressivene ss. Additionally, it facilitates access to aligned ontologies for users unfamiliar with SPARQL, providing a flexible solution for querying heterogeneous data. I n the Linked Data Web, aligned ontologies play a crucial role in facilitating interoperability between different data sources.
Exploring Ordinal Bias in Action Recognition for Instructional Videos
Kim, Joochan, Jung, Minjoon, Zhang, Byoung-Tak
Action recognition models have achieved promising results in understanding instructional videos. However, they often rely on dominant, dataset-specific action sequences rather than true video comprehension, a problem that we define as ordinal bias. To address this issue, we propose two effective video manipulation methods: Action Masking, which masks frames of frequently co-occurring actions, and Sequence Shuffling, which randomizes the order of action segments. Through comprehensive experiments, we demonstrate that current models exhibit significant performance drops when confronted with nonstandard action sequences, underscoring their vulnerability to ordinal bias. Our findings emphasize the importance of rethinking evaluation strategies and developing models capable of generalizing beyond fixed action patterns in diverse instructional videos. Due to the dominant action pair'Take-Background', the model fails to predict the action'Open.' Action recognition in instructional videos has witnessed remarkable progress, primarily driven by models that excel in curated benchmark datasets (Farha & Gall, 2019; Ishikawa et al., 2021; Li et al., 2020; Yi et al., 2021).
Point-PNG: Conditional Pseudo-Negatives Generation for Point Cloud Pre-Training
Mahendren, Sutharsan, Rahman, Saimunur, Koniusz, Piotr, Fernando, Tharindu, Sridharan, Sridha, Fookes, Clinton, Moghadam, Peyman
We propose Point-PNG, a novel self-supervised learning framework that generates conditional pseudo-negatives in the latent space to learn point cloud representations that are both discriminative and transformation-sensitive. Conventional self-supervised learning methods focus on achieving invariance, discarding transformation-specific information. Recent approaches incorporate transformation sensitivity by explicitly modeling relationships between original and transformed inputs. However, they often suffer from an invariant-collapse phenomenon, where the predictor degenerates into identity mappings, resulting in latent representations with limited variation across transformations. To address this, we propose Point-PNG that explicitly penalizes invariant collapse through pseudo-negatives generation, enabling the network to capture richer transformation cues while preserving discriminative representations. To this end, we introduce a parametric network, COnditional Pseudo-Negatives Embedding (COPE), which learns localized displacements induced by transformations within the latent space. A key challenge arises when jointly training COPE with the MAE, as it tends to converge to trivial identity mappings. To overcome this, we design a loss function based on pseudo-negatives conditioned on the transformation, which penalizes such trivial invariant solutions and enforces meaningful representation learning. We validate Point-PNG on shape classification and relative pose estimation tasks, showing competitive performance on ModelNet40 and ScanObjectNN under challenging evaluation protocols, and achieving superior accuracy in relative pose estimation compared to supervised baselines.
Drone strikes on Sudan kindergarten, hospital kill dozens, local official says
Sudanese refugee children watch the sunset in the Tine transit camp amid the conflict between the paramilitary Rapid Support Forces (RSF) and the Sudanese Army, in eastern Chad on Nov. 23. Port Sudan, Sudan - A recent paramilitary drone attack on the army-held town of Kalogi in Sudan's South Kordofan state hit a kindergarten and a hospital, killing dozens of civilians including children, a local official said Sunday. The attack, which took place on Thursday, involved three strikes, first a kindergarten, then a hospital and a third time as people tried to rescue the children, Essam al-Din al-Sayed, head of the Kalogi administrative unit, said using a Starlink satellite internet connection. He blamed the assault on the Rapid Support Forces and their ally, the Sudan People's Liberation Movement-North faction (SPLM-N) led by Abdelaziz al-Hilu, which controls much of South Kordofan and parts of Blue Nile state. In a time of both misinformation and too much information, quality journalism is more crucial than ever.