interaction type
CFRecs: Counterfactual Recommendations on Real Estate User Listing Interaction Graphs
Mousavi, Seyedmasoud, Xu, Ruomeng, Zhu, Xiaojing
Graph-structured data is ubiquitous and powerful in representing complex relationships in many online platforms. While graph neural networks (GNNs) are widely used to learn from such data, counterfactual graph learning has emerged as a promising approach to improve model interpretability. Counterfactual explanation research focuses on identifying a counterfactual graph that is similar to the original but leads to different predictions. These explanations optimize two objectives simultaneously: the sparsity of changes in the counterfactual graph and the validity of its predictions. Building on these qualitative optimization goals, this paper introduces CFRecs, a novel framework that transforms counterfactual explanations into actionable insights. CFRecs employs a two-stage architecture consisting of a graph neural network (GNN) and a graph variational auto-encoder (Graph-VAE) to strategically propose minimal yet high-impact changes in graph structure and node attributes to drive desirable outcomes in recommender systems. We apply CFRecs to Zillow's graph-structured data to deliver actionable recommendations for both home buyers and sellers with the goal of helping them navigate the competitive housing market and achieve their homeownership goals. Experimental results on Zillow's user-listing interaction data demonstrate the effectiveness of CFRecs, which also provides a fresh perspective on recommendations using counterfactual reasoning in graphs.
Improving Neutrino Oscillation Measurements through Event Classification
Ellis, Sebastian A. R., Hackett, Daniel C., Li, Shirley Weishi, Machado, Pedro A. N., Tame-Narvaez, Karla
Precise neutrino energy reconstruction is essential for next-generation long-baseline oscillation experiments, yet current methods remain limited by large uncertainties in neutrino-nucleus interaction modeling. Even so, it is well established that different interaction channels produce systematically varying amounts of missing energy and therefore yield different reconstruction performance--information that standard calorimetric approaches do not exploit. We introduce a strategy that incorporates this structure by classifying events according to their underlying interaction type prior to energy reconstruction. Using supervised machine-learning techniques trained on labeled generator events, we leverage intrinsic kinematic differences among quasi-elastic scattering, meson-exchange current, resonance production, and deep-inelastic scattering processes. A cross-generator testing framework demonstrates that this classification approach is robust to microphysics mismodeling and, when applied to a simulated DUNE $ฮฝ_ฮผ$ disappearance analysis, yields improved accuracy and sensitivity. These results highlight a practical path toward reducing reconstruction-driven systematics in future oscillation measurements.
Social-Physical Interactions with Virtual Characters: Evaluating the Impact of Physicality through Encountered-Type Haptics
Godden, Eric, Groenewegen, Jacquie, Wheeler, Michael, Pan, Matthew K. X. J.
This work investigates how robot-mediated physicality influences the perception of social-physical interactions with virtual characters. ETHOS (Encountered-Type Haptics for On-demand Social interaction) is an encountered-type haptic display that integrates a torque-controlled manipulator and interchangeable props with a VR headset to enable three gestures: object handovers, fist bumps, and high fives. We conducted a user study to examine how ETHOS adds physicality to virtual character interactions and how this affects presence, realism, enjoyment, and connection metrics. Each participant experienced one interaction under three conditions: no physicality (NP), static physicality (SP), and dynamic physicality (DP). SP extended the purely virtual baseline (NP) by introducing tangible props for direct contact, while DP further incorporated motion and impact forces to emulate natural touch. Results show presence increased stepwise from NP to SP to DP. Realism, enjoyment, and connection also improved with added physicality, though differences between SP and DP were not significant. Comfort remained consistent across conditions, indicating no added psychological friction. These findings demonstrate the experiential value of ETHOS and motivate the integration of encountered-type haptics into socially meaningful VR experiences.
GFlowNets for Learning Better Drug-Drug Interaction Representations
Drug-drug interactions pose a significant challenge in clinical pharmacology, with severe class imbalance among interaction types limiting the effectiveness of predictive models. Common interactions dominate datasets, while rare but critical interactions remain underrepre-sented, leading to poor model performance on infrequent cases. Existing methods often treat DDI prediction as a binary problem, ignoring class-specific nuances and exacerbating bias toward frequent interactions. To address this, we propose a framework combining Generative Flow Networks (GFlowNet) with Variational Graph Autoencoders (VGAE) to generate synthetic samples for rare classes, improving model balance and generate effective and novel DDI pairs. Our approach enhances predictive performance across interaction types, ensuring better clinical reliability.
Benchmarking Correctness and Security in Multi-Turn Code Generation
Rawal, Ruchit, Chiang, Jeffrey Yang Fan, Shen, Chihao, Tian, Jeffery Siyuan, Mahajan, Aastha, Goldstein, Tom, Chen, Yizheng
AI coding assistants powered by large language models (LLMs) have transformed software development, significantly boosting productivity. While existing benchmarks evaluate the correctness and security of LLM-generated code, they are typically limited to single-turn tasks that do not reflect the iterative nature of real-world development. We introduce MT-Sec, the first benchmark to systematically evaluate both correctness and security in multi-turn coding scenarios. We construct this using a synthetic data pipeline that transforms existing single-turn tasks into semantically aligned multi-turn interaction sequences, allowing reuse of original test suites while modeling the complexity of real-world coding processes. We evaluate 32 open- and closed-source models, and three agent-scaffolding on MT-Sec and observe a consistent 20-27% drop in "correct and secure" outputs from single-turn to multi-turn settings -- even among state-of-the-art models. Beyond full-program generation, we also evaluate models on multi-turn code-diff generation -- an unexplored yet practically relevant setting -- and find that models perform worse here, with increased rates of functionally incorrect and insecure outputs. Finally, we find that while agent scaffoldings boost single-turn code generation performance, they are not quite as effective in multi-turn evaluations. Together, these findings highlight the need for benchmarks that jointly evaluate correctness and security in multi-turn, real-world coding workflows.
Compose and Fuse: Revisiting the Foundational Bottlenecks in Multimodal Reasoning
Wang, Yucheng, Hou, Yifan, Javadov, Aydin, Akhtar, Mubashara, Sachan, Mrinmaya
Multimodal large language models (MLLMs) promise enhanced reasoning by integrating diverse inputs such as text, vision, and audio. Yet cross-modal reasoning remains underexplored, with conflicting reports on whether added modalities help or harm performance. These inconsistencies stem from a lack of controlled evaluation frameworks and analysis of models' internals to isolate when and why modality interactions support or undermine reasoning. We address this gap through a logic-grounded evaluation framework that categorizes multimodal reasoning into six interaction patterns, varying how facts are distributed across modalities and logically combined. Empirically, additional modalities enhance reasoning only when they provide independent and sufficient reasoning paths, while redundant or chained entailment support often hurts performance. Moreover, reasoning degrades in three systematic ways: weaker modalities drag down overall performance, conflicts bias preference toward certain modalities, and joint signals from different modalities fail to be integrated effectively. Therefore, we identify two core failures: task-composition bottleneck, where recognition and reasoning cannot be jointly executed in one pass, and fusion bottleneck, where early integration introduces bias. For further investigation, we find that attention patterns fail to encode fact usefulness, but a simple two-step prompting (recognize then reason) restores performance, confirming the task-composition bottleneck. Moreover, modality identity remains recoverable in early layers, and softening attention in early fusion improves reasoning, highlighting biased fusion as another failure mode. Overall, our findings show that integration, not perception, is the main barrier to multimodal reasoning, suggesting composition-aware training and early fusion control as promising directions.
LINKER: Learning Interactions Between Functional Groups and Residues With Chemical Knowledge-Enhanced Reasoning and Explainability
Pham, Phuc, Nguyen, Viet Thanh Duy, Hy, Truong-Son
Accurate identification of interactions between protein residues and ligand functional groups is essential to understand molecular recognition and guide rational drug design. Existing deep learning approaches for protein-ligand interpretability often rely on 3D structural input or use distance-based contact labels, limiting both their applicability and biological relevance. We introduce LINKER, the first sequence-based model to predict residue-functional group interactions in terms of biologically defined interaction types, using only protein sequences and the ligand SMILES as input. LINKER is trained with structure-supervised attention, where interaction labels are derived from 3D protein-ligand complexes via functional group-based motif extraction. By abstracting ligand structures into functional groups, the model focuses on chemically meaningful substructures while predicting interaction types rather than mere spatial proximity. Crucially, LINKER requires only sequence-level input at inference time, enabling large-scale application in settings where structural data is unavailable. Experiments on the LP-PDBBind benchmark demonstrate that structure-informed supervision over functional group abstractions yields interaction predictions closely aligned with ground-truth biochemical annotations.
Case-Based Reasoning Enhances the Predictive Power of LLMs in Drug-Drug Interaction
Liu, Guangyi, Zhang, Yongqi, Liu, Xunyuan, Yao, Quanming
Drug-drug interaction (DDI) prediction is critical for treatment safety. While large language models (LLMs) show promise in pharmaceutical tasks, their effectiveness in DDI prediction remains challenging. Inspired by the well-established clinical practice where physicians routinely reference similar historical cases to guide their decisions through case-based reasoning (CBR), we propose CBR-DDI, a novel framework that distills pharmacological principles from historical cases to improve LLM reasoning for DDI tasks. CBR-DDI constructs a knowledge repository by leveraging LLMs to extract pharmacological insights and graph neural networks (GNNs) to model drug associations. A hybrid retrieval mechanism and dual-layer knowledge-enhanced prompting allow LLMs to effectively retrieve and reuse relevant cases. We further introduce a representative sampling strategy for dynamic case refinement. Extensive experiments demonstrate that CBR-DDI achieves state-of-the-art performance, with a significant 28.7% accuracy improvement over both popular LLMs and CBR baseline, while maintaining high interpretability and flexibility.