Goto

Collaborating Authors

 motif




Motif-oriented influence maximization for viral marketing in large-scale social networks

Neural Information Processing Systems

The influence maximization (IM) problem aims to identify a budgeted set of nodes with the highest potential to influence the largest number of users in a cascade model, a key challenge in viral marketing. Traditional \emph{IM} approaches consider each user/node independently as a potential target customer. However, in many scenarios, the target customers comprise motifs, where activating only one or a few users within a motif is insufficient for effective viral marketing, which, nevertheless, receives little attention. For instance, if a motif of three friends planning to dine together, targeting all three simultaneously is crucial for a restaurant advertisement to succeed.In this paper, we address the motif-oriented influence maximization problem under the linear threshold model. We prove that the motif-oriented IM problem is NP-hard and that the influence function is neither supermodular nor submodular, in contrast to the classical \emph{IM} setting.To simplify the problem, we establish the submodular upper and lower bounds for the influence function. By leveraging the submodular property, we propose a natural greedy strategy that simultaneously maximizes both bounds. Our algorithm has an approximation ratio of $\tau\cdot (1-1/e-\varepsilon)$ and a near-linear time complexity of $O((k+l)(m+\eta)\log \eta/\varepsilon^2)$.Experimental results on diverse datasets confirm the effectiveness of our approach in motif maximization.


MOTIF: Multi-strategy Optimization via Turn-based Interactive Framework

Kiet, Nguyen Viet Tuan, Van Tung, Dao, Dao, Tran Cong, Binh, Huynh Thi Thanh

arXiv.org Artificial Intelligence

Designing effective algorithmic components remains a fundamental obstacle in tackling NP-hard combinatorial optimization problems (COPs), where solvers often rely on carefully hand-crafted strategies. Despite recent advances in using large language models (LLMs) to synthesize high-quality components, most approaches restrict the search to a single element - commonly a heuristic scoring function - thus missing broader opportunities for innovation. In this paper, we introduce a broader formulation of solver design as a multi-strategy optimization problem, which seeks to jointly improve a set of interdependent components under a unified objective. To address this, we propose Multi-strategy Optimization via Turn-based Interactive Framework (MOTIF) - a novel framework based on Monte Carlo Tree Search that facilitates turn-based optimization between two LLM agents. At each turn, an agent improves one component by leveraging the history of both its own and its opponent's prior updates, promoting both competitive pressure and emergent cooperation. This structured interaction broadens the search landscape and encourages the discovery of diverse, high-performing solutions. Experiments across multiple COP domains show that MOTIF consistently outperforms state-of-the-art methods, highlighting the promise of turn-based, multi-agent prompting for fully automated solver design.


Phase Transition for Stochastic Block Model with more than $\sqrt{n}$ Communities (II)

Carpentier, Alexandra, Giraud, Christophe, Verzelen, Nicolas

arXiv.org Machine Learning

A fundamental theoretical question in network analysis is to determine under which conditions community recovery is possible in polynomial time in the Stochastic Block Model (SBM). When the number $K$ of communities remains smaller than $\sqrt{n}$ --where $n$ denotes the number of nodes--, non-trivial community recovery is possible in polynomial time above, and only above, the Kesten--Stigum (KS) threshold, originally postulated using arguments from statistical physics. When $K \geq \sqrt{n}$, Chin, Mossel, Sohn, and Wein recently proved that, in the \emph{sparse regime}, community recovery in polynomial time is achievable below the KS threshold by counting non-backtracking paths. This finding led them to postulate a new threshold for the many-communities regime $K \geq \sqrt{n}$. Subsequently, Carpentier, Giraud, and Verzelen established the failure of low-degree polynomials below this new threshold across all density regimes, and demonstrated successful recovery above the threshold in certain moderately sparse settings. While these results provide strong evidence that, in the many community setting, the computational barrier lies at the threshold proposed in~Chin et al., the question of achieving recovery above this threshold still remains open in most density regimes. The present work is a follow-up to~Carpentier et al., in which we prove Conjecture~1.4 stated therein by: \\ 1- Constructing a family of motifs satisfying specific structural properties; and\\ 2- Proving that community recovery is possible above the proposed threshold by counting such motifs.\\ Our results complete the picture of the computational barrier for community recovery in the SBM with $K \geq \sqrt{n}$ communities. They also indicate that, in moderately sparse regimes, the optimal algorithms appear to be fundamentally different from spectral methods.


SPHINX: A Synthetic Environment for Visual Perception and Reasoning

Alam, Md Tanvirul, Aggarwal, Saksham, Chae, Justin Yang, Rastogi, Nidhi

arXiv.org Artificial Intelligence

We present Sphinx, a synthetic environment for visual perception and reasoning that targets core cognitive primitives. Sphinx procedurally generates puzzles using motifs, tiles, charts, icons, and geometric primitives, each paired with verifiable ground-truth solutions, enabling both precise evaluation and large-scale dataset construction. The benchmark covers 25 task types spanning symmetry detection, geometric transformations, spatial reasoning, chart interpretation, and sequence prediction. Evaluating recent large vision-language models (LVLMs) shows that even state-of-the-art GPT-5 attains only 51.1% accuracy, well below human performance. Finally, we demonstrate that reinforcement learning with verifiable rewards (RLVR) substantially improves model accuracy on these tasks and yields gains on external visual reasoning benchmarks, highlighting its promise for advancing multimodal reasoning.


Interpreting GFlowNets for Drug Discovery: Extracting Actionable Insights for Medicinal Chemistry

S, Amirtha Varshini A, Ranasinghe, Duminda S., Tam, Hok Hei

arXiv.org Artificial Intelligence

Generative Flow Networks, or GFlowNets, offer a promising framework for molecular design, but their internal decision policies remain opaque. This limits adoption in drug discovery, where chemists require clear and interpretable rationales for proposed structures. We present an interpretability framework for SynFlowNet, a GFlowNet trained on documented chemical reactions and purchasable starting materials that generates both molecules and the synthetic routes that produce them. Our approach integrates three complementary components. Gradient based saliency combined with counterfactual perturbations identifies which atomic environments influence reward and how structural edits change molecular outcomes. Sparse autoencoders reveal axis aligned latent factors that correspond to physicochemical properties such as polarity, lipophilicity, and molecular size. Motif probes show that functional groups including aromatic rings and halogens are explicitly encoded and linearly decodable from the internal embeddings. Together, these results expose the chemical logic inside SynFlowNet and provide actionable and mechanistic insight that supports transparent and controllable molecular design.



Inhomogeneous Hypergraph Clustering with Applications

Pan Li, Olgica Milenkovic

Neural Information Processing Systems

However, this assumption fails to leverage the fact that different subsets of vertices within the same hyperedge may have different structural importance. We hence propose a new hypergraph clustering technique, termed inhomogeneous hypergraph partitioning, which assigns different costs to different hyperedge cuts.