AITopics

DARG: Dynamic Evaluation of Large Language Models via Adaptive Reasoning Graph

Neural Information Processing SystemsMar-27-2025, 15:16:30 GMT

The current paradigm of evaluating Large Language Models (LLMs) through static benchmarks comes with significant limitations, such as vulnerability to data contamination and a lack of adaptability to the evolving capabilities of LLMs. Therefore, evaluation methods that can adapt and generate evaluation data with controlled complexity are urgently needed. In this work, we introduce Dynamic Evaluation of LLMs via Adaptive Reasoning Graph Evolvement (DARG) to dynamically extend current benchmarks with controlled complexity and diversity. Specifically, we first extract the reasoning graphs of data points in current benchmarks and then perturb the reasoning graphs to generate novel testing data. Such newly generated test samples can have different levels of complexity while maintaining linguistic diversity similar to the original benchmarks.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

Neural Information Processing Systems

Country: Asia > Middle East > UAE (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry: Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

9d0f188c7947eacb0c07f709576824f6-Paper-Conference.pdf

Neural Information Processing SystemsMar-27-2025, 15:16:20 GMT

arxiv preprint arxiv, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Country: North America > United States > Michigan (0.28)

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
(2 more...)

Add feedback

Supplementary Material -- Towards Reliable Model Selection for Unsupervised Domain Adaptation: An Empirical Study and A Certified Baseline Mi Luo 3

Neural Information Processing SystemsMar-27-2025, 15:16:13 GMT

We first prove the first inequality using Jensen's inequality, which states that for a real-valued, convex function φ with its domain as a subset of R and numbers t Next, we leverage the property of inequalities to prove the second inequality. Substituting the NLL loss with any strongly convex loss function would still uphold the proposition. We introduce the respective computation involved in the existing model selection approaches. However, this method has limited effectiveness in scenarios with severe domain shifts between the source and target domains. Additionally, it introduces additional hyperparameters for dataset splitting, which can further complicate the model selection process.

artificial intelligence, machine learning, validation accuracy, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Motif-oriented influence maximization for viral marketing in large-scale social networks

Neural Information Processing SystemsMar-27-2025, 15:14:23 GMT

The influence maximization (IM) problem aims to identify a budgeted set of nodes with the highest potential to influence the largest number of users in a cascade model, a key challenge in viral marketing. Traditional IM approaches consider each user/node independently as a potential target customer. However, in many scenarios, the target customers comprise motifs, where activating only one or a few users within a motif is insufficient for effective viral marketing, which, nevertheless, receives little attention. For instance, if a motif of three friends planning to dine together, targeting all three simultaneously is crucial for a restaurant advertisement to succeed. In this paper, we address the motif-oriented influence maximization problem under the linear threshold model. We prove that the motif-oriented IM problem is NP-hard and that the influence function is neither supermodular nor submodular, in contrast to the classical IM setting. To simplify the problem, we establish the submodular upper and lower bounds for the influence function. By leveraging the submodular property, we propose a natural greedy strategy that simultaneously maximizes both bounds.

artificial intelligence, machine learning, node, (18 more...)

Neural Information Processing Systems

Country: Asia > China > Guangdong Province (0.14)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry: Information Technology > Services (0.66)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

c0d5a28eb3949efbedbe3e41751e3ffc-Paper-Conference.pdf

Neural Information Processing SystemsMar-27-2025, 15:14:12 GMT

canada government, classifier, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.69)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

UniIF: Unified Molecule Inverse Folding Zhangyang Gao

Neural Information Processing SystemsMar-27-2025, 15:14:05 GMT

Molecule inverse folding has been a long-standing challenge in chemistry and biology, with the potential to revolutionize drug discovery and material science. Despite specified models have been proposed for different small-or macro-molecules, few have attempted to unify the learning process, resulting in redundant efforts. Complementary to recent advancements in molecular structure prediction, such as RoseTTAFold All-Atom and AlphaFold3, we propose the unified model UniIF for the inverse folding of all molecules. We do such unification in two levels: 1) Data-Level: We propose a unified block graph data form for all molecules, including the local frame building and geometric feature initialization.

artificial intelligence, machine learning, molecule, (19 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.89)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

9cf5fff2f85310e6ece5bc3a8489b6fa-Paper-Conference.pdf

Neural Information Processing SystemsMar-27-2025, 15:13:54 GMT

artificial intelligence, machine learning, optimization problem, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Learning to Balance Altruism and Self-interest Based on Empathy in Mixed-Motive Games

Neural Information Processing SystemsMar-27-2025, 15:13:49 GMT

Real-world multi-agent scenarios often involve mixed motives, demanding altruistic agents capable of self-protection against potential exploitation. However, existing approaches often struggle to achieve both objectives. In this paper, based on that empathic responses are modulated by inferred social relationships between agents, we propose LASE (Learning to balance Altruism and Self-interest based on Empathy), a distributed multi-agent reinforcement learning algorithm that fosters altruistic cooperation through gifting while avoiding exploitation by other agents in mixed-motive games. LASE allocates a portion of its rewards to co-players as gifts, with this allocation adapting dynamically based on the social relationship -- a metric evaluating the friendliness of co-players estimated by counterfactual reasoning. In particular, social relationship measures each co-player by comparing the estimated Q-function of current joint action to a counterfactual baseline which marginalizes the co-player's action, with its action distribution inferred by a perspective-taking module. Comprehensive experiments are performed in spatially and temporally extended mixed-motive games, demonstrating LASE's ability to promote group collaboration without compromising fairness and its capacity to adapt policies to various types of interactive co-players.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (0.93)

Industry: