AITopics | Model-Based Reasoning

Collaborating Authors

Model-Based Reasoning

News Overviews Instructional Materials AI-Alerts Classics

IES-M: Resolving Interference When Merging Models Derek Tam

Neural Information Processing SystemsJan-23-2025, 15:22:31 GMT

Transfer learning - i.e., further fine-tuning a pre-trained model on a downstream task - can confer significant advantages, including improved downstream performance, faster convergence, and better sample efficiency. These advantages have led to a proliferation of task-specific fine-tuned models, which typically can only perform a single task and do not benefit from one another. Recently, model merging techniques have emerged as a solution to combine multiple task-specific models into a single multitask model without performing additional training. However, existing merging methods often ignore the interference between parameters of different models, resulting in large performance drops when merging multiple models. In this paper, we demonstrate that prior merging techniques inadvertently lose valuable information due to two major sources of interference: (a) interference due to redundant parameter values and (b) disagreement on the sign of a given parameter's values across models.

erging, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.67)

Genre: Research Report (0.94)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.34)

Add feedback

Reviews: Regression Planning Networks

Neural Information Processing SystemsJan-23-2025, 01:00:12 GMT

This submission drew a great deal of discussion -- primarily on the point of the role of learning. All reviewers agreed that the approach had the potential to learn interesting, non-trivial things but did not feel the the current experiments demonstrated these effectively -- despite strong performance on the task. Some examples of questions that were not answered by the main draft but came up in the discussion: [Training Data] The training data provides edges in the dependency graph, subgoals, and predicate value -- image pairs. One question was whether the union of the seen dependency graph constituted the entire true underlying graph. Similarly, do all predicate-object pairs occur?

artificial intelligence, belief revision, demonstration, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.60)
Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (0.60)

Add feedback

On the Robustness of Mechanism Design under Total Variation Distance

Neural Information Processing SystemsJan-21-2025, 02:46:04 GMT

We study the problem of designing mechanisms when agents' valuation functions are drawn from unknown and correlated prior distributions. In particular, we are given a prior distribution D, and we are interested in designing a (truthful) mechanism that has good performance for all "true distributions" that are close to D in Total Variation (TV) distance. We show that DSIC and BIC mechanisms in this setting are strongly robust with respect to TV distance, for any bounded objective function O, extending a recent result of Brustle et al. ([BCD20], EC 2020). At the heart of our result is a fundamental duality property of total variation distance. As direct applications of our result, we (i) demonstrate how to find approximately revenue-optimal and approximately BIC mechanisms for weakly dependent prior distributions; (ii) show how to find correlation-robust mechanisms when only "noisy" versions of marginals are accessible, extending recent results of Bei et.

artificial intelligence, mechanism, robustness, (16 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.54)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.42)

Add feedback

Causal Effect Identification in Uncertain Causal Networks

Neural Information Processing SystemsJan-20-2025, 23:47:12 GMT

Causal identification is at the core of the causal inference literature, where complete algorithms have been proposed to identify causal queries of interest. The validity of these algorithms hinges on the restrictive assumption of having access to a correctly specified causal structure. In this work, we study the setting where a probabilistic model of the causal structure is available. Specifically, the edges in a causal graph exist with uncertainties which may, for example, represent degree of belief from domain experts. Alternatively, the uncertainty about an edge may reflect the confidence of a particular statistical test. The question that naturally arises in this setting is: Given such a probabilistic graph and a specific causal effect of interest, what is the subgraph which has the highest plausibility and for which the causal effect is identifiable? We show that answering this question reduces to solving an NP-complete combinatorial optimization problem which we call the edge ID problem. We propose efficient algorithms to approximate this problem and evaluate them against both real-world networks and randomly generated graphs.

artificial intelligence, bidirected edge, vertex, (16 more...)

Neural Information Processing Systems

Country:

Europe (0.46)
North America > United States (0.45)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.40)

Add feedback

Reviews: Sample Complexity of Automated Mechanism Design

Neural Information Processing SystemsJan-20-2025, 19:35:21 GMT

This paper deals with the sample complexity of automated mechanism design for the problem of maximizing the revenue in a combinatorial auction (CA). Given a class of auction mechanisms, the automated mechanism design takes as input samples from the bidders' valuation distribution (which, in practice, may be the history records from the previous auctions), and output the choice of auction mechanism with high revenue. This work presents several upper bounds on the sample complexities for various auction classes. Although Morgenstern and Roughgarden (reference [19] in this paper) studied the same problem of bounding the sample complexities of CA, their work only deals with "simple auctions" which can be reduced to the single-bidder setting. In contrast, this paper studies the hierarchy of deterministic CA families consists of VCG-based mechanisms.

auction, game theory, sample complexity, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.86)

Add feedback

Sample Complexity of Automated Mechanism Design

Maria-Florina F. Balcan, Tuomas Sandholm, Ellen Vitercik

Neural Information Processing SystemsJan-20-2025, 19:35:19 GMT

The design of revenue-maximizing combinatorial auctions, i.e. multi-item auctions over bundles of goods, is one of the most fundamental problems in computational economics, unsolved even for two bidders and two items for sale. In the traditional economic models, it is assumed that the bidders' valuations are drawn from an underlying distribution and that the auction designer has perfect knowledge of this distribution. Despite this strong and oftentimes unrealistic assumption, it is remarkable that the revenue-maximizing combinatorial auction remains unknown. In recent years, automated mechanism design has emerged as one of the most practical and promising approaches to designing high-revenue combinatorial auctions. The most scalable automated mechanism design algorithms take as input samples from the bidders' valuation distribution and then search for a high-revenue auction in a rich auction class. In this work, we provide the first sample complexity analysis for the standard hierarchy of deterministic combinatorial auction classes used in automated mechanism design. In particular, we provide tight sample complexity bounds on the number of samples needed to guarantee that the empirical revenue of the designed mechanism on the samples is close to its expected revenue on the underlying, unknown distribution over bidder valuations, for each of the auction classes in the hierarchy. In addition to helping set automated mechanism design on firm foundations, our results also push the boundaries of learning theory. In particular, the hypothesis functions used in our contexts are defined through multi-stage combinatorial optimization procedures, rather than simple decision boundaries, as are common in machine learning.

artificial intelligence, auction, complexity, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (1.00)

Add feedback

Towards Foundation Models for Scientific Machine Learning: Characterizing Scaling and Transfer Behavior

Neural Information Processing SystemsJan-20-2025, 00:46:49 GMT

Pre-trained machine learning (ML) models have shown great performance for awide range of applications, in particular in natural language processing (NLP)and computer vision (CV). Here, we study how pre-training could be used forscientific machine learning (SciML) applications, specifically in the context oftransfer learning. We study the transfer behavior of these models as (i) the pretrainedmodel size is scaled, (ii) the downstream training dataset size is scaled,(iii) the physics parameters are systematically pushed out of distribution, and (iv)how a single model pre-trained on a mixture of different physics problems canbe adapted to various downstream applications. We also find that fine-tuning these modelsyields more performance gains as model size increases, compared to training fromscratch on new downstream tasks. These results hold for a broad range of PDElearning tasks.

artificial intelligence, characterizing scaling and transfer behavior, scientific machine learning, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Mechanism Design for Collaborative Normal Mean Estimation

Neural Information Processing SystemsJan-19-2025, 16:45:36 GMT

We study collaborative normal mean estimation, where m strategic agents collect i.i.d samples from a normal distribution \mathcal{N}(\mu, \sigma 2) at a cost. They all wish to estimate the mean \mu . By sharing data with each other, agents can obtain better estimates while keeping the cost of data collection small. To facilitate this collaboration, we wish to design mechanisms that encourage agents to collect a sufficient amount of data and share it truthfully, so that they are all better off than working alone. In naive mechanisms, such as simply pooling and sharing all the data, an individual agent might find it beneficial to under-collect and/or fabricate data, which can lead to poor social outcomes.

artificial intelligence, collaborative normal mean estimation, mechanism, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.41)
Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.40)

Add feedback

Automated Dynamic Mechanism Design

Neural Information Processing SystemsJan-19-2025, 11:47:31 GMT

We study Bayesian automated mechanism design in unstructured dynamic environments, where a principal repeatedly interacts with an agent, and takes actions based on the strategic agent's report of the current state of the world. Both the principal and the agent can have arbitrary and potentially different valuations for the actions taken, possibly also depending on the actual state of the world. Moreover, at any time, the state of the world may evolve arbitrarily depending on the action taken by the principal. The goal is to compute an optimal mechanism which maximizes the principal's utility in the face of the self-interested strategic agent.We give an efficient algorithm for computing optimal mechanisms, with or without payments, under different individual-rationality constraints, when the time horizon is constant. Our algorithm is based on a sophisticated linear program formulation, which can be customized in various ways to accommodate richer constraints.

algorithm, artificial intelligence, automated dynamic mechanism design, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.69)

Add feedback

Optimistic Exploration in Reinforcement Learning Using Symbolic Model Estimates

Neural Information Processing SystemsJan-19-2025, 03:25:46 GMT

There has been an increasing interest in using symbolic models along with reinforcement learning (RL) problems, where these coarser abstract models are used as a way to provide RL agents with higher level guidance. However, most of these works are inherently limited by their assumption of having an access to a symbolic approximation of the underlying problem. To address this issue, we introduce a new method for learning optimistic symbolic approximations of the underlying world model. We will see how these representations, coupled with fast diverse planners developed by the automated planning community, provide us with a new paradigm for optimistic exploration in sparse reward settings. We investigate the possibility of speeding up the learning process by generalizing learned model dynamics across similar actions with minimal human input. Finally, we evaluate the method, by testing it on multiple benchmark domains and compare it with other RL strategies.

artificial intelligence, machine learning, reinforcement learning, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.66)

Add feedback