AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsFeb-10-2026, 12:01:55 GMT

PaCo: Parameter-CompositionalMulti-Task ReinforcementLearning

Ontheotherhand,asintelligentagents,humansusuallyspendlesstime learning similar tasks and can acquire new skills using existing ones.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.51)

Neural Information Processing SystemsDec-24-2025, 16:51:17 GMT

PaCo: Parameter-Compositional Multi-task Reinforcement Learning

The purpose of multi-task reinforcement learning (MTRL) is to train a single policy that can be applied to a set of different tasks. Sharing parameters allows us to take advantage of the similarities among tasks. However, the gaps between contents and difficulties of different tasks bring us challenges on both which tasks should share the parameters and what parameters should be shared, as well as the optimization challenges due to parameter sharing. In this work, we introduce a parameter-compositional approach (PaCo) as an attempt to address these challenges. In this framework, a policy subspace represented by a set of parameters is learned. Policies for all the single tasks lie in this subspace and can be composed by interpolating with the learned set. It allows not only flexible parameter sharing, but also a natural way to improve training.We demonstrate the state-of-the-art performance on Meta-World benchmarks, verifying the effectiveness of the proposed approach.

name change, paco, parameter-compositional multi-task reinforcement learning, (4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.31)

Neural Information Processing SystemsAug-16-2025, 16:01:14 GMT

A Appendix

In appendix, we provide some additional results in Section A.1, more implementation details in To compare the stability of training, we didn't early-stop the training process even if the loss of some tasks already exploded. MTRL training compared with both variants, demonstrating the effectiveness of the PaCo design. MT50 is a more complex benchmark in Meta-World containing 50 different manipulation tasks (including the MT10 tasks). Therefore it's hard to determine if the policy has reached to the optimal. The results are shown in Figure 8.

artificial intelligence, machine learning, paco, (19 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsAug-16-2025, 16:01:11 GMT

86b8ad667206fb9a52ae575fbf1cd6be-Paper-Conference.pdf

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Country: North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report (0.93)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.31)

Neural Information Processing SystemsJan-17-2025, 03:34:46 GMT

PaCo: Parameter-Compositional Multi-task Reinforcement Learning

paco, parameter sharing, parameter-compositional multi-task reinforcement learning

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

arXiv.org Artificial IntelligenceAug-13-2023

PaCo: Preconditions Attributed to Commonsense Knowledge

Qasemi, Ehsan, Ilievski, Filip, Chen, Muhao, Szekely, Pedro

Humans can seamlessly reason with circumstantial preconditions of commonsense knowledge. We understand that a glass is used for drinking water, unless the glass is broken or the water is toxic. Despite state-of-the-art (SOTA) language models' (LMs) impressive performance on inferring commonsense knowledge, it is unclear whether they understand the circumstantial preconditions. To address this gap, we propose a novel challenge of reasoning with circumstantial preconditions. We collect a dataset, called PaCo, consisting of 12.4 thousand preconditions of commonsense statements expressed in natural language. Based on this dataset, we create three canonical evaluation tasks and use them to examine the capability of existing LMs to understand situational preconditions. Our results reveal a 10-30% gap between machine and human performance on our tasks, which shows that reasoning with preconditions is an open challenge.

artificial intelligence, natural language, precondition, (16 more...)

2104.08712

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(13 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Education (0.68)
Water & Waste Management > Water Management > Water Supplies & Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Flasseur, Olivier, Bodrito, Théo, Mairal, Julien, Ponce, Jean, Langlois, Maud, Lagrange, Anne-Marie

Combining multi-spectral data with statistical and deep-learning models for improved exoplanet detection in direct imaging at high contrast

arXiv.org Artificial IntelligenceJun-21-2023

Exoplanet detection by direct imaging is a difficult task: the faint signals from the objects of interest are buried under a spatially structured nuisance component induced by the host star. The exoplanet signals can only be identified when combining several observations with dedicated detection algorithms. In contrast to most of existing methods, we propose to learn a model of the spatial, temporal and spectral characteristics of the nuisance, directly from the observations. In a pre-processing step, a statistical model of their correlations is built locally, and the data are centered and whitened to improve both their stationarity and signal-to-noise ratio (SNR). A convolutional neural network (CNN) is then trained in a supervised fashion to detect the residual signature of synthetic sources in the pre-processed images. Our method leads to a better trade-off between precision and recall than standard approaches in the field. It also outperforms a state-of-the-art algorithm based solely on a statistical framework. Besides, the exploitation of the spectral diversity improves the performance compared to a similar model built solely from spatio-temporal data.

algorithm, artificial intelligence, machine learning, (19 more...)

2306.12266

Country:

Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.05)
South America > Chile (0.04)
North America > United States > New York (0.04)
Europe > France > Auvergne-Rhône-Alpes > Lyon > Lyon (0.04)

Genre: Research Report (0.83)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Garg, Vaibhav, Xu, Ganning, Singh, Munindar P.

PACO: Provocation Involving Action, Culture, and Oppression

arXiv.org Artificial IntelligenceMar-19-2023

In India, people identify with a particular group based on certain attributes such as religion. The same religious groups are often provoked against each other. Previous studies show the role of provocation in increasing tensions between India's two prominent religious groups: Hindus and Muslims. With the advent of the Internet, such provocation also surfaced on social media platforms such as WhatsApp. By leveraging an existing dataset of Indian WhatsApp posts, we identified three categories of provoking sentences against Indian Muslims. Further, we labeled 7,000 sentences for three provocation categories and called this dataset PACO. We leveraged PACO to train a model that can identify provoking sentences from a WhatsApp post. Our best model is fine-tuned RoBERTa and achieved a 0.851 average AUC score over five-fold cross-validation. Automatically identifying provoking sentences could stop provoking text from reaching out to the masses, and can prevent possible discrimination or violence against the target religious group. Further, we studied the provocative speech through a pragmatic lens, by identifying the dialog acts and impoliteness super-strategies used against the religious group.

artificial intelligence, category, machine learning, (19 more...)

2303.12808

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(10 more...)

Genre: Research Report (1.00)

Industry:

Information Technology > Services (0.76)
Law Enforcement & Public Safety (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.50)

arXiv.org Artificial IntelligenceOct-20-2022

PaCo: Parameter-Compositional Multi-Task Reinforcement Learning

Sun, Lingfeng, Zhang, Haichao, Xu, Wei, Tomizuka, Masayoshi

The purpose of multi-task reinforcement learning (MTRL) is to train a single policy that can be applied to a set of different tasks. Sharing parameters allows us to take advantage of the similarities among tasks. However, the gaps between contents and difficulties of different tasks bring us challenges on both which tasks should share the parameters and what parameters should be shared, as well as the optimization challenges due to parameter sharing. In this work, we introduce a parameter-compositional approach (PaCo) as an attempt to address these challenges. In this framework, a policy subspace represented by a set of parameters is learned. Policies for all the single tasks lie in this subspace and can be composed by interpolating with the learned set. It allows not only flexible parameter sharing but also a natural way to improve training. We demonstrate the state-of-the-art performance on Meta-World benchmarks, verifying the effectiveness of the proposed approach.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

2210.11653

Country: North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)