Goto

Collaborating Authors

 sama



Making Scalable Meta Learning Practical

Neural Information Processing Systems

Despite its flexibility to learn diverse inductive biases in machine learning programs, meta learning (i.e.,\ learning to learn) has long been recognized to suffer from poor scalability due to its tremendous compute/memory costs, training instability, and a lack of efficient distributed training support. In this work, we focus on making scalable meta learning practical by introducing SAMA, which combines advances in both implicit differentiation algorithms and systems. Specifically, SAMA is designed to flexibly support a broad range of adaptive optimizers in the base level of meta learning programs, while reducing computational burden by avoiding explicit computation of second-order gradient information, and exploiting efficient distributed training techniques implemented for first-order gradients.



Making Scalable Meta Learning Practical

Neural Information Processing Systems

Despite its flexibility to learn diverse inductive biases in machine learning programs, meta learning (i.e.,\ learning to learn) has long been recognized to suffer from poor scalability due to its tremendous compute/memory costs, training instability, and a lack of efficient distributed training support. In this work, we focus on making scalable meta learning practical by introducing SAMA, which combines advances in both implicit differentiation algorithms and systems. Specifically, SAMA is designed to flexibly support a broad range of adaptive optimizers in the base level of meta learning programs, while reducing computational burden by avoiding explicit computation of second-order gradient information, and exploiting efficient distributed training techniques implemented for first-order gradients. Furthermore, we show that SAMA-based data optimization leads to consistent improvements in text classification accuracy with BERT and RoBERTa large language models, and achieves state-of-the-art results in both small- and large-scale data pruning on image classification tasks, demonstrating the practical applicability of scalable meta learning across language and vision domains.


Making Scalable Meta Learning Practical

Neural Information Processing Systems

Despite its flexibility to learn diverse inductive biases in machine learning programs, meta learning (i.e.,\ learning to learn) has long been recognized to suffer from poor scalability due to its tremendous compute/memory costs, training instability, and a lack of efficient distributed training support. In this work, we focus on making scalable meta learning practical by introducing SAMA, which combines advances in both implicit differentiation algorithms and systems. Specifically, SAMA is designed to flexibly support a broad range of adaptive optimizers in the base level of meta learning programs, while reducing computational burden by avoiding explicit computation of second-order gradient information, and exploiting efficient distributed training techniques implemented for first-order gradients. Furthermore, we show that SAMA-based data optimization leads to consistent improvements in text classification accuracy with BERT and RoBERTa large language models, and achieves state-of-the-art results in both small- and large-scale data pruning on image classification tasks, demonstrating the practical applicability of scalable meta learning across language and vision domains.


Kenya's President Wades Into Meta Lawsuits

TIME - Tech

Can a Big Tech company be sued in Kenya for alleged abuses at an outsourcing company working on its behalf? That's the question at the heart of two lawsuits that are attempting to set a new precedent in Kenya, which is the prime destination for tech companies looking to farm out digital work to the African continent. The two-year legal battle stems from allegations of human rights violations at an outsourced Meta content moderation facility in Nairobi, where employees hired by a contractor were paid as little as 1.50 per hour to view traumatic content, such as videos of rapes, murders, and war crimes. The suits claim that despite the workers being contracted by an outsourcing company, called Sama, Meta essentially supervised and set the terms for the work, and designed and managed the software required for the task. Both companies deny wrongdoing and Meta has challenged the Kenyan courts' jurisdiction to hear the cases.


OpenAI co-founder and Chief Scientist Ilya Sutskever is leaving the company

Engadget

Ilya Sutskever has announced on X, formerly known as Twitter, that he's leaving OpenAI almost a decade after he co-founded the company. He's confident that OpenAI "will build [artificial general intelligence] that is both safe and beneficial" under the leadership of CEO Sam Altman, President Greg Brockman and CTO Mira Murati, he continued. In his own post about Sutskever's departure, Altman called him "one of the greatest minds of our generation" and credited him for his work with the company. Jakub Pachocki, OpenAI's previous Director of Research who headed the development of GPT-4 and OpenAI Five, has taken Sutskever's role as Chief Scientist. After almost a decade, I have made the decision to leave OpenAI.


Making Scalable Meta Learning Practical

Choe, Sang Keun, Mehta, Sanket Vaibhav, Ahn, Hwijeen, Neiswanger, Willie, Xie, Pengtao, Strubell, Emma, Xing, Eric

arXiv.org Artificial Intelligence

Despite its flexibility to learn diverse inductive biases in machine learning programs, meta learning (i.e., learning to learn) has long been recognized to suffer from poor scalability due to its tremendous compute/memory costs, training instability, and a lack of efficient distributed training support. In this work, we focus on making scalable meta learning practical by introducing SAMA, which combines advances in both implicit differentiation algorithms and systems. Specifically, SAMA is designed to flexibly support a broad range of adaptive optimizers in the base level of meta learning programs, while reducing computational burden by avoiding explicit computation of second-order gradient information, and exploiting efficient distributed training techniques implemented for first-order gradients. Evaluated on multiple large-scale meta learning benchmarks, SAMA showcases up to 1.7/4.8x increase in throughput and 2.0/3.8x decrease in memory consumption respectively on single-/multi-GPU setups compared to other baseline meta learning algorithms. Furthermore, we show that SAMA-based data optimization leads to consistent improvements in text classification accuracy with BERT and RoBERTa large language models, and achieves state-of-the-art results in both small- and large-scale data pruning on image classification tasks, demonstrating the practical applicability of scalable meta learning across language and vision domains.


Semantically Aligned Task Decomposition in Multi-Agent Reinforcement Learning

Li, Wenhao, Qiao, Dan, Wang, Baoxiang, Wang, Xiangfeng, Jin, Bo, Zha, Hongyuan

arXiv.org Artificial Intelligence

The difficulty of appropriately assigning credit is particularly heightened in cooperative MARL with sparse reward, due to the concurrent time and structural scales involved. Automatic subgoal generation (ASG) has recently emerged as a viable MARL approach inspired by utilizing subgoals in intrinsically motivated reinforcement learning. However, end-to-end learning of complex task planning from sparse rewards without prior knowledge, undoubtedly requires massive training samples. Moreover, the diversity-promoting nature of existing ASG methods can lead to the "over-representation" of subgoals, generating numerous spurious subgoals of limited relevance to the actual task reward and thus decreasing the sample efficiency of the algorithm. To address this problem and inspired by the disentangled representation learning, we propose a novel "disentangled" decision-making method, Semantically Aligned task decomposition in MARL (SAMA), that prompts pretrained language models with chain-of-thought that can suggest potential goals, provide suitable goal decomposition and subgoal allocation as well as self-reflection-based replanning. Additionally, SAMA incorporates language-grounded RL to train each agent's subgoal-conditioned policy. SAMA demonstrates considerable advantages in sample efficiency compared to state-of-the-art ASG methods, as evidenced by its performance on two challenging sparse-reward tasks, Overcooked and MiniRTS.


'It's destroyed me completely': Kenyan moderators decry toll of training of AI models

The Guardian

The images pop up in Mophat Okinyi's mind when he's alone, or when he's about to sleep. Okinyi, a former content moderator for Open AI's ChatGPT in Nairobi, Kenya, is one of four people in that role who have filed a petition to the Kenyan government calling for an investigation into what they describe as exploitative conditions for contractors reviewing the content that powers artificial intelligence programs. "It has really damaged my mental health," said Okinyi. The 27-year-old said he would would view up to 700 text passages a day, many depicting graphic sexual violence. He recalls he started avoiding people after having read texts about rapists and found himself projecting paranoid narratives on to people around him.