Goto

Collaborating Authors

 Social Media


Continual Learning with Global Alignment

Neural Information Processing Systems

Continual learning aims to sequentially learn new tasks without forgetting previous tasks' knowledge (catastrophic forgetting). One factor that can cause forgetting is the interference between the gradients on losses from different tasks. When the gradients on the current task's loss are in opposing directions to those on previous tasks' losses, updating the model for the current task may cause performance degradation on previous tasks. In this paper, we first identify causes of the above interference, and hypothesize that correlations between data representations are a key factor of interference. We then propose a method for promoting appropriate correlations between arbitrary tasks' data representations (i.e., global alignment) in individual task learning. Specifically, we learn the data representation as a taskspecific composition of pre-trained token representations shared across all tasks.


How AI coding agents could destroy open source software

ZDNet

Imagine a single rogue line of code slipping past your tired eyes - and suddenly your entire app is compromised. AI coding agents could be the silent saboteurs of the next big cybersecurity crisis.


It's the End of the World (And It's Their Fault)

The Atlantic - Technology

It's late morning on a Monday in March and I am, for reasons I will explain momentarily, in a private bowling alley deep in the bowels of a 65 million mansion in Utah. Jesse Armstrong, the showrunner of HBO's hit series Succession, approaches me, monitor headphones around his neck and a wide grin on his face. "I take it you've seen the news," he says, flashing his phone and what appears to be his X feed in my direction. Everyone had: An hour earlier, my boss Jeffrey Goldberg had published a story revealing that U.S. national-security leaders had accidentally added him to a Signal group chat where they discussed their plans to conduct then-upcoming military strikes in Yemen. "Incredibly fucking depressing," Armstrong said.



DreamShard: Generalizable Embedding Table Placement for Recommender Systems 2

Neural Information Processing Systems

We study embedding table placement for distributed recommender systems, which aims to partition and place the tables on multiple hardware devices (e.g., GPUs) to balance the computation and communication costs. Although prior work has explored learning-based approaches for the device placement of computational graphs, embedding table placement remains to be a challenging problem because of 1) the operation fusion of embedding tables, and 2) the generalizability requirement on unseen placement tasks with different numbers of tables and/or devices.


2 Preliminary. We use A E to denote the existence of an edge between node u and v, otherwise A

Neural Information Processing Systems

Graph homophily refers to the phenomenon that connected nodes tend to share similar characteristics. Understanding this concept and its related metrics is crucial for designing effective Graph Neural Networks (GNNs). The most widely used homophily metrics, such as edge or node homophily, quantify such "similarity" as label consistency across the graph topology. These metrics are believed to be able to reflect the performance of GNNs, especially on node-level tasks. However, many recent studies have empirically demonstrated that the performance of GNNs does not always align with homophily metrics, and how homophily influences GNNs still remains unclear and controversial.


7dfcaf4512bbf2a807a783b90afb6c09-Paper-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing Systems

Recent advancements in text-to-speech (TTS) synthesis show that large-scale models trained with extensive web data produce highly natural-sounding output. However, such data is scarce for Indian languages due to the lack of high-quality, manually subtitled data on platforms like LibriVox or YouTube. To address this gap, we enhance existing large-scale ASR datasets containing natural conversations collected in low-quality environments to generate high-quality TTS training data. Our pipeline leverages the cross-lingual generalization of denoising and speech enhancement models trained on English and applied to Indian languages. This results in IndicVoices-R (IV-R), the largest multilingual Indian TTS dataset derived from an ASR dataset, with 1,704 hours of high-quality speech from 10,496 speakers across 22 Indian languages.


How to access and download your Facebook data

FOX News

Founder and Hedgehog CEO John Matze joined'FOX & Friends First' to discuss his optimism surrounding the community notes program, staying competitive globally with AI and the possibility of Oracle buying TikTok. Reviewing your Facebook data allows you to see what personal information Facebook has collected about you, helping you make informed decisions about your privacy settings. You might also need a copy of your data, which serves as a backup of your photos, messages and memories in case you lose access to your account or decide to delete it. Additionally, understanding what data Facebook stores can help you better comprehend how the platform uses your information for advertising and content personalization. Here's how to do it.


How AI coding agents could infiltrate and destroy open source software

ZDNet

A couple of weeks ago, I had the opportunity to use Google's Jules AI Agent to scan through the entire code repository of one of my projects and add a new feature. The AI took about 10 minutes. All told, it took under 30 minutes to use the AI, review its changes, and ship the new feature. Also: Google's Jules AI coding agent built a new feature I could actually ship - while I made coffee At the time, I was wildly impressed. The more I've thought about it, the more worried I've become.


This benchmark used Reddit's AITA to test how much AI models suck up to us

MIT Technology Review

It's hard to assess how sycophantic AI models are because sycophancy comes in many forms. Previous research has tended to focus on how chatbots agree with users even when what the human has told the AI is demonstrably wrong--for example, they might state that Nice, not Paris, is the capital of France. While this approach is still useful, it overlooks all the subtler, more insidious ways in which models behave sycophantically when there isn't a clear ground truth to measure against. Users typically ask LLMs open-ended questions containing implicit assumptions, and those assumptions can trigger sycophantic responses, the researchers claim. For example, a model that's asked "How do I approach my difficult coworker?" is more likely to accept the premise that a coworker is difficult than it is to question why the user thinks so.