Goto

Collaborating Authors

 dilute


How new data permeates LLM knowledge and how to dilute it

Sun, Chen, Aksitov, Renat, Zhmoginov, Andrey, Miller, Nolan Andrew, Vladymyrov, Max, Rueckert, Ulrich, Kim, Been, Sandler, Mark

arXiv.org Artificial Intelligence

Large language models learn and continually learn through the accumulation of gradient-based updates, but how individual pieces of new information affect existing knowledge, leading to both beneficial generalization and problematic hallucination, remains poorly understood. We demonstrate that when learning new information, LLMs exhibit a "priming" effect: learning a new fact can cause the model to inappropriately apply that knowledge in unrelated contexts. To systematically study this phenomenon, we introduce "Outlandish," a carefully curated dataset of 1320 diverse text samples designed to probe how new knowledge permeates through an LLM's existing knowledge base. Using this dataset, we show that the degree of priming after learning new information can be predicted by measuring the token probability of key words before learning. This relationship holds robustly across different model architectures (PALM-2, Gemma, Llama), sizes, and training stages. Finally, we develop two novel techniques to modulate how new knowledge affects existing model behavior: (1) a ``stepping-stone'' text augmentation strategy and (2) an ``ignore-k'' update pruning method. These approaches reduce undesirable priming effects by 50-95\% while preserving the model's ability to learn new information. Our findings provide both empirical insights into how LLMs learn and practical tools for improving the specificity of knowledge insertion in language models. Further materials: https://sunchipsster1.github.io/projects/outlandish/


Rotten Tomatoes further dilutes its utility with 'Verified Hot' badge

Engadget

Rotten Tomatoes just added a new "Verified Hot" badge that indicates an overall positive user score that will join the "Certified Fresh" badge for critic scores. To qualify for this designation, a movie or show needs to have a Verified Audience Score of 90 percent or higher. Finally, the dregs will be slapped with a "Stale" badge, which is for any show or movie that falls beneath 60 percent. Rotten Tomatoes is trying to get around review bombing here by mandating that user reviews be from people who actually saw the movie in question. There are a couple of little problems with this. It verifies that a consumer saw the movie via the ticketing firm Fandango, and there are plenty of other ticketing firms out there, including, you know, the theater cashier.


NeuralCMS: A deep learning approach to study Jupiter's interior

Ziv, Maayan, Galanti, Eli, Sheffer, Amir, Howard, Saburo, Guillot, Tristan, Kaspi, Yohai

arXiv.org Artificial Intelligence

NASA's Juno mission provided exquisite measurements of Jupiter's gravity field that together with the Galileo entry probe atmospheric measurements constrains the interior structure of the giant planet. Inferring its interior structure range remains a challenging inverse problem requiring a computationally intensive search of combinations of various planetary properties, such as the cloud-level temperature, composition, and core features, requiring the computation of ~10^9 interior models. We propose an efficient deep neural network (DNN) model to generate high-precision wide-ranged interior models based on the very accurate but computationally demanding concentric MacLaurin spheroid (CMS) method. We trained a sharing-based DNN with a large set of CMS results for a four-layer interior model of Jupiter, including a dilute core, to accurately predict the gravity moments and mass, given a combination of interior features. We evaluated the performance of the trained DNN (NeuralCMS) to inspect its predictive limitations. NeuralCMS shows very good performance in predicting the gravity moments, with errors comparable with the uncertainty due to differential rotation, and a very accurate mass prediction. This allowed us to perform a broad parameter space search by computing only ~10^4 actual CMS interior models, resulting in a large sample of plausible interior structures, and reducing the computation time by a factor of 10^5. Moreover, we used a DNN explainability algorithm to analyze the impact of the parameters setting the interior model on the predicted observables, providing information on their nonlinear relation.