Goto

Collaborating Authors

 Syria






Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers

Neural Information Processing Systems

Despite several works trying to reduce their computational cost, most of LLMs still adopt attention layers between all pairs of tokens in the sequence, thus incurring a quadratic cost. In this study, we present a novel approach that dynamically prunes contextual information while preserving the model's expressiveness, resulting in reduced memory and computational



Noether Embedding: Efficient Learning of Temporal Regularities Chi Gao

Neural Information Processing Systems

Learning to detect and encode temporal regularities (TRs) in events is a prerequisite for human-like intelligence. These regularities should be formed from limited event samples and stored as easily retrievable representations.



US military used Anthropic's AI model Claude in Venezuela raid, report says

The Guardian

A spokesperson for Anthropic declined to comment on whether Claude was used in the operation, but said any use of the tool was required to comply with its policies. A spokesperson for Anthropic declined to comment on whether Claude was used in the operation, but said any use of the tool was required to comply with its policies. US military used Anthropic's AI model Claude in Venezuela raid, report says Wall Street Journal says Claude used in operation via Anthropic's partnership with Palantir Technologies Sat 14 Feb 2026 11.15 ESTFirst published on Sat 14 Feb 2026 10.53 EST Claude, the AI model developed by Anthropic, was used by the US military during its operation to kidnap Nicolás Maduro from Venezuela, the Wall Street Journal revealed on Saturday, a high-profile example of how the US defence department is using artificial intelligence in its operations. The US raid on Venezuela involved bombing across the capital, Caracas, and the killing of 83 people, according to Venezuela's defence ministry. Anthropic's terms of use prohibit the use of Claude for violent ends, for the development of weapons or for conducting surveillance.


Language Model Tokenizers Introduce Unfairness Between Languages

Neural Information Processing Systems

Recent language models have shown impressive multilingual performance, even when not explicitly trained for it. Despite this, there are concerns about the quality of their outputs across different languages. In this paper, we show how disparity in the treatment of different languages arises at the tokenization stage, well before a model is even invoked. The same text translated into different languages can have drastically different tok-enization lengths, with differences up to 15 times in some cases. These disparities persist even for tokenizers that are intentionally trained for multilingual support.