AITopics | ctx

Frustratingly Easy Test-Time Adaptation of Vision-Language Models Matteo Farina

Neural Information Processing SystemsFeb-18-2026, 13:29:57 GMT

Vision-Language Models seamlessly discriminate among arbitrary semantic categories, yet they still suffer from poor generalization when presented with challenging examples.

artificial intelligence, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Country: Europe > Italy > Trentino-Alto Adige/Südtirol > Trentino Province > Trento (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Compact Proofs of Model Performance via Mechanistic Interpretability

Jason Gross,Rajashree Agrawal,Thomas Kwa,Euan Ong,Chun Hei Yip,Alex Gibson,Soufiane Noubir,Lawrence Chan

Neural Information Processing SystemsFeb-16-2026, 15:52:08 GMT

We propose using mechanistic interpretability – techniques for reverse engineering model weights into human-interpretable algorithms – to derive and compactly prove formal guarantees on model performance. We prototype this approach by formally proving accuracy lower bounds for a small transformer trained on Max-of-K, validating proof transferability across 151 random seeds and four values of K. We create 102 different computer-assisted proof strategies and assess their length and tightness of bound on each of our models. Using quantitative metrics, we find that shorter proofs seem to require and provide more mechanistic understanding. Moreover, we find that more faithful mechanistic understanding leads to tighter performance bounds. We confirm these connections by qualitatively examining a subset of our proofs. Finally, we identify compounding structureless errors as a key challenge for using mechanistic interpretability to generate compact proofs on model performance.

logic & formal reasoning, machine learning, natural language, (23 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
Europe > Poland > Lower Silesia Province > Wroclaw (0.04)
(3 more...)

Genre: Research Report > Experimental Study (0.45)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

e55b33430e344a1ee23710415b1c9d87-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 12:18:05 GMT

dataset, language prototype, similarity matrix, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

876f1f9954de0aa402d91bb988d12cd4-Supplemental.pdf

Neural Information Processing SystemsFeb-9-2026, 17:26:56 GMT

backward process, diffflow, equation, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.34)

Add feedback

Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale Matthew Le

Neural Information Processing SystemsFeb-9-2026, 13:44:54 GMT

In particular, V oicebox outperforms the state-of-the-art zero-shot TTS model V ALL-E on both intelligibility (5.9% vs 1.9% word error rates) and audio similarity (0.580 vs 0.681) while being up to 20 times faster.

artificial intelligence, machine learning, natural language, (22 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
South America > Colombia > Meta Department > Villavicencio (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale Matthew Le

Neural Information Processing SystemsFeb-9-2026, 13:44:51 GMT

In particular, V oicebox outperforms the state-of-the-art zero-shot TTS model V ALL-E on both intelligibility (5.9% vs 1.9% word error rates) and audio similarity (0.580 vs 0.681) while being up to 20 times faster.

artificial intelligence, machine learning, natural language, (22 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
South America > Colombia > Meta Department > Villavicencio (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

The Semiotic Channel Principle: Measuring the Capacity for Meaning in LLM Communication

Picca, Davide

arXiv.org Artificial IntelligenceNov-26-2025

This paper proposes a novel semiotic framework for analyzing Large Language Models (LLMs), conceptualizing them as stochastic semiotic engines whose outputs demand active, asymmetric human interpretation. We formalize the trade-off between expressive richness (semiotic breadth) and interpretive stability (decipherability) using information-theoretic tools. Breadth is quantified as source entropy, and decipherability as the mutual information between messages and human interpretations. We introduce a generative complexity parameter (lambda) that governs this trade-off, as both breadth and decipherability are functions of lambda. The core trade-off is modeled as an emergent property of their distinct responses to $λ$. We define a semiotic channel, parameterized by audience and context, and posit a capacity constraint on meaning transmission, operationally defined as the maximum decipherability by optimizing lambda. This reframing shifts analysis from opaque model internals to observable textual artifacts, enabling empirical measurement of breadth and decipherability. We demonstrate the framework's utility across four key applications: (i) model profiling; (ii) optimizing prompt/context design; (iii) risk analysis based on ambiguity; and (iv) adaptive semiotic systems. We conclude that this capacity-based semiotic approach offers a rigorous, actionable toolkit for understanding, evaluating, and designing LLM-mediated communication.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.1955

Country:

Europe > Switzerland (0.14)
North America > United States (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Frustratingly Easy Test-Time Adaptation of Vision-Language Models Matteo Farina

Neural Information Processing SystemsOct-10-2025, 20:11:35 GMT

Vision-Language Models seamlessly discriminate among arbitrary semantic categories, yet they still suffer from poor generalization when presented with challenging examples.

ctx, dataset, experiment, (12 more...)

Neural Information Processing Systems

Country: Europe > Italy > Trentino-Alto Adige/Südtirol > Trentino Province > Trento (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Compact Proofs of Model Performance via Mechanistic Interpretability

Jason Gross,Rajashree Agrawal,Thomas Kwa,Euan Ong,Chun Hei Yip,Alex Gibson,Soufiane Noubir,Lawrence Chan

Neural Information Processing SystemsOct-10-2025, 09:38:10 GMT

We propose using mechanistic interpretability – techniques for reverse engineering model weights into human-interpretable algorithms – to derive and compactly prove formal guarantees on model performance. We prototype this approach by formally proving accuracy lower bounds for a small transformer trained on Max-of-K, validating proof transferability across 151 random seeds and four values of K. We create 102 different computer-assisted proof strategies and assess their length and tightness of bound on each of our models. Using quantitative metrics, we find that shorter proofs seem to require and provide more mechanistic understanding. Moreover, we find that more faithful mechanistic understanding leads to tighter performance bounds. We confirm these connections by qualitatively examining a subset of our proofs. Finally, we identify compounding structureless errors as a key challenge for using mechanistic interpretability to generate compact proofs on model performance.

query, sequence, vocab, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
Europe > Poland > Lower Silesia Province > Wroclaw (0.04)
(3 more...)

Genre: Research Report > Experimental Study (0.45)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Efficiency-Effectiveness Reranking FLOPs for LLM-based Rerankers

Peng, Zhiyuan, Wei, Ting-ruen, Song, Tingyu, Zhao, Yilun

arXiv.org Artificial IntelligenceOct-10-2025

Large Language Models (LLMs) have recently been applied to reranking tasks in information retrieval, achieving strong performance. However, their high computational demands often hinder practical deployment. Existing studies evaluate the efficiency of LLM-based rerankers using proxy metrics such as latency, the number of forward passes, input tokens, and output tokens. However, these metrics depend on hardware and running-time choices (\eg parallel or not, batch size, etc), and often fail to account for model size, making it difficult to interpret and obscuring the evaluation of the efficiency-effectiveness tradeoff. To address this issue, we propose \ours\footnote{https://github.com/zhiyuanpeng/EER-FLOPs.} for LLM-based rerankers: RPP (ranking metrics per PetaFLOP), measuring how much ranking quality (e.g., NDCG or MRR) a method achieves per PetaFLOP, and QPP (queries per PetaFLOP), measuring how many queries can be processed per PetaFLOP. Accompanied by the new metrics, an interpretable FLOPs estimator is developed to estimate the FLOPs of an LLM-based reranker even without running any experiments. Based on the proposed metrics, we conduct comprehensive experiments to evaluate a wide range of LLM-based rerankers with different architectures, studying the efficiency-effectiveness trade-off and bringing this issue to the attention of the research community.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2507.06223

Country:

Europe (0.67)
North America > United States (0.28)
North America > Mexico (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Filters

Collaborating Authors

ctx

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Frustratingly Easy Test-Time Adaptation of Vision-Language Models Matteo Farina

Compact Proofs of Model Performance via Mechanistic Interpretability

e55b33430e344a1ee23710415b1c9d87-Supplemental-Conference.pdf

876f1f9954de0aa402d91bb988d12cd4-Supplemental.pdf

Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale Matthew Le

Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale Matthew Le

The Semiotic Channel Principle: Measuring the Capacity for Meaning in LLM Communication

Frustratingly Easy Test-Time Adaptation of Vision-Language Models Matteo Farina

Compact Proofs of Model Performance via Mechanistic Interpretability

Efficiency-Effectiveness Reranking FLOPs for LLM-based Rerankers