AITopics | Overview

Language Model Tokenizers Introduce Unfairness Between Languages

Neural Information Processing SystemsMay-24-2025, 22:38:50 GMT

Recent language models have shown impressive multilingual performance, even when not explicitly trained for it. Despite this, there are concerns about the quality of their outputs across different languages. In this paper, we show how disparity in the treatment of different languages arises at the tokenization stage, well before a model is even invoked. The same text translated into different languages can have drastically different tokenization lengths, with differences up to 15 times in some cases. These disparities persist even for tokenizers that are intentionally trained for multilingual support.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Asia > Middle East (1.00)
Africa (1.00)
Asia > Japan > Honshū > Kantō > Saitama Prefecture (0.14)

Genre: Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Add feedback

Scalable Primal-Dual Actor-Critic Method for Safe Multi-Agent RL with General Utilities

Neural Information Processing SystemsMay-24-2025, 21:37:10 GMT

We investigate safe multi-agent reinforcement learning, where agents seek to collectively maximize an aggregate sum of local objectives while satisfying their own safety constraints. The objective and constraints are described by general utilities, i.e., nonlinear functions of the long-term state-action occupancy measure, which encompass broader decision-making goals such as risk, exploration, or imitations. The exponential growth of the state-action space size with the number of agents presents challenges for global observability, further exacerbated by the global coupling arising from agents' safety constraints. To tackle this issue, we propose a primal-dual method utilizing shadow reward and κ-hop neighbor truncation under a form of correlation decay property, where κ is the communication radius.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (0.67)
Overview (0.67)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.66)

Add feedback

72416ded78a439907ff72165ac9c56e0-Paper-Conference.pdf

Neural Information Processing SystemsMay-24-2025, 21:14:28 GMT

artificial intelligence, evolutionary algorithm, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Minnesota (0.14)

Genre:

Overview (0.68)
Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.93)

Add feedback

Appendix A Related Work A.1 Multimodal Large Language Models 3 A.2 Trustworthiness of LLMs

Neural Information Processing SystemsMay-24-2025, 14:32:12 GMT

A.1 Multimodal Large Language Models Building on the foundational capabilities of groundbreaking Large Language Models (LLMs) such as GPT [3], PALM [6], Mistral [49], and LLama [108], which excel in language understanding and reasoning, recent innovations have integrated these models with other modalities (especially vision), leading to the development of Multimodal Large Language Models (MLLMs). These advanced MLLMs combine and process visual and textual data, demonstrating enhanced versatility in addressing both traditional vision tasks [21, 40, 42, 133] and complex multimodal challenges [34, 70, 136]. Among all MLLMs, proprietary models consistently perform well. OpenAI's GPT-4-Vision [82] pioneered this space by adeptly handling both text and image content. Anthropic's Claude 3 series [7] integrates advanced vision capabilities and multilingual support, enhancing its application across diverse cognitive and real-time tasks.

large language model, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.45)
Asia > China (0.28)
Europe > Italy (0.27)
Africa > Middle East > Egypt (0.14)

Genre:

Research Report (1.00)
Overview (0.92)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Government (0.93)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

701eba0f98c6f28ffee0de5969d8d034-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsMay-24-2025, 13:27:14 GMT

distillation, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Genre: Overview (0.46)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)

Add feedback

6c7ca1889f01a9b767c631686fb5fd24-Supplemental-Conference.pdf

Neural Information Processing SystemsMay-24-2025, 10:53:22 GMT

artificial intelligence, lipschitz constant, machine learning, (20 more...)

Neural Information Processing Systems

Genre:

Research Report (0.46)
Overview (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Automatic differentiation in ML: Where we are and where we should be going

Bart van Merrienboer, Olivier Breuleux, Arnaud Bergeron, Pascal Lamblin

Neural Information Processing SystemsMay-24-2025, 07:22:44 GMT

We review the current state of automatic differentiation (AD) for array programming in machine learning (ML), including the different approaches such as operator overloading (OO) and source transformation (ST) used for AD, graph-based intermediate representations for programs, and source languages. Based on these insights, we introduce a new graph-based intermediate representation (IR) which specifically aims to efficiently support fully-general AD for array programming. Unlike existing dataflow programming representations in ML frameworks, our IR naturally supports function calls, higher-order functions and recursion, making ML models easier to implement. The ability to represent closures allows us to perform AD using ST without a tape, making the resulting derivative (adjoint) program amenable to ahead-of-time optimization using tools from functional language compilers, and enabling higher-order derivatives. Lastly, we introduce a proof of concept compiler toolchain called Myia which uses a subset of Python as a front end.

artificial intelligence, machine learning, programming language, (21 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.14)
Europe > Germany (0.14)

Genre:

Overview (0.68)
Research Report (0.48)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Encoding Time-Series Explanations through Self-Supervised Model Behavior Consistency

Neural Information Processing SystemsMay-24-2025, 06:08:17 GMT

Interpreting time series models is uniquely challenging because it requires identifying both the location of time series signals that drive model predictions and their matching to an interpretable temporal pattern. While explainers from other modalities can be applied to time series, their inductive biases do not transfer well to the inherently challenging interpretation of time series.

data mining, explanation, machine learning, (19 more...)

Neural Information Processing Systems

Country: