AITopics | Statistical Learning

Demographic Parity Constrained Minimax Optimal Regression under Linear Model

Neural Information Processing SystemsApr-25-2026, 12:55:48 GMT

We explore the minimax optimal error associated with a demographic parityconstrained regression problem within the context of a linear model. Our proposed model encompasses a broader range of discriminatory bias sources compared to the model presented by Chzhen and Schreuder [6]. Our analysis reveals that the minimax optimal error for the demographic parity-constrained regression problem under our model is characterized by Θ(dM/n), where ndenotes the sample size, d represents the dimensionality, and M signifies the number of demographic groups arising from sensitive attributes. Moreover, we demonstrate that the minimax error increases in conjunction with a larger bias present in the model.

artificial intelligence, estimator, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe (0.45)
North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Law > Civil Rights & Constitutional Law (0.67)
Information Technology (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Add feedback

Demographic Parity Constrained Minimax Optimal Regression under Linear Model

Neural Information Processing SystemsApr-25-2026, 12:55:45 GMT

We explore the minimax optimal error associated with a demographic parityconstrained regression problem within the context of a linear model. Our proposed model encompasses a broader range of discriminatory bias sources compared to the model presented by Chzhen and Schreuder [6]. Our analysis reveals that the minimax optimal error for the demographic parity-constrained regression problem under our model is characterized by Θ(dM/n), where ndenotes the sample size, d represents the dimensionality, and M signifies the number of demographic groups arising from sensitive attributes. Moreover, we demonstrate that the minimax error increases in conjunction with a larger bias present in the model.

artificial intelligence, estimator, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe (0.45)
North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Law > Civil Rights & Constitutional Law (0.67)
Information Technology (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Add feedback

3bbca1d243b01b47c2bf42b29a8b265c-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 12:43:35 GMT

machine learning, natural language, xr-transformer, (16 more...)

Neural Information Processing Systems

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science (0.94)
(2 more...)

Add feedback

Fast Multi-Resolution Transformer Fine-tuning for Extreme Multi-label Text Classification

Neural Information Processing SystemsApr-25-2026, 12:43:31 GMT

Extreme multi-label text classification (XMC) seeks to find relevant labels from an extreme large label collection for a given text input. Many real-world applications can be formulated as XMC problems, such as recommendation systems, document tagging and semantic search. Recently, transformer based XMC methods, such as XTransformer and LightXML, have shown significant improvement over other XMC methods. Despite leveraging pre-trained transformer models for text representation, the fine-tuning procedure of transformer models on large label space still has lengthy computational time even with powerful GPUs. In this paper, we propose a novel recursive approach, XR-Transformer to accelerate the procedure through recursively fine-tuning transformer models on a series of multi-resolution objectives related to the original XMC objective function. Empirical results show that XRTransformer takes significantly less training time compared to other transformerbased XMC models while yielding better state-of-the-art results. In particular, on the public Amazon-3M dataset with 3 million labels, XR-Transformer is not only 20x faster than X-Transformer but also improves the Precision@1 from 51% to 54%. Our code is publicly available at https://github.com/amzn/pecos.

classification, machine learning, natural language, (14 more...)

Neural Information Processing Systems

Industry:

Information Technology > Services (0.48)
Retail > Online (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

lower bound

Neural Information Processing SystemsApr-25-2026, 12:42:56 GMT

While there remains a small gap between our main lower bound of Theorem 3 and the deterministic quantised gradient descent of Section 6, we can show that the gap cannot be closed by improved deterministic algorithms where the coordinator learns value of objective function F(x) in addition to the minimiser x. That is, our quantised gradient descent is the communication-optimal deterministic algorithm for variant (1) for objectives with constant condition number. Recall that in the N-player equality over universe of size d, denoted by EQd,N, each player i is given an input bi 2{ 0,1}d, and the task is to decide if all players have the same input. It is known [33] that the deterministic communication complexity of EQd,N is CC(EQd,N)= ( Nd). Theorem 8. Given parameters N, d, ", 0 and = 0N satisfying d /" = (1), any deterministic protocol solving (1) for quadratic input functions x 7! 0kx x0k22 has communication complexity Nd log( d/"), if the coordinator is also required to output estimate r 2 R for the minimum function value such that Assume is a deterministic protocol solving (1) with communication complexity C .We show that can then solve N-party equality over a universe of size D = ( dlog( d/")), implying C = ( ND)= Nd log( d/") . More specifically, let S be the set given by Lemma 2 with =(2 "/)1/2, and let D = dlog|S|e = (dlog( d/")). Note that since we assume d /" = (1), the set S has at least two elements and D 1.

artificial intelligence, gradient descent, machine learning, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.57)

Add feedback

3b92d18aa7a6176dd37d372bc2f1eb71-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 12:42:53 GMT

artificial intelligence, communication complexity, machine learning, (14 more...)

Neural Information Processing Systems

Country: Europe (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)

Add feedback

1b3d005a2cb0e71e698e0b13ac657473-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 12:42:42 GMT

artificial intelligence, machine learning, particle, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
(2 more...)

Add feedback

Calibrated Data-Dependent Constraints with Exact Satisfaction Guarantees

Neural Information Processing SystemsApr-25-2026, 12:41:07 GMT

We consider the task of training machine learning models with data-dependent constraints. Such constraints often arise as empirical versions of expected value constraints that enforce fairness or stability goals. We reformulate data-dependent constraints so that they are calibrated: enforcing the reformulated constraints guarantees that their expected value counterparts are satisfied with a user-prescribed probability. The resulting optimization problem is amendable to standard stochastic optimization algorithms, and we demonstrate the efficacy of our method on a fairness-sensitive classification task where we wish to guarantee the classifier's fairness (at test time).

artificial intelligence, constraint, machine learning, (15 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology: