AITopics

Beyond Pinball Loss: Quantile Methods for Calibrated Uncertainty Quantification

Neural Information Processing SystemsMay-28-2025, 23:36:43 GMT

Among the many ways of quantifying uncertainty in a regression setting, specifying the full quantile function is attractive, as quantiles are amenable to interpretation and evaluation. A model that predicts the true conditional quantiles for each input, at all quantile levels, presents a correct and efficient representation of the underlying uncertainty. To achieve this, many current quantile-based methods focus on optimizing the pinball loss. However, this loss restricts the scope of applicable regression models, limits the ability to target many desirable properties (e.g.

artificial intelligence, calibration, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > California > Santa Clara County (0.14)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

26300457961c3e056ea61c9d3ebec2a4-Paper-Conference.pdf

Neural Information Processing SystemsMay-28-2025, 23:34:19 GMT

artificial intelligence, domain adaptation, machine learning, (18 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.86)

Add feedback

DAPE: Data-Adaptive Positional Encoding for Length Extrapolation 2

Neural Information Processing SystemsMay-28-2025, 23:34:10 GMT

Positional encoding plays a crucial role in transformers, significantly impacting model performance and length generalization. Prior research has introduced absolute positional encoding (APE) and relative positional encoding (RPE) to distinguish token positions in given sequences. However, both APE and RPE remain fixed after model training regardless of input data, limiting their adaptability and flexibility. Hence, we expect that the desired positional encoding should be data-adaptive and can be dynamically adjusted with the given attention. In this paper, we propose a Data-Adaptive Positional Encoding (DAPE) method, which dynamically and semantically adjusts based on input context and learned fixed priors. Experimental validation on real-world datasets (Arxiv, Books3, and CHE) demonstrates that DAPE enhances model performances in terms of trained length and length generalization, where the improvements are statistically significant. The model visualization suggests that our model can keep both local and anti-local information. Finally, we successfully train the model on sequence length 128 and achieve better performance at evaluation sequence length 8192, compared with other static positional encoding methods, revealing the benefit of the adaptive positional encoding method.

large language model, machine learning, natural language, (14 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)

Add feedback

Checklist

Neural Information Processing SystemsMay-28-2025, 23:33:49 GMT

For all authors... (a) Do the main claims made in the abstract and introduction accurately reflect the paper's contributions and scope? If you used crowdsourcing or conducted research with human subjects... (a) Did you include the full text of instructions given to participants and screenshots, if applicable? [N/A] (b) Did you describe any potential participant risks, with links to Institutional Review Board (IRB) approvals, if applicable? [N/A] (c) Did you include the estimated hourly wage paid to participants and the total amount spent on participant compensation? Proposition 4. For Γ H and domains S, T we have: T Proposition 5 (equivalence between transferability and transfer measures). Summing over (A.12) and (A.14) we have: 2 sup |ɛ In the proof above, we assumed a classifier h Γ is allowed to take a garbage value 0 if it is not sure which label to choose. This is a mild assumption that can hold in practice.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Genre: Research Report (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

5adaacd4531b78ff8b5cedfe3f4d5212-Paper.pdf

Neural Information Processing SystemsMay-28-2025, 23:33:45 GMT

artificial intelligence, deep learning, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > Canada (0.28)

Industry: Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

3465ab6e0c21086020e382f09a482ced-Paper.pdf

Neural Information Processing SystemsMay-28-2025, 23:33:30 GMT

artificial intelligence, machine learning, mechanism, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Maryland > Prince George's County > College Park (0.14)

Industry: Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

We thank the reviewers (R1, R2, R3, R4, and R5) for their thoughtful reviews, and respond to as much as we can given

Neural Information Processing SystemsMay-28-2025, 23:33:18 GMT

Their Theorem 2.2 gives a Given true expected regret, Lemma 2.1 allows one It is precisely this quantity which vanilla RegretNet can only approximate but which we can compute. Due to RegretNet's sensitivity to hyperparameters, we believe that reproducing optimal These changes might explain the performance differences. We agree with this and will add such discussion. As such, much of the comparison in Duetting et al. to previous work applies to our technique as well. We will add discussion briefly in 1 and as a new subsection in 2. We will explicitly clarify this assumption as well.

artificial intelligence, machine learning, regretnet, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.30)

Add feedback

What Makes Multi-modal Learning Better than Single (Provably) Yu Huang

Neural Information Processing SystemsMay-28-2025, 23:33:14 GMT

Other Fusion Methods [7, 10, 8, 1]: Methods in these works can be formulated into the form we mentioned in the example in Section 3. Specifically, recall the example, g has the form: φ

artificial intelligence, machine learning, minimizer, (15 more...)

Neural Information Processing Systems

Country: North America > United States > Texas (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.71)

Add feedback

Better by Default: Strong Pre-Tuned MLPs and Boosted Trees on Tabular Data

Neural Information Processing SystemsMay-28-2025, 23:32:51 GMT

For classification and regression on tabular data, the dominance of gradient-boosted decision trees (GBDTs) has recently been challenged by often much slower deep learning methods with extensive hyperparameter tuning. We address this discrepancy by introducing (a) RealMLP, an improved multilayer perceptron (MLP), and (b) strong meta-tuned default parameters for GBDTs and RealMLP. We tune RealMLP and the default parameters on a meta-train benchmark with 118 datasets and compare them to hyperparameter-optimized versions on a disjoint meta-test benchmark with 90 datasets, as well as the GBDT-friendly benchmark by Grinsztajn et al. (2022). Our benchmark results on medium-to-large tabular datasets (1K-500K samples) show that RealMLP offers a favorable time-accuracy tradeoff compared to other neural baselines and is competitive with GBDTs in terms of benchmark scores. Moreover, a combination of RealMLP and GBDTs with improved default parameters can achieve excellent results without hyperparameter tuning. Finally, we demonstrate that some of RealMLP's improvements can also considerably improve the performance of TabR with default parameters.

artificial intelligence, deep learning, machine learning, (15 more...)

Neural Information Processing Systems

Country: