AITopics | axis

We present a covariance-aware sampler that improves the quality of pixel-space Diffusion Model (DM) sampling in the few-step regime. We hypothesize that in the few-step regime samplers fail because they rely solely on the predicted mean of the reverse distribution, while our solution explicitly models the reverse-process covariance. Our method combines Tweedie's formula to estimate the covariance with an efficient, structured Fourier-space decomposition of the covariance matrix. Implemented as an extension of DDIM, our method requires only a minimal overhead: one extra Jacobian-Vector Product (JVP) per step. We demonstrate that for pixel-based DMs, our method consistently produces superior samples compared to state-of-the-art second order samplers (Heun, DPM-Solver++) and the recent aDDIM sampler, at an identical number of function evaluations (NFE).

artificial intelligence, machine learning, urlhttp, (18 more...)

arXiv.org Machine Learning

2605.1391

Country: Europe > Austria (0.28)

Genre: Research Report (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

The Manokhin Probability Matrix: A Diagnostic Framework for Classifier Probability Quality

Manokhin, Valery

arXiv.org Machine LearningMay-6-2026

The Brier score conflates two distinct properties of probabilistic predictions: reliability (calibration error) and resolution (discriminatory power). We introduce the Manokhin Probability Matrix, a BCG-style two-dimensional diagnostic framework that separates them. Classifiers are placed on a 2x2 grid by Spiegelhalter Z-statistic and AUC-ROC expected rank, then assigned to one of four archetypes: Eagle (good on both axes), Bull (strong discrimination, poor calibration), Sloth (well-calibrated, weak discriminator), and Mole (poor on both). Each archetype carries a distinct prescription. We populate the matrix from a large-scale empirical study spanning 21 classifiers, 5 post-hoc calibrators, and 30 real-world binary classification tasks from the TabArena-v0.1 suite. The assignment is unambiguous. CatBoost, TabICL, EBM, TabPFN, GBC, and Random Forest are Eagles. XGBoost, LightGBM, and HGB are Bulls; Venn-Abers calibration cuts log-loss by 6.5 to 12.6% on Bulls but degrades Eagles by 2.1%. SVM, LR, LDA, and the empirical base-rate predictor are Sloths. MLP, KNN, Naive Bayes, and ExtraTrees are Moles. A theoretical asymmetry follows: no order-preserving post-hoc calibrator can add discriminatory power (Proposition 1), so calibration is the fixable part and discrimination is the hard part. The practical rule is direct: do not optimise aggregate Brier score without first decomposing it; optimise discrimination first, then fix calibration post-hoc. Code and raw experimental data are available at https://github.com/valeman/classifier_calibration.

artificial intelligence, calibration, machine learning, (14 more...)

arXiv.org Machine Learning

doi: 10.5281/zenodo.19372589

2605.03816

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

1663fba7b56da1e96bed6e30546a07b0-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 19:14:00 GMT

artificial intelligence, machine learning, trajectory, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.51)

Add feedback

Beyond Augmented-Action Surrogates for Multi-Expert Learning-to-Defer

Montreuil, Yannis, Carlier, Axel, Ng, Lai Xing, Ooi, Wei Tsang

arXiv.org Machine LearningApr-20-2026

Existing multi-expert learning-to-defer surrogates are statistically consistent, yet they can underfit, suppress useful experts, or degrade as the expert pool grows. We trace these failures to a shared architectural choice: casting classes and experts as actions inside one augmented prediction geometry. Consistency governs the population target; it says nothing about how the surrogate distributes gradient mass during training. We analyze five surrogates along both axes and show that each trades a fix on one for a failure on the other. We then introduce a decoupled surrogate that estimates the class posterior with a softmax and each expert utility with an independent sigmoid. It admits an $\mathcal{H}$-consistency bound whose constant is $J$-independent for fixed per-expert weight $β{=}λ/J$, and its gradients are free of the amplification, starvation, and coupling pathologies of the augmented family. Experiments on synthetic benchmarks, CIFAR-10, CIFAR-10H, and Covertype confirm that the decoupled surrogate is the only method that avoids amplification under redundancy, preserves rare specialists, and consistently improves over a standalone classifier across all settings.

artificial intelligence, machine learning, surrogate, (18 more...)

arXiv.org Machine Learning

2604.09414

Country:

Asia > Singapore (0.04)
North America > United States (0.04)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)

Add feedback

Probabilistic size-and-shape functional mixed models

Neural Information Processing SystemsMar-20-2026, 18:43:16 GMT

The reliable recovery and uncertainty quantification of a fixed effect function $\mu$ in a functional mixed model, for modeling population-and object-level variability in noisily observed functional data, is a notoriously challenging task: variations along the $x$ and $y$ axes are confounded with additive measurement error, and cannot in general be disentangled. The question then as to what properties of $\mu$ may be reliably recovered becomes important. We demonstrate that it is possible to recover the size-and-shape of a square-integrable $\mu$ under a Bayesian functional mixed model. The size-and-shape of $\mu$ is a geometric property invariant to a family of space-time unitary transformations, viewed as rotations of the Hilbert space, that jointly transform the $x$ and $y$ axes.

artificial intelligence, name change, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.39)

Add feedback

Compact Language Models via Pruning and Knowledge Distillation

Neural Information Processing SystemsMar-20-2026, 05:39:56 GMT

Large language models (LLMs) targeting different deployment scales and sizes are currently produced by training each variant from scratch; this is extremely compute-intensive. In this paper, we investigate if pruning an existing LLM and then re-training it with a fraction <3% of the original training data can be a suitable alternative to repeated, full retraining.

artificial intelligence, large language model, natural language, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

d0b67349dd16b83b2cf6167fb4e2be50-Supplemental-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsFeb-17-2026, 06:05:01 GMT

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (0.96)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)

Add feedback

aec5e2847c5ae90f939ab786774856cc-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 13:37:43 GMT

artificial intelligence, inductive learning, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Massachusetts > Middlesex County > Reading (0.04)
North America > Mexico > Yucatán > Mérida (0.04)
Asia > Pakistan (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.93)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Add feedback

SpatialPIN: Enhancing Spatial Reasoning Capabilities

Neural Information Processing SystemsFeb-16-2026, 03:16:37 GMT

To this end, we propose SpatialPIN, a framework that utilizes progressive prompting and interactions between VLMs and 2D/3D foundation models as "free lunch" to enhance spatial reasoning capabilities

artificial intelligence, machine learning, spatial reasoning, (14 more...)

Neural Information Processing Systems

Country: