AITopics | pmp

We decompose the Kullback--Leibler generalization error (GE) -- the expected KL divergence from the data distribution to the trained model -- of unsupervised learning into three non-negative components: model error, data bias, and variance. The decomposition is exact for any e-flat model class and follows from two identities of information geometry: the generalized Pythagorean theorem and a dual e-mixture variance identity. As an analytically tractable demonstration, we apply the framework to $ε$-PCA, a regularized principal component analysis in which the empirical covariance is truncated at rank $N_K$ and discarded directions are pinned at a fixed noise floor $ε$. Although rank-constrained $ε$-PCA is not itself e-flat, it admits a technical reformulation with the same total GE on isotropic Gaussian data, under which each component of the decomposition takes closed form. The optimal rank emerges as the cutoff $λ_{\mathrm{cut}}^{*} = ε$ -- the model retains exactly those empirical eigenvalues exceeding the noise floor -- with the cutoff reflecting a marginal-rate balance between model-error gain and data-bias cost. A boundary comparison further yields a three-regime phase diagram -- retain-all, interior, and collapse -- separated by the lower Marchenko--Pastur edge and an analytically computable collapse threshold $ε_{*}(α)$, where $α$ is the dimension-to-sample-size ratio. All claims are verified numerically.

artificial intelligence, decomposition, machine learning, (19 more...)

arXiv.org Machine Learning

2604.1234

Country:

Asia > South Korea > Seoul > Seoul (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
Europe > Russia (0.04)
(2 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.70)

Add feedback

You Only Propagate Once: Accelerating Adversarial Training via Maximal Principle

Dinghuai Zhang, Tianyuan Zhang, Yiping Lu, Zhanxing Zhu, Bin Dong

Neural Information Processing SystemsFeb-12-2026, 18:26:20 GMT

Neural Information Processing Systems http://nips.cc/

adversarial training, arxiv preprint arxiv, neural network, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > China > Beijing > Beijing (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

07b1c04a30f798b5506c1ec5acfb9031-Supplemental.pdf

Neural Information Processing SystemsFeb-7-2026, 09:05:44 GMT

full sweep, ising model, pmp, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Data Science (0.67)

Add feedback

07b1c04a30f798b5506c1ec5acfb9031-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 09:05:40 GMT

full sweep, ising model, pmp, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Perturb-and-max-product: Sampling and learning in discrete energy-based models

Neural Information Processing SystemsDec-23-2025, 17:36:44 GMT

Perturb-and-MAP offers an elegant approach to approximately sample from a energy-based model (EBM) by computing the maximum-a-posteriori (MAP) configuration of a perturbed version of the model. Sampling in turn enables learning. However, this line of research has been hindered by the general intractability of the MAP computation. Very few works venture outside tractable models, and when they do, they use linear programming approaches, which as we will show, have several limitations.

discrete energy-based model, energy-based model, perturb-and-max-product, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.58)

Add feedback

Beyond English: Toward Inclusive and Scalable Multilingual Machine Translation with LLMs

Luo, Yingfeng, Xu, Ziqiang, Ouyang, Yuxuan, Yang, Murun, Lin, Dingyang, Chang, Kaiyan, Zheng, Tong, Li, Bei, Feng, Peinan, Du, Quan, Xiao, Tong, Zhu, Jingbo

arXiv.org Artificial IntelligenceNov-11-2025

Large language models have significantly advanced Multilingual Machine Translation (MMT), yet the broad language coverage, consistent translation quality, and English-centric bias remain open challenges. To address these challenges, we introduce \textbf{LMT}, a suite of \textbf{L}arge-scale \textbf{M}ultilingual \textbf{T}ranslation models centered on both Chinese and English, covering 60 languages and 234 translation directions. During development, we identify a previously overlooked phenomenon of \textbf{directional degeneration}, where symmetric multi-way fine-tuning data overemphasize reverse directions (X $\to$ En/Zh), leading to excessive many-to-one mappings and degraded translation quality. We propose \textbf{Strategic Downsampling}, a simple yet effective method to mitigate this degeneration. In addition, we design \textbf{Parallel Multilingual Prompting (PMP)}, which leverages typologically related auxiliary languages to enhance cross-lingual transfer. Through rigorous data curation and refined adaptation strategies, LMT achieves SOTA performance among models of comparable language coverage, with our 4B model (LMT-60-4B) surpassing the much larger Aya-101-13B and NLLB-54B models by a substantial margin. We release LMT in four sizes (0.6B/1.7B/4B/8B) to catalyze future research and provide strong baselines for inclusive, scalable, and high-quality MMT \footnote{\href{https://github.com/NiuTrans/LMT}{https://github.com/NiuTrans/LMT}}.

artificial intelligence, natural language, translation, (14 more...)

arXiv.org Artificial Intelligence

2511.07003

Country:

Asia (1.00)
North America > United States (0.28)
North America > Mexico (0.28)
Europe > Austria (0.28)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Nek Minit: Harnessing Pragmatic Metacognitive Prompting for Explainable Sarcasm Detection of Australian and Indian English

Singh, Ishmanbir, Srirag, Dipankar, Joshi, Aditya

arXiv.org Artificial IntelligenceOct-31-2025

Sarcasm is a challenge to sentiment analysis because of the incongruity between stated and implied sentiment. The challenge is exacerbated when the implication may be relevant to a specific country or geographical region. Pragmatic metacognitive prompting (PMP) is a cognition-inspired technique that has been used for pragmatic reasoning. In this paper, we harness PMP for explainable sarcasm detection for Australian and Indian English, alongside a benchmark dataset for standard English. We manually add sarcasm explanations to an existing sarcasm-labeled dataset for Australian and Indian English called BESSTIE, and compare the performance for explainable sarcasm detection for them with FLUTE, a standard English dataset containing sarcasm explanations. Our approach utilising PMP when evaluated on two open-weight LLMs (GEMMA and LLAMA) achieves statistically significant performance improvement across all tasks and datasets when compared with four alternative prompting strategies. We also find that alternative techniques such as agentic prompting mitigate context-related failures by enabling external knowledge retrieval. The focused contribution of our work is utilising PMP in generating sarcasm explanations for varieties of English.

explanation, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2505.15095

Country:

North America > United States (0.68)
Europe (0.68)
Asia > Middle East (0.46)

Genre: Research Report > New Finding (0.93)

Industry:

Leisure & Entertainment (0.67)
Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

You Only Propagate Once: Accelerating Adversarial Training via Maximal Principle

Dinghuai Zhang, Tianyuan Zhang, Yiping Lu, Zhanxing Zhu, Bin Dong

Neural Information Processing SystemsOct-3-2025, 02:41:59 GMT

Deep learning achieves state-of-the-art results in many tasks in computer vision and natural language processing.

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Filters

Collaborating Authors

pmp

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

07b1c04a30f798b5506c1ec5acfb9031-Supplemental.pdf

07b1c04a30f798b5506c1ec5acfb9031-Paper.pdf

Information-Geometric Decomposition of Generalization Error in Unsupervised Learning

You Only Propagate Once: Accelerating Adversarial Training via Maximal Principle

07b1c04a30f798b5506c1ec5acfb9031-Supplemental.pdf

07b1c04a30f798b5506c1ec5acfb9031-Paper.pdf

Perturb-and-max-product: Sampling and learning in discrete energy-based models

Beyond English: Toward Inclusive and Scalable Multilingual Machine Translation with LLMs

Nek Minit: Harnessing Pragmatic Metacognitive Prompting for Explainable Sarcasm Detection of Australian and Indian English

You Only Propagate Once: Accelerating Adversarial Training via Maximal Principle