AITopics | asymmetry

Collaborating Authors

asymmetry

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Representation Without Reward: A JEPA Audit for LLM Fine-Tuning

Sengupta, Biswa

arXiv.org Machine LearningMay-18-2026

Joint-embedding predictive architectures (JEPAs) propose that a model should learn more useful abstractions when trained to predict latent representations rather than observed outputs. For autoregressive language-model fine-tuning the principle entails a stricter requirement: the induced hidden-state geometry must reach the language-model head \emph{and} improve the decoded task metric. We test that requirement under a fixed Llama-3.2-1B-Instruct LoRA harness on natural-language-to-regex generation, comparing twenty-two training-time auxiliaries across trajectory-shape regularisation, distributional constraints, predictor/target asymmetry, Fisher-metric Jacobi residuals, and a decoder-visible JEPA objective constructed to lie in cross-entropy's positive cone. The empirical answer is a structured null: several auxiliaries clear single-cell paired $α= 0.10$ without correction (T3-Local at $Δ= +2.53$~pp, $p = 0.003$ being the strongest), but none survives Bonferroni or Holm--Bonferroni at the relevant family-wise threshold, even though many change curvature, anisotropy, variance, and gradient direction. Decoder-visible JEPA yields the first positive auxiliary--cross-entropy gradient cosine in the study, yet exact match remains inside seed noise; a full-fine-tuning replication of the same auxiliary at $n = 5$ seeds reproduces the null on both benchmarks (TURK: $Δ= +0.04$~pp, $p_{\text{paired}} = 0.96$; SYNTH: $Δ= +0.52$~pp, $p_{\text{paired}} = 0.28$), so the null is robust across LoRA and full fine-tuning for the decoder-visible construction. Hidden-state representation work and decoded-task accuracy are therefore weakly coupled in this regime; we accordingly reframe LLM-domain JEPA evaluation as a coupling problem, in which the operative question is under which metrics useful hidden geometry becomes decoder-visible task signal.

large language model, machine learning, natural language, (21 more...)

arXiv.org Machine Learning

2605.15394

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Sequential Minimal Optimization for $\varepsilon$-SVR with MAPE Loss and Sample-Dependent Box Constraints

Benavides-Herrera, Pablo, Ruiz-Cruz, Riemann, Sánchez-Torres, Juan Diego

arXiv.org Machine LearningMay-11-2026

We derive a Sequential Minimal Optimization (SMO) algorithm for the quadratic dual problem arising from $\varepsilon$-SVR~\cite{Vapnik1995, Drucker1997, Smola2004} modified to minimize the Mean Absolute Percentage Error (MAPE)~\cite{Makridakis1993, Hyndman2006} directly in the loss function~\cite{benavides2025support}. This formulation is part of a broader family of SVR models with percentage-error losses that also includes least-squares variants~\cite{Suykens2002} and symmetric-kernel extensions~\cite{Espinoza2005}, whose unified structure is studied in~\cite{benavides2026unified}. The key structural difference from standard $\varepsilon$-SVR is that the box constraints become \emph{sample-dependent}: $α_k, α_k^* \in [0,\, 100C/y_k]$. We show that this modification affects only (i) the feasibility sets $\Iup$ and $\Idown$ in the working-set selection and (ii) the clipping bounds in the analytic two-variable update, while leaving the curvature formula and gradient update structurally identical to the standard SMO~\cite{Platt1998, Platt1999, Fan2005}. A shrinking heuristic adapted to the sample-dependent bounds is derived and shown to introduce an asymmetry between $α$- and $α^*$-variables controlled by the gap $2y_k\varepsilon/100$. The same solver applies to the symmetric-kernel variant (m2) by replacing $Ω$ with $Ω_s = \tfrac{1}{2}(Ω+ aΩ^*)$~\cite{Espinoza2005}. Numerical validation against an interior-point QP reference solver confirms solution agreement to within solver termination tolerance across ten synthetic configurations spanning both kernel variants and symmetry types. An implementation is available in the open-source \texttt{psvr} R package~\cite{BenavidesHerrera2026Rpsvr}.

artificial intelligence, constraint, machine learning, (16 more...)

arXiv.org Machine Learning

2605.01446

Country:

Europe (0.28)
North America > United States (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.67)

Add feedback

Visual Search Asymmetry: Deep Nets and Humans Share Similar Inherent Biases

Neural Information Processing SystemsApr-25-2026, 11:39:57 GMT

Visual search is a ubiquitous and often challenging daily task, exemplified by looking for the car keys at home or a friend in a crowd. An intriguing property of some classical search tasks is an asymmetry such that finding a target A among distractors B can be easier than finding B among A. To elucidate the mechanisms responsible for asymmetry in visual search, we propose a computational model that takes a target and a search image as inputs and produces a sequence of eye movements until the target is found.

artificial intelligence, information management, machine learning, (19 more...)

Neural Information Processing Systems

Country: Asia > India (0.28)

Genre: Research Report (0.67)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Information Management > Search (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

On scalable oversight with weak LLMs judging strong LLMs

Neural Information Processing SystemsMar-21-2026, 12:04:07 GMT

Scalable oversight protocols aim to enable humans to accurately supervise superhuman AI. In this paper we study debate, where two AI's compete to convince a judge; consultancy, where a single AI tries to convince a judge that asks questions;and compare to a baseline of direct question-answering, where the judge just answers outright without the AI.We use large language models (LLMs) as both AI agents and as stand-ins for human judges, taking the judge models to be weaker than agent models. We benchmark on a diverse range of asymmetries between judges and agents, extending previous work on a single extractive QA task with information asymmetry, to also include mathematics, coding, logic and multimodal reasoning asymmetries. We find that debate outperforms consultancy across all tasks when the consultant is randomly assigned to argue for the correct/incorrect answer. Comparing debate to direct question answering, the results depend on the type of task: in extractive QA tasks with information asymmetry debate outperforms direct question answering, but in other tasks without information asymmetry the results are mixed.Previous work assigned debaters/consultants an answer to argue for. When we allow them to instead choose which answer to argue for, we find judges are less frequently convinced by the wrong answer in debate than in consultancy.Further, we find that stronger debater models increase judge accuracy, though more modestly than in previous studies.

artificial intelligence, large language model, natural language, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.64)

Add feedback

Causal vs. Anticausal merging of predictors

Neural Information Processing SystemsMar-17-2026, 19:57:04 GMT

We study the differences arising from merging predictors in the causal and anticausal directions using the same data.In particular we study the asymmetries that arise in a simple model where we merge the predictors using one binary variable as target and two continuous variables as predictors.We use Causal Maximum Entropy (CMAXENT) as inductive bias to merge the predictors, however, we expect similar differences to hold also when we use other merging methods that take into account asymmetries between cause and effect.We show that if we observe all bivariate distributions, the CMAXENT solution reduces to a logistic regression in the causal direction and Linear Discriminant Analysis (LDA) in the anticausal direction.Furthermore, we study how the decision boundaries of these two solutions differ whenever we observe only some of the bivariate distributions implications for Out-Of-Variable (OOV) generalisation.

artificial intelligence, machine learning, proceedings, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

e28e19d00b23fe0265f433fa05a96b06-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 11:04:56 GMT

correlation, hemisphere, language model, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Lifted Weighted Mini-Bucket

Nicholas Gallo, Alexander T. Ihler

Neural Information Processing SystemsFeb-14-2026, 09:42:22 GMT

Many applications require computing likelihoods and marginal probabilities over a distribution defined by a graphical model, tasks which are intractable in general [24].

artificial intelligence, inference, inference term, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > Jordan (0.04)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.47)

Add feedback

Causal Inference via Kernel Deviance Measures

Jovana Mitrovic, Dino Sejdinovic, Yee Whye Teh

Neural Information Processing SystemsFeb-13-2026, 06:19:07 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, conditional distribution, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Rot-Pro: ModelingTransitivitybyProjectionin KnowledgeGraphEmbedding

Neural Information Processing SystemsFeb-11-2026, 06:16:39 GMT

Inthispaper,we first theoretically showthat the transitive relations can be modeled with projections. Wethen propose the Rot-Pro model which combines the projection and relational rotation together. We prove that Rot-Pro can infer all the aboverelation patterns.

artificial intelligence, natural language, relation, (14 more...)

Neural Information Processing Systems

Country: