AITopics | equivalently

Collaborating Authors

equivalently

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

RobustTestinginHigh-DimensionalSparseModels

Neural Information Processing SystemsFeb-9-2026, 13:45:51 GMT

In the first model, we are givenn i.i.d.

artificial intelligence, linear regression model, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California > Santa Clara County > Stanford (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.33)

Add feedback

300891a62162b960cf02ce3827bb363c-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-7-2026, 23:52:49 GMT

certification, empirical evidence, estimator, (15 more...)

Neural Information Processing Systems

Genre: Research Report (0.30)

Technology: Information Technology > Artificial Intelligence (0.30)

Add feedback

Locally-AdaptiveNonparametricOnlineLearning: SupplementaryMaterial

Neural Information Processing SystemsFeb-7-2026, 13:05:01 GMT

In case of generic convex losses, we use the more complex parameterless algorithm AdaNormalHedge. The following theorem states a slightly more general bound that holds for anyη-exp-concave loss function (for completeness,theproofisgiveninAppendixD). Nownotethatalthough the algorithm is actually initialized withw1,i = 1, Lemma 1 shows that the regret remains the same if we assume the algorithm is initialized withwE1. Suppose that Algorithm 5 is run using predictions and updates provided by AdaNormalHedge. Asinourlocally-adaptive setting node experts are local learners,byi,t should be viewed as the prediction of the local online learning algorithm sitting at nodeiof the tree.

artificial intelligence, leavesk, machine learning, (19 more...)

Neural Information Processing Systems

Country: Europe > Italy (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.67)

Add feedback

051f3997af1dd65da8e14397b6a72f8e-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 07:07:12 GMT

cstnd, h-consistency, hinge, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Logic for Expressing Log-Precision Transformers

Neural Information Processing SystemsDec-26-2025, 12:10:55 GMT

One way to interpret the reasoning power of transformer-based language models is to describe the types of logical rules they can resolve over some input text. Recently, Chiang et al. (2023) showed that finite-precision transformer classifiers can be equivalently expressed in a generalization of first-order logic. However, finite-precision transformers are a weak transformer variant because, as we show, a single head can only attend to a constant number of tokens and, in particular, cannot represent uniform attention. Since attending broadly is a core capability for transformers, we ask whether a minimally more expressive model that can attend universally can also be characterized in logic. To this end, we analyze transformers whose forward pass is computed in $\log n$ precision on contexts of length $n$. We prove any log-precision transformer classifier can be equivalently expressed as a first-order logic sentence that, in addition to standard universal and existential quantifiers, may also contain majority-vote quantifiers. This is the tightest known upper bound and first logical characterization of log-precision transformers.

expressing log-precision transformer, name change, proceedings, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.60)
Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

300891a62162b960cf02ce3827bb363c-AuthorFeedback.pdf

Neural Information Processing SystemsOct-2-2025, 14:23:28 GMT

artificial intelligence, certification, estimator, (17 more...)

Neural Information Processing Systems

Genre: Research Report (0.30)

Technology: Information Technology > Artificial Intelligence (0.30)

Add feedback

Contents of Appendix

Neural Information Processing SystemsOct-1-2025, 21:25:49 GMT

Then, for any permutation σ of the set {1,...,c }, a ( c 1) If there is a tie, we pick the label with the highest index under the natural ordering of labels. Since f is non-decreasing, for any t 0, f ( t) 1/2 . Since f is non-decreasing, for any t 0, f ( t) 1/2 . Since f is non-decreasing, for any t 0, f ( t) 1/2 .

artificial intelligence, hinge, machine learning, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.93)

Add feedback

Evaluation and Comparison Semantics for ODRL

Salas, Jaime Osvaldo, Pareti, Paolo, Yumuşak, Semih, Gheisari, Soulmaz, Ibáñez, Luis-Daniel, Konstantinidis, George

arXiv.org Artificial IntelligenceSep-9-2025

We consider the problem of evaluating, and comparing computational policies in the Open Digital Rights Language (ODRL), which has become the de facto standard for governing the access and usage of digital resources. Although preliminary progress has been made on the formal specification of the language's features, a comprehensive formal semantics of ODRL is still missing. In this paper, we provide a simple and intuitive formal semantics for ODRL that is based on query answering. Our semantics refines previous formalisations, and is aligned with the latest published specification of the language (2.2). Building on our evaluation semantics, and motivated by data sharing scenarios, we also define and study the problem of comparing two policies, detecting equivalent, more restrictive or more permissive policies.

artificial intelligence, logic & formal reasoning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2509.05139

Country: Europe > United Kingdom (0.04)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.89)

Technology:

Information Technology > Security & Privacy (0.89)
Information Technology > Artificial Intelligence > Natural Language (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.88)

Add feedback

Learning and Generalization with Mixture Data

Vardhan, Harsh, Ghosh, Avishek, Mazumdar, Arya

arXiv.org Machine LearningApr-30-2025

In many, if not most, machine learning applications the training data is naturally heterogeneous (e.g. federated learning, adversarial attacks and domain adaptation in neural net training). Data heterogeneity is identified as one of the major challenges in modern day large-scale learning. A classical way to represent heterogeneous data is via a mixture model. In this paper, we study generalization performance and statistical rates when data is sampled from a mixture distribution. We first characterize the heterogeneity of the mixture in terms of the pairwise total variation distance of the sub-population distributions. Thereafter, as a central theme of this paper, we characterize the range where the mixture may be treated as a single (homogeneous) distribution for learning. In particular, we study the generalization performance under the classical PAC framework and the statistical error rates for parametric (linear regression, mixture of hyperplanes) as well as non-parametric (Lipschitz, convex and Hölder-smooth) regression problems. In order to do this, we obtain Rademacher complexity and (local) Gaussian complexity bounds with mixture data, and apply them to get the generalization and convergence rates respectively. We observe that as the (regression) function classes get more complex, the requirement on the pairwise total variation distance gets stringent, which matches our intuition. We also do a finer analysis for the case of mixed linear regression and provide a tight bound on the generalization error in terms of heterogeneity.

artificial intelligence, complexity, machine learning, (19 more...)

arXiv.org Machine Learning

2504.20651

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)
(4 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.76)

Add feedback

Distributional autoencoders know the score

Leban, Andrej

arXiv.org Machine LearningFeb-17-2025

This work presents novel and desirable properties of a recently introduced class of autoencoders -- the Distributional Principal Autoencoder (DPA) -- that combines distributionally correct reconstruction with principal components-like interpretability of the encodings. First, we show that the level sets of the encoder orient themselves exactly with regard to the score of the data distribution. This both explains the method's often remarkable performance in disentangling the the factors of variation of the data, as well as opens up possibilities of recovering its distribution while having access to samples only. In settings where the score itself has physical meaning -- such as when the data obey the Boltzmann distribution -- we demonstrate that the method can recover scientifically important quantities such as the \textit{minimum free energy path}. Second, we show that if the data lie on a manifold that can be approximated by the encoder, the optimal encoder's components beyond the dimension of the manifold will carry absolutely no additional information about the data distribution. This promises new ways of determining the number of relevant dimensions of the data beyond common heuristics such as the scree plot. Finally, the fact that the method is learning the score means that it could have promise as a generative model, potentially rivaling approaches such as diffusion, which similarly attempts to approximate the score of the data distribution.

artificial intelligence, encoder, machine learning, (17 more...)

arXiv.org Machine Learning

2502.11583

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback