AITopics | srank

Collaborating Authors

srank

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Appendix for " Beyond the Signs: Nonparametric Tensor Completion via Sign Series "

Neural Information Processing SystemsFeb-10-2026, 20:35:38 GMT

The appendix consists of proofs (Section A), additional theoretical results (Section B), and numerical experiments (Section C). When g is strictly increasing, the mapping x7 g(x) is sign preserving. Specifically, if x 0, then g(x) g(0) = 0. Conversely, ifg(x) 0 = g(0), then applying g 1 to both sides givesx 0. When g is strictly decreasing, the mappingx7 g(x) is sign reversing. See Section B.2 for constructive examples. Based on the definition of classification lossL(,), the function Risk() relies only on the sign pattern of the tensor.

artificial intelligence, machine learning, sgn, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin > Dane County > Madison (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

BeyondtheSigns: NonparametricTensor CompletionviaSignSeries

Neural Information Processing SystemsFeb-10-2026, 20:35:35 GMT

A nonparametric approach to tensor completion is developed based on anewmodel which we coin assign representable tensors. The model represents the signal tensor of interest using a series of structured sign tensors. Unlike earlier methods, the sign series representation effectively addresses both low-andhigh-rank signals, while encompassing manyexisting tensor models-- includingCPmodels,Tuckermodels,singleindexmodels,structuredtensorswith repeating entries--as special cases. We provably reduce the tensor estimation problem to a series of structured classification tasks, and we develop a learning reduction machinery to empower existing low-rank tensor algorithms for more challenging high-rank estimation.

artificial intelligence, machine learning, tensor, (17 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Appendix for " Beyond the Signs: Nonparametric Tensor Completion via Sign Series "

Neural Information Processing SystemsAug-17-2025, 01:12:00 GMT

See Section B.2 for constructive examples.Proof of Proposition 2. Based on (3) in Proposition 2, we have Risk( Z) Risk( Θ) = E null |sgnZ sgn Θ|| Θ|null . We divide the proof into two cases: α > 0 and α = . The inequality (6) now becomes Risk( Z) Risk( Θ) t null MAE(sgn Θ, sgnZ) C snull, for all 0 t < ρ(π, N) . Consider the same setup as in Theorem 2. Fix The conclusion (10) then directly follows by applying Remark A.1 to (11). 3 Proof of Theorem 2. To simplify the notation, we denote ρ = ρ(π, N). It follows from Kosorok (2007, Theorem 9.22) that the Proof of Theorem 3. By definition of ˆ Θ, we have MAE( ˆ Θ, Θ) = E null null null null null 1 2H + 1 null Assumption A.1, we establish the estimation accuracy guarantee for the large-margin estimators H log H. (29) In particualr, setting H null (1 + |N|) To apply Theorem A.1, we choose the pair ( L Here, we describe the details of the example set-up.

artificial intelligence, machine learning, sgn, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin > Dane County > Madison (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Unraveling Zeroth-Order Optimization through the Lens of Low-Dimensional Structured Perturbations

Park, Sihwan, Yun, Jihun, Kim, SungYub, Kundu, Souvik, Yang, Eunho

arXiv.org Artificial IntelligenceJan-31-2025

Zeroth-order (ZO) optimization has emerged as a promising alternative to gradient-based backpropagation methods, particularly for black-box optimization and large language model (LLM) fine-tuning. However, ZO methods suffer from slow convergence due to high-variance stochastic gradient estimators. While structured perturbations, such as sparsity and low-rank constraints, have been explored to mitigate these issues, their effectiveness remains highly under-explored. In this work, we develop a unified theoretical framework that analyzes both the convergence and generalization properties of ZO optimization under structured perturbations. We show that high dimensionality is the primary bottleneck and introduce the notions of \textit{stable rank} and \textit{effective overlap} to explain how structured perturbations reduce gradient noise and accelerate convergence. Using the uniform stability under our framework, we then provide the first theoretical justification for why these perturbations enhance generalization. Additionally, through empirical analysis, we identify that \textbf{block coordinate descent} (BCD) to be an effective structured perturbation method. Extensive experiments show that, compared to existing alternatives, memory-efficient ZO (MeZO) with BCD (\textit{MeZO-BCD}) can provide improved converge with a faster wall-clock time/iteration by up to $\times\textbf{2.09}$ while yielding similar or better accuracy.

machine learning, natural language, optimization, (18 more...)

arXiv.org Artificial Intelligence

2501.19099

Country:

Asia > South Korea > Seoul > Seoul (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.66)

Add feedback

Embedding Ontologies in the Description Logic ALC by Axis-Aligned Cones

Özcep, Özgür Lütfü (University of Lübeck) | Leemhuis, Mena | Wolter, Diedrich

Journal of Artificial Intelligence ResearchOct-23-2023

This paper is concerned with knowledge graph embedding with background knowledge, taking the formal perspective of logics. In knowledge graph embedding, knowledge— expressed as a set of triples of the form (a R b) (“a is R-related to b”)—is embedded into a real-valued vector space. The embedding helps exploiting geometrical regularities of the space in order to tackle typical inductive tasks of machine learning such as link prediction. Recent embedding approaches also consider incorporating background knowledge, in which the intended meanings of the symbols a, R, b are further constrained via axioms of a theory. Of particular interest are theories expressed in a formal language with a neat semantics and a good balance between expressivity and feasibility. In that case, the knowledge graph together with the background can be considered to be an ontology. This paper develops a cone-based theory for embedding in order to advance the expressivity of the ontology: it works (at least) with ontologies expressed in the description logic ALC, which comprises restricted existential and universal quantifiers, as well as concept negation and concept disjunction. In order to align the classical Tarskian Style semantics for ALC with the sub-symbolic representation of triples, we use the notion of a geometric model of an ALC ontology and show, as one of our main results, that an ALC ontology is satisfiable in the classical sense iff it is satisfiable by a geometric model based on cones. The geometric model, if treated as a partial model, can even be chosen to be faithful, i.e., to reflect all and only the knowledge captured by the ontology. We introduce the class of axis-aligned cones and show that modulo simple geometric operations any distributive logic (such as ALC) interpreted over cones employs this class of cones. Cones are also attractive from a machine learning perspective on knowledge graph embeddings since they give rise to applying conic optimization techniques.

algebraic atom, ontology, srank, (13 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.13939

AI Access Foundation

13939

Journal of Artificial Intelligence Research

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany (0.04)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)

Add feedback

Approximate Real Symmetric Tensor Rank

Ergür, Alperen A., Bueno, Jesus Rebollo, Valettas, Petros

arXiv.org Artificial IntelligenceAug-17-2023

We investigate the effect of an $\varepsilon$-room of perturbation tolerance on symmetric tensor decomposition. To be more precise, suppose a real symmetric $d$-tensor $f$, a norm $||.||$ on the space of symmetric $d$-tensors, and $\varepsilon >0$ are given. What is the smallest symmetric tensor rank in the $\varepsilon$-neighborhood of $f$? In other words, what is the symmetric tensor rank of $f$ after a clever $\varepsilon$-perturbation? We prove two theorems and develop three corresponding algorithms that give constructive upper bounds for this question. With expository goals in mind; we present probabilistic and convex geometric ideas behind our results, reproduce some known results, and point out open problems.

artificial intelligence, machine learning, tensor, (18 more...)

arXiv.org Artificial Intelligence

2207.12529

Country:

North America > United States > Rhode Island > Providence County > Providence (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)
Africa > Senegal > Kolda Region > Kolda (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

KDEformer: Accelerating Transformers via Kernel Density Estimation

Zandieh, Amir, Han, Insu, Daliri, Majid, Karbasi, Amin

arXiv.org Artificial IntelligenceJun-29-2023

Dot-product attention mechanism plays a crucial role in modern deep architectures (e.g., Transformer) for sequence modeling, however, na\"ive exact computation of this model incurs quadratic time and memory complexities in sequence length, hindering the training of long-sequence models. Critical bottlenecks are due to the computation of partition functions in the denominator of softmax function as well as the multiplication of the softmax matrix with the matrix of values. Our key observation is that the former can be reduced to a variant of the kernel density estimation (KDE) problem, and an efficient KDE solver can be further utilized to accelerate the latter via subsampling-based fast matrix products. Our proposed KDEformer can approximate the attention in sub-quadratic time with provable spectral norm bounds, while all prior results merely provide entry-wise error bounds. Empirically, we verify that KDEformer outperforms other attention approximations in terms of accuracy, memory, and runtime on various pre-trained models. On BigGAN image generation, we achieve better generative scores than the exact computation with over $4\times$ speedup. For ImageNet classification with T2T-ViT, KDEformer shows over $18\times$ speedup while the accuracy drop is less than $0.5\%$.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2302.02451

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Afghanistan > Parwan Province > Charikar (0.05)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Explicit and Implicit Semantic Ranking Framework

Zhu, Xiaofeng, Lin, Thomas, Anand, Vishal, Calderwood, Matthew, Clausen-Brown, Eric, Lueck, Gord, Yim, Wen-wai, Wu, Cheng

arXiv.org Artificial IntelligenceApr-10-2023

The core challenge in numerous real-world applications is to match an inquiry to the best document from a mutable and finite set of candidates. Existing industry solutions, especially latency-constrained services, often rely on similarity algorithms that sacrifice quality for speed. In this paper we introduce a generic semantic learning-to-rank framework, Self-training Semantic Cross-attention Ranking (sRank). This transformer-based framework uses linear pairwise loss with mutable training batch sizes and achieves quality gains and high efficiency, and has been applied effectively to show gains on two industry tasks at Microsoft over real-world large-scale data sets: Smart Reply (SR) and Ambient Clinical Intelligence (ACI). In Smart Reply, $sRank$ assists live customers with technical support by selecting the best reply from predefined solutions based on consumer and support agent messages. It achieves 11.7% gain in offline top-one accuracy on the SR task over the previous system, and has enabled 38.7% time reduction in composing messages in telemetry recorded since its general release in January 2021. In the ACI task, sRank selects relevant historical physician templates that serve as guidance for a text summarization model to generate higher quality medical notes. It achieves 35.5% top-one accuracy gain, along with 46% relative ROUGE-L gain in generated medical notes.

machine learning, natural language, template, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3543873.3584621

2304.04918

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Washington > King County > Redmond (0.06)
North America > United States > District of Columbia > Washington (0.05)
(3 more...)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

A Generalized Bootstrap Target for Value-Learning, Efficiently Combining Value and Feature Predictions

GX-Chen, Anthony, Chelu, Veronica, Richards, Blake A., Pineau, Joelle

arXiv.org Artificial IntelligenceJan-5-2022

Estimating value functions is a core component of reinforcement learning algorithms. Temporal difference (TD) learning algorithms use bootstrapping, i.e. they update the value function toward a learning target using value estimates at subsequent time-steps. Alternatively, the value function can be updated toward a learning target constructed by separately predicting successor features (SF)--a policy-dependent model--and linearly combining them with instantaneous rewards. We focus on bootstrapping targets used when estimating value functions, and propose a new backup target, the $\eta$-return mixture, which implicitly combines value-predictive knowledge (used by TD methods) with (successor) feature-predictive knowledge--with a parameter $\eta$ capturing how much to rely on each. We illustrate that incorporating predictive knowledge through an $\eta\gamma$-discounted SF model makes more efficient use of sampled experience, compared to either extreme, i.e. bootstrapping entirely on the value function estimate, or bootstrapping on the product of separately estimated successor features and instantaneous reward models. We empirically show this approach leads to faster policy evaluation and better control performance, for tabular and nonlinear function approximations, indicating scalability and generality.

equation, successor feature, value function, (12 more...)

arXiv.org Artificial Intelligence

2201.01836

Country:

North America > United States (0.14)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.50)

Industry: Education (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Nonparametric Trace Regression in High Dimensions via Sign Series Representation

Lee, Chanwoo, Li, Lexin, Zhang, Hao Helen, Wang, Miaoyan

arXiv.org Machine LearningMay-4-2021

Learning of matrix-valued data has recently surged in a range of scientific and business applications. Trace regression is a widely used method to model effects of matrix predictors and has shown great success in matrix learning. However, nearly all existing trace regression solutions rely on two assumptions: (i) a known functional form of the conditional mean, and (ii) a global low-rank structure in the entire range of the regression function, both of which may be violated in practice. In this article, we relax these assumptions by developing a general framework for nonparametric trace regression models via structured sign series representations of high dimensional functions. The new model embraces both linear and nonlinear trace effects, and enjoys rank invariance to order-preserving transformations of the response. In the context of matrix completion, our framework leads to a substantially richer model based on what we coin as the "sign rank" of a matrix. We show that the sign series can be statistically characterized by weighted classification tasks. Based on this connection, we propose a learning reduction approach to learn the regression model via a series of classifiers, and develop a parallelable computation algorithm to implement sign series aggregations. We establish the excess risk bounds, estimation error rates, and sample complexities. Our proposal provides a broad nonparametric paradigm to many important matrix learning problems, including matrix regression, matrix completion, multi-task learning, and compressed sensing. We demonstrate the advantages of our method through simulations and two applications, one on brain connectivity study and the other on high-rank image completion.

bayes, matrix, regression, (16 more...)

arXiv.org Machine Learning

2105.01783

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Montana (0.04)
North America > United States > California (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Add feedback