AITopics | mach

Collaborating Authors

mach

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Extreme Classification in Log Memory using Count-Min Sketch: A Case Study of Amazon Search with 50M Products

Tharun Kumar Reddy Medini, Qixuan Huang, Yiqiu Wang, Vijai Mohan, Anshumali Shrivastava

Neural Information Processing SystemsFeb-12-2026, 11:37:36 GMT

Neural Information Processing Systems http://nips.cc/

classifier, dataset, mach, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Texas > Harris County > Houston (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

69cd21a0e0b7d5f05dc88a0be36950c7-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-12-2026, 11:37:21 GMT

classification repository, information retrieval, positive comment, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.74)

Add feedback

Simultaneous P

Neural Information Processing SystemsFeb-7-2026, 08:27:00 GMT

Personalized years, influencing influencing matching personalized userpreferenceideal [1].

artificial intelligence, conf, inproc, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.05)
(15 more...)

Technology: Information Technology > Artificial Intelligence (0.47)

Add feedback

Extreme Classification in Log Memory using Count-Min Sketch: A Case Study of Amazon Search with 50M Products

Neural Information Processing SystemsDec-25-2025, 12:28:02 GMT

In the last decade, it has been shown that many hard AI tasks, especially in NLP, can be naturally modeled as extreme classification problems leading to improved precision. However, such models are prohibitively expensive to train due to the memory bottleneck in the last layer. For example, a reasonable softmax layer for the dataset of interest in this paper can easily reach well beyond 100 billion parameters (> 400 GB memory). To alleviate this problem, we present Merged-Average Classifiers via Hashing (MACH), a generic $K$-classification algorithm where memory provably scales at $O(\log K)$ without any assumption on the relation between classes. MACH is subtly a count-min sketch structure in disguise, which uses universal hashing to reduce classification with a large number of classes to few embarrassingly parallel and independent classification tasks with a small (constant) number of classes.

amazon search, count-min sketch, extreme classification, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Transformed $\ell_1$ Regularizations for Robust Principal Component Analysis: Toward a Fine-Grained Understanding

Zhao, Kun, Zhang, Haoke, Wang, Jiayi, Lou, Yifei

arXiv.org Machine LearningOct-7-2025

Robust Principal Component Analysis (RPCA) aims to recover a low-rank structure from noisy, partially observed data that is also corrupted by sparse, potentially large-magnitude outliers. Traditional RPCA models rely on convex relaxations, such as nuclear norm and $\ell_1$ norm, to approximate the rank of a matrix and the $\ell_0$ functional (the number of non-zero elements) of another. In this work, we advocate a nonconvex regularization method, referred to as transformed $\ell_1$ (TL1), to improve both approximations. The rationale is that by varying the internal parameter of TL1, its behavior asymptotically approaches either $\ell_0$ or $\ell_1$. Since the rank is equal to the number of non-zero singular values and the nuclear norm is defined as their sum, applying TL1 to the singular values can approximate either the rank or the nuclear norm, depending on its internal parameter. We conduct a fine-grained theoretical analysis of statistical convergence rates, measured in the Frobenius norm, for both the low-rank and sparse components under general sampling schemes. These rates are comparable to those of the classical RPCA model based on the nuclear norm and $\ell_1$ norm. Moreover, we establish constant-order upper bounds on the estimated rank of the low-rank component and the cardinality of the sparse component in the regime where TL1 behaves like $\ell_0$, assuming that the respective matrices are exactly low-rank and exactly sparse. Extensive numerical experiments on synthetic data and real-world applications demonstrate that the proposed approach achieves higher accuracy than the classic convex model, especially under non-uniform sampling schemes.

learn, matrix, theorem 3, (15 more...)

arXiv.org Machine Learning

2510.03624

Country:

North America > United States > Texas > Dallas County > Richardson (0.04)
North America > United States > North Carolina (0.04)

Genre: Research Report (1.00)

Industry: Education (0.45)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.61)

Add feedback

Extreme Classification in Log Memory using Count-Min Sketch: A Case Study of Amazon Search with 50M Products

Neural Information Processing SystemsOct-10-2024, 05:33:45 GMT

In the last decade, it has been shown that many hard AI tasks, especially in NLP, can be naturally modeled as extreme classification problems leading to improved precision. However, such models are prohibitively expensive to train due to the memory bottleneck in the last layer. For example, a reasonable softmax layer for the dataset of interest in this paper can easily reach well beyond 100 billion parameters ( 400 GB memory). To alleviate this problem, we present Merged-Average Classifiers via Hashing (MACH), a generic K -classification algorithm where memory provably scales at O(\log K) without any assumption on the relation between classes. MACH is subtly a count-min sketch structure in disguise, which uses universal hashing to reduce classification with a large number of classes to few embarrassingly parallel and independent classification tasks with a small (constant) number of classes.

amazon search, count-min sketch, extreme classification, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

COD: Learning Conditional Invariant Representation for Domain Adaptation Regression

Yang, Hao-Ran, Ren, Chuan-Xian, Luo, You-Wei

arXiv.org Artificial IntelligenceAug-13-2024

Aiming to generalize the label knowledge from a source domain with continuous outputs to an unlabeled target domain, Domain Adaptation Regression (DAR) is developed for complex practical learning problems. However, due to the continuity problem in regression, existing conditional distribution alignment theory and methods with discrete prior, which are proven to be effective in classification settings, are no longer applicable. In this work, focusing on the feasibility problems in DAR, we establish the sufficiency theory for the regression model, which shows the generalization error can be sufficiently dominated by the cross-domain conditional discrepancy. Further, to characterize conditional discrepancy with continuous conditioning variable, a novel Conditional Operator Discrepancy (COD) is proposed, which admits the metric property on conditional distributions via the kernel embedding theory. Finally, to minimize the discrepancy, a COD-based conditional invariant representation learning model is proposed, and the reformulation is derived to show that reasonable modifications on moment statistics can further improve the discriminability of the adaptation model. Extensive experiments on standard DAR datasets verify the validity of theoretical results and the superiority over SOTA DAR methods.

adaptation, domain adaptation, regression, (13 more...)

arXiv.org Artificial Intelligence

2408.06638

Country:

Asia > Middle East > Jordan (0.05)
Asia > Middle East > Syria > Daraa Governorate > Dar'a (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report (0.64)

Industry: Education (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Correspondence-Free Non-Rigid Point Set Registration Using Unsupervised Clustering Analysis

Zhao, Mingyang, Jiang, Jingen, Ma, Lei, Xin, Shiqing, Meng, Gaofeng, Yan, Dong-Ming

arXiv.org Artificial IntelligenceJun-26-2024

This paper presents a novel non-rigid point set registration method that is inspired by unsupervised clustering analysis. Unlike previous approaches that treat the source and target point sets as separate entities, we develop a holistic framework where they are formulated as clustering centroids and clustering members, separately. We then adopt Tikhonov regularization with an $\ell_1$-induced Laplacian kernel instead of the commonly used Gaussian kernel to ensure smooth and more robust displacement fields. Our formulation delivers closed-form solutions, theoretical guarantees, independence from dimensions, and the ability to handle large deformations. Subsequently, we introduce a clustering-improved Nystr\"om method to effectively reduce the computational complexity and storage of the Gram matrix to linear, while providing a rigorous bound for the low-rank approximation. Our method achieves high accuracy results across various scenarios and surpasses competitors by a significant margin, particularly on shapes with substantial deformations. Additionally, we demonstrate the versatility of our method in challenging tasks such as shape transfer and medical registration.

comput, deformation, registration, (13 more...)

arXiv.org Artificial Intelligence

2406.18817

Country:

North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
North America > United States > Indiana > Hamilton County > Fishers (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

From Discrete to Continuous: Deep Fair Clustering With Transferable Representations

Zhang, Xiang

arXiv.org Artificial IntelligenceMar-24-2024

We consider the problem of deep fair clustering, which partitions data into clusters via the representations extracted by deep neural networks while hiding sensitive data attributes. To achieve fairness, existing methods present a variety of fairness-related objective functions based on the group fairness criterion. However, these works typically assume that the sensitive attributes are discrete and do not work for continuous sensitive variables, such as the proportion of the female population in an area. Besides, the potential of the representations learned from clustering tasks to improve performance on other tasks is ignored by existing works. In light of these limitations, we propose a flexible deep fair clustering method that can handle discrete and continuous sensitive attributes simultaneously. Specifically, we design an information bottleneck style objective function to learn fair and clustering-friendly representations. Furthermore, we explore for the first time the transferability of the extracted representations to other downstream tasks. Unlike existing works, we impose fairness at the representation level, which could guarantee fairness for the transferred task regardless of clustering results. To verify the effectiveness of the proposed method, we perform extensive experiments on datasets with discrete and continuous sensitive attributes, demonstrating the advantage of our method in comparison with state-of-the-art methods.

fairness, information, representation, (15 more...)

arXiv.org Artificial Intelligence

2403.16201

Country:

North America > United States (0.15)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Add feedback

On the (Non-)existence of Convex, Calibrated Surrogate Losses for Ranking

Neural Information Processing SystemsMar-14-2024, 08:01:52 GMT

We study surrogate losses for learning to rank, in a framework where the rankings are induced by scores and the task is to learn the scoring function. We focus on the calibration of surrogate losses with respect to a ranking evaluation metric, where the calibration is equivalent to the guarantee that near-optimal values of the surrogate risk imply near-optimal values of the risk defined by the evaluation metric. We prove that if a surrogate loss is a convex function of the scores, then it is not calibrated with respect to two evaluation metrics widely used for search engine evaluation, namely the Average Precision and the Expected Reciprocal Rank. We also show that such convex surrogate losses cannot be calibrated with respect to the Pairwise Disagreement, an evaluation metric used when learning from pairwise preferences. Our results cast lights on the intrinsic difficulty of some ranking problems, as well as on the limitations of learning-to-rank algorithms based on the minimization of a convex surrogate risk.

arg sort, evaluation metric, surrogate loss, (14 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback