AITopics

2510.10432

Country: North America (0.46)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Data Science > Data Mining (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Soroosh Shafieezadeh Abadeh, Peyman Mohajerin Mohajerin Esfahani, Daniel Kuhn

Distributionally Robust Logistic Regression

Neural Information Processing SystemsOct-2-2025, 13:47:39 GMT

This paper proposes a distributionally robust approach to logistic regression. We use the Wasserstein distance to construct a ball in the space of probability distributions centered at the uniform distribution on the training samples. If the radius of this ball is chosen judiciously, we can guarantee that it contains the unknown data-generating distribution with high confidence. We then formulate a distribution-ally robust logistic regression model that minimizes a worst-case expected logloss function, where the worst case is taken over all distributions in the Wasserstein ball. We prove that this optimization problem admits a tractable reformulation and encapsulates the classical as well as the popular regularized logistic regression problems as special cases. We further propose a distributionally robust approach based on Wasserstein balls to compute upper and lower confidence bounds on the misclassification probability of the resulting classifier. These bounds are given by the optimal values of two highly tractable linear programs.

logistic regression, logloss, regression, (14 more...)

Country: Europe > Switzerland > Vaud > Lausanne (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

arXiv.org Artificial IntelligenceJun-17-2025

Hierarchical Group-wise Ranking Framework for Recommendation Models

Yan, YaChen, Li, Liubo, Choudhary, Ravi

In modern recommender systems, CTR/CVR models are increasingly trained with ranking objectives to improve item ranking quality. While this shift aligns training more closely with serving goals, most existing methods rely on in-batch negative sampling, which predominantly surfaces easy negatives. This limits the model's ability to capture fine-grained user preferences and weakens overall ranking performance. To address this, we propose a Hierarchical Group-wise Ranking Framework with two key components. First, we apply residual vector quantization to user embeddings to generate hierarchical user codes that partition users into hierarchical, trie-structured clusters. Second, we apply listwise ranking losses to user-item pairs at each level of the hierarchy, where shallow levels group loosely similar users and deeper levels group highly similar users, reinforcing learning-to-rank signals through progressively harder negatives. Since users with similar preferences and content exposure tend to yield more informative negatives, applying ranking losses within these hierarchical user groups serves as an effective approximation of hard negative mining. Our approach improves ranking performance without requiring complex real-time context collection or retrieval infrastructure. Extensive experiments demonstrate that the proposed framework consistently enhances both model calibration and ranking accuracy, offering a scalable and practical solution for industrial recommender systems.

artificial intelligence, machine learning, ranking loss, (15 more...)

2506.12756

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Neural Information Processing SystemsFeb-9-2025, 21:09:36 GMT

Robust Classification Under Sample Selection Bias

Anqi Liu, Brian Ziebart

In many important machine learning applications, the source distribution used to estimate a probabilistic classifier differs from the target distribution on which the classifier will be used to make predictions. Due to its asymptotic properties, sample reweighted empirical loss minimization is a commonly employed technique to deal with this difference. However, given finite amounts of labeled source data, this technique suffers from significant estimation errors in settings with large sample selection bias. We develop a framework for learning a robust bias-aware (RBA) probabilistic classifier that adapts to different sample selection biases using a minimax estimation formulation. Our approach requires only accurate estimates of statistics under the source distribution and is otherwise as robust as possible to unknown properties of the conditional label distribution, except when explicit generalization assumptions are incorporated. We demonstrate the behavior and effectiveness of our approach on binary classification tasks.

artificial intelligence, machine learning, target distribution, (17 more...)

Country:

North America > United States > Illinois > Cook County > Chicago (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.31)
Research Report > Experimental Study (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

Berta, Eugène, Holzmüller, David, Jordan, Michael I., Bach, Francis

Rethinking Early Stopping: Refine, Then Calibrate

arXiv.org Artificial IntelligenceJan-31-2025

Machine learning classifiers often produce probabilistic predictions that are critical for accurate and interpretable decision-making in various domains. The quality of these predictions is generally evaluated with proper losses like cross-entropy, which decompose into two components: calibration error assesses general under/overconfidence, while refinement error measures the ability to distinguish different classes. In this paper, we provide theoretical and empirical evidence that these two errors are not minimized simultaneously during training. Selecting the best training epoch based on validation loss thus leads to a compromise point that is suboptimal for both calibration error and, most importantly, refinement error. To address this, we introduce a new metric for early stopping and hyperparameter tuning that makes it possible to minimize refinement error during training. The calibration error is minimized after training, using standard techniques. Our method integrates seamlessly with any architecture and consistently improves performance across diverse classification tasks.

artificial intelligence, calibration, machine learning, (16 more...)

2501.19195

Country:

Europe > France (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Artificial IntelligenceJun-4-2024

Wukong: Towards a Scaling Law for Large-Scale Recommendation

Zhang, Buyun, Luo, Liang, Chen, Yuxin, Nie, Jade, Liu, Xi, Guo, Daifeng, Zhao, Yanli, Li, Shen, Hao, Yuchen, Yao, Yantao, Lakshminarayanan, Guna, Wen, Ellie Dingqiao, Park, Jongsoo, Naumov, Maxim, Chen, Wenlin

Scaling laws play an instrumental role in the sustainable improvement in model quality. Unfortunately, recommendation models to date do not exhibit such laws similar to those observed in the domain of large language models, due to the inefficiencies of their upscaling mechanisms. This limitation poses significant challenges in adapting these models to increasingly more complex real-world datasets. In this paper, we propose an effective network architecture based purely on stacked factorization machines, and a synergistic upscaling strategy, collectively dubbed Wukong, to establish a scaling law in the domain of recommendation. Wukong's unique design makes it possible to capture diverse, any-order of interactions simply through taller and wider layers. We conducted extensive evaluations on six public datasets, and our results demonstrate that Wukong consistently outperforms state-of-the-art models quality-wise. Further, we assessed Wukong's scalability on an internal, large-scale dataset. The results show that Wukong retains its superiority in quality over state-of-the-art models, while holding the scaling law across two orders of magnitude in model complexity, extending beyond 100 GFLOP/example, where prior arts fall short.

dataset, interaction, wukong, (15 more...)

2403.02545

Country:

Europe > Austria > Vienna (0.14)
Europe > Spain > Galicia > Madrid (0.04)
Oceania > Australia > Queensland (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Promising Solution (0.86)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Artificial IntelligenceMay-1-2024

Transformer-Based Self-Supervised Learning for Histopathological Classification of Ischemic Stroke Clot Origin

Yeh, K., Jabal, M. S., Gupta, V., Kallmes, D. F., Brinjikji, W., Erdal, B. S.

Background and Purpose: Identifying the thromboembolism source in ischemic stroke is crucial for treatment and secondary prevention yet is often undetermined. This study describes a self-supervised deep learning approach in digital pathology of emboli for classifying ischemic stroke clot origin from histopathological images. Methods: The dataset included whole slide images (WSI) from the STRIP AI Kaggle challenge, consisting of retrieved clots from ischemic stroke patients following mechanical thrombectomy. Transformer-based deep learning models were developed using transfer learning and self-supervised pretraining for classifying WSI. Customizations included an attention pooling layer, weighted loss function, and threshold optimization. Various model architectures were tested and compared, and model performances were primarily evaluated using weighted logarithmic loss. Results: The model achieved a logloss score of 0.662 in cross-validation and 0.659 on the test set. Different model backbones were compared, with the swin_large_patch4_window12_384 showed higher performance. Thresholding techniques for clot origin classification were employed to balance false positives and negatives. Conclusion: The study demonstrates the extent of efficacy of transformer-based deep learning models in identifying ischemic stroke clot origins from histopathological images and emphasizes the need for refined modeling techniques specifically adapted to thrombi WSI. Further research is needed to improve model performance, interpretability, validate its effectiveness. Future enhancement could include integrating larger patient cohorts, advanced preprocessing strategies, and exploring ensemble multimodal methods for enhanced diagnostic accuracy.

classification, in-domain mocov3 pretraining, learning, (9 more...)

2405.00908

Country: North America > United States > Minnesota > Olmsted County > Rochester (0.05)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Hematology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsMar-13-2024, 13:06:17 GMT

Robust Classification Under Sample Selection Bias

prediction, sample selection bia, target distribution, (15 more...)

Country:

North America > United States > Illinois > Cook County > Chicago (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.31)
Research Report > Experimental Study (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

Neural Information Processing SystemsMar-13-2024, 03:30:12 GMT

Distributionally Robust Logistic Regression

This paper proposes a distributionally robust approach to logistic regression. We use the Wasserstein distance to construct a ball in the space of probability distributions centered at the uniform distribution on the training samples. If the radius of this ball is chosen judiciously, we can guarantee that it contains the unknown datagenerating distribution with high confidence. We then formulate a distributionally robust logistic regression model that minimizes a worst-case expected logloss function, where the worst case is taken over all distributions in the Wasserstein ball. We prove that this optimization problem admits a tractable reformulation and encapsulates the classical as well as the popular regularized logistic regression problems as special cases. We further propose a distributionally robust approach based on Wasserstein balls to compute upper and lower confidence bounds on the misclassification probability of the resulting classifier. These bounds are given by the optimal values of two highly tractable linear programs.

artificial intelligence, logistic regression, machine learning, (15 more...)

Country: Europe > Switzerland > Vaud > Lausanne (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Li, Zhibin, Koniusz, Piotr, Zhang, Lu, Pagendam, Daniel Edward, Moghadam, Peyman

Exploiting Field Dependencies for Learning on Categorical Data

arXiv.org Artificial IntelligenceJul-18-2023

Traditional approaches for learning on categorical data underexploit the dependencies between columns (\aka fields) in a dataset because they rely on the embedding of data points driven alone by the classification/regression loss. In contrast, we propose a novel method for learning on categorical data with the goal of exploiting dependencies between fields. Instead of modelling statistics of features globally (i.e., by the covariance matrix of features), we learn a global field dependency matrix that captures dependencies between fields and then we refine the global field dependency matrix at the instance-wise level with different weights (so-called local dependency modelling) w.r.t. each field to improve the modelling of the field dependencies. Our algorithm exploits the meta-learning paradigm, i.e., the dependency matrices are refined in the inner loop of the meta-learning algorithm without the use of labels, whereas the outer loop intertwines the updates of the embedding matrix (the matrix performing projection) and global dependency matrix in a supervised fashion (with the use of labels). Our method is simple yet it outperforms several state-of-the-art methods on six popular dataset benchmarks. Detailed ablation studies provide additional insights into our method.

dataset, dependency, proceedings, (15 more...)

2307.09321

Country:

Oceania > Australia > Queensland > Brisbane (0.04)
South America > Peru (0.04)
Europe > Greece (0.04)
(9 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.67)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)