AITopics | Lai, Kuo-Wei

Collaborating Authors

Lai, Kuo-Wei

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Task Shift: From Classification to Regression in Overparameterized Linear Models

LaBonte, Tyler, Lai, Kuo-Wei, Muthukumar, Vidya

arXiv.org Machine LearningFeb-18-2025

Modern machine learning methods have recently demonstrated remarkable capability to generalize under task shift, where latent knowledge is transferred to a different, often more difficult, task under a similar data distribution. We investigate this phenomenon in an overparameterized linear regression setting where the task shifts from classification during training to regression during evaluation. In the zero-shot case, wherein no regression data is available, we prove that task shift is impossible in both sparse signal and random signal models for any Gaussian covariate distribution. In the few-shot case, wherein limited regression data is available, we propose a simple postprocessing algorithm which asymptotically recovers the ground-truth predictor. Our analysis leverages a fine-grained characterization of individual parameters arising from minimum-norm interpolation which may be of independent interest. Our results show that while minimum-norm interpolators for classification cannot transfer to regression a priori, they experience surprisingly structured attenuation which enables successful task shift with limited additional data.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

2502.13285

Country: North America > United States (0.27)

Genre: Research Report > New Finding (0.68)

Industry: Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Sharp analysis of out-of-distribution error for "importance-weighted" estimators in the overparameterized regime

Lai, Kuo-Wei, Muthukumar, Vidya

arXiv.org Machine LearningMay-10-2024

Overparameterized models are ubiquitous in machine learning theory and practice today because of their state-of-the-art generalization guarantees (in the sense of low test error) even while perfectly fitting the training data [30, 7]. However, this "good generalization" property does not extend to test data that is distributed differently from training data, termed out-of-distribution (OOD) data [20, 21, 29]. A particularly acute scenario arises when the data is drawn as a mixture from multiple groups (each with a different distribution) and some groups are very under-represented in training data [2]. Under such models, the worst-group generalization error can be significantly degraded with respect to the average generalization error on all groups [1, 27, 21, 20]. The effect of distribution shift on generalization has been sharply characterized in a worst-case/minimax sense, e.g.

artificial intelligence, inequality, machine learning, (18 more...)

arXiv.org Machine Learning

2405.06546

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

General Loss Functions Lead to (Approximate) Interpolation in High Dimensions

Lai, Kuo-Wei, Muthukumar, Vidya

arXiv.org Artificial IntelligenceMar-13-2023

We provide a unified framework, applicable to a general family of convex losses and across binary and multiclass settings in the overparameterized regime, to approximately characterize the implicit bias of gradient descent in closed form. Specifically, we show that the implicit bias is approximated (but not exactly equal to) the minimum-norm interpolation in high dimensions, which arises from training on the squared loss. In contrast to prior work which was tailored to exponentially-tailed losses and used the intermediate support-vector-machine formulation, our framework directly builds on the primal-dual analysis of Ji and Telgarsky (2021), allowing us to provide new approximate equivalences for general convex losses through a novel sensitivity analysis. Our framework also recovers existing exact equivalence results for exponentially-tailed losses across binary and multiclass settings. Finally, we provide evidence for the tightness of our techniques, which we use to demonstrate the effect of certain loss functions designed for out-of-distribution problems on the closed-form solution.

artificial intelligence, machine learning, null, (17 more...)

arXiv.org Artificial Intelligence

2303.07475

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)

Add feedback