AITopics | Michael I. Jordan

Plotting

Michael I. Jordan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Privacy Aware Learning

Martin J. Wainwright, Michael I. Jordan, John C. Duchi

Neural Information Processing SystemsFeb-11-2025, 18:24:47 GMT

We study statistical risk minimization problems under a version of privacy in which the data is kept confidential even from the learner. In this local privacy framework, we establish sharp upper and lower bounds on the convergence rates of statistical estimation procedures. As a consequence, we exhibit a precise tradeoff between the amount of privacy the data preserves and the utility, measured by convergence rate, of any statistical estimator.

artificial intelligence, machine learning, privacy, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Data Science (0.68)
Information Technology > Security & Privacy (0.68)

Add feedback

Unsupervised Domain Adaptation with Residual Transfer Networks

Mingsheng Long, Han Zhu, Jianmin Wang, Michael I. Jordan

Neural Information Processing SystemsJan-20-2025, 17:43:25 GMT

The recent success of deep neural networks relies on massive amounts of labeled data. For a target task where labeled data is unavailable, domain adaptation can transfer a learner from a different source domain. In this paper, we propose a new approach to domain adaptation in deep networks that can jointly learn adaptive classifiers and transferable features from labeled data in the source domain and unlabeled data in the target domain. We relax a shared-classifier assumption made by previous methods and assume that the source classifier and target classifier differ by a residual function. We enable classifier adaptation by plugging several layers into deep network to explicitly learn the residual function with reference to the target classifier. We fuse features of multiple layers with tensor product and embed them into reproducing kernel Hilbert spaces to match distributions for feature adaptation. The adaptation can be achieved in most feed-forward models by extending them with new residual layers and loss functions, which can be trained efficiently via back-propagation. Empirical evidence shows that the new approach outperforms state of the art methods on standard domain adaptation benchmarks.

artificial intelligence, deep learning, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Cyclades: Conflict-free Asynchronous Machine Learning

Xinghao Pan, Maximilian Lam, Stephen Tu, Dimitris Papailiopoulos, Ce Zhang, Michael I. Jordan, Kannan Ramchandran, Christopher Ré

Neural Information Processing SystemsJan-20-2025, 08:11:32 GMT

In all of these studies, classic algorithms are parallelized by simply running parallel and asynchronous model updates without locks. These lock-free, asynchronous algorithms exhibit speedups even when applied to large, non-convex problems, as demonstrated by deep learning systems such as Google's Downpour SGD [6] and Microsoft's Project Adam [4]. While these techniques have been remarkably successful, many of the above papers require delicate and tailored analyses to quantify the benefits of asynchrony for each particular learning task. Moreover, in non-convex settings, we currently have little quantitative insight into how much speedup is gained from asynchrony.

algorithm, artificial intelligence, machine learning, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.14)
Europe > Spain (0.14)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

Non-convex Finite-Sum Optimization Via SCSG Methods

Lihua Lei, Cheng Ju, Jianbo Chen, Michael I. Jordan

Neural Information Processing SystemsOct-8-2024, 02:22:20 GMT

We develop a class of algorithms, as variants of the stochastically controlled stochastic gradient (SCSG) methods [21], for the smooth non-convex finitesum optimization problem.

artificial intelligence, machine learning, scsg, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.50)

Add feedback

Conditional Adversarial Domain Adaptation

Mingsheng Long, ZHANGJIE CAO, Jianmin Wang, Michael I. Jordan

Neural Information Processing SystemsOct-7-2024, 23:23:21 GMT

Adversarial learning has been embedded into deep networks to learn disentangled and transferable representations for domain adaptation. Existing adversarial domain adaptation methods may not effectively align different domains of multimodal distributions native in classification problems. In this paper, we present conditional adversarial domain adaptation, a principled framework that conditions the adversarial adaptation models on discriminative information conveyed in the classifier predictions. Conditional domain adversarial networks (CDANs) are designed with two novel conditioning strategies: multilinear conditioning that captures the crosscovariance between feature representations and classifier predictions to improve the discriminability, and entropy conditioning that controls the uncertainty of classifier predictions to guarantee the transferability. With theoretical guarantees and a few lines of codes, the approach has exceeded state-of-the-art results on five datasets.

artificial intelligence, deep learning, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Online control of the false discovery rate with decaying memory

Aaditya Ramdas, Fanny Yang, Martin J. Wainwright, Michael I. Jordan

Neural Information Processing SystemsOct-6-2024, 22:22:49 GMT

In the online multiple testing problem, p-values corresponding to different null hypotheses are observed one by one, and the decision of whether or not to reject the current hypothesis must be made immediately, after which the next p-value is observed. Alpha-investing algorithms to control the false discovery rate (FDR), formulated by Foster and Stine, have been generalized and applied to many settings, including quality-preserving databases in science and multiple A/B or multi-armed bandit tests for internet commerce. This paper improves the class of generalized alpha-investing algorithms (GAI) in four ways: (a) we show how to uniformly improve the power of the entire class of monotone GAI procedures by awarding more alpha-wealth for each rejection, giving a win-win resolution to a recent dilemma raised by Javanmard and Montanari, (b) we demonstrate how to incorporate prior weights to indicate domain knowledge of which hypotheses are likely to be non-null, (c) we allow for differing penalties for false discoveries to indicate that some hypotheses may be more important than others, (d) we define a new quantity called the decaying memory false discovery rate (mem-FDR) that may be more meaningful for truly temporal applications, and which alleviates problems that we describe and refer to as "piggybacking" and "alpha-death." Our GAI++ algorithms incorporate all four generalizations simultaneously, and reduce to more powerful variants of earlier algorithms when the weights and decay are all set to unity. Finally, we also describe a simple method to derive new online FDR rules based on an estimated false discovery proportion.

artificial intelligence, hypothesis, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre: Research Report > Experimental Study (0.58)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.46)
Information Technology (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Gradient Descent Can Take Exponential Time to Escape Saddle Points

Simon S. Du, Chi Jin, Jason D. Lee, Michael I. Jordan, Aarti Singh, Barnabas Poczos

Neural Information Processing SystemsOct-4-2024, 10:55:46 GMT

Although gradient descent (GD) almost always escapes saddle points asymptotically [Lee et al., 2016], this paper shows that even with fairly natural random initialization schemes and non-pathological functions, GD can be significantly slowed down by saddle points, taking exponential time to escape. On the other hand, gradient descent with perturbations [Ge et al., 2015, Jin et al., 2017] is not slowed down by saddle points--it can find an approximate local minimizer in polynomial time. This result implies that GD is inherently slower than perturbed GD, and justifies the importance of adding perturbations for efficient non-convex optimization. While our focus is theoretical, we also present experiments that illustrate our theoretical findings.

artificial intelligence, machine learning, saddle point, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.46)

Genre: Research Report (0.46)

Industry: Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Kernel Feature Selection via Conditional Covariance Minimization

Jianbo Chen, Mitchell Stern, Martin J. Wainwright, Michael I. Jordan

Neural Information Processing SystemsOct-4-2024, 03:47:03 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, selection, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.47)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Fast Black-box Variational Inference through Stochastic Trust-Region Optimization

Jeffrey Regier, Michael I. Jordan, Jon McAuliffe

Neural Information Processing SystemsOct-3-2024, 23:50:39 GMT

We introduce TrustVI, a fast second-order algorithm for black-box variational inference based on trust-region optimization and the "reparameterization trick." At each iteration, TrustVI proposes and assesses a step based on minibatches of draws from the variational distribution.

artificial intelligence, iteration, machine learning, (14 more...)

Neural Information Processing Systems

Industry: Transportation > Air (0.60)

Technology: