AITopics

2502.07794

Country:

Europe (1.00)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Technology (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.85)

arXiv.org Artificial IntelligenceOct-22-2024

ClimaQA: An Automated Evaluation Framework for Climate Foundation Models

Manivannan, Veeramakali Vignesh, Jafari, Yasaman, Eranky, Srikar, Ho, Spencer, Yu, Rose, Watson-Parris, Duncan, Ma, Yian, Bergen, Leon, Berg-Kirkpatrick, Taylor

In recent years, foundation models have attracted significant interest in climate science due to their potential to transform how we approach critical challenges such as climate predictions and understanding the drivers of climate change [Thulke et al., 2024, Nguyen et al., 2024, Cao et al., 2024]. However, while these models are powerful, they often fall short when it comes to answering technical questions requiring high precision such as What is the net effect of Arctic stratus clouds on the Arctic climate? Even advanced models like GPT-4 exhibit epistemological inaccuracies in Climate Question-Answering (QA) tasks [Bulian et al., 2024], raising concerns about their reliability in scientific workflows. This highlights the need for a domain-specific evaluation framework to assess the quality and validity of outputs generated by these models. Current benchmarks for Large Language Models (LLMs) predominantly focus on linguistic accuracy or general factual correctness, but they fail to address the unique demands of climate science, where factual rigor, domain-specific knowledge, and robust reasoning are essential.

large language model, machine learning, natural language, (17 more...)

2410.16701

Country: North America > United States > California (0.29)

Genre: Research Report (1.00)

Industry:

Education (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)

arXiv.org Machine LearningJun-27-2024

Accuracy on the wrong line: On the pitfalls of noisy data for out-of-distribution generalisation

Sanyal, Amartya, Hu, Yaxi, Yu, Yaodong, Ma, Yian, Wang, Yixin, Schölkopf, Bernhard

"Accuracy-on-the-line" is a widely observed phenomenon in machine learning, where a model's accuracy on in-distribution (ID) and out-of-distribution (OOD) data is positively correlated across different hyperparameters and data configurations. But when does this useful relationship break down? In this work, we explore its robustness. The key observation is that noisy data and the presence of nuisance features can be sufficient to shatter the Accuracy-on-the-line phenomenon. In these cases, ID and OOD accuracy can become negatively correlated, leading to "Accuracy-on-the-wrong-line". This phenomenon can also occur in the presence of spurious (shortcut) features, which tend to overshadow the more complex signal (core, non-spurious) features, resulting in a large nuisance feature space. Moreover, scaling to larger datasets does not mitigate this undesirable behavior and may even exacerbate it. We formally prove a lower bound on Out-of-distribution (OOD) error in a linear classification model, characterizing the conditions on the noise and nuisance features for a large OOD error. We finally demonstrate this phenomenon across both synthetic and real datasets with noisy data and nuisance features.

artificial intelligence, data quality, machine learning, (17 more...)

2406.19049

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.68)

Industry:

Government > Regional Government > North America Government > United States Government (0.93)
Energy (0.93)
Transportation > Infrastructure & Services (0.67)

Technology:

Information Technology > Data Science > Data Quality > Data Cleaning (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

arXiv.org Artificial IntelligenceJan-11-2024

Faster Sampling without Isoperimetry via Diffusion-based Monte Carlo

Huang, Xunpeng, Zou, Difan, Dong, Hanze, Ma, Yian, Zhang, Tong

To sample from a general target distribution $p_*\propto e^{-f_*}$ beyond the isoperimetric condition, Huang et al. (2023) proposed to perform sampling through reverse diffusion, giving rise to Diffusion-based Monte Carlo (DMC). Specifically, DMC follows the reverse SDE of a diffusion process that transforms the target distribution to the standard Gaussian, utilizing a non-parametric score estimation. However, the original DMC algorithm encountered high gradient complexity, resulting in an exponential dependency on the error tolerance $\epsilon$ of the obtained samples. In this paper, we demonstrate that the high complexity of DMC originates from its redundant design of score estimation, and proposed a more efficient algorithm, called RS-DMC, based on a novel recursive score estimation method. In particular, we first divide the entire diffusion process into multiple segments and then formulate the score estimation step (at any time step) as a series of interconnected mean estimation and sampling subproblems accordingly, which are correlated in a recursive manner. Importantly, we show that with a proper design of the segment decomposition, all sampling subproblems will only need to tackle a strongly log-concave distribution, which can be very efficient to solve using the Langevin-based samplers with a provably rapid convergence rate. As a result, we prove that the gradient complexity of RS-DMC only has a quasi-polynomial dependency on $\epsilon$, which significantly improves exponential gradient complexity in Huang et al. (2023). Furthermore, under commonly used dissipative conditions, our algorithm is provably much faster than the popular Langevin-based algorithms. Our algorithm design and theoretical framework illuminate a novel direction for addressing sampling problems, which could be of broader applicability in the community.

artificial intelligence, inequality, machine learning, (18 more...)

2401.06325

Country:

North America > United States > Illinois (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

arXiv.org Machine LearningOct-23-2023

Tractable MCMC for Private Learning with Pure and Gaussian Differential Privacy

Lin, Yingyu, Ma, Yian, Wang, Yu-Xiang, Redberg, Rachel

Posterior sampling, i.e., exponential mechanism to sample from the posterior distribution, provides $\varepsilon$-pure differential privacy (DP) guarantees and does not suffer from potentially unbounded privacy breach introduced by $(\varepsilon,\delta)$-approximate DP. In practice, however, one needs to apply approximate sampling methods such as Markov chain Monte Carlo (MCMC), thus re-introducing the unappealing $\delta$-approximation error into the privacy guarantees. To bridge this gap, we propose the Approximate SAample Perturbation (abbr. ASAP) algorithm which perturbs an MCMC sample with noise proportional to its Wasserstein-infinity ($W_\infty$) distance from a reference distribution that satisfies pure DP or pure Gaussian DP (i.e., $\delta=0$). We then leverage a Metropolis-Hastings algorithm to generate the sample and prove that the algorithm converges in W$_\infty$ distance. We show that by combining our new techniques with a careful localization step, we obtain the first nearly linear-time algorithm that achieves the optimal rates in the DP-ERM problem with strongly convex and smooth losses.

artificial intelligence, dist, machine learning, (19 more...)

2310.14661

Country: North America > United States (0.46)

Genre: Research Report (0.63)

Industry:

Information Technology > Security & Privacy (0.67)
Government > Regional Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

arXiv.org Machine LearningOct-23-2023

Linear Convergence of Black-Box Variational Inference: Should We Stick the Landing?

Kim, Kyurae, Ma, Yian, Gardner, Jacob R.

We now have rigorous convergence guarantees that, for certain well-behaved posteriors, BBVI achieves a convergence rate of (1), corresponding We prove that black-box variational inference to a computational complexity of (1)(Domke et al., (BBVI) with control variates, particularly 2023a; Kim et al., 2023b). A remaining theoretical question the sticking-the-landing(STL) estimator, is whether BBVI can achieve better rates, in particular converges at a geometric (traditionally called geometric convergence rates, which is traditionally "linear") rate under perfect variational family called "linear" convergence in the optimization literature specification. In particular, we prove a (see the textbook by Nesterov 2004, 1.2.3), correspondingtoacomplexityof(log(1)).

artificial intelligence, estimator, machine learning, (14 more...)

2307.14642

Country:

North America > United States > Virginia (0.14)
North America > United States > Rhode Island (0.14)
North America > United States > California (0.14)

Genre:

Research Report (0.64)
Instructional Material (0.48)

Industry: Transportation > Air (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

arXiv.org Machine LearningOct-2-2023

Reverse Diffusion Monte Carlo

Huang, Xunpeng, Dong, Hanze, Hao, Yifan, Ma, Yian, Zhang, Tong

The efficacy of modern generative models is commonly contingent upon the precision of score estimation along the diffusion path, with a focus on diffusion models and their ability to generate high-quality data samples. This study delves into the application of reverse diffusion to Monte Carlo sampling. It is shown that score estimation can be transformed into a mean estimation problem via the decomposition of the transition kernel. By estimating the mean of the posterior distribution, we derive a novel Monte Carlo sampling algorithm from the reverse diffusion process, which is distinct from traditional Markov Chain Monte Carlo (MCMC) methods. We calculate the error requirements and sample size for the posterior distribution, and use the result to derive an algorithm that can approximate the target distribution to any desired accuracy. Additionally, by estimating the log-Sobolev constant of the posterior distribution, we show under suitable conditions the problem of sampling from the posterior can be easier than direct sampling from the target distribution using traditional MCMC techniques. For Gaussian mixture models, we demonstrate that the new algorithm achieves significant improvement over the traditional Langevin-style MCMC sampling methods both theoretically and practically. Our algorithm offers a new perspective and solution beyond classical MCMC algorithms for challenging complex distributions.

algorithm, artificial intelligence, machine learning, (19 more...)

2307.02037

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

arXiv.org Artificial IntelligenceJun-4-2023

Disentangled Multi-Fidelity Deep Bayesian Active Learning

Wu, Dongxia, Niu, Ruijia, Chinazzi, Matteo, Ma, Yian, Yu, Rose

To balance quality and cost, various domain areas of science and engineering run simulations at multiple levels of sophistication. Multi-fidelity active learning aims to learn a direct mapping from input parameters to simulation outputs at the highest fidelity by actively acquiring data from multiple fidelity levels. However, existing approaches based on Gaussian processes are hardly scalable to high-dimensional data. Deep learning-based methods often impose a hierarchical structure in hidden representations, which only supports passing information from low-fidelity to high-fidelity. These approaches can lead to the undesirable propagation of errors from low-fidelity representations to high-fidelity ones. We propose a novel framework called Disentangled Multi-fidelity Deep Bayesian Active Learning (D-MFDAL), which learns the surrogate models conditioned on the distribution of functions at multiple fidelities. On benchmark tasks of learning deep surrogates of partial differential equations including heat equation, Poisson's equation and fluid simulations, our approach significantly outperforms state-of-the-art in prediction accuracy and sample efficiency.

artificial intelligence, deep learning, machine learning, (14 more...)

2305.04392

Country:

North America > United States > California (0.14)
North America > United States > Hawaii (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Machine LearningJun-30-2021

Variational Refinement for Importance Sampling Using the Forward Kullback-Leibler Divergence

Jerfel, Ghassen, Wang, Serena, Fannjiang, Clara, Heller, Katherine A., Ma, Yian, Jordan, Michael I.

Variational Inference (VI) is a popular alternative to asymptotically exact sampling in Bayesian inference. Its main workhorse is optimization over a reverse Kullback-Leibler divergence (RKL), which typically underestimates the tail of the posterior leading to miscalibration and potential degeneracy. Importance sampling (IS), on the other hand, is often used to fine-tune and de-bias the estimates of approximate Bayesian inference procedures. The quality of IS crucially depends on the choice of the proposal distribution. Ideally, the proposal distribution has heavier tails than the target, which is rarely achievable by minimizing the RKL. We thus propose a novel combination of optimization and sampling techniques for approximate Bayesian inference by constructing an IS proposal distribution through the minimization of a forward KL (FKL) divergence. This approach guarantees asymptotic consistency and a fast convergence towards both the optimal IS estimator and the optimal variational approximation. We empirically demonstrate on real data that our method is competitive with variational boosting and MCMC.

artificial intelligence, forward kullback-leibler divergence, variational refinement, (1 more...)

2106.1598

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence (0.93)

arXiv.org Machine LearningNov-6-2020

Underspecification Presents Challenges for Credibility in Modern Machine Learning

D'Amour, Alexander, Heller, Katherine, Moldovan, Dan, Adlam, Ben, Alipanahi, Babak, Beutel, Alex, Chen, Christina, Deaton, Jonathan, Eisenstein, Jacob, Hoffman, Matthew D., Hormozdiari, Farhad, Houlsby, Neil, Hou, Shaobo, Jerfel, Ghassen, Karthikesalingam, Alan, Lucic, Mario, Ma, Yian, McLean, Cory, Mincu, Diana, Mitani, Akinori, Montanari, Andrea, Nado, Zachary, Natarajan, Vivek, Nielson, Christopher, Osborne, Thomas F., Raman, Rajiv, Ramasamy, Kim, Sayres, Rory, Schrouff, Jessica, Seneviratne, Martin, Sequeira, Shannon, Suresh, Harini, Veitch, Victor, Vladymyrov, Max, Wang, Xuezhi, Webster, Kellie, Yadlowsky, Steve, Yun, Taedong, Zhai, Xiaohua, Sculley, D.

ML models often exhibit unexpectedly poor behavior when they are deployed in real-world domains. We identify underspecification as a key reason for these failures. An ML pipeline is underspecified when it can return many predictors with equivalently strong held-out performance in the training domain. Underspecification is common in modern ML pipelines, such as those based on deep learning. Predictors returned by underspecified pipelines are often treated as equivalent based on their training domain performance, but we show here that such predictors can behave very differently in deployment domains. This ambiguity can lead to instability and poor model behavior in practice, and is a distinct failure mode from previously identified issues arising from structural mismatch between training and deployment domains. We show that this problem appears in a wide variety of practical ML pipelines, using examples from computer vision, medical imaging, natural language processing, clinical risk prediction based on electronic health records, and medical genomics. Our results show the need to explicitly account for underspecification in modeling pipelines that are intended for real-world deployment in any domain.

deep learning, neural network, predictor, (26 more...)

2011.03395

Country: North America > United States (1.00)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
Health & Medicine > Therapeutic Area > Dermatology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)