AITopics

2206.08853

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California (0.14)
Asia > Japan > Honshū (0.14)

Genre:

Instructional Material (1.00)
Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

arXiv.org Artificial IntelligenceNov-21-2022

Can You Label Less by Using Out-of-Domain Data? Active & Transfer Learning with Few-shot Instructions

Kocielnik, Rafal, Kangaslahti, Sara, Prabhumoye, Shrimai, Hari, Meena, Alvarez, R. Michael, Anandkumar, Anima

Labeling social-media data for custom dimensions of toxicity and social bias is challenging and labor-intensive. Existing transfer and active learning approaches meant to reduce annotation effort require fine-tuning, which suffers from over-fitting to noise and can cause domain shift with small sample sizes. In this work, we propose a novel Active Transfer Few-shot Instructions (ATF) approach which requires no fine-tuning. ATF leverages the internal linguistic knowledge of pre-trained language models (PLMs) to facilitate the transfer of information from existing pre-labeled datasets (source-domain task) with minimum labeling effort on unlabeled target data (target-domain task). Our strategy can yield positive transfer achieving a mean AUC gain of 10.5% compared to no transfer with a large 22b parameter PLM. We further show that annotation of just a few target-domain samples via active learning can be beneficial for transfer, but the impact diminishes with more annotation effort (26% drop in gain between 100 and 2000 annotated examples). Finally, we find that not all transfer scenarios yield a positive gain, which seems related to the PLMs initial performance on the target-domain task.

artificial intelligence, arxiv preprint arxiv, machine learning, (15 more...)

2211.11798

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry: Law (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.52)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

arXiv.org Artificial IntelligenceOct-27-2022

An Adversarial Active Sampling-based Data Augmentation Framework for Manufacturable Chip Design

Liu, Mingjie, Yang, Haoyu, Li, Zongyi, Sastry, Kumara, Mukhopadhyay, Saumyadip, Dogru, Selim, Anandkumar, Anima, Pan, David Z., Khailany, Brucek, Ren, Haoxing

Lithography modeling is a crucial problem in chip design to ensure a chip design mask is manufacturable. It requires rigorous simulations of optical and chemical models that are computationally expensive. Recent developments in machine learning have provided alternative solutions in replacing the time-consuming lithography simulations with deep neural networks. However, the considerable accuracy drop still impedes its industrial adoption. Most importantly, the quality and quantity of the training dataset directly affect the model performance. To tackle this problem, we propose a litho-aware data augmentation (LADA) framework to resolve the dilemma of limited data and improve the machine learning model performance. First, we pretrain the neural networks for lithography modeling and a gradient-friendly StyleGAN2 generator. We then perform adversarial active sampling to generate informative and synthetic in-distribution mask designs. These synthetic mask images will augment the original limited training dataset used to finetune the lithography model for improved performance. Experimental results demonstrate that LADA can successfully exploits the neural network capacity by narrowing down the performance gap between the training and testing data instances.

artificial intelligence, deep learning, machine learning, (12 more...)

2210.15765

Country: North America > United States > Texas (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Semiconductors & Electronics (1.00)
Information Technology (0.96)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJul-8-2022

Large Scale Mask Optimization Via Convolutional Fourier Neural Operator and Litho-Guided Self Training

Yang, Haoyu, Li, Zongyi, Sastry, Kumara, Mukhopadhyay, Saumyadip, Anandkumar, Anima, Khailany, Brucek, Singh, Vivek, Ren, Haoxing

Machine learning techniques have been extensively studied for mask optimization problems, aiming at better mask printability, shorter turnaround time, better mask manufacturability, and so on. However, most of these researches are focusing on the initial solution generation of small design regions. To further realize the potential of machine learning techniques on mask optimization tasks, we present a Convolutional Fourier Neural Operator (CFNO) that can efficiently learn layout tile dependencies and hence promise stitch-less large-scale mask optimization with the limited intervention of legacy tools. We discover the possibility of litho-guided self-training (LGST) through a trained machine learning model when solving non-convex optimization problems, which allows iterative model and dataset update and brings significant model performance improvement. Experimental results show that, for the first time, our machine learning-based framework outperforms state-of-the-art academic numerical mask optimizers with an order of magnitude speedup.

artificial intelligence, latexit sha1, machine learning, (16 more...)

2207.04056

Country: North America > United States (0.16)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Machine LearningJun-22-2022

Langevin Monte Carlo for Contextual Bandits

Xu, Pan, Zheng, Hongkai, Mazumdar, Eric, Azizzadenesheli, Kamyar, Anandkumar, Anima

We study the efficiency of Thompson sampling for contextual bandits. Existing Thompson sampling-based algorithms need to construct a Laplace approximation (i.e., a Gaussian distribution) of the posterior distribution, which is inefficient to sample in high dimensional applications for general covariance matrices. Moreover, the Gaussian approximation may not be a good surrogate for the posterior distribution for general reward generating functions. We propose an efficient posterior sampling algorithm, viz., Langevin Monte Carlo Thompson Sampling (LMC-TS), that uses Markov Chain Monte Carlo (MCMC) methods to directly sample from the posterior distribution in contextual bandits. Our method is computationally efficient since it only needs to perform noisy gradient descent updates without constructing the Laplace approximation of the posterior distribution. We prove that the proposed algorithm achieves the same sublinear regret bound as the best Thompson sampling algorithms for a special case of contextual bandits, viz., linear contextual bandits. We conduct experiments on both synthetic data and real-world datasets on different contextual bandit models, which demonstrates that directly sampling from the posterior is both computationally efficient and competitive in performance.

artificial intelligence, bandit, machine learning, (16 more...)

2206.11254

Country: North America > United States (1.00)

Genre:

Instructional Material (0.46)
Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

arXiv.org Machine LearningJun-16-2022

Thompson Sampling Achieves $\tilde O(\sqrt{T})$ Regret in Linear Quadratic Control

Kargin, Taylan, Lale, Sahin, Azizzadenesheli, Kamyar, Anandkumar, Anima, Hassibi, Babak

Thompson Sampling (TS) is an efficient method for decision-making under uncertainty, where an action is sampled from a carefully prescribed distribution which is updated based on the observed data. In this work, we study the problem of adaptive control of stabilizable linear-quadratic regulators (LQRs) using TS, where the system dynamics are unknown. Previous works have established that $\tilde O(\sqrt{T})$ frequentist regret is optimal for the adaptive control of LQRs. However, the existing methods either work only in restrictive settings, require a priori known stabilizing controllers, or utilize computationally intractable approaches. We propose an efficient TS algorithm for the adaptive control of LQRs, TS-based Adaptive Control, TSAC, that attains $\tilde O(\sqrt{T})$ regret, even for multidimensional systems, thereby solving the open problem posed in Abeille and Lazaric (2018). TSAC does not require a priori known stabilizing controller and achieves fast stabilization of the underlying system by effectively exploring the environment in the early stages. Our result hinges on developing a novel lower bound on the probability that the TS provides an optimistic sample. By carefully prescribing an early exploration strategy and a policy update rule, we show that TS achieves order-optimal regret in adaptive control of multidimensional stabilizable LQRs. We empirically demonstrate the performance and the efficiency of TSAC in several adaptive control tasks.

artificial intelligence, linear quadratic control, sqrt

2206.0852

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence (0.53)

arXiv.org Machine LearningJun-7-2022

Finite-Time Regret of Thompson Sampling Algorithms for Exponential Family Multi-Armed Bandits

Jin, Tianyuan, Xu, Pan, Xiao, Xiaokui, Anandkumar, Anima

We study the regret of Thompson sampling (TS) algorithms for exponential family bandits, where the reward distribution is from a one-dimensional exponential family, which covers many common reward distributions including Bernoulli, Gaussian, Gamma, Exponential, etc. We propose a Thompson sampling algorithm, termed ExpTS, which uses a novel sampling distribution to avoid the under-estimation of the optimal arm. We provide a tight regret analysis for ExpTS, which simultaneously yields both the finite-time regret bound as well as the asymptotic regret bound. In particular, for a $K$-armed bandit with exponential family rewards, ExpTS over a horizon $T$ is sub-UCB (a strong criterion for the finite-time regret that is problem-dependent), minimax optimal up to a factor $\sqrt{\log K}$, and asymptotically optimal, for exponential family rewards. Moreover, we propose ExpTS$^+$, by adding a greedy exploitation step in addition to the sampling distribution used in ExpTS, to avoid the over-estimation of sub-optimal arms. ExpTS$^+$ is an anytime bandit algorithm and achieves the minimax optimality and asymptotic optimality simultaneously for exponential family reward distributions. Our proof techniques are general and conceptually simple and can be easily applied to analyze standard Thompson sampling with specific reward distributions.

artificial intelligence, exponential family multi-armed bandit, thompson sampling algorithm, (1 more...)

2206.0352

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence (0.93)
Information Technology > Data Science > Data Mining > Big Data (0.69)

arXiv.org Machine LearningJun-3-2022

KCRL: Krasovskii-Constrained Reinforcement Learning with Guaranteed Stability in Nonlinear Dynamical Systems

Lale, Sahin, Shi, Yuanyuan, Qu, Guannan, Azizzadenesheli, Kamyar, Wierman, Adam, Anandkumar, Anima

Learning a dynamical system requires stabilizing the unknown dynamics to avoid state blow-ups. However, current reinforcement learning (RL) methods lack stabilization guarantees, which limits their applicability for the control of safety-critical systems. We propose a model-based RL framework with formal stability guarantees, Krasovskii Constrained RL (KCRL), that adopts Krasovskii's family of Lyapunov functions as a stability constraint. The proposed method learns the system dynamics up to a confidence interval using feature representation, e.g. Random Fourier Features. It then solves a constrained policy optimization problem with a stability constraint based on Krasovskii's method using a primal-dual approach to recover a stabilizing policy. We show that KCRL is guaranteed to learn a stabilizing policy in a finite number of interactions with the underlying unknown system. We also derive the sample complexity upper bound for stabilization of unknown nonlinear dynamical systems via the KCRL framework.

artificial intelligence, krasovskii-constrained reinforcement learning, nonlinear dynamical system, (2 more...)

2206.01704

Genre: Research Report (0.40)

Technology:

Information Technology > Scientific Computing (0.80)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.60)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.53)

arXiv.org Artificial IntelligenceApr-1-2022

Informing Geometric Deep Learning with Electronic Interactions to Accelerate Quantum Chemistry

Qiao, Zhuoran, Christensen, Anders S., Welborn, Matthew, Manby, Frederick R., Anandkumar, Anima, Miller, Thomas F. III

Predicting electronic energies, densities, and related chemical properties can facilitate the discovery of novel catalysts, medicines, and battery materials. By developing a physics-inspired equivariant neural network, we introduce a method to learn molecular representations based on the electronic interactions among atomic orbitals. Our method, OrbNet-Equi, leverages efficient tight-binding simulations and learned mappings to recover high fidelity quantum chemical properties. OrbNet-Equi models a wide spectrum of target properties with an accuracy consistently better than standard machine learning methods and a speed orders of magnitude greater than density functional theory. Despite only using training samples collected from readily available small-molecule libraries, OrbNet-Equi outperforms traditional methods on comprehensive downstream benchmarks that encompass diverse main-group chemical processes. Our method also describes interactions in challenging charge-transfer complexes and open-shell systems. We anticipate that the strategy presented here will help to expand opportunities for studies in chemistry and materials science, where the acquisition of experimental or reference training data is costly.

artificial intelligence, informing geometric deep learning, machine learning, (13 more...)

doi: 10.1073/pnas.2205221119

2105.14655

Country:

North America > United States (0.14)
Europe (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Energy > Oil & Gas (1.00)
Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceFeb-8-2022

Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models

Wang, Boxin, Ping, Wei, Xiao, Chaowei, Xu, Peng, Patwary, Mostofa, Shoeybi, Mohammad, Li, Bo, Anandkumar, Anima, Catanzaro, Bryan

Pre-trained language models (LMs) are shown to easily generate toxic language. In this work, we systematically explore domain-adaptive training to reduce the toxicity of language models. We conduct this study on three dimensions: training corpus, model size, and parameter efficiency. For the training corpus, we propose to leverage the generative power of LMs and generate nontoxic datasets for domain-adaptive training, which mitigates the exposure bias and is shown to be more data-efficient than using a curated pre-training corpus. We demonstrate that the self-generation method consistently outperforms the existing baselines across various model sizes on both automatic and human evaluations, even when it uses a 1/3 smaller training corpus. We then comprehensively study detoxifying LMs with parameter sizes ranging from 126M up to 530B (3x larger than GPT-3), a scale that has never been studied before. We find that i) large LMs have similar toxicity levels as smaller ones given the same pre-training corpus, and ii) large LMs require more endeavor to detoxify. We also explore parameter-efficient training methods for detoxification. We demonstrate that adding and training adapter-only layers in LMs not only saves a lot of parameters but also achieves a better trade-off between toxicity and perplexity than whole model adaptation for the large-scale models.

artificial intelligence, machine learning, natural language, (16 more...)

2202.04173

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)