AITopics

2302.01928

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Oceania > Australia > Queensland > Brisbane (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(26 more...)

Genre: Overview (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(3 more...)

Gama, Fernando, Zilberstein, Nicolas, Sevilla, Martin, Baraniuk, Richard, Segarra, Santiago

Unsupervised Learning of Sampling Distributions for Particle Filters

arXiv.org Machine LearningFeb-2-2023

Accurate estimation of the states of a nonlinear dynamical system is crucial for their design, synthesis, and analysis. Particle filters are estimators constructed by simulating trajectories from a sampling distribution and averaging them based on their importance weight. For particle filters to be computationally tractable, it must be feasible to simulate the trajectories by drawing from the sampling distribution. Simultaneously, these trajectories need to reflect the reality of the nonlinear dynamical system so that the resulting estimators are accurate. Thus, the crux of particle filters lies in designing sampling distributions that are both easy to sample from and lead to accurate estimators. In this work, we propose to learn the sampling distributions. We put forward four methods for learning sampling distributions from observed measurements. Three of the methods are parametric methods in which we learn the mean and covariance matrix of a multivariate Gaussian distribution; each methods exploits a different aspect of the data (generic, time structure, graph structure). The fourth method is a nonparametric alternative in which we directly learn a transform of a uniform random variable. All four methods are trained in an unsupervised manner by maximizing the likelihood that the states may have produced the observed measurements. Our computational experiments demonstrate that learned sampling distributions exhibit better performance than designed, minimum-degeneracy sampling distributions.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Machine Learning

2302.01174

Country:

Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
North America > United States > Texas > Harris County > Houston (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(15 more...)

Genre: Research Report (0.64)

Industry: Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Lelièvre, Tony, Robin, Geneviève, Sekkat, Inass, Stoltz, Gabriel, Cardoso, Gabriel Victorino

Generative methods for sampling transition paths in molecular dynamics

Molecular dynamics aims at simulating the physical movement of atoms in order to sample the Boltzmann-Gibbs probability measure and the associated trajectories, and to compute macroscopic properties using Monte Carlo estimates [17, 1]. One of the main difficulties when performing these numerical simulations is metastability: the system tends to stay trapped in some regions of the phase space, typically in the vicinity of local maxima of the target probability measure. In this context, transitions from one metastable state to another one are of particular interest in complex systems, as they characterize for example crystallisation or enzymatic reactions. These reactions happen on a long time scale compared to the molecular timescale, so that the simulation of realistic rare events is computationally difficult. On the one hand, many efforts have been devoted to the development of rare events sampling methods in molecular dynamics. The goal of these methods is to characterize transition paths and to compute associated transition rates and mean transition times; see for instance [21] for a review. The most notable methods can be classified in two groups: (i) importance sampling techniques, where the dynamics is biased (by modifying the potential for instance) to reduce the variance of Monte Carlo estimators when computing expectations, see for instance [16, 8] for more details, and also [31, Section 6.2]. It is possible to use adaptive importance sampling strategies to choose the importance function, see [30, Chapter 5]. Another viewpoint is offered by the framework of stochastic control, as in [21] where the modification in the drift of the dynamics is determined by the solution of an optimal control problem.

machine learning, reinforcement learning, trajectory, (19 more...)

2205.02818

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > New York (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
(5 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Mathematics of Computing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
(3 more...)

Double Sampling Randomized Smoothing

Li, Linyi, Zhang, Jiawei, Xie, Tao, Li, Bo

Neural networks (NNs) are known to be vulnerable against adversarial perturbations, and thus there is a line of work aiming to provide robustness certification for NNs, such as randomized smoothing, which samples smoothing noises from a certain distribution to certify the robustness for a smoothed classifier. However, as shown by previous work, the certified robust radius in randomized smoothing suffers from scaling to large datasets ("curse of dimensionality"). To overcome this hurdle, we propose a Double Sampling Randomized Smoothing (DSRS) framework, which exploits the sampled probability from an additional smoothing distribution to tighten the robustness certification of the previous smoothed classifier. Theoretically, under mild assumptions, we prove that DSRS can certify $\Theta(\sqrt d)$ robust radius under $\ell_2$ norm where $d$ is the input dimension, implying that DSRS may be able to break the curse of dimensionality of randomized smoothing. We instantiate DSRS for a generalized family of Gaussian smoothing and propose an efficient and sound computing method based on customized dual optimization considering sampling error. Extensive experiments on MNIST, CIFAR-10, and ImageNet verify our theory and show that DSRS certifies larger robust radii than existing baselines consistently under different settings. Code is available at https://github.com/llylly/DSRS.

artificial intelligence, certification, machine learning, (19 more...)

2206.07912

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
(14 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)

Lin, Xiang, Jwalapuram, Prathyusha, Joty, Shafiq

Dynamic Scheduled Sampling with Imitation Loss for Neural Text Generation

State-of-the-art neural text generation models are typically trained to maximize the likelihood of each token in the ground-truth sequence conditioned on the previous target tokens. However, during inference, the model needs to make a prediction conditioned on the tokens generated by itself. This train-test discrepancy is referred to as exposure bias. Scheduled sampling is a curriculum learning strategy that gradually exposes the model to its own predictions during training to mitigate this bias. Most of the proposed approaches design a scheduler based on training steps, which generally requires careful tuning depending on the training setup. In this work, we introduce Dynamic Scheduled Sampling with Imitation Loss (DySI), which maintains the schedule based solely on the training time accuracy, while enhancing the curriculum learning by introducing an imitation loss, which attempts to make the behavior of the decoder indistinguishable from the behavior of a teacher-forced decoder. DySI is universally applicable across training setups with minimal tuning. Extensive experiments and analysis show that DySI not only achieves notable improvements on standard machine translation benchmarks, but also significantly improves the robustness of other text generation models.

artificial intelligence, machine learning, natural language, (17 more...)

2301.13753

Country:

North America > Dominican Republic (0.04)
Asia > Singapore (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(23 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Banerjee, Debangshu, Singh, Avaljot, Singh, Gagandeep

Interpreting Robustness Proofs of Deep Neural Networks

In recent years numerous methods have been developed to formally verify the robustness of deep neural networks (DNNs). Though the proposed techniques are effective in providing mathematical guarantees about the DNNs behavior, it is not clear whether the proofs generated by these methods are human-interpretable. In this paper, we bridge this gap by developing new concepts, algorithms, and representations to generate human understandable interpretations of the proofs. Leveraging the proposed method, we show that the robustness proofs of standard DNNs rely on spurious input features, while the proofs of DNNs trained to be provably robust filter out even the semantically meaningful features. The proofs for the DNNs combining adversarial and provably robust training are the most effective at selectively filtering out spurious features as well as relying on human-understandable input features.

artificial intelligence, deep learning, machine learning, (16 more...)

2301.13845

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry: Information Technology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Hada, Suryabhan Singh, Carreira-Perpiñán, Miguel Á., Zharmagambetov, Arman

Sparse Oblique Decision Trees: A Tool to Understand and Manipulate Neural Net Features

arXiv.org Artificial IntelligenceJan-30-2023

The widespread deployment of deep nets in practical applications has lead to a growing desire to understand how and why such black-box methods perform prediction. Much work has focused on understanding what part of the input pattern (an image, say) is responsible for a particular class being predicted, and how the input may be manipulated to predict a different class. We focus instead on understanding which of the internal features computed by the neural net are responsible for a particular class. We achieve this by mimicking part of the neural net with an oblique decision tree having sparse weight vectors at the decision nodes. Using the recently proposed Tree Alternating Optimization (TAO) algorithm, we are able to learn trees that are both highly accurate and interpretable. Such trees can faithfully mimic the part of the neural net they replaced, and hence they can provide insights into the deep net black box. Further, we show we can easily manipulate the neural net features in order to make the net predict, or not predict, a given class, thus showing that it is possible to carry out adversarial attacks at the level of the features. These insights and manipulations apply globally to the entire training and test set, not just at a local (single-instance) level. We demonstrate this robustly in the MNIST and ImageNet datasets with LeNet5 and VGG networks.

artificial intelligence, gini, machine learning, (19 more...)

doi: 10.1007/s10618-022-00892-7

2104.02922

Country:

North America > United States > California > San Francisco County > San Francisco (0.13)
Europe > Switzerland > Zürich > Zürich (0.13)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(14 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine (1.00)
Information Technology > Security & Privacy (0.87)
Transportation (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Tsilivis, Nikolaos, Kempe, Julia

What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness?

arXiv.org Artificial IntelligenceJan-30-2023

The adversarial vulnerability of neural nets, and subsequent techniques to create robust models have attracted significant attention; yet we still lack a full understanding of this phenomenon. Here, we study adversarial examples of trained neural networks through analytical tools afforded by recent theory advances connecting neural networks and kernel methods, namely the Neural Tangent Kernel (NTK), following a growing body of work that leverages the NTK approximation to successfully analyze important deep learning phenomena and design algorithms for new applications. We show how NTKs allow to generate adversarial examples in a ``training-free'' fashion, and demonstrate that they transfer to fool their finite-width neural net counterparts in the ``lazy'' regime. We leverage this connection to provide an alternative view on robust and non-robust features, which have been suggested to underlie the adversarial brittleness of neural nets. Specifically, we define and study features induced by the eigendecomposition of the kernel to better understand the role of robust and non-robust features, the reliance on both for standard classification and the robustness-accuracy trade-off. We find that such features are surprisingly consistent across architectures, and that robust features tend to correspond to the largest eigenvalues of the model, and thus are learned early during training. Our framework allows us to identify and visualize non-robust yet useful features. Finally, we shed light on the robustness mechanism underlying adversarial training of neural nets used in practice: quantifying the evolution of the associated empirical NTK, we demonstrate that its dynamics falls much earlier into the ``lazy'' regime and manifests a much stronger form of the well known bias to prioritize learning features within the top eigenspaces of the kernel, compared to standard training.

artificial intelligence, machine learning, neural network, (18 more...)

2210.05577

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > United States > New York (0.04)
(12 more...)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

arXiv.org Artificial IntelligenceJan-30-2023

A deep-learning search for technosignatures of 820 nearby stars

Ma, Peter Xiangyuan, Ng, Cherry, Rizk, Leandro, Croft, Steve, Siemion, Andrew P. V., Brzycki, Bryan, Czech, Daniel, Drew, Jamie, Gajjar, Vishal, Hoang, John, Isaacson, Howard, Lebofsky, Matt, MacMahon, David, de Pater, Imke, Price, Danny C., Sheikh, Sofia Z., Worden, S. Pete

The goal of the Search for Extraterrestrial Intelligence (SETI) is to quantify the prevalence of technological life beyond Earth via their "technosignatures". One theorized technosignature is narrowband Doppler drifting radio signals. The principal challenge in conducting SETI in the radio domain is developing a generalized technique to reject human radio frequency interference (RFI). Here, we present the most comprehensive deep-learning based technosignature search to date, returning 8 promising ETI signals of interest for re-observation as part of the Breakthrough Listen initiative. The search comprises 820 unique targets observed with the Robert C. Byrd Green Bank Telescope, totaling over 480, hr of on-sky data. We implement a novel beta-Convolutional Variational Autoencoder to identify technosignature candidates in a semi-unsupervised manner while keeping the false positive rate manageably low. This new approach presents itself as a leading solution in accelerating SETI and other transient research into the age of data-driven astronomy.

artificial intelligence, cadence, machine learning, (19 more...)

doi: 10.1038/s41550-022-01872-z

2301.1267

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > Alameda County > Berkeley (0.14)
Oceania > Australia > Queensland (0.04)
(7 more...)

Genre: Research Report (1.00)

Industry: Media (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Hashemi, Alireza, Makse, Hernan

Visiting Distant Neighbors in Graph Convolutional Networks

arXiv.org Artificial IntelligenceJan-29-2023

We extend the graph convolutional network method for deep learning on graph data to higher order in terms of neighboring nodes. In order to construct representations for a node in a graph, in addition to the features of the node and its immediate neighboring nodes, we also include more distant nodes in the calculations. In experimenting with a number of publicly available citation graph datasets, we show that this higher order neighbor visiting pays off by outperforming the original model especially when we have a limited number of available labeled data points for the training of the model.

artificial intelligence, machine learning, node, (16 more...)

2301.1096

Country:

North America > United States > New York > New York County > New York City (0.15)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Europe > Italy > Sardinia (0.04)
Africa > Senegal > Kolda Region > Kolda (0.04)

Genre: Research Report (0.72)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)