AITopics | Zhang, Linjun

Collaborating Authors

Zhang, Linjun

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Scaffolding Sets

Burhanpurkar, Maya, Deng, Zhun, Dwork, Cynthia, Zhang, Linjun

arXiv.org Machine LearningNov-16-2021

Predictors map individual instances in a population to the interval $[0,1]$. For a collection $\mathcal C$ of subsets of a population, a predictor is multi-calibrated with respect to $\mathcal C$ if it is simultaneously calibrated on each set in $\mathcal C$. We initiate the study of the construction of scaffolding sets, a small collection $\mathcal S$ of sets with the property that multi-calibration with respect to $\mathcal S$ ensures correctness, and not just calibration, of the predictor. Our approach is inspired by the folk wisdom that the intermediate layers of a neural net learn a highly structured and useful data representation.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Machine Learning

2111.03135

Country: North America (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

The Power of Contrast for Feature Learning: A Theoretical Analysis

Ji, Wenlong, Deng, Zhun, Nakada, Ryumei, Zou, James, Zhang, Linjun

arXiv.org Machine LearningOct-5-2021

Deep supervised learning has achieved great success in various applications, including computer vision (Krizhevsky et al., 2012), natural language processing (Devlin et al., 2018), and scientific computing (Han et al., 2018). However, its dependence on manually assigned labels, which is usually difficult and costly, has motivated research into alternative approaches to exploit unlabeled data. Self-supervised learning is a promising approach that leverages the unlabeled data itself as supervision and learns representations that are beneficial to potential downstream tasks. At a high level, there are two common approaches for feature extraction in self-supervised learning: generative and contrastive (Liu et al., 2021). Both approaches aim to learn latent representations of the original data, while the difference is that the generative approach focused on minimizing the reconstruction error from latent representations, and the contrastive approach targets to decrease the similarity between the representations of contrastive pairs. Recent works have shown the benefits of contrastive learning in practice (Chen et al., 2020a,b,c; He et al., 2020).

artificial intelligence, inductive learning, machine learning, (18 more...)

arXiv.org Machine Learning

2110.02473

Country: North America > United States (0.28)

Genre:

Research Report > Promising Solution (0.47)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.74)

Add feedback

Understanding Dynamics of Nonlinear Representation Learning and Its Application

Kawaguchi, Kenji, Zhang, Linjun, Deng, Zhun

arXiv.org Machine LearningJun-28-2021

Representations of the world environment play a crucial role in machine intelligence. It is often inefficient to conduct reasoning and inference directly in the space of raw sensory representations, such as pixel values of images. Representation learning allows us to automatically discover suitable representations from raw sensory data. For example, given raw sensory data, a multilayer perceptron learns nonlinear representations at its hidden layers, which are subsequently used for classification (or regression) at its output layer. This happens implicitly during training through minimizing a supervised or unsupervised loss. In this paper, we study the dynamics of such implicit nonlinear representation learning. We identify a pair of a new assumption and a novel condition, called the common model structure assumption and the data-architecture alignment condition. Under the common model structure assumption, the data-architecture alignment condition is shown to be sufficient for the global convergence and necessary for the global optimality. Our results provide practical guidance for designing a model structure: e.g., the common model structure assumption can be used as a justification for using a particular model structure instead of others. As an application, we then derive a new training framework, which satisfies the data-architecture alignment condition without assuming it by automatically modifying any given training algorithm dependently on each data and architecture. Given a standard training algorithm, the framework running its modified version is empirically shown to maintain competitive (practical) test performances while providing global convergence guarantees for ResNet-18 with convolutions, skip connections, and batch normalization with standard benchmark datasets, including MNIST, CIFAR-10, CIFAR-100, Semeion, KMNIST and SVHN.

deep learning, neural network, vec, (16 more...)

arXiv.org Machine Learning

2106.14836

Country: Europe > Italy (0.14)

Genre: Research Report > New Finding (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

High-Dimensional Differentially-Private EM Algorithm: Methods and Near-Optimal Statistical Guarantees

Zhang, Zhe, Zhang, Linjun

arXiv.org Machine LearningApr-1-2021

In this paper, we develop a general framework to design differentially private expectation-maximization (EM) algorithms in high-dimensional latent variable models, based on the noisy iterative hard-thresholding. We derive the statistical guarantees of the proposed framework and apply it to three specific models: Gaussian mixture, mixture of regression, and regression with missing covariates. In each model, we establish the near-optimal rate of convergence with differential privacy constraints, and show the proposed algorithm is minimax rate optimal up to logarithm factors. The technical tools developed for the high-dimensional setting are then extended to the classic low-dimensional latent variable models, and we propose a near rate-optimal EM algorithm with differential privacy guarantees in this setting. Simulation studies and real data analysis are conducted to support our results.

algorithm, artificial intelligence, health & medicine, (16 more...)

arXiv.org Machine Learning

2104.00245

Country: North America > United States (0.67)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

A Central Limit Theorem for Differentially Private Query Answering

Dong, Jinshuo, Su, Weijie J., Zhang, Linjun

arXiv.org Machine LearningMar-15-2021

Perhaps the single most important use case for differential privacy is to privately answer numerical queries, which is usually achieved by adding noise to the answer vector. The central question, therefore, is to understand which noise distribution optimizes the privacy-accuracy trade-off, especially when the dimension of the answer vector is high. Accordingly, extensive literature has been dedicated to the question and the upper and lower bounds have been matched up to constant factors [BUV18, SU17]. In this paper, we take a novel approach to address this important optimality question. We first demonstrate an intriguing central limit theorem phenomenon in the high-dimensional regime. More precisely, we prove that a mechanism is approximately Gaussian Differentially Private [DRS21] if the added noise satisfies certain conditions. In particular, densities proportional to $\mathrm{e}^{-\|x\|_p^\alpha}$, where $\|x\|_p$ is the standard $\ell_p$-norm, satisfies the conditions. Taking this perspective, we make use of the Cramer--Rao inequality and show an "uncertainty principle"-style result: the product of the privacy parameter and the $\ell_2$-loss of the mechanism is lower bounded by the dimension. Furthermore, the Gaussian mechanism achieves the constant-sharp optimal privacy-accuracy trade-off among all such noises. Our findings are corroborated by numerical experiments.

artificial intelligence, mechanism, natural language, (19 more...)

arXiv.org Machine Learning

2103.08721

Country:

North America > United States (0.46)
Europe (0.28)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.40)

Add feedback

The Cost of Privacy in Generalized Linear Models: Algorithms and Minimax Lower Bounds

Cai, T. Tony, Wang, Yichen, Zhang, Linjun

arXiv.org Machine LearningNov-7-2020

The trade-off between differential privacy and statistical accuracy in generalized linear models (GLMs) is studied. We propose differentially private algorithms for parameter estimation in both low-dimensional and high-dimensional sparse GLMs and characterize their statistical performance. We establish privacy-constrained minimax lower bounds for GLMs, which imply that the proposed algorithms are rate-optimal up to logarithmic factors in sample size. The lower bounds are obtained via a novel technique, which is based on Stein's Lemma and generalizes the tracing attack technique for privacy-constrained lower bounds. This lower bound argument can be of independent interest as it is applicable to general parametric models. Simulated and real data experiments are conducted to demonstrate the numerical performance of our algorithms.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

2011.039

Country: North America > United States (0.67)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Estimation, Confidence Intervals, and Large-Scale Hypotheses Testing for High-Dimensional Mixed Linear Regression

Zhang, Linjun, Ma, Rong, Cai, T. Tony, Li, Hongzhe

arXiv.org Machine LearningNov-6-2020

This paper studies the high-dimensional mixed linear regression (MLR) where the output variable comes from one of the two linear regression models with an unknown mixing proportion and an unknown covariance structure of the random covariates. Building upon a high-dimensional EM algorithm, we propose an iterative procedure for estimating the two regression vectors and establish their rates of convergence. Based on the iterative estimators, we further construct debiased estimators and establish their asymptotic normality. For individual coordinates, confidence intervals centered at the debiased estimators are constructed. Furthermore, a large-scale multiple testing procedure is proposed for testing the regression coefficients and is shown to control the false discovery rate (FDR) asymptotically. Simulation studies are carried out to examine the numerical performance of the proposed methods and their superiority over existing methods. The proposed methods are further illustrated through an analysis of a dataset of multiplex image cytometry, which investigates the interaction networks among the cellular phenotypes that include the expression levels of 20 epitopes or combinations of markers.

algorithm, artificial intelligence, health & medicine, (16 more...)

arXiv.org Machine Learning

2011.03598

Country: North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

How Does Mixup Help With Robustness and Generalization?

Zhang, Linjun, Deng, Zhun, Kawaguchi, Kenji, Ghorbani, Amirata, Zou, James

arXiv.org Machine LearningOct-9-2020

Mixup is a popular data augmentation technique based on taking convex combinations of pairs of examples and their labels. This simple technique has been shown to substantially improve both the robustness and the generalization of the trained model. However, it is not well-understood why such improvement occurs. In this paper, we provide theoretical analysis to demonstrate how using Mixup in training helps model robustness and generalization. For robustness, we show that minimizing the Mixup loss corresponds to approximately minimizing an upper bound of the adversarial loss. This explains why models obtained by Mixup training exhibits robustness to several kinds of adversarial attacks such as Fast Gradient Sign Method (FGSM). For generalization, we prove that Mixup augmentation corresponds to a specific type of data-adaptive regularization which reduces overfitting. Our analysis provides new insights and a framework to understand Mixup.

deep learning, neural network, null, (17 more...)

arXiv.org Machine Learning

2010.04819

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Interpreting Robust Optimization via Adversarial Influence Functions

Deng, Zhun, Dwork, Cynthia, Wang, Jialiang, Zhang, Linjun

arXiv.org Artificial IntelligenceOct-2-2020

Robust optimization has been widely used in nowadays data science, especially in adversarial training. However, little research has been done to quantify how robust optimization changes the optimizers and the prediction losses comparing to standard training. In this paper, inspired by the influence function in robust statistics, we introduce the Adversarial Influence Function (AIF) as a tool to investigate the solution produced by robust optimization. The proposed AIF enjoys a closed-form and can be calculated efficiently. To illustrate the usage of AIF, we apply it to study model sensitivity -- a quantity defined to capture the change of prediction losses on the natural data after implementing robust optimization. We use AIF to analyze how model complexity and randomized smoothing affect the model sensitivity with respect to specific models. We further derive AIF for kernel regressions, with a particular application to neural tangent kernels, and experimentally demonstrate the effectiveness of the proposed AIF. Lastly, the theories of AIF will be extended to distributional robust optimization.

artificial intelligence, neural network, robust optimization, (12 more...)

arXiv.org Artificial Intelligence

2010.01247

Genre: Research Report (0.63)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Improving Adversarial Robustness via Unlabeled Out-of-Domain Data

Deng, Zhun, Zhang, Linjun, Ghorbani, Amirata, Zou, James

arXiv.org Machine LearningJun-15-2020

Robustness to adversarial attacks has been a major focus in machine learning security [4,12,26], and has been intensively studied in the past few years [8, 15, 32]. However, the theoretical understanding of adversarial robustness is still far from being satisfactory. Research [36] have demonstrated sample complexity may be one of the obstacles in achieving high robustness under standard learning, which is a large challenge since in many real-world applications, labeled examples are few and expensive. To address this challenge, recent works [9, 37] showed that adversarial robustness can be improved by leveraging unlabeled data that come from the same distribution/domain as the original labeled training samples. Nevertheless, that is still limited due to the difficulty to make sure that the unlabeled data are exactly from the same distribution as the labeled data. For example, gathering a large number of unlabeled images that follow the same distribution as CIFAR-10 is challenging, since one would have to carefully match the same lighting conditions, backgrounds, etc. Meanwhile, out-of-domain unlabeled data can be much easier and cheaper to collect. For instance, we used Bing search engine to query a small number of keywords and, within hours, generated a new 500k dataset of noisy CIFAR-10 categories; we call this Cheap-10.

deep learning, neural network, unlabeled data, (17 more...)

arXiv.org Machine Learning

2006.08476

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Industry:

Transportation (0.46)
Information Technology > Security & Privacy (0.34)
Government > Military (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.78)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback