AITopics | Sun, Yi

Collaborating Authors

Sun, Yi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

TransGrasp: Grasp Pose Estimation of a Category of Objects by Transferring Grasps from Only One Labeled Instance

Wen, Hongtao, Yan, Jianhang, Peng, Wanli, Sun, Yi

arXiv.org Artificial IntelligenceJul-25-2022

Grasp pose estimation is an important issue for robots to interact with the real world. However, most of existing methods require exact 3D object models available beforehand or a large amount of grasp annotations for training. To avoid these problems, we propose Trans-Grasp, a category-level grasp pose estimation method that predicts grasp poses of a category of objects by labeling only one object instance. Specifically, we perform grasp pose transfer across a category of objects based on their shape correspondences and propose a grasp pose refinement module to further fine-tune grasp pose of grippers so as to ensure successful grasps. Experiments demonstrate the effectiveness of our method on achieving high-quality grasps with the transferred grasp poses.

artificial intelligence, grasp pose, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2207.07861

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision > Video Understanding (0.83)
Information Technology > Artificial Intelligence > Robots > Manipulation (0.68)

Add feedback

NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task--Next Sentence Prediction

Sun, Yi, Zheng, Yu, Hao, Chao, Qiu, Hangping

arXiv.org Artificial IntelligenceSep-8-2021

Using prompts to utilize language models to perform various downstream tasks, also known as prompt-based learning or prompt-learning, has lately gained significant success in comparison to the pre-train and fine-tune paradigm. Nonetheless, virtually all prompt-based methods are token-level, meaning they all utilize GPT's left-to-right language model or BERT's masked language model to perform cloze-style tasks. In this paper, we attempt to accomplish several NLP tasks in the zero-shot scenario using a BERT original pre-training task abandoned by RoBERTa and other models--Next Sentence Prediction (NSP). Unlike token-level techniques, our sentence-level prompt-based method NSP-BERT does not need to fix the length of the prompt or the position to be predicted, allowing it to handle tasks such as entity linking with ease. Based on the characteristics of NSP-BERT, we offer several quick building templates for various downstream tasks. We suggest a two-stage prompt method for word sense disambiguation tasks in particular. Our strategies for mapping the labels significantly enhance the model's performance on sentence pair tasks. On the FewCLUE benchmark, our NSP-BERT outperforms other zero-shot methods on most of these tasks and comes close to the few-shot methods.

artificial intelligence, natural language, nsp-bert, (14 more...)

arXiv.org Artificial Intelligence

2109.03564

Country:

Europe > Spain (0.28)
Asia (0.28)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Proof: Accelerating Approximate Aggregation Queries with Expensive Predicates

Kang, Daniel, Guibas, John, Bailis, Peter, Hashimoto, Tatsunori, Sun, Yi, Zaharia, Matei

arXiv.org Machine LearningJul-28-2021

Given a dataset $\mathcal{D}$, we are interested in computing the mean of a subset of $\mathcal{D}$ which matches a predicate. ABae leverages stratified sampling and proxy models to efficiently compute this statistic given a sampling budget $N$. In this document, we theoretically analyze ABae and show that the MSE of the estimate decays at rate $O(N_1^{-1} + N_2^{-1} + N_1^{1/2}N_2^{-3/2})$, where $N=K \cdot N_1+N_2$ for some integer constant $K$ and $K \cdot N_1$ and $N_2$ represent the number of samples used in Stage 1 and Stage 2 of ABae respectively. Hence, if a constant fraction of the total sample budget $N$ is allocated to each stage, we will achieve a mean squared error of $O(N^{-1})$ which matches the rate of mean squared error of the optimal stratified sampling algorithm given a priori knowledge of the predicate positive rate and standard deviation per stratum.

artificial intelligence, high probability, probability, (15 more...)

arXiv.org Machine Learning

2107.12525

Country: North America > United States (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence (0.68)

Add feedback

Data augmentation as stochastic optimization

Hanin, Boris, Sun, Yi

arXiv.org Machine LearningOct-21-2020

We present a theoretical framework recasting data augmentation as stochastic optimization for a sequence of time-varying proxy losses. This provides a unified approach to understanding techniques commonly thought of as data augmentation, including synthetic noise and label-preserving transformations, as well as more traditional ideas in stochastic optimization such as learning rate and batch size scheduling. We prove a time-varying Monro-Robbins theorem with rates of convergence which gives conditions on the learning rate and augmentation schedule under which augmented gradient descent converges. Special cases give provably good joint schedules for augmentation with additive noise, minibatch SGD, and minibatch SGD with noise. Implementing gradient-based optimization in practice requires many choices. These include hyperparameters such as learning rate and batch size as well as data augmentation, a popular set of techniques in which data is augmented (i.e.

artificial intelligence, augmentation, machine learning, (17 more...)

arXiv.org Machine Learning

2010.11171

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.50)

Add feedback

Testing Robustness Against Unforeseen Adversaries

Kang, Daniel, Sun, Yi, Hendrycks, Dan, Brown, Tom, Steinhardt, Jacob

arXiv.org Machine LearningAug-21-2019

Considerable work on adversarial defense has studied robustness to a fixed, known family of adversarial distortions, most frequently L_p-bounded distortions. In reality, the specific form of attack will rarely be known and adversaries are free to employ distortions outside of any fixed set. The present work advocates measuring robustness against this much broader range of unforeseen attacks---attacks whose precise form is not known when designing a defense. We propose a methodology for evaluating a defense against a diverse range of distortion types together with a summary metric UAR that measures the Unforeseen Attack Robustness against a distortion. We construct novel JPEG, Fog, Gabor, and Snow adversarial attacks to simulate unforeseen adversaries and perform a careful study of adversarial robustness against these and existing distortion types. We find that evaluation against existing L_p attacks yields highly correlated information that may not generalize to other attacks and identify a set of 4 attacks that yields more diverse information. We further find that adversarial training against either one or multiple distortions, including our novel ones, does not confer robustness to unforeseen distortions. These results underscore the need to study robustness against unforeseen distortions and provide a starting point for doing so.

deep learning, neural network, robustness, (19 more...)

arXiv.org Machine Learning

1908.08016

Country: Asia (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Security & Privacy (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Security & Privacy (0.89)
Information Technology > Artificial Intelligence > Vision (0.68)

Add feedback

Learning Fair Classifiers in Online Stochastic Settings

Sun, Yi, Ramirez, Ivan, Cuesta-Infante, Alfredo, Veeramachaneni, Kalyan

arXiv.org Machine LearningAug-19-2019

In many real life situations, including job and loan applications, gatekeepers must make justified, real-time decisions about a person's fitness for a particular opportunity. People on both sides of such decisions have understandable concerns about their fairness, especially when they occur online or algorithmically. In this paper we consider the setting where we try to satisfy approximate fairness in an online decision making process where examples are sampled i.i.d from an underlying distribution. The fairness metric we consider is "equalized odds", which requires that approximately equalized false positive rates and false negative rates across groups. Our work follows from the classical learning from experts scheme and extends the multiplicative weights algorithm by maintaining an estimation for label distribution and keeping separate weights for label classes as well as groups. Our theoretical results show that approximate equalized odds can be achieved without sacrificing much regret from some distributions. We also demonstrate the algorithm on real data sets commonly used by the fairness community.

algorithm, artificial intelligence, banking & finance, (16 more...)

arXiv.org Machine Learning

1908.07009

Genre: Research Report > New Finding (0.34)

Industry:

Banking & Finance (0.48)
Education > Educational Setting (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Transfer of Adversarial Robustness Between Perturbation Types

Kang, Daniel, Sun, Yi, Brown, Tom, Hendrycks, Dan, Steinhardt, Jacob

arXiv.org Machine LearningMay-3-2019

We study the transfer of adversarial robustness of deep neural networks between different perturbation types. While most work on adversarial examples has focused on $L_\infty$ and $L_2$-bounded perturbations, these do not capture all types of perturbations available to an adversary. The present work evaluates 32 attacks of 5 different types against models adversarially trained on a 100-class subset of ImageNet. Our empirical results suggest that evaluating on a wide range of perturbation sizes is necessary to understand whether adversarial robustness transfers between perturbation types. We further demonstrate that robustness against one perturbation type may not always imply and may sometimes hurt robustness against other perturbation types. In light of these results, we recommend evaluation of adversarial defenses take place on a diverse range of perturbation types and sizes.

deep learning, neural network, robustness, (18 more...)

arXiv.org Machine Learning

1905.01034

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Limited Gradient Descent: Learning With Noisy Labels

Sun, Yi, Tian, Yan, Xu, Yiping

arXiv.org Machine LearningDec-6-2018

Label noise may handicap the generalization of classifiers, and the effective learning of the main pattern from samples with noisy labels is an important issue. Recent studies have shown that deep neural networks tend to prioritize the learning of simple patterns over the memorization of noise patterns. This suggests the need for a method to search for the best generalization that learns the main pattern until noise begins to be memorized. An intuitive idea is to use a supervised approach to find the stop timing of learning by, for example, employing a clean verification set. In practice, however, a clean verification set is sometimes difficult to obtain. To solve this problem, we propose an unsupervised method called limited gradient descent to estimate the best stop timing. We modified the labels of a few samples in a noisy dataset to be almost false labels, creating a reverse pattern. By monitoring the learning progresses of the noisy samples and the reverse samples, we could determine the stop timing of learning. In this paper, we also provide some sufficient conditions on learning with noisy labels. Experimental results on CIFAR-10 demonstrate that our approach has a similar generalization performance to supervised methods. For uncomplicated datasets, such as MNIST, we add a relabeling strategy to further improve generalization and achieve state-of-the-art performance.

deep learning, neural network, noisy label, (17 more...)

arXiv.org Machine Learning

1811.08117

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Learning Vine Copula Models For Synthetic Data Generation

Sun, Yi, Cuesta-Infante, Alfredo, Veeramachaneni, Kalyan

arXiv.org Machine LearningDec-4-2018

A vine copula model is a flexible high-dimensional dependence model which uses only bivariate building blocks. However, the number of possible configurations of a vine copula grows exponentially as the number of variables increases, making model selection a major challenge in development. In this work, we formulate a vine structure learning problem with both vector and reinforcement learning representation. We use neural network to find the embeddings for the best possible vine model and generate a structure. Throughout experiments on synthetic and real-world datasets, we show that our proposed approach fits the data better in terms of log-likelihood. Moreover, we demonstrate that the model is able to generate high-quality samples in a variety of applications, making it a good candidate for synthetic data generation.

deep learning, neural network, vine, (18 more...)

arXiv.org Machine Learning

1812.01226

Country: North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)

Add feedback

From random walks to distances on unweighted graphs

Hashimoto, Tatsunori, Sun, Yi, Jaakkola, Tommi

Neural Information Processing SystemsDec-31-2015

Large unweighted directed graphs are commonly used to capture relations between entities. A fundamental problem in the analysis of such networks is to properly define the similarity or dissimilarity between any two vertices. Despite the significance of this problem, statistical characterization of the proposed metrics has been limited.We introduce and develop a class of techniques for analyzing random walks on graphs using stochastic calculus. Using these techniques we generalize results on the degeneracy of hitting times and analyze a metric based on the Laplace transformed hitting time (LTHT). The metric serves as a natural, provably well-behaved alternative to the expected hitting time. We establish a general correspondence between hitting times of the Brownian motion and analogous hitting times on the graph. We show that the LTHT is consistent with respect to the underlying metric of a geometric graph, preserves clustering tendency, and remains robust against random addition of non-geometric edges. Tests on simulated and real-world data show that the LTHT matches theoretical predictions and outperforms alternatives.

artificial intelligence, data mining, graph, (17 more...)

Neural Information Processing Systems

Industry: Energy > Oil & Gas (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.95)

Add feedback