AITopics | Xiao, Zhenxin

Collaborating Authors

Xiao, Zhenxin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text

Zhou, Wangchunshu, Jiang, Yuchen Eleanor, Cui, Peng, Wang, Tiannan, Xiao, Zhenxin, Hou, Yifan, Cotterell, Ryan, Sachan, Mrinmaya

arXiv.org Artificial IntelligenceMay-22-2023

The fixed-size context of Transformer makes GPT models incapable of generating arbitrarily long text. In this paper, we introduce RecurrentGPT, a language-based simulacrum of the recurrence mechanism in RNNs. RecurrentGPT is built upon a large language model (LLM) such as ChatGPT and uses natural language to simulate the Long Short-Term Memory mechanism in an LSTM. At each timestep, RecurrentGPT generates a paragraph of text and updates its language-based long-short term memory stored on the hard drive and the prompt, respectively. This recurrence mechanism enables RecurrentGPT to generate texts of arbitrary length without forgetting. Since human users can easily observe and edit the natural language memories, RecurrentGPT is interpretable and enables interactive generation of long text. RecurrentGPT is an initial step towards next-generation computer-assisted writing systems beyond local editing suggestions. In addition to producing AI-generated content (AIGC), we also demonstrate the possibility of using RecurrentGPT as an interactive fiction that directly interacts with consumers. We call this usage of generative models by ``AI As Contents'' (AIAC), which we believe is the next form of conventional AIGC. We further demonstrate the possibility of using RecurrentGPT to create personalized interactive fiction that directly interacts with readers instead of interacting with writers. More broadly, RecurrentGPT demonstrates the utility of borrowing ideas from popular model designs in cognitive science and deep learning for prompting LLMs. Our code is available at https://github.com/aiwaves-cn/RecurrentGPT and an online demo is available at https://www.aiwaves.org/recurrentgpt.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2305.13304

Country: North America > United States (0.68)

Genre: Research Report (0.82)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Toward Finding The Global Optimal of Adversarial Examples

Xiao, Zhenxin, Chang, Kai-Wei, Hsieh, Cho-Jui

arXiv.org Machine LearningSep-14-2019

Current machine learning models are vulnerable to adversarial examples (Goodfellow et al., 2014), we noticed that current state-of-the-art methods (Kurakin et al., 2016; Cheng et al., 2018) to attack a well-trained model often stuck in local optimal values. We conduct series of experiments on both white-box and black-box settings, and find out that by different initialization, the attack algorithm will finally converge to very different local optimals, suggesting the importance of careful and thorough search in the attack space. In this paper, we propose a general boosting algorithm that can help current attack to find a more global optimal example. Specifically, we search for the adversarial examples by starting from different points/directions, and in certain interval we adopt successive halving (Jamieson & Talwalkar, 2016) to cut down the searching directions that are not promising, and use Bayesian Optimization (Pelikan et al., 1999; Bergstra et al., 2011) to resample from the search space based on the knowledge obtained from past searches. We demonstrate that by applying our methods to state-of-the-art attack algorithms in both black-and white box setting, we can further reduce the distortion between the original image and adversarial sample about 10%-20%. By adopting dynamic successive halving, we can reduce the computation cost 5-10 times without harming the final result. We conduct experiments in models trained on MNIST or ImageNet and also try on decision tree models, these experiments suggest that our method is a general way to boost the performance of current adversarial attack methods.

adversarial example, global optimal

arXiv.org Machine Learning

1909.04288

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.73)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.87)

Add feedback