AITopics | Zhang, Xishan

Collaborating Authors

Zhang, Xishan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CodeV: Empowering LLMs for Verilog Generation through Multi-Level Summarization

Zhao, Yang, Huang, Di, Li, Chongxiao, Jin, Pengwei, Nan, Ziyuan, Ma, Tianyun, Qi, Lei, Pan, Yansong, Zhang, Zhenxing, Zhang, Rui, Zhang, Xishan, Du, Zidong, Guo, Qi, Hu, Xing, Chen, Yunji

arXiv.org Artificial IntelligenceJul-18-2024

The increasing complexity and high costs associated with modern processor design have led to a surge in demand for processor design automation. Instruction-tuned large language models (LLMs) have demonstrated remarkable performance in automatically generating code for general-purpose programming languages like Python. However, these methods fail on hardware description languages (HDLs) like Verilog due to the scarcity of high-quality instruction tuning data, as even advanced LLMs like GPT-3.5 exhibit limited performance on Verilog generation. Regarding this issue, we observe that (1) Verilog code collected from the real world has higher quality than those generated by LLMs. (2) LLMs like GPT-3.5 excel in summarizing Verilog code rather than generating it. Based on these observations, this paper introduces CodeV, a series of open-source instruction-tuned Verilog generation LLMs. Instead of generating descriptions first and then getting the corresponding code from advanced LLMs, we prompt the LLM with Verilog code and let the LLM generate the corresponding natural language description by multi-level summarization. Experimental results show that CodeV relatively surpasses the previous open-source SOTA by 14.4% (BetterV in VerilogEval) and 11.3% (RTLCoder in RTLLM) respectively, and also relatively outperforms previous commercial SOTA GPT-4 by 22.1% in VerilogEval.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2407.10424

Country:

Europe (0.14)
Asia (0.14)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

InverseCoder: Unleashing the Power of Instruction-Tuned Code LLMs with Inverse-Instruct

Wu, Yutong, Huang, Di, Shi, Wenxuan, Wang, Wei, Gao, Lingzhe, Liu, Shihao, Nan, Ziyuan, Yuan, Kaizhao, Zhang, Rui, Zhang, Xishan, Du, Zidong, Guo, Qi, Pu, Yewen, Yin, Dawei, Hu, Xing, Chen, Yunji

arXiv.org Artificial IntelligenceJul-8-2024

Recent advancements in open-source code large language models (LLMs) have demonstrated remarkable coding abilities by fine-tuning on the data generated from powerful closed-source LLMs such as GPT-3.5 and GPT-4 for instruction tuning. This paper explores how to further improve an instruction-tuned code LLM by generating data from itself rather than querying closed-source LLMs. Our key observation is the misalignment between the translation of formal and informal languages: translating formal language (i.e., code) to informal language (i.e., natural language) is more straightforward than the reverse. Based on this observation, we propose INVERSE-INSTRUCT, which summarizes instructions from code snippets instead of the reverse. Specifically, given an instruction tuning corpus for code and the resulting instruction-tuned code LLM, we ask the code LLM to generate additional high-quality instructions for the original corpus through code summarization and self-evaluation. Then, we fine-tune the base LLM on the combination of the original corpus and the self-generated one, which yields a stronger instruction-tuned LLM. We present a series of code LLMs named InverseCoder, which surpasses the performance of the original code LLMs on a wide range of benchmarks, including Python text-to-code generation, multilingual coding, and data-science code generation.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2407.057

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Luban: Building Open-Ended Creative Agents via Autonomous Embodied Verification

Guo, Yuxuan, Peng, Shaohui, Guo, Jiaming, Huang, Di, Zhang, Xishan, Zhang, Rui, Hao, Yifan, Li, Ling, Tian, Zikang, Gao, Mingju, Li, Yutai, Gan, Yiming, Liang, Shuai, Zhang, Zihao, Du, Zidong, Guo, Qi, Hu, Xing, Chen, Yunji

arXiv.org Artificial IntelligenceMay-24-2024

Building open agents has always been the ultimate goal in AI research, and creative agents are the more enticing. Existing LLM agents excel at long-horizon tasks with well-defined goals (e.g., `mine diamonds' in Minecraft). However, they encounter difficulties on creative tasks with open goals and abstract criteria due to the inability to bridge the gap between them, thus lacking feedback for self-improvement in solving the task. In this work, we introduce autonomous embodied verification techniques for agents to fill the gap, laying the groundwork for creative tasks. Specifically, we propose the Luban agent target creative building tasks in Minecraft, which equips with two-level autonomous embodied verification inspired by human design practices: (1) visual verification of 3D structural speculates, which comes from agent synthesized CAD modeling programs; (2) pragmatic verification of the creation by generating and verifying environment-relevant functionality programs based on the abstract criteria. Extensive multi-dimensional human studies and Elo ratings show that the Luban completes diverse creative building tasks in our proposed benchmark and outperforms other baselines ($33\%$ to $100\%$) in both visualization and pragmatism. Additional demos on the real-world robotic arm show the creation potential of the Luban in the physical world.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2405.15414

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Games > Computer Games (0.73)
Leisure & Entertainment > Games > Chess (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Robots (0.87)
(2 more...)

Add feedback

Emergent Communication for Rules Reasoning

Guo, Yuxuan, Hao, Yifan, Zhang, Rui, Zhou, Enshuai, Du, Zidong, Zhang, Xishan, Song, Xinkai, Wen, Yuanbo, Zhao, Yongwei, Zhou, Xuehai, Guo, Jiaming, Yi, Qi, Peng, Shaohui, Huang, Di, Chen, Ruizhi, Guo, Qi, Chen, Yunji

arXiv.org Artificial IntelligenceNov-8-2023

Research on emergent communication between deep-learning-based agents has received extensive attention due to its inspiration for linguistics and artificial intelligence. However, previous attempts have hovered around emerging communication under perception-oriented environmental settings, that forces agents to describe low-level perceptual features intra image or symbol contexts. In this work, inspired by the classic human reasoning test (namely Raven's Progressive Matrix), we propose the Reasoning Game, a cognition-oriented environment that encourages agents to reason and communicate high-level rules, rather than perceived low-level contexts. Moreover, we propose 1) an unbiased dataset (namely rule-RAVEN) as a benchmark to avoid overfitting, 2) and a two-stage curriculum agent training method as a baseline for more stable convergence in the Reasoning Game, where contexts and semantics are bilaterally drifting. Experimental results show that, in the Reasoning Game, a semantically stable and compositional language emerges to solve reasoning problems. The emerged language helps agents apply the extracted rules to the generalization of unseen context attributes, and to the transfer between different context attributes or even tasks.

deep learning, emergent communication, machine learning, (2 more...)

arXiv.org Artificial Intelligence

2311.04474

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

Contrastive Modules with Temporal Attention for Multi-Task Reinforcement Learning

Lan, Siming, Zhang, Rui, Yi, Qi, Guo, Jiaming, Peng, Shaohui, Gao, Yunkai, Wu, Fan, Chen, Ruizhi, Du, Zidong, Hu, Xing, Zhang, Xishan, Li, Ling, Chen, Yunji

arXiv.org Artificial IntelligenceNov-2-2023

In the field of multi-task reinforcement learning, the modular principle, which involves specializing functionalities into different modules and combining them appropriately, has been widely adopted as a promising approach to prevent the negative transfer problem that performance degradation due to conflicts between tasks. However, most of the existing multi-task RL methods only combine shared modules at the task level, ignoring that there may be conflicts within the task. In addition, these methods do not take into account that without constraints, some modules may learn similar functions, resulting in restricting the model's expressiveness and generalization capability of modular methods. In this paper, we propose the Contrastive Modules with Temporal Attention(CMTA) method to address these limitations. CMTA constrains the modules to be different from each other by contrastive learning and combining shared modules at a finer granularity than the task level with temporal attention, alleviating the negative transfer within the task and improving the generalization ability and the performance for multi-task RL. We conducted the experiment on Meta-World, a multi-task RL benchmark containing various robotics manipulation tasks. Experimental results show that CMTA outperforms learning each task individually for the first time and achieves substantial performance improvements over the baselines.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2311.01075

Country: Asia > China (0.29)

Genre:

Research Report > New Finding (0.34)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Efficient Symbolic Policy Learning with Differentiable Symbolic Expression

Guo, Jiaming, Zhang, Rui, Peng, Shaohui, Yi, Qi, Hu, Xing, Chen, Ruizhi, Du, Zidong, Zhang, Xishan, Li, Ling, Guo, Qi, Chen, Yunji

arXiv.org Artificial IntelligenceNov-1-2023

Deep reinforcement learning (DRL) has led to a wide range of advances in sequential decision-making tasks. However, the complexity of neural network policies makes it difficult to understand and deploy with limited computational resources. Currently, employing compact symbolic expressions as symbolic policies is a promising strategy to obtain simple and interpretable policies. Previous symbolic policy methods usually involve complex training processes and pre-trained neural network policies, which are inefficient and limit the application of symbolic policies. In this paper, we propose an efficient gradient-based learning method named Efficient Symbolic Policy Learning (ESPL) that learns the symbolic policy from scratch in an end-to-end way. We introduce a symbolic network as the search space and employ a path selector to find the compact symbolic policy. By doing so we represent the policy with a differentiable symbolic expression and train it in an off-policy manner which further improves the efficiency. In addition, in contrast with previous symbolic policies which only work in single-task RL because of complexity, we expand ESPL on meta-RL to generate symbolic policies for unseen tasks. Experimentally, we show that our approach generates symbolic policies with higher performance and greatly improves data efficiency for single-task RL. In meta-RL, we demonstrate that compared with neural network policies the proposed symbolic policy achieves higher performance and efficiency and shows the potential to be interpretable.

artificial intelligence, machine learning, reinforcement learning, (2 more...)

arXiv.org Artificial Intelligence

2311.02104

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.64)

Add feedback

Pushing the Limits of Machine Design: Automated CPU Design with AI

Cheng, Shuyao, Jin, Pengwei, Guo, Qi, Du, Zidong, Zhang, Rui, Tian, Yunhao, Hu, Xing, Zhao, Yongwei, Hao, Yifan, Guan, Xiangtao, Han, Husheng, Zhao, Zhengyue, Liu, Ximing, Li, Ling, Zhang, Xishan, Chu, Yuejie, Mao, Weilong, Chen, Tianshi, Chen, Yunji

arXiv.org Artificial IntelligenceJun-27-2023

Design activity -- constructing an artifact description satisfying given goals and constraints -- distinguishes humanity from other animals and traditional machines, and endowing machines with design abilities at the human level or beyond has been a long-term pursuit. Though machines have already demonstrated their abilities in designing new materials, proteins, and computer programs with advanced artificial intelligence (AI) techniques, the search space for designing such objects is relatively small, and thus, "Can machines design like humans?" remains an open question. To explore the boundary of machine design, here we present a new AI approach to automatically design a central processing unit (CPU), the brain of a computer, and one of the world's most intricate devices humanity have ever designed. This approach generates the circuit logic, which is represented by a graph structure called Binary Speculation Diagram (BSD), of the CPU design from only external input-output observations instead of formal program code. During the generation of BSD, Monte Carlo-based expansion and the distance of Boolean functions are used to guarantee accuracy and efficiency, respectively. By efficiently exploring a search space of unprecedented size 10^{10^{540}}, which is the largest one of all machine-designed objects to our best knowledge, and thus pushing the limits of machine design, our approach generates an industrial-scale RISC-V CPU within only 5 hours. The taped-out CPU successfully runs the Linux operating system and performs comparably against the human-designed Intel 80486SX CPU. In addition to learning the world's first CPU only from input-output observations, which may reform the semiconductor industry by significantly reducing the design cycle, our approach even autonomously discovers human knowledge of the von Neumann architecture.

logic & formal reasoning, machine learning, programming language, (22 more...)

arXiv.org Artificial Intelligence

2306.12456

Country: North America > United States > California (0.14)

Genre: Research Report (0.64)

Industry:

Semiconductors & Electronics (1.00)
Information Technology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.55)
(2 more...)

Add feedback

Online Prototype Alignment for Few-shot Policy Transfer

Yi, Qi, Zhang, Rui, Peng, Shaohui, Guo, Jiaming, Gao, Yunkai, Yuan, Kaizhao, Chen, Ruizhi, Lan, Siming, Hu, Xing, Du, Zidong, Zhang, Xishan, Guo, Qi, Chen, Yunji

arXiv.org Artificial IntelligenceJun-12-2023

Domain adaptation in reinforcement learning (RL) mainly deals with the changes of observation when transferring the policy to a new environment. Many traditional approaches of domain adaptation in RL manage to learn a mapping function between the source and target domain in explicit or implicit ways. However, they typically require access to abundant data from the target domain. Besides, they often rely on visual clues to learn the mapping function and may fail when the source domain looks quite different from the target domain. To address these problems, we propose a novel framework Online Prototype Alignment (OPA) to learn the mapping function based on the functional similarity of elements and is able to achieve the few-shot policy transfer within only several episodes. The key insight of OPA is to introduce an exploration mechanism that can interact with the unseen elements of the target domain in an efficient and purposeful manner, and then connect them with the seen elements in the source domain according to their functionalities (instead of visual clues). Experimental results show that when the target domain looks visually different from the source domain, OPA can achieve better transfer performance even with much fewer samples from the target domain, outperforming prior methods.

machine learning, natural language, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2306.07307

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.50)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.46)

Add feedback

Ultra-low Precision Multiplication-free Training for Deep Neural Networks

Liu, Chang, Zhang, Rui, Zhang, Xishan, Hao, Yifan, Du, Zidong, Hu, Xing, Li, Ling, Guo, Qi

arXiv.org Artificial IntelligenceFeb-28-2023

The training for deep neural networks (DNNs) demands immense energy consumption, which restricts the development of deep learning as well as increases carbon emissions. Thus, the study of energy-efficient training for DNNs is essential. In training, the linear layers consume the most energy because of the intense use of energy-consuming full-precision (FP32) multiplication in multiply-accumulate (MAC). The energy-efficient works try to decrease the precision of multiplication or replace the multiplication with energy-efficient operations such as addition or bitwise shift, to reduce the energy consumption of FP32 multiplications. However, the existing energy-efficient works cannot replace all of the FP32 multiplications during both forward and backward propagation with low-precision energy-efficient operations. In this work, we propose an Adaptive Layer-wise Scaling PoT Quantization (ALS-POTQ) method and a Multiplication-Free MAC (MF-MAC) to replace all of the FP32 multiplications with the INT4 additions and 1-bit XOR operations. In addition, we propose Weight Bias Correction and Parameterized Ratio Clipping techniques for stable training and improving accuracy. In our training scheme, all of the above methods do not introduce extra multiplications, so we reduce up to 95.8% of the energy consumption in linear layers during training. Experimentally, we achieve an accuracy degradation of less than 1% for CNN models on ImageNet and Transformer model on the WMT En-De task. In summary, we significantly outperform the existing methods for both energy efficiency and accuracy.

artificial intelligence, machine learning, multiplication, (15 more...)

arXiv.org Artificial Intelligence

2302.14458

Genre: Research Report (0.50)

Industry: Energy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Eden: A Unified Environment Framework for Booming Reinforcement Learning Algorithms

Chen, Ruizhi, Wu, Xiaoyu, Pan, Yansong, Yuan, Kaizhao, Li, Ling, Ma, TianYun, Liang, JiYuan, Zhang, Rui, Wang, Kai, Zhang, Chen, Peng, Shaohui, Zhang, Xishan, Du, Zidong, Guo, Qi, Chen, Yunji

arXiv.org Artificial IntelligenceSep-3-2021

With AlphaGo defeats top human players, reinforcement learning(RL) algorithms have gradually become the code-base of building stronger artificial intelligence(AI). The RL algorithm design firstly needs to adapt to the specific environment, so the designed environment guides the rapid and profound development of RL algorithms. However, the existing environments, which can be divided into real world games and customized toy environments, have obvious shortcomings. For real world games, it is designed for human entertainment, and too much difficult for most of RL researchers. For customized toy environments, there is no widely accepted unified evaluation standard for all RL algorithms. Therefore, we introduce the first virtual user-friendly environment framework for RL. In this framework, the environment can be easily configured to realize all kinds of RL tasks in the mainstream research. Then all the mainstream state-of-the-art(SOTA) RL algorithms can be conveniently evaluated and compared. Therefore, our contributions mainly includes the following aspects: 1.single configured environment for all classification of SOTA RL algorithms; 2.combined environment of more than one classification RL algorithms; 3.the evaluation standard for all kinds of RL algorithms. With all these efforts, a possibility for breeding an AI with capability of general competency in a variety of tasks is provided, and maybe it will open up a new chapter for AI.

agent, artificial intelligence, computer game, (13 more...)

arXiv.org Artificial Intelligence

2109.01768

Country: Asia (0.14)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment > Games > Computer Games (0.46)
Leisure & Entertainment > Games > Go (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback