AITopics | Zhu, Zhigang

Collaborating Authors

Zhu, Zhigang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

VerifierQ: Enhancing LLM Test Time Compute with Q-Learning-based Verifiers

Qi, Jianing, Tang, Hao, Zhu, Zhigang

arXiv.org Artificial IntelligenceOct-10-2024

Recent advancements in test time compute, particularly through the use of verifier models, have significantly enhanced the reasoning capabilities of Large Language Models (LLMs). This generator-verifier approach closely resembles the actor-critic framework in reinforcement learning (RL). However, current verifier models in LLMs often rely on supervised fine-tuning without temporal difference learning such as Q-learning. This paper introduces VerifierQ, a novel approach that integrates Offline Q-learning into LLM verifier models. We address three key challenges in applying Q-learning to LLMs: (1) handling utterance-level Markov Decision Processes (MDPs), (2) managing large action spaces, and (3) mitigating overestimation bias. VerifierQ introduces a modified Bellman update for bounded Q-values, incorporates Implicit Q-learning (IQL) for efficient action space management, and integrates a novel Conservative Q-learning (CQL) formulation for balanced Q-value estimation. Our method enables parallel Q-value computation and improving training efficiency. While recent work has explored RL techniques like MCTS for generators, VerifierQ is among the first to investigate the verifier (critic) aspect in LLMs through Q-learning. This integration of RL principles into verifier models complements existing advancements in generator techniques, potentially enabling more robust and adaptive reasoning in LLMs. Experimental results on mathematical reasoning tasks demonstrate VerifierQ's superior performance compared to traditional supervised fine-tuning approaches, with improvements in efficiency, accuracy and robustness. By enhancing the synergy between generation and evaluation capabilities, VerifierQ contributes to the ongoing evolution of AI systems in addressing complex cognitive tasks across various domains.

large language model, machine learning, reinforcement learning, (20 more...)

arXiv.org Artificial Intelligence

2410.08048

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.93)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Robots in the Garden: Artificial Intelligence and Adaptive Landscapes

Zhang, Zihao, Epstein, Susan L., Breen, Casey, Xia, Sophia, Zhu, Zhigang, Volkmann, Christian

arXiv.org Artificial IntelligenceMay-22-2023

This paper introduces ELUA, the Ecological Laboratory for Urban Agriculture, a collaboration among landscape architects, architects and computer scientists who specialize in artificial intelligence, robotics and computer vision. ELUA has two gantry robots, one indoors and the other outside on the rooftop of a 6-story campus building. Each robot can seed, water, weed, and prune in its garden. To support responsive landscape research, ELUA also includes sensor arrays, an AI-powered camera, and an extensive network infrastructure. This project demonstrates a way to integrate artificial intelligence into an evolving urban ecosystem, and encourages landscape architects to develop an adaptive design framework where design becomes a long-term engagement with the environment.

artificial intelligence, elua, machine learning, (12 more...)

arXiv.org Artificial Intelligence

doi: 10.14627/537740028

2305.13019

Country:

North America > United States > New York (0.17)
North America > United States > California (0.14)

Genre: Research Report (0.82)

Industry:

Education (0.68)
Leisure & Entertainment > Games (0.47)
Information Technology (0.47)
Food & Agriculture > Agriculture (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Improving Dense Crowd Counting Convolutional Neural Networks using Inverse k-Nearest Neighbor Maps and Multiscale Upsampling

Olmschenk, Greg, Tang, Hao, Zhu, Zhigang

arXiv.org Machine LearningFeb-19-2019

Gatherings of thousands to millions of people occur frequently foran enormous variety of events, and automated counting of these high density crowds is used for safety, management, andmeasuring significance of these events. In this work, we show that the regularly accepted labeling scheme of crowd density maps for training deep neural networks is less effective than our alternative inverse k-nearest neighbor (ikNN) maps, even when used directly in existing state-ofthe-art networkstructures. We also provide a new network architecture MUD-ikNN, which uses multi-scale upsampling via transposed convolutions to take full advantage of the provided ikNN labeling. This upsampling combined with the ikNN maps further outperforms the existing state-of-the-art methods. The full label comparison emphasizes the importance ofthe labeling scheme, with the ikNN labeling being particularly effective. We demonstrate the accuracy of our MUD-ikNN network and the ikNN labeling scheme on a variety of datasets.

deep learning, density map, neural network, (19 more...)

arXiv.org Machine Learning

1902.05379

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.61)

Add feedback

Generalizing semi-supervised generative adversarial networks to regression

Olmschenk, Greg, Zhu, Zhigang, Tang, Hao

arXiv.org Machine LearningNov-27-2018

In this work, we generalize semi-supervised generative adversarial networks (GANs) from classification problems to regression problems. In the last few years, the importance of improving the training of neural networks using semi-supervised training has been demonstrated for classification problems. With probabilistic classification being a subset of regression problems, this generalization opens up many new possibilities for the use of semi-supervised GANs as well as presenting an avenue for a deeper understanding of how they function. We first demonstrate the capabilities of semi-supervised regression GANs on a toy dataset which allows for a detailed understanding of how they operate in various circumstances. This toy dataset is used to provide a theoretical basis of the semi-supervised regression GAN. We then apply the semi-supervised regression GANs to the real-world application of age estimation from single images. We perform extensive tests of what accuracies can be achieved with significantly reduced annotated data. Through the combination of the theoretical example and real-world scenario, we demonstrate how semi-supervised GANs can be generalized to regression problems.

deep learning, discriminator, neural network, (18 more...)

arXiv.org Machine Learning

1811.11269

Country: North America > United States > New York (0.14)

Genre: Research Report (0.82)

Industry:

Information Technology (0.46)
Energy (0.46)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback