AITopics | Zhang, Edwin

Collaborating Authors

Zhang, Edwin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Creating a Cooperative AI Policymaking Platform through Open Source Collaboration

Lewington, Aiden, Vittalam, Alekhya, Singh, Anshumaan, Uppuluri, Anuja, Ashok, Arjun, Athmaram, Ashrith Mandayam, Milt, Austin, Smith, Benjamin, Weinberger, Charlie, Sarin, Chatanya, Bergmeir, Christoph, Chang, Cliff, Patel, Daivik, Li, Daniel, Bell, David, Cao, Defu, Shin, Donghwa, Kang, Edward, Zhang, Edwin, Li, Enhui, Chen, Felix, Smithline, Gabe, Chen, Haipeng, Gasztowtt, Henry, Shin, Hoon, Zhang, Jiayun, Gray, Joshua, Low, Khai Hern, Patel, Kishan, Cooke, Lauren Hannah, Burstein, Marco, Kalapatapu, Maya, Mittal, Mitali, Chen, Raymond, Zhao, Rosie, Majid, Sameen, Potlapalli, Samya, Wang, Shang, Patel, Shrenik, Li, Shuheng, Komaragiri, Siva, Lu, Song, Siangjaeo, Sorawit, Jung, Sunghoo, Zhang, Tianyu, Mao, Valery, Krishnakumar, Vikram, Zhu, Vincent, Kam, Wesley, Li, Xingzhe, Liu, Yumeng

arXiv.org Artificial IntelligenceDec-9-2024

Advances in artificial intelligence (AI) present significant risks and opportunities, requiring improved governance to mitigate societal harms and promote equitable benefits. Current incentive structures and regulatory delays may hinder responsible AI development and deployment, particularly in light of the transformative potential of large language models (LLMs). To address these challenges, we propose developing the following three contributions: (1) a large multimodal text and economic-timeseries foundation model that integrates economic and natural language policy data for enhanced forecasting and decision-making, (2) algorithmic mechanisms for eliciting diverse and representative perspectives, enabling the creation of data-driven public policy recommendations, and (3) an AI-driven web platform for supporting transparent, inclusive, and data-driven policymaking.

forecasting, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2412.06936

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Banking & Finance > Economy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Add feedback

Proof Flow: Preliminary Study on Generative Flow Network Language Model Tuning for Formal Reasoning

Ho, Matthew, Zhu, Vincent, Chen, Xiaoyin, Jain, Moksh, Malkin, Nikolay, Zhang, Edwin

arXiv.org Artificial IntelligenceOct-17-2024

Reasoning is a fundamental substrate for solving novel and complex problems. Deliberate efforts in learning and developing frameworks around System 2 reasoning have made great strides, yet problems of sufficient complexity remain largely out of reach for open models. To address this gap, we examine the potential of Generative Flow Networks [GFlowNets; Bengio et al., 2021, Hu et al., 2024] as a fine-tuning method for LLMs to unlock advanced reasoning capabilities. In this paper, we present a proof of concept in the domain of formal reasoning, specifically in the Neural Theorem Proving (NTP) setting, where proofs specified in a formal language such as Lean can be deterministically and objectively verified. Unlike classical reward-maximization reinforcement learning, which frequently over-exploits high-reward actions and fails to effectively explore the state space, GFlowNets have emerged as a promising approach for sampling compositional objects, improving generalization, and enabling models to maintain diverse hypotheses. Our early results demonstrate GFlowNet fine-tuning's potential for enhancing model performance in a search setting, which is especially relevant given the paradigm shift towards inference time compute scaling and "thinking slowly."

large language model, machine learning, trajectory, (15 more...)

arXiv.org Artificial Intelligence

2410.13224

Country: North America > United States > California (0.68)

Genre:

Research Report > Promising Solution (0.34)
Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)

Add feedback

Large Legislative Models: Towards Efficient AI Policymaking in Economic Simulations

Gasztowtt, Henry, Smith, Benjamin, Zhu, Vincent, Bai, Qinxun, Zhang, Edwin

arXiv.org Artificial IntelligenceOct-10-2024

The improvement of economic policymaking presents an opportunity for broad societal benefit, a notion that has inspired research towards AI-driven policymaking tools. AI policymaking holds the potential to surpass human performance through the ability to process data quickly at scale. However, existing RL-based methods exhibit sample inefficiency, and are further limited by an inability to flexibly incorporate nuanced information into their decision-making processes. Thus, we propose a novel method in which we instead utilize pre-trained Large Language Models (LLMs), as sample-efficient policymakers in socially complex multi-agent reinforcement learning (MARL) scenarios. We demonstrate significant efficiency gains, outperforming existing methods across three environments. Our code is available at https://github.com/hegasz/large-legislative-models.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2410.08345

Genre: Research Report (1.00)

Industry:

Media > Television (1.00)
Leisure & Entertainment (1.00)
Government (1.00)
Banking & Finance > Economy (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Global Human-guided Counterfactual Explanations for Molecular Properties via Reinforcement Learning

Wang, Danqing, Antoniades, Antonis, Luong, Kha-Dinh, Zhang, Edwin, Kosan, Mert, Li, Jiachen, Singh, Ambuj, Wang, William Yang, Li, Lei

arXiv.org Artificial IntelligenceJun-19-2024

Counterfactual explanations of Graph Neural Networks (GNNs) offer a powerful way to understand data that can naturally be represented by a graph structure. Furthermore, in many domains, it is highly desirable to derive data-driven global explanations or rules that can better explain the high-level properties of the models and data in question. However, evaluating global counterfactual explanations is hard in real-world datasets due to a lack of human-annotated ground truth, which limits their use in areas like molecular sciences. Additionally, the increasing scale of these datasets provides a challenge for random search-based methods. In this paper, we develop a novel global explanation model RLHEX for molecular property prediction. It aligns the counterfactual explanations with human-defined principles, making the explanations more interpretable and easy for experts to evaluate. RLHEX includes a VAE-based graph generator to generate global explanations and an adapter to adjust the latent representation space to human-defined principles. Optimized by Proximal Policy Optimization (PPO), the global explanations produced by RLHEX cover 4.12% more input graphs and reduce the distance between the counterfactual explanation set and the input set by 0.47% on average across three molecular datasets. RLHEX provides a flexible framework to incorporate different human-designed principles into the counterfactual explanation generation process, aligning these explanations with domain expertise. The code and data are released at https://github.com/dqwang122/RLHEX.

explanation, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2406.13869

Country: North America > United States > California (0.28)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.49)
Health & Medicine > Therapeutic Area > Immunology (0.31)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

A Decision-Language Model (DLM) for Dynamic Restless Multi-Armed Bandit Tasks in Public Health

Behari, Nikhil, Zhang, Edwin, Zhao, Yunfan, Taneja, Aparna, Nagaraj, Dheeraj, Tambe, Milind

arXiv.org Artificial IntelligenceMay-26-2024

Restless multi-armed bandits (RMAB) have demonstrated success in optimizing resource allocation for large beneficiary populations in public health settings. Unfortunately, RMAB models lack flexibility to adapt to evolving public health policy priorities. Concurrently, Large Language Models (LLMs) have emerged as adept automated planners across domains of robotic control and navigation. In this paper, we propose a Decision Language Model (DLM) for RMABs, enabling dynamic fine-tuning of RMAB policies in public health settings using human-language commands. We propose using LLMs as automated planners to (1) interpret human policy preference prompts, (2) propose reward functions as code for a multi-agent RMAB environment, and (3) iterate on the generated reward functions using feedback from grounded RMAB simulations. We illustrate the application of DLM in collaboration with ARMMAN, an India-based non-profit promoting preventative care for pregnant mothers, that currently relies on RMAB policies to optimally allocate health worker calls to low-resource populations. We conduct a technology demonstration in simulation using the Gemini Pro model [1], showing DLM can dynamically shape policy outcomes using only human prompts as input.

large language model, machine learning, reinforcement learning, (21 more...)

arXiv.org Artificial Intelligence

2402.14807

Country: Asia > India (0.24)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Public Health (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.67)
Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (0.66)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.93)
Information Technology > Data Science > Data Mining > Big Data (0.70)

Add feedback

Social Environment Design

Zhang, Edwin, Zhao, Sadie, Wang, Tonghan, Hossain, Safwan, Gasztowtt, Henry, Zheng, Stephan, Parkes, David C., Tambe, Milind, Chen, Yiling

arXiv.org Machine LearningFeb-21-2024

Artificial Intelligence (AI) holds promise as a technology that can be used to improve government and economic policy-making. This paper proposes a new research agenda towards this end by introducing Social Environment Design, a general framework for the use of AI for automated policy-making that connects with the Reinforcement Learning, EconCS, and Computational Social Choice communities. The framework seeks to capture general economic environments, includes voting on policy objectives, and gives a direction for the systematic analysis of government and economic policy through AI simulation. We highlight key open problems for future research in AI-based policy-making. By solving these challenges, we hope to achieve various social welfare objectives, thereby promoting more ethical and responsible decision making.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Machine Learning

2402.1409

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report (0.50)

Industry:

Government (1.00)
Banking & Finance > Economy (0.69)
Leisure & Entertainment > Games (0.68)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.91)
Information Technology > Artificial Intelligence > Cognitive Science (0.89)
(2 more...)

Add feedback

Towards a Pretrained Model for Restless Bandits via Multi-arm Generalization

Zhao, Yunfan, Behari, Nikhil, Hughes, Edward, Zhang, Edwin, Nagaraj, Dheeraj, Tuyls, Karl, Taneja, Aparna, Tambe, Milind

arXiv.org Artificial IntelligenceJan-29-2024

Restless multi-arm bandits (RMABs), a class of resource allocation problems with broad application in areas such as healthcare, online advertising, and anti-poaching, have recently been studied from a multi-agent reinforcement learning perspective. Prior RMAB research suffers from several limitations, e.g., it fails to adequately address continuous states, and requires retraining from scratch when arms opt-in and opt-out over time, a common challenge in many real world applications. We address these limitations by developing a neural network-based pre-trained model (PreFeRMAB) that has general zero-shot ability on a wide range of previously unseen RMABs, and which can be fine-tuned on specific instances in a more sample-efficient way than retraining from scratch. Our model also accommodates general multi-action settings and discrete or continuous state spaces. To enable fast generalization, we learn a novel single policy network model that utilizes feature information and employs a training procedure in which arms opt-in and out over time. We derive a new update rule for a crucial $\lambda$-network with theoretical convergence guarantees and empirically demonstrate the advantages of our approach on several challenging, real-world inspired problems.

machine learning, prefermab, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2310.14526

Country: North America > United States (0.67)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area (0.68)
Health & Medicine > Public Health (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)

Add feedback

Toward Computationally Efficient Inverse Reinforcement Learning via Reward Shaping

Cooke, Lauren H., Klyne, Harvey, Zhang, Edwin, Laidlaw, Cassidy, Tambe, Milind, Doshi-Velez, Finale

arXiv.org Machine LearningDec-18-2023

Inverse reinforcement learning (IRL) is computationally challenging, with common approaches requiring the solution of multiple reinforcement learning (RL) sub-problems. This work motivates the use of potential-based reward shaping to reduce the computational burden of each RL sub-problem. This work serves as a proof-of-concept and we hope will inspire future developments towards computationally efficient IRL.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Machine Learning

2312.09983

Country: North America > United States > California (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Offline Reinforcement Learning with Closed-Form Policy Improvement Operators

Li, Jiachen, Zhang, Edwin, Yin, Ming, Bai, Qinxun, Wang, Yu-Xiang, Wang, William Yang

arXiv.org Artificial IntelligenceJul-22-2023

Behavior constrained policy optimization has been demonstrated to be a successful paradigm for tackling Offline Reinforcement Learning. By exploiting historical transitions, a policy is trained to maximize a learned value function while constrained by the behavior policy to avoid a significant distributional shift. In this paper, we propose our closed-form policy improvement operators. We make a novel observation that the behavior constraint naturally motivates the use of first-order Taylor approximation, leading to a linear approximation of the policy objective. Additionally, as practical datasets are usually collected by heterogeneous policies, we model the behavior policies as a Gaussian Mixture and overcome the induced optimization difficulties by leveraging the LogSumExp's lower bound and Jensen's Inequality, giving rise to a closed-form policy improvement operator. We instantiate offline RL algorithms with our novel policy improvement operators and empirically demonstrate their effectiveness over state-of-the-art algorithms on the standard D4RL benchmark. Our code is available at https://cfpi-icml23.github.io/.

artificial intelligence, machine learning, offline reinforcement learning, (11 more...)

arXiv.org Artificial Intelligence

2211.15956

Country: North America > United States > California > Santa Barbara County > Santa Barbara (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Language Control Diffusion: Efficiently Scaling through Space, Time, and Tasks

Zhang, Edwin, Lu, Yujie, Wang, William, Zhang, Amy

arXiv.org Artificial IntelligenceApr-10-2023

Training generalist agents is difficult across several axes, requiring us to deal with high-dimensional inputs (space), long horizons (time), and multiple and new tasks. Recent advances with architectures have allowed for improved scaling along one or two of these dimensions, but are still prohibitive computationally. In this paper, we propose to address all three axes by leveraging Language to Control Diffusion models as a hierarchical planner conditioned on language (LCD). We effectively and efficiently scale diffusion models for planning in extended temporal, state, and task dimensions to tackle long horizon control problems conditioned on natural language instructions. We compare LCD with other state-of-the-art models on the CALVIN language robotics benchmark and find that LCD outperforms other SOTA methods in multi task success rates while dramatically improving computational efficiency with a single task success rate (SR) of 88.7% against the previous best of 82.6%. We show that LCD can successfully leverage the unique strength of diffusion models to produce coherent long range plans while addressing their weakness at generating low-level details and control. We release our code and models at https://github.com/ezhang7423/language-control-diffusion.

diffusion model, machine learning, natural language, (13 more...)

arXiv.org Artificial Intelligence

2210.15629

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.68)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback