AITopics | Xu, Yaosheng

Collaborating Authors

Xu, Yaosheng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

NeuralMOVES: A lightweight and microscopic vehicle emission estimation model based on reverse engineering and surrogate learning

Ramirez-Sanchez, Edgar, Tang, Catherine, Xu, Yaosheng, Renganathan, Nrithya, Jayawardana, Vindula, He, Zhengbing, Wu, Cathy

arXiv.org Artificial IntelligenceFeb-6-2025

This significant contribution makes it a critical sector for climate change mitigation, as reducing emissions from transportation is essential for achieving global climate goals. The sector's transformation through electrification, automation, and intelligent infrastructure offers promising avenues for substantial emissions reductions (Sciarretta et al., 2020; International Energy Agency, 2023; McKinsey Center for Future Mobility, 2023). However, the success of these innovations is critically dependent on the availability of suitable and accurate emission estimation models to guide the design and deployment of new technologies. Motor Vehicle Emission Simulation (MOVES) (U.S. Environmental Protection Agency, 2022), one of the most well-established emission estimation models, serves as the official and state-of-the-art emission estimation model in the U.S., provided, enforced, and maintained by the U.S. Environmental Protection Agency (EPA). Despite its technical certification, MOVES' processing and software is tailored for two specific governmental uses: State Implementation Plans and Conformity Analyses U.S. Environmental Protection Agency (2021), which are for states to achieve and maintain air quality standards; and its use beyond trained practitioners and these specific analyses poses two main limitations. First, a steep learning curve, computational demands, and complex inputs make it difficult for researchers and practitioners to use. In particular, MOVES has rigid input requirements, including a combination of toggle-based settings within its GUI and structured input files in specific formats. Second, MOVES is tailored for macroscopic analysis and is unsuitable for microscopic applications, such as control and optimization, which commonly require second-by-second emission calculations for individual actions and vehicles.

artificial intelligence, machine learning, optimization problem, (17 more...)

arXiv.org Artificial Intelligence

2502.04417

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Ground > Road (1.00)
Law > Environmental Law (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Energy > Oil & Gas (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Target Entropy Annealing for Discrete Soft Actor-Critic

Xu, Yaosheng, Hu, Dailin, Liang, Litian, McAleer, Stephen, Abbeel, Pieter, Fox, Roy

arXiv.org Artificial IntelligenceDec-6-2021

Soft Actor-Critic (SAC) is considered the state-of-the-art algorithm in continuous action space settings. It uses the maximum entropy framework for efficiency and stability, and applies a heuristic temperature Lagrange term to tune the temperature $\alpha$, which determines how "soft" the policy should be. It is counter-intuitive that empirical evidence shows SAC does not perform well in discrete domains. In this paper we investigate the possible explanations for this phenomenon and propose Target Entropy Scheduled SAC (TES-SAC), an annealing method for the target entropy parameter applied on SAC. Target entropy is a constant in the temperature Lagrange term and represents the target policy entropy in discrete SAC. We compare our method on Atari 2600 games with different constant target entropy SAC, and analyze on how our scheduling affects SAC.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2112.02852

Country: North America > United States > California (0.28)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.35)

Add feedback

Temporal-Difference Value Estimation via Uncertainty-Guided Soft Updates

Liang, Litian, Xu, Yaosheng, McAleer, Stephen, Hu, Dailin, Ihler, Alexander, Abbeel, Pieter, Fox, Roy

arXiv.org Artificial IntelligenceOct-27-2021

Temporal-Difference (TD) learning methods, such as Q-Learning, have proven effective at learning a policy to perform control tasks. One issue with methods like Q-Learning is that the value update introduces bias when predicting the TD target of a unfamiliar state. Estimation noise becomes a bias after the max operator in the policy improvement step, and carries over to value estimations of other states, causing Q-Learning to overestimate the Q value. Algorithms like Soft Q-Learning (SQL) introduce the notion of a soft-greedy policy, which reduces the estimation bias via soft updates in early stages of training. However, the inverse temperature $\beta$ that controls the softness of an update is usually set by a hand-designed heuristic, which can be inaccurate at capturing the uncertainty in the target estimate. Under the belief that $\beta$ is closely related to the (state dependent) model uncertainty, Entropy Regularized Q-Learning (EQL) further introduces a principled scheduling of $\beta$ by maintaining a collection of the model parameters that characterizes model uncertainty. In this paper, we present Unbiased Soft Q-Learning (UQL), which extends the work of EQL from two action, finite state spaces to multi-action, infinite state space Markov Decision Processes. We also provide a principled numerical scheduling of $\beta$, extended from SQL and using model uncertainty, during the optimization process. We show the theoretical guarantees and the effectiveness of this update method in experiments on several discrete control environments.

machine learning, reinforcement learning, teaching method, (18 more...)

arXiv.org Artificial Intelligence

2110.14818

Country: North America > United States > California (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback