AITopics | Deng, Zhihong

Collaborating Authors

Deng, Zhihong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

What Hides behind Unfairness? Exploring Dynamics Fairness in Reinforcement Learning

Deng, Zhihong, Jiang, Jing, Long, Guodong, Zhang, Chengqi

arXiv.org Artificial IntelligenceApr-28-2024

In sequential decision-making problems involving sensitive attributes like race and gender, reinforcement learning (RL) agents must carefully consider long-term fairness while maximizing returns. Recent works have proposed many different types of fairness notions, but how unfairness arises in RL problems remains unclear. In this paper, we address this gap in the literature by investigating the sources of inequality through a causal lens. We first analyse the causal relationships governing the data generation process and decompose the effect of sensitive attributes on long-term well-being into distinct components. We then introduce a novel notion called dynamics fairness, which explicitly captures the inequality stemming from environmental dynamics, distinguishing it from those induced by decision-making or inherited from the past. This notion requires evaluating the expected changes in the next state and the reward induced by changing the value of the sensitive attribute while holding everything else constant. To quantitatively evaluate this counterfactual concept, we derive identification formulas that allow us to obtain reliable estimations from data. Extensive experiments demonstrate the effectiveness of the proposed techniques in explaining, detecting, and reducing inequality in reinforcement learning. We publicly release code at https://github.com/familyld/InsightFair.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2404.10942

Country:

North America > United States (0.15)
Europe (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Education (0.67)
Law (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Causal Reinforcement Learning: A Survey

Deng, Zhihong, Jiang, Jing, Long, Guodong, Zhang, Chengqi

arXiv.org Artificial IntelligenceNov-20-2023

Reinforcement learning is an essential paradigm for solving sequential decision problems under uncertainty. Despite many remarkable achievements in recent decades, applying reinforcement learning methods in the real world remains challenging. One of the main obstacles is that reinforcement learning agents lack a fundamental understanding of the world and must therefore learn from scratch through numerous trial-and-error interactions. They may also face challenges in providing explanations for their decisions and generalizing the acquired knowledge. Causality, however, offers a notable advantage as it can formalize knowledge in a systematic manner and leverage invariance for effective knowledge transfer. This has led to the emergence of causal reinforcement learning, a subfield of reinforcement learning that seeks to enhance existing algorithms by incorporating causal relationships into the learning process. In this survey, we comprehensively review the literature on causal reinforcement learning. We first introduce the basic concepts of causality and reinforcement learning, and then explain how causality can address core challenges in non-causal reinforcement learning. We categorize and systematically review existing causal reinforcement learning approaches based on their target problems and methodologies. Finally, we outline open issues and future directions in this emerging field.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2307.01452

Country:

Europe > United Kingdom > England (0.27)
North America > United States > Massachusetts > Middlesex County (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Transportation (0.68)
Education > Educational Setting (0.67)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Prefix-Tuning Based Unsupervised Text Style Transfer

Mai, Huiyu, Jiang, Wenhao, Deng, Zhihong

arXiv.org Artificial IntelligenceOct-23-2023

Unsupervised text style transfer aims at training a generative model that can alter the style of the input sentence while preserving its content without using any parallel data. In this paper, we employ powerful pre-trained large language models and present a new prefix-tuning-based method for unsupervised text style transfer. We construct three different kinds of prefixes, i.e., \textit{shared prefix, style prefix}, and \textit{content prefix}, to encode task-specific information, target style, and the content information of the input sentence, respectively. Compared to embeddings used by previous works, the proposed prefixes can provide richer information for the model. Furthermore, we adopt a recursive way of using language models in the process of style transfer. This strategy provides a more effective way for the interactions between the input sentence and GPT-2, helps the model construct more informative prefixes, and thus, helps improve the performance. Evaluations on the well-known datasets show that our method outperforms the state-of-the-art baselines. Results, analysis of ablation studies, and subjective evaluations from humans are also provided for a deeper understanding of the proposed method.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2310.14599

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Undersampling and Cumulative Class Re-decision Methods to Improve Detection of Agitation in People with Dementia

Meng, Zhidong, Iaboni, Andrea, Ye, Bing, Newman, Kristine, Mihailidis, Alex, Deng, Zhihong, Khan, Shehroz S.

arXiv.org Artificial IntelligenceAug-15-2023

Agitation is one of the most prevalent symptoms in people with dementia (PwD) that can place themselves and the caregiver's safety at risk. Developing objective agitation detection approaches is important to support health and safety of PwD living in a residential setting. In a previous study, we collected multimodal wearable sensor data from 17 participants for 600 days and developed machine learning models for detecting agitation in one-minute windows. However, there are significant limitations in the dataset, such as imbalance problem and potential imprecise labelsas the occurrence of agitation is much rarer in comparison to the normal behaviours. In this paper, we first implemented different undersampling methods to eliminate the imbalance problem, and came to the conclusion that only 20% of normal behaviour data were adequate to train a competitive agitation detection model. Then, we designed a weighted undersampling method to evaluate the manual labeling mechanism given the ambiguous time interval assumption. After that, the postprocessing method of cumulative class re-decision (CCR) was proposed based on the historical sequential information and continuity characteristic of agitation, improving the decision-making performance for the potential application of agitation detection system. The results showed that a combination of undersampling and CCR improved F1-score and other metrics to varying degrees with less training time and data.

agitation, artificial intelligence, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2302.03224

Country: North America > Canada > Ontario > Toronto (0.15)

Genre: Research Report > New Finding (0.86)

Industry: Health & Medicine > Therapeutic Area > Neurology > Dementia (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Retrieved Sequence Augmentation for Protein Representation Learning

Ma, Chang, Zhao, Haiteng, Zheng, Lin, Xin, Jiayi, Li, Qintong, Wu, Lijun, Deng, Zhihong, Lu, Yang, Liu, Qi, Kong, Lingpeng

arXiv.org Artificial IntelligenceFeb-24-2023

Protein language models have excelled in a variety of tasks, ranging from structure prediction to protein engineering. However, proteins are highly diverse in functions and structures, and current state-of-the-art models including the latest version of AlphaFold rely on Multiple Sequence Alignments (MSA) to feed in the evolutionary knowledge. Despite their success, heavy computational overheads, as well as the de novo and orphan proteins remain great challenges in protein representation learning. In this work, we show that MSAaugmented models inherently belong to retrievalaugmented methods. Motivated by this finding, we introduce Retrieved Sequence Augmentation(RSA) for protein representation learning without additional alignment or pre-processing. RSA links query protein sequences to a set of sequences with similar structures or properties in the database and combines these sequences for downstream prediction. We show that protein language models benefit from the retrieval enhancement on both structure prediction and property prediction tasks, with a 5% improvement on MSA Transformer on average while being 373 times faster. In addition, we show that our model can transfer to new protein domains better and outperforms MSA Transformer on de novo protein prediction. Our study fills a much-encountered gap in protein prediction and brings us a step closer to demystifying the domain knowledge needed to understand protein sequences. Code is available on https://github.com/HKUNLP/RSA.

bioinformatics, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2302.12563

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Biomedical Informatics > Translational Bioinformatics (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

SCORE: Spurious COrrelation REduction for Offline Reinforcement Learning

Deng, Zhihong, Fu, Zuyue, Wang, Lingxiao, Yang, Zhuoran, Bai, Chenjia, Wang, Zhaoran, Jiang, Jing

arXiv.org Artificial IntelligenceOct-24-2021

Offline reinforcement learning (RL) aims to learn the optimal policy from a pre-collected dataset without online interactions. Most of the existing studies focus on distributional shift caused by out-of-distribution actions. However, even in-distribution actions can raise serious problems. Since the dataset only contains limited information about the underlying model, offline RL is vulnerable to spurious correlations, i.e., the agent tends to prefer actions that by chance lead to high returns, resulting in a highly suboptimal policy. To address such a challenge, we propose a practical and theoretically guaranteed algorithm SCORE that reduces spurious correlations by combing an uncertainty penalty into policy evaluation. We show that this is consistent with the pessimism principle studied in theory, and the proposed algorithm converges to the optimal policy with a sublinear rate under mild assumptions. By conducting extensive experiments on existing benchmarks, we show that SCORE not only benefits from a solid theory but also obtains strong empirical results on a variety of tasks.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

2110.12468

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback