Goto

Collaborating Authors

 Xiao, Ting


Radiology Report Generation via Multi-objective Preference Optimization

arXiv.org Artificial Intelligence

Automatic Radiology Report Generation (RRG) is an important topic for alleviating the substantial workload of radiologists. Existing RRG approaches rely on supervised regression based on different architectures or additional knowledge injection,while the generated report may not align optimally with radiologists' preferences. Especially, since the preferences of radiologists are inherently heterogeneous and multidimensional, e.g., some may prioritize report fluency, while others emphasize clinical accuracy. To address this problem,we propose a new RRG method via Multi-objective Preference Optimization (MPO) to align the pre-trained RRG model with multiple human preferences, which can be formulated by multi-dimensional reward functions and optimized by multi-objective reinforcement learning (RL). Specifically, we use a preference vector to represent the weight of preferences and use it as a condition for the RRG model. Then, a linearly weighed reward is obtained via a dot product between the preference vector and multi-dimensional reward. Next,the RRG model is optimized to align with the preference vector by optimizing such a reward via RL. In the training stage,we randomly sample diverse preference vectors from the preference space and align the model by optimizing the weighted multi-objective rewards, which leads to an optimal policy on the entire preference space. When inference,our model can generate reports aligned with specific preferences without further fine-tuning. Extensive experiments on two public datasets show the proposed method can generate reports that cater to different preferences in a single model and achieve state-of-the-art performance.


Constrained Ensemble Exploration for Unsupervised Skill Discovery

arXiv.org Artificial Intelligence

Unsupervised Reinforcement Learning (RL) provides a promising paradigm for learning useful behaviors via reward-free per-training. Existing methods for unsupervised RL mainly conduct empowerment-driven skill discovery or entropy-based exploration. However, empowerment often leads to static skills, and pure exploration only maximizes the state coverage rather than learning useful behaviors. In this paper, we propose a novel unsupervised RL framework via an ensemble of skills, where each skill performs partition exploration based on the state prototypes. Thus, each skill can explore the clustered area locally, and the ensemble skills maximize the overall state coverage. We adopt state-distribution constraints for the skill occupancy and the desired cluster for learning distinguishable skills. Theoretical analysis is provided for the state entropy and the resulting skill distributions. Based on extensive experiments on several challenging tasks, we find our method learns well-explored ensemble skills and achieves superior performance in various downstream tasks compared to previous methods.


Importance Weighted Adversarial Discriminative Transfer for Anomaly Detection

arXiv.org Artificial Intelligence

Previous transfer methods for anomaly detection generally assume the availability of labeled data in source or target domains. However, such an assumption is not valid in most real applications where large-scale labeled data are too expensive. Therefore, this paper proposes an importance weighted adversarial autoencoder-based method to transfer anomaly detection knowledge in an unsupervised manner, particularly for a rarely studied scenario where a target domain has no labeled normal/abnormal data while only normal data from a related source domain exist. Specifically, the method learns to align the distributions of normal data in both source and target domains, but leave the distribution of abnormal data in the target domain unchanged. In this way, an obvious gap can be produced between the distributions of normal and abnormal data in the target domain, therefore enabling the anomaly detection in the domain. Extensive experiments on multiple synthetic datasets and the UCSD benchmark demonstrate the effectiveness of our approach.