Cao, Feng
CNMBert: A Model for Hanyu Pinyin Abbreviation to Character Conversion Task
Feng, Zishuo, Cao, Feng
Abstract--The task of converting hanyu pinyin abbreviations to Chinese characters is a significant branch within the domain of Chinese Spelling Correction (CSC). It plays an important role in many downstream applications like named entity recognition and sentiment analysis. This task is typically one of text-length alignment and seems easy to solve; however, due to the limited information content in pinyin abbreviations, achieving accurate conversion is challenging. In this paper, we treat this as a Fill-Mask task then propose CNMBert, which stands for zh-CN Pinyin Multi-mask Bert Model, as a solution to this issue. By introducing a multi-mask strategy and Mixture-of-Experts (MoE) layers, CNMBert outperforms fine-tuned GPT models and ChatGPT-4o with a 61.53 MRR score and 51.86 accuracy on a 10,373-sample test dataset.
Detection and Prediction of Adverse and Anomalous Events in Medical Robots
Liang, Kai (Case Western Reserve University) | Cao, Feng (Case Western Reserve University) | Bai, Zhuofu (Case Western Reserve University) | Renfrew, Mark (Case Western Reserve University) | Cavusoglu, Murat Cenk (Case Western Reserve University) | Podgurski, Andy (Case Western Reserve University) | Ray, Soumya (Case Western Reserve University)
Adverse and anomalous (A&A) events are a serious concern in medical robots. We describe a system that can rapidly detect such events and predict their occurrence. As part of this system, we describe simulation, data collection and user interface tools we build for a robot for small animal biopsies. The data we collect consists of both the hardware state of the robot and variables in the software controller. We use this data to train dynamic Bayesian network models of the joint hardware-software state-space dynamics of the robot. Our empirical evaluation shows that (i) our models can accurately model normal behavior of the robot, (ii) they can rapidly detect anomalous behavior once it starts, (iii) they can accurately predict a future A&A event within a time window of it starting and (iv) the use of additional software variables beyond the hardware state of the robot is important in being able to detect and predict certain kinds of events.
SEPIA: A Scalable Game Environment for Artificial Intelligence Teaching and Research
Sosnowski, Scott (Case Western Reserve University) | Ernsberger, Tim (Case Western Reserve University) | Cao, Feng (Case Western Reserve University) | Ray, Soumya (Case Western Reserve University)
We describe a game environment we have developed that we call the Strategy Engine for Programming Intelligent Agents (SEPIA). SEPIA is based on real-time strategy games, but modified extensively to preferentially support the development of artificial agents rather than human play. Through flexible configuration options, SEPIA is designed to be pedagogically scalable: suitable for use at the undergraduate and graduate levels, and also as a research testbed. We also describe assignments and our experiences with this environment in undergraduate and graduate classes.
Bayesian Hierarchical Reinforcement Learning
Cao, Feng, Ray, Soumya
We describe an approach to incorporating Bayesian priors in the maxq framework for hierarchical reinforcement learning (HRL). We define priors on the primitive environment model and on task pseudo-rewards. Since models for composite tasks can be complex, we use a mixed model-based/model-free learning approach to find an optimal hierarchical policy. We show empirically that (i) our approach results in improved convergence over non-Bayesian baselines, given sensible priors, (ii) task hierarchies and Bayesian priors can be complementary sources of information, and using both sources is better than either alone, (iii) taking advantage of the structural decomposition induced by the task hierarchy significantly reduces the computational cost of Bayesian reinforcement learning and (iv) in this framework, task pseudo-rewards can be learned instead of being manually specified, leading to automatic learning of hierarchically optimal rather than recursively optimal policies.