Chang, Edward
REFUEL: Exploring Sparse Features in Deep Reinforcement Learning for Fast Disease Diagnosis
Peng, Yu-Shao, Tang, Kai-Fu, Lin, Hsuan-Tien, Chang, Edward
This paper proposes REFUEL, a reinforcement learning method with two techniques: {\em reward shaping} and {\em feature rebuilding}, to improve the performance of online symptom checking for disease diagnosis. Reward shaping can guide the search of policy towards better directions. Feature rebuilding can guide the agent to learn correlations between features. Together, they can find symptom queries that can yield positive responses from a patient with high probability. Experimental results justify that the two techniques in REFUEL allows the symptom checker to identify the disease more rapidly and accurately.
REFUEL: Exploring Sparse Features in Deep Reinforcement Learning for Fast Disease Diagnosis
Peng, Yu-Shao, Tang, Kai-Fu, Lin, Hsuan-Tien, Chang, Edward
This paper proposes REFUEL, a reinforcement learning method with two techniques: {\em reward shaping} and {\em feature rebuilding}, to improve the performance of online symptom checking for disease diagnosis. Reward shaping can guide the search of policy towards better directions. Feature rebuilding can guide the agent to learn correlations between features. Together, they can find symptom queries that can yield positive responses from a patient with high probability. Experimental results justify that the two techniques in REFUEL allows the symptom checker to identify the disease more rapidly and accurately.
Union of Intersections (UoI) for Interpretable Data Driven Discovery and Prediction
Bouchard, Kristofer, Bujan, Alejandro, Roosta-Khorasani, Farbod, Ubaru, Shashanka, Prabhat, Mr., Snijders, Antoine, Mao, Jian-Hua, Chang, Edward, Mahoney, Michael W., Bhattacharya, Sharmodeep
The increasing size and complexity of scientific data could dramatically enhance discovery and prediction for basic scientific applications, e.g., neuroscience, genetics, systems biology, etc. Realizing this potential, however, requires novel statistical analysis methods that are both interpretable and predictive. We introduce the Union of Intersections (UoI) method, a flexible, modular, and scalable framework for enhanced model selection and estimation. The method performs model selection and model estimation through intersection and union operations, respectively. We show that UoI can satisfy the bi-criteria of low-variance and nearly unbiased estimation of a small number of interpretable features, while maintaining high-quality prediction accuracy. We perform extensive numerical investigation to evaluate a UoI algorithm ($UoI_{Lasso}$) on synthetic and real data. In doing so, we demonstrate the extraction of interpretable functional networks from human electrophysiology recordings as well as the accurate prediction of phenotypes from genotype-phenotype data with reduced features. We also show (with the $UoI_{L1Logistic}$ and $UoI_{CUR}$ variants of the basic framework) improved prediction parsimony for classification and matrix factorization on several benchmark biomedical data sets. These results suggest that methods based on UoI framework could improve interpretation and prediction in data-driven discovery across scientific fields.
Tweet Timeline Generation with Determinantal Point Processes
Yao, Jin-ge (Peking University) | Fan, Feifan (Peking University) | Zhao, Wayne Xin (Renmin University of China) | Wan, Xiaojun (Peking University) | Chang, Edward (HTC Research) | Xiao, Jianguo (Peking University)
The task of tweet timeline generation (TTG) aims at selecting a small set of representative tweets to generate a meaningful timeline and providing enough coverage for a given topical query. This paper presents an approach based on determinantal point processes (DPPs) by jointly modeling the topical relevance of each selected tweet and overall selectional diversity. Aiming at better treatment for balancing relevance and diversity, we introduce two novel strategies, namely spectral rescaling and topical prior. Extensive experiments on the public TREC 2014 dataset demonstrate that our proposed DPP model along with the two strategies can achieve fairly competitive results against the state-of-the-art TTG systems.