Goto

Collaborating Authors

 Yan, Jing


Trinity: Syncretizing Multi-/Long-tail/Long-term Interests All in One

arXiv.org Artificial Intelligence

Interest modeling in recommender system has been a constant topic for improving user experience, and typical interest modeling tasks (e.g. multi-interest, long-tail interest and long-term interest) have been investigated in many existing works. However, most of them only consider one interest in isolation, while neglecting their interrelationships. In this paper, we argue that these tasks suffer from a common "interest amnesia" problem, and a solution exists to mitigate it simultaneously. We figure that long-term cues can be the cornerstone since they reveal multi-interest and clarify long-tail interest. Inspired by the observation, we propose a novel and unified framework in the retrieval stage, "Trinity", to solve interest amnesia problem and improve multiple interest modeling tasks. We construct a real-time clustering system that enables us to project items into enumerable clusters, and calculate statistical interest histograms over these clusters. Based on these histograms, Trinity recognizes underdelivered themes and remains stable when facing emerging hot topics. Trinity is more appropriate for large-scale industry scenarios because of its modest computational overheads. Its derived retrievers have been deployed on the recommender system of Douyin, significantly improving user experience and retention. We believe that such practical experience can be well generalized to other scenarios.


The Blessings of Multiple Treatments and Outcomes in Treatment Effect Estimation

arXiv.org Machine Learning

Assessing causal effects in the presence of unobserved confounding is a challenging problem. Existing studies leveraged proxy variables or multiple treatments to adjust for the confounding bias. In particular, the latter approach attributes the impact on a single outcome to multiple treatments, allowing estimating latent variables for confounding control. Nevertheless, these methods primarily focus on a single outcome, whereas in many real-world scenarios, there is greater interest in studying the effects on multiple outcomes. Besides, these outcomes are often coupled with multiple treatments. Examples include the intensive care unit (ICU), where health providers evaluate the effectiveness of therapies on multiple health indicators. To accommodate these scenarios, we consider a new setting dubbed as multiple treatments and multiple outcomes. We then show that parallel studies of multiple outcomes involved in this setting can assist each other in causal identification, in the sense that we can exploit other treatments and outcomes as proxies for each treatment effect under study. We proceed with a causal discovery method that can effectively identify such proxies for causal estimation. The utility of our method is demonstrated in synthetic data and sepsis disease.


Less Learn Shortcut: Analyzing and Mitigating Learning of Spurious Feature-Label Correlation

arXiv.org Artificial Intelligence

Recent research has revealed that deep neural networks often take dataset biases as a shortcut to make decisions rather than understand tasks, leading to failures in real-world applications. In this study, we focus on the spurious correlation between word features and labels that models learn from the biased data distribution of training data. In particular, we define the word highly co-occurring with a specific label as biased word, and the example containing biased word as biased example. Our analysis shows that biased examples are easier for models to learn, while at the time of prediction, biased words make a significantly higher contribution to the models' predictions, and models tend to assign predicted labels over-relying on the spurious correlation between words and labels. To mitigate models' over-reliance on the shortcut (i.e. spurious correlation), we propose a training strategy Less-Learn-Shortcut (LLS): our strategy quantifies the biased degree of the biased examples and down-weights them accordingly. Experimental results on Question Matching, Natural Language Inference and Sentiment Analysis tasks show that LLS is a task-agnostic strategy and can improve the model performance on adversarial data while maintaining good performance on in-domain data.


SMoA: Sparse Mixture of Adapters to Mitigate Multiple Dataset Biases

arXiv.org Artificial Intelligence

Recent studies reveal that various biases exist in different NLP tasks, and over-reliance on biases results in models' poor generalization ability and low adversarial robustness. To mitigate datasets biases, previous works propose lots of debiasing techniques to tackle specific biases, which perform well on respective adversarial sets but fail to mitigate other biases. In this paper, we propose a new debiasing method Sparse Mixture-of-Adapters (SMoA), which can mitigate multiple dataset biases effectively and efficiently. Experiments on Natural Language Inference and Paraphrase Identification tasks demonstrate that SMoA outperforms full-finetuning, adapter tuning baselines, and prior strong debiasing methods. Further analysis indicates the interpretability of SMoA that sub-adapter can capture specific pattern from the training data and specialize to handle specific bias.


An Ontology-Based Artificial Intelligence Model for Medicine Side-Effect Prediction: Taking Traditional Chinese Medicine as An Example

arXiv.org Artificial Intelligence

Artificial intelligence is a modern technology that is utilized in various fields of medicine [1-3]. At the meantime, Chinese Traditional Medicine (TCM) is now widely considered as a promising alternative medicine for complementary treatment in cancers or chronic diseases due to the effective methodology practically developed by generations of doctors for almost 4000 years [4]. Based on previous verification, it is undeniable that there are many correlations between the TCM syndromes and western diseases, turning out novel approaches for enhancing the treatment efficiency and developing medicines regarding with TCM methodologies [5]. Unfortunately, hindered by the remarkable gap between the modern informatics and the fundament of TCM: antient Chinese philosophy, such correlations are still too elusive to be formulated precisely. Therefore, recently, in order to figure out the deep connection between modern science and TCM, the research combining TCM with AI for valid knowledge acquisition and mining attracts extremely attention, and hereby, leading to many profound works, such as ontology information system design [6], latent tree models design [7], TCM warehouse for AI application [8], and digital knowledge graph development [2]. On the other hand, researchers face, however, many difficulties in setting up AI for TCM in terms of directly interpreting TCM semantic system (almost recorded by ancient Chinese doctrines) into structured database. Because in this way, considerable workload must be undertaken by limited numbers of experts who are proficient in both AI and TCM to translate the TCM terminologies and then formulate the modern model thereof. In contrast, as shown in Figure 1, the digestion of using TCM methodology in dealing with issues of modern science, new medicine design for example, is relatively lacking and thus of significant worth to explore.


Cost-aware Cascading Bandits

arXiv.org Machine Learning

In this paper, we propose a cost-aware cascading bandits model, a new variant of multi-armed ban- dits with cascading feedback, by considering the random cost of pulling arms. In each step, the learning agent chooses an ordered list of items and examines them sequentially, until certain stopping condition is satisfied. Our objective is then to max- imize the expected net reward in each step, i.e., the reward obtained in each step minus the total cost in- curred in examining the items, by deciding the or- dered list of items, as well as when to stop examina- tion. We study both the offline and online settings, depending on whether the state and cost statistics of the items are known beforehand. For the of- fline setting, we show that the Unit Cost Ranking with Threshold 1 (UCR-T1) policy is optimal. For the online setting, we propose a Cost-aware Cas- cading Upper Confidence Bound (CC-UCB) algo- rithm, and show that the cumulative regret scales in O(log T ). We also provide a lower bound for all {\alpha}-consistent policies, which scales in {\Omega}(log T ) and matches our upper bound. The performance of the CC-UCB algorithm is evaluated with both synthetic and real-world data.