egal
Efficient Algorithms for Electing Successive Committees
Jain, Pallavi, Kaczmarczyk, Andrzej
In a recently introduced model of successive committee elections (Bredereck et al., AAAI-20) for a given set of ordinal or approval preferences one aims to find a sequence of a given length of "best" same-size committees such that each candidate is a member of a limited number of consecutive committees. However, the practical usability of this model remains limited, as the described task turns out to be NP-hard for most selection criteria already for seeking committees of size three. Non-trivial or somewhat efficient algorithms for these cases are lacking too. Motivated by a desire to unlock the full potential of the described temporal model of committee elections, we devise (parameterized) algorithms that effectively solve the mentioned hard cases in realistic scenarios of a moderate number of candidates or of a limited time horizon.
Optimally Improving Cooperative Learning in a Social Setting
Haddadan, Shahrzad, Xin, Cheng, Gao, Jie
We consider a cooperative learning scenario where a collection of networked agents with individually owned classifiers dynamically update their predictions, for the same classification task, through communication or observations of each other's predictions. Clearly if highly influential vertices use erroneous classifiers, there will be a negative effect on the accuracy of all the agents in the network. We ask the following question: how can we optimally fix the prediction of a few classifiers so as maximize the overall accuracy in the entire network. To this end we consider an aggregate and an egalitarian objective function. We show a polynomial time algorithm for optimizing the aggregate objective function, and show that optimizing the egalitarian objective function is NP-hard. Furthermore, we develop approximation algorithms for the egalitarian improvement. The performance of all of our algorithms are guaranteed by mathematical analysis and backed by experiments on synthetic and real data.
UCE-FID: Using Large Unlabeled, Medium Crowdsourced-Labeled, and Small Expert-Labeled Tweets for Foodborne Illness Detection
Hu, Ruofan, Zhang, Dongyu, Tao, Dandan, Zhang, Huayi, Feng, Hao, Rundensteiner, Elke
Foodborne illnesses significantly impact public health. Deep learning surveillance applications using social media data aim to detect early warning signals. However, labeling foodborne illness-related tweets for model training requires extensive human resources, making it challenging to collect a sufficient number of high-quality labels for tweets within a limited budget. The severe class imbalance resulting from the scarcity of foodborne illness-related tweets among the vast volume of social media further exacerbates the problem. Classifiers trained on a class-imbalanced dataset are biased towards the majority class, making accurate detection difficult. To overcome these challenges, we propose EGAL, a deep learning framework for foodborne illness detection that uses small expert-labeled tweets augmented by crowdsourced-labeled and massive unlabeled data. Specifically, by leveraging tweets labeled by experts as a reward set, EGAL learns to assign a weight of zero to incorrectly labeled tweets to mitigate their negative influence. Other tweets receive proportionate weights to counter-balance the unbalanced class distribution. Extensive experiments on real-world \textit{TWEET-FID} data show that EGAL outperforms strong baseline models across different settings, including varying expert-labeled set sizes and class imbalance ratios. A case study on a multistate outbreak of Salmonella Typhimurium infection linked to packaged salad greens demonstrates how the trained model captures relevant tweets offering valuable outbreak insights. EGAL, funded by the U.S. Department of Agriculture (USDA), has the potential to be deployed for real-time analysis of tweet streaming, contributing to foodborne illness outbreak surveillance efforts.