gini
- South America > Brazil (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- (10 more...)
Online Social Welfare Function-based Resource Allocation
Pardeshi, Kanad, Foubert, Samsara, Singh, Aarti
In many real-world settings, a centralized decision-maker must repeatedly allocate finite resources to a population over multiple time steps. Individuals who receive a resource derive some stochastic utility; to characterize the population-level effects of an allocation, the expected individual utilities are then aggregated using a social welfare function (SWF). We formalize this setting and present a general confidence sequence framework for SWF-based online learning and inference, valid for any monotonic, concave, and Lipschitz-continuous SWF. Our key insight is that monotonicity alone suffices to lift confidence sequences from individual utilities to anytime-valid bounds on optimal welfare. Building on this foundation, we propose SWF-UCB, a SWF-agnostic online learning algorithm that achieves near-optimal $\tilde{O}(n+\sqrt{nkT})$ regret (for $k$ resources distributed among $n$ individuals at each of $T$ time steps). We instantiate our framework on three normatively distinct SWF families: Weighted Power Mean, Kolm, and Gini, providing bespoke oracle algorithms for each. Experiments confirm $\sqrt{T}$ scaling and reveal rich interactions between $k$ and SWF parameters. This framework naturally supports inference applications such as sequential hypothesis testing, optimal stopping, and policy evaluation.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- Education (0.54)
- Health & Medicine (0.34)
Feature Learning for Interpretable, Performant Decision Trees Supplementary Material 1 Experiment Specification
Here we cover the full specification of the experiments. Some details were omitted from the main text. If there were separate training and test sets, they were combined before creating the random 10-fold split. All attributes are normalized to mean 0 and standard deviation 1. Additional details for each model type follow.
One world, one opinion? The superstar effect in LLM responses
As large language models (LLMs) are shaping the way information is shared and accessed online, their opinions have the potential to influence a wide audience. This study examines who the LLMs view as the most prominent figures across various fields, using prompts in ten different languages to explore the influence of linguistic diversity. Our findings reveal low diversity in responses, with a small number of figures dominating recognition across languages (also known as the "superstar effect"). These results highlight the risk of narrowing global knowledge representation when LLMs retrieve subjective information.
- North America > United States > Virginia (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- Europe > Belgium > Flanders > Antwerp Province > Antwerp (0.04)
- Asia > Indonesia > Bali (0.04)
Addressing bias in Recommender Systems: A Case Study on Data Debiasing Techniques in Mobile Games
Wang, Yixiong, Paskevich, Maria, Wang, Hui
The mobile gaming industry, particularly the free-to-play sector, has been around for more than a decade, yet it still experiences rapid growth. The concept of games-as-service requires game developers to pay much more attention to recommendations of content in their games. With recommender systems (RS), the inevitable problem of bias in the data comes hand in hand. A lot of research has been done on the case of bias in RS for online retail or services, but much less is available for the specific case of the game industry. Also, in previous works, various debiasing techniques were tested on explicit feedback datasets, while it is much more common in mobile gaming data to only have implicit feedback. This case study aims to identify and categorize potential bias within datasets specific to model-based recommendations in mobile games, review debiasing techniques in the existing literature, and assess their effectiveness on real-world data gathered through implicit feedback. The effectiveness of these methods is then evaluated based on their debiasing quality, data requirements, and computational demands.
- North America > United States > New York > New York County > New York City (0.05)
- Europe > Sweden > Stockholm > Stockholm (0.05)
You Shall Know a Tool by the Traces it Leaves: The Predictability of Sentiment Analysis Tools
Baumartz, Daniel, Bagci, Mevlüt, Henlein, Alexander, Konca, Maxim, Lücking, Andy, Mehler, Alexander
If sentiment analysis tools were valid classifiers, one would expect them to provide comparable results for sentiment classification on different kinds of corpora and for different languages. In line with results of previous studies we show that sentiment analysis tools disagree on the same dataset. Going beyond previous studies we show that the sentiment tool used for sentiment annotation can even be predicted from its outcome, revealing an algorithmic bias of sentiment analysis. Based on Twitter, Wikipedia and different news corpora from the English, German and French languages, our classifiers separate sentiment tools with an averaged F1-score of 0.89 (for the English corpora). We therefore warn against taking sentiment annotations as face value and argue for the need of more and systematic NLP evaluation studies.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Oregon > Multnomah County > Portland (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (29 more...)
A Comparative Analysis of Wealth Index Predictions in Africa between three Multi-Source Inference Models
Karsai, Márton, Kertész, János, Espín-Noboa, Lisette
Poverty map inference is a critical area of research, with growing interest in both traditional and modern techniques, ranging from regression models to convolutional neural networks applied to tabular data, images, and networks. Despite extensive focus on the validation of training phases, the scrutiny of final predictions remains limited. Here, we compare the Relative Wealth Index (RWI) inferred by Chi et al. (2022) with the International Wealth Index (IWI) inferred by Lee and Braithwaite (2022) and Esp\'in-Noboa et al. (2023) across six Sub-Saharan African countries. Our analysis focuses on identifying trends and discrepancies in wealth predictions over time. Our results show that the predictions by Chi et al. and Esp\'in-Noboa et al. align with general GDP trends, with differences expected due to the distinct time-frames of the training sets. However, predictions by Lee and Braithwaite diverge significantly, indicating potential issues with the validity of the model. These discrepancies highlight the need for policymakers and stakeholders in Africa to rigorously audit models that predict wealth, especially those used for decision-making on the ground. These and other techniques require continuous verification and refinement to enhance their reliability and ensure that poverty alleviation strategies are well-founded.
- Africa > Uganda (0.16)
- Africa > South Africa (0.06)
- Africa > Rwanda (0.05)
- (7 more...)
- Banking & Finance (1.00)
- Government (0.66)
Individual Packet Features are a Risk to Model Generalisation in ML-Based Intrusion Detection
Kostas, Kahraman, Just, Mike, Lones, Michael A.
Machine learning is increasingly used for intrusion detection in IoT networks. This paper explores the effectiveness of using individual packet features (IPF), which are attributes extracted from a single network packet, such as timing, size, and source-destination information. Through literature review and experiments, we identify the limitations of IPF, showing they can produce misleadingly high detection rates. Our findings emphasize the need for approaches that consider packet interactions for robust intrusion detection. Additionally, we demonstrate that models based on IPF often fail to generalize across datasets, compromising their reliability in diverse IoT environments.
Interpretable Distribution-Invariant Fairness Measures for Continuous Scores
Becker, Ann-Kristin, Dumitrasc, Oana, Broelemann, Klaus
Measures of algorithmic fairness are usually discussed in the context of binary decisions. We extend the approach to continuous scores. So far, ROC-based measures have mainly been suggested for this purpose. Other existing methods depend heavily on the distribution of scores, are unsuitable for ranking tasks, or their effect sizes are not interpretable. Here, we propose a distributionally invariant version of fairness measures for continuous scores with a reasonable interpretation based on the Wasserstein distance. Our measures are easily computable and well suited for quantifying and interpreting the strength of group disparities as well as for comparing biases across different models, datasets, or time points. We derive a link between the different families of existing fairness measures for scores and show that the proposed distributionally invariant fairness measures outperform ROC-based fairness measures because they are more explicit and can quantify significant biases that ROC-based fairness measures miss. Finally, we demonstrate their effectiveness through experiments on the most commonly used fairness benchmark datasets.
- South America > Uruguay > Maldonado > Maldonado (0.04)
- Oceania > Australia > Western Australia > Perth (0.04)
- North America > United States > California > San Francisco County > San Francisco (0.04)
- (5 more...)
- Research Report > Experimental Study (0.94)
- Research Report > New Finding (0.88)