Goto

Collaborating Authors

 Large Language Model



Bias Out-of-the-Box: An Empirical Analysis of Intersectional Occupational Biases in Popular Generative Language Models

Neural Information Processing Systems

The capabilities of natural language models trained on large-scale data have increased immensely over the past few years. Open source libraries such as HuggingFace have made these models easily available and accessible. While prior research has identified biases in large language models, this paper considers biases contained in the most popular versions of these models when applied'out-of-the-box' for downstream tasks. We focus on generative language models as they are well-suited for extracting biases inherited from training data. Specifically, we conduct an indepth analysis of GPT-2, which is the most downloaded text generation model on HuggingFace, with over half a million downloads per month. We assess biases related to occupational associations for different protected categories by intersecting gender with religion, sexuality, ethnicity, political affiliation, and continental name origin. Using a template-based data collection pipeline, we collect 396K sentence completions made by GPT-2 and find: (i) The machine-predicted jobs are less diverse and more stereotypical for women than for men, especially for intersections; (ii) Intersectional interactions are highly relevant for occupational associations, which we quantify by fitting 262 logistic models; (iii) For most occupations, GPT-2 reflects the skewed gender and ethnicity distribution found in USLabor Bureau data, and even pulls the societally-skewed distribution towards gender parity in cases where its predictions deviate from real labor market observations. This raises the normative question of what language models should learn - whether they should reflect or correct for existing inequalities.



Investigation: RAM prices are falling. Don't fall for it

PCWorld

When you purchase through links in our articles, we may earn a small commission. Investigation: RAM prices are falling. A few price dips don't mean the memory crisis is over -- AI demand, tight supply, and a jittery market could keep PC upgrades expensive. Rising prices are the biggest tech story of 2026 . Well, the biggest tech story, anyway -- the biggest story in a broader sense is "AI" in general.


What you need to know as Elon Musk's lawsuit against Sam Altman begins

Engadget

What you need to know as Elon Musk's lawsuit against Sam Altman begins It's sure to be cringe, and may end up costing OpenAI billions. OpenAI CEO Sam Altman speaks during the BlackRock Infrastructure Summit on March 11, 2026 in Washington, DC. In a few short days, jury selection will begin in the long-awaited case. At the end of that process, an Oakland federal court will task nine regular people with deciding if OpenAI defrauded Elon Musk when it announced, and recently completed, its reorganization to become a more traditional for-profit business . More than just being the venue where two billionaires will air their grievances against one another in public, the trial has the potential to reshape the AI industry.


equizero_neurips23_format

Neural Information Processing Systems

Proof of Thm. 2. We want to show M G(hx)= hM G(x) for all x 2X and h 2 G. From the definition of M G in equation 4, we have M G(hx)= 1P Similar to Yarotsky (2022), we first define Ksym = S g2G gK. Note that Ksym is also a compact set and Ksym X . We want to show that M G,equi(gx)= gM G,equi(x). Hence, ( h(gx) 1gx) is invariant to actions of G. The proof for invariance of M G,inv(x) follows similarly. In addition to properties discussed in section 3.3, here we show that equizero models have autoregressive and invertibility properties. These properties have not been used in the main paper, but we believe they could be of use for future work in this area.


Efficient Equivariant Transfer Learning from Pretrained Models

Neural Information Processing Systems

Efficient transfer learning algorithms are key to the success of foundation models on diverse downstream tasks even with limited data. Recent works of Basu et al. (2023) and Kaba et al. (2022) propose group averaging (equitune) and optimizationbased methods, respectively, over features from group-transformed inputs to obtain equivariant outputs from non-equivariant neural networks. While Kaba et al. (2022) are only concerned with training from scratch, we find that equitune performs poorly on equivariant zero-shot tasks despite good finetuning results. We hypothesize that this is because pretrained models provide better quality features for certain transformations than others and simply averaging them is deleterious. Hence, we propose ฮป-equitune that averages the features using importance weights, ฮปs. These weights are learned directly from the data using a small neural network, leading to excellent zero-shot and finetuned results that outperform equitune. Further, we prove that ฮป-equitune is equivariant and a universal approximator of equivariant functions. Additionally, we show that the method of Kaba et al. (2022) used with appropriate loss functions, which we call equizero, also gives excellent zero-shot and finetuned performance.