Goto

Collaborating Authors

 enthusiasm





Mike Rowe reveals which American jobs will remain untouched by the coming AI revolution

FOX News

MikeroweWORKS Foundation founder Mike Rowe joins'The Brian Kilmeade Show' to discuss how AI and robots threaten white-collar jobs, as the nation faces a need for blue-collar workers. Mike Rowe is sounding the alarm about the future of white and blue-collar jobs, and is urging young Americans to rethink their career choices due to threats from artificial intelligence. The former star of the shows "How America Works" and "Dirty Jobs" sat down with Fox News Radio host Brian Kilmeade to discuss the outlook for the U.S. job market amid recent developments from President Donald Trump's administration to invest in domestic energy and artificial intelligence. Trump visited Pittsburgh on July 15 to announce a 90 billion investment in data centers and other energy projects in Pennsylvania. Rowe was also present at the event, dubbed the Energy and Investment Summit, at Carnegie Mellon University.


A Stereotype Content Analysis on Color-related Social Bias in Large Vision Language Models

Choi, Junhyuk, Kim, Minju, Hong, Yeseon, Kim, Bugeun

arXiv.org Artificial Intelligence

As large vision language models(LVLMs) rapidly advance, concerns about their potential to learn and generate social biases and stereotypes are increasing. Previous studies on LVLM's stereotypes face two primary limitations: metrics that overlooked the importance of content words, and datasets that overlooked the effect of color. To address these limitations, this study introduces new evaluation metrics based on the Stereotype Content Model (SCM). We also propose BASIC, a benchmark for assessing gender, race, and color stereotypes. Using SCM metrics and BASIC, we conduct a study with eight LVLMs to discover stereotypes. As a result, we found three findings. (1) The SCM-based evaluation is effective in capturing stereotypes. (2) LVLMs exhibit color stereotypes in the output along with gender and race ones. (3) Interaction between model architecture and parameter sizes seems to affect stereotypes. We release BASIC publicly on [anonymized for review].


Inside the eerily accurate presidential election simulation that has predicted the 2024 winner

Daily Mail - Science & tech

If a video game designed to predict the presidential election is correct, then Donald Trump will take the White House. I played Stardock's'The Political Machine,' which forecasted Trump's shock win in 2016, to see what America could expect once the polls close Tuesday night. The simulation is a turn-based, map-trotting game -- not unlike'Risk' or any other tabletop game of political strategy -- except the board itself reacts to you and your opponent's moves based on historic turnout data, debate focus groups and more. The game's makers claim it'relies heavily on demographic issue patterns' like job, race, sex and income level to determine'what issues [voters] care about,' data the team has updated regularly ever since they created the first edition back in 2004. Initially, I found the game confusing, complicated and frankly dorky, but in time I was enthusiastically buying up local ads, setting up campaign offices and hiring'smear merchants' to spread devious rumors about my opponent: Donald J. Trump.


Curve Your Enthusiasm: Concurvity Regularization in Differentiable Generalized Additive Models

Neural Information Processing Systems

Generalized Additive Models (GAMs) have recently experienced a resurgence in popularity due to their interpretability, which arises from expressing the target value as a sum of non-linear transformations of the features. Despite the current enthusiasm for GAMs, their susceptibility to concurvity -- i.e., (possibly non-linear) dependencies between the features -- has hitherto been largely overlooked. Here, we demonstrate how concurvity can severly impair the interpretability of GAMs and propose a remedy: a conceptually simple, yet effective regularizer which penalizes pairwise correlations of the non-linearly transformed feature variables. This procedure is applicable to any differentiable additive model, such as Neural Additive Models or NeuralProphet, and enhances interpretability by eliminating ambiguities due to self-canceling feature contributions. Our experiments show that concurvity in GAMs can be reduced without significantly compromising prediction quality, improving interpretability and reducing variance in the feature importances.


An Empirical Study of Gendered Stereotypes in Emotional Attributes for Bangla in Multilingual Large Language Models

Sadhu, Jayanta, Saha, Maneesha Rani, Shahriyar, Rifat

arXiv.org Artificial Intelligence

The influence of Large Language Models (LLMs) is rapidly growing, automating more jobs over time. Assessing the fairness of LLMs is crucial due to their expanding impact. Studies reveal the reflection of societal norms and biases in LLMs, which creates a risk of propagating societal stereotypes in downstream tasks. Many studies on bias in LLMs focus on gender bias in various NLP applications. However, there's a gap in research on bias in emotional attributes, despite the close societal link between emotion and gender. This gap is even larger for low-resource languages like Bangla. Historically, women are associated with emotions like empathy, fear, and guilt, while men are linked to anger, bravado, and authority. This pattern reflects societal norms in Bangla-speaking regions. We offer the first thorough investigation of gendered emotion attribution in Bangla for both closed and open source LLMs in this work. Our aim is to elucidate the intricate societal relationship between gender and emotion specifically within the context of Bangla. We have been successful in showing the existence of gender bias in the context of emotions in Bangla through analytical methods and also show how emotion attribution changes on the basis of gendered role selection in LLMs. All of our resources including code and data are made publicly available to support future research on Bangla NLP. Warning: This paper contains explicit stereotypical statements that many may find offensive.


AI Safety: Necessary, but insufficient and possibly problematic

P, Deepak

arXiv.org Artificial Intelligence

This article critically examines the recent hype around AI safety. We first start with noting the nature of the AI safety hype as being dominated by governments and corporations, and contrast it with other avenues within AI research on advancing social good. We consider what 'AI safety' actually means, and outline the dominant concepts that the digital footprint of AI safety aligns with. We posit that AI safety has a nuanced and uneasy relationship with transparency and other allied notions associated with societal good, indicating that it is an insufficient notion if the goal is that of societal good in a broad sense. We note that the AI safety debate has already influenced some regulatory efforts in AI, perhaps in not so desirable directions. We also share our concerns on how AI safety may normalize AI that advances structural harm through providing exploitative and harmful AI with a veneer of safety.


Intent-based Prompt Calibration: Enhancing prompt optimization with synthetic boundary cases

Levi, Elad, Brosh, Eli, Friedmann, Matan

arXiv.org Artificial Intelligence

Prompt engineering is a challenging and important task due to the high sensitivity of Large Language Models (LLMs) to the given prompt and the inherent ambiguity of a textual task instruction. Automatic prompt engineering is essential to achieve optimized performance from LLMs. Recent studies have demonstrated the capabilities of LLMs to automatically conduct prompt engineering by employing a meta-prompt that incorporates the outcomes of the last trials and proposes an improved prompt. However, this requires a high-quality benchmark to compare different prompts, which is difficult and expensive to acquire in many real-world use cases. In this work, we introduce a new method for automatic prompt engineering, using a calibration process that iteratively refines the prompt to the user intent. During the optimization process, the system jointly generates synthetic data of boundary use cases and optimizes the prompt according to the generated dataset. We demonstrate the effectiveness of our method with respect to strong proprietary models on real-world tasks such as moderation and generation. Our method outperforms state-of-the-art methods with a limited number of annotated samples. Furthermore, we validate the advantages of each one of the system's key components. Our system is built in a modular way, facilitating easy adaptation to other tasks. The code is available $\href{https://github.com/Eladlev/AutoPrompt}{here}$.