Generative AI
Improving Fairness in LLMs Through Testing-Time Adversaries
Gregio, Isabela Pereira, Pons, Ian, Costa, Anna Helena Reali, Jordรฃo, Artur
Large Language Models (LLMs) push the bound-aries in natural language processing and generative AI, driving progress across various aspects of modern society. Unfortunately, the pervasive issue of bias in LLMs responses (i.e., predictions) poses a significant and open challenge, hindering their application in tasks involving ethical sensitivity and responsible decision-making. In this work, we propose a straightforward, user-friendly and practical method to mitigate such biases, enhancing the reliability and trustworthiness of LLMs. Our method creates multiple variations of a given sentence by modifying specific attributes and evaluates the corresponding prediction behavior compared to the original, unaltered, prediction/sentence. The idea behind this process is that critical ethical predictions often exhibit notable inconsistencies, indicating the presence of bias. Unlike previous approaches, our method relies solely on forward passes (i.e., testing-time adversaries), eliminating the need for training, fine-tuning, or prior knowledge of the training data distribution. Through extensive experiments on the popular Llama family, we demonstrate the effectiveness of our method in improving various fairness metrics, focusing on the reduction of disparities in how the model treats individuals from different racial groups. Specifically, using standard metrics, we improve the fairness in Llama3 in up to 27 percentage points. Overall, our approach significantly enhances fairness, equity, and reliability in LLM-generated results without parameter tuning or training data modifications, confirming its effectiveness in practical scenarios. We believe our work establishes an important step toward enabling the use of LLMs in tasks that require ethical considerations and responsible decision-making.
Can Sam Altman Be Trusted with the Future?
In 2017, soon after Google researchers invented a new kind of neural network called a transformer, a young OpenAI engineer named Alec Radford began experimenting with it. What made the transformer architecture different from that of existing A.I. systems was that it could ingest and make connections among larger volumes of text, and Radford decided to train his model on a database of seven thousand unpublished English-language books--romance, adventure, speculative tales, the full range of human fantasy and invention. Then, instead of asking the network to translate text, as Google's researchers had done, he prompted it to predict the most probable next word in a sentence. The machine responded: one word, then another, and another--each new term inferred from the patterns buried in those seven thousand books. Radford hadn't given it rules of grammar or a copy of Strunk and White.
The market's down, but this OpenAI for the stock market can help you trade up
You've seen headlines about the market crash and maybe even wondered if now's your shot at finally investing. A stock-picking tool powered by OpenAI is helping regular folks identify strong opportunities with minimal risk. Meet Sterling Stock Picker, the thing that could turn your savings account into an early retirement, extra travel funds, or whatever you wish. Rather than gambling with your hard-earned dollars, this tool helps you research options that match your preferences and risk tolerance, and a lifetime subscription is just 55.19 (reg. Want to dive into the stock market but feel like you're reading a foreign language?
Improving the Data-efficiency of Reinforcement Learning by Warm-starting with LLM
Duong, Thang, Yang, Minglai, Zhang, Chicheng
We investigate the usage of Large Language Model (LLM) in collecting high-quality data to warm-start Reinforcement Learning (RL) algorithms for learning in some classical Markov Decision Process (MDP) environments. In this work, we focus on using LLM to generate an off-policy dataset that sufficiently covers state-actions visited by optimal policies, then later using an RL algorithm to explore the environment and improve the policy suggested by the LLM. Our algorithm, LORO, can both converge to an optimal policy and have a high sample efficiency thanks to the LLM's good starting policy. On multiple OpenAI Gym environments, such as CartPole and Pendulum, we empirically demonstrate that LORO outperforms baseline algorithms such as pure LLM-based policies, pure RL, and a naive combination of the two, achieving up to $4 \times$ the cumulative rewards of the pure RL baseline.
ReaCritic: Large Reasoning Transformer-based DRL Critic-model Scaling For Heterogeneous Networks
Heterogeneous Networks (HetNets) pose critical challenges for intelligent management due to the diverse user requirements and time-varying wireless conditions. These factors introduce significant decision complexity, which limits the adaptability of existing Deep Reinforcement Learning (DRL) methods. In many DRL algorithms, especially those involving value-based or actor-critic structures, the critic component plays a key role in guiding policy learning by estimating value functions. However, conventional critic models often use shallow architectures that map observations directly to scalar estimates, limiting their ability to handle multi-task complexity. In contrast, recent progress in inference-time scaling of Large Language Models (LLMs) has shown that generating intermediate reasoning steps can significantly improve decision quality. Motivated by this, we propose ReaCritic, a large reasoning transformer-based criticmodel scaling scheme that brings reasoning ability into DRL. ReaCritic performs horizontal reasoning over parallel state-action inputs and vertical reasoning through deep transformer stacks. It is compatible with a broad range of value-based and actor-critic DRL algorithms and enhances generalization in dynamic wireless environments. Extensive experiments demonstrate that ReaCritic improves convergence speed and final performance across various HetNet settings and standard OpenAI Gym control tasks.
TACO: Rethinking Semantic Communications with Task Adaptation and Context Embedding
Wijesinghe, Achintha, Wang, Weiwei, Wanninayaka, Suchinthaka, Zhang, Songyang, Ding, Zhi
--Recent advancements in generative artificial intelligence have introduced groundbreaking approaches to innovating next-generation semantic communication, which prioritizes conveying the meaning of a message rather than merely transmitting raw data. A fundamental challenge in semantic communication lies in accurately identifying and extracting the most critical semantic information while adapting to downstream tasks without degrading performance, particularly when the objective at the receiver may evolve over time. T o enable flexible adaptation to multiple tasks at the receiver, this work introduces a novel semantic communication framework, which is capable of jointly capturing task-specific information to enhance downstream task performance and contextual information. Through rigorous experiments on popular image datasets and computer vision tasks, our framework shows promising improvement compared to existing work, including superior performance in downstream tasks, better generalizability, ultra-high bandwidth efficiency, and low reconstruction latency. Next-generation communication systems are expected to support the surge in data-intensive applications with the increasing demand to handle a copious amount of multimodal data generated from intelligent devices, including those from smart sensors, ecosystems of the Internet of Things, mixed reality, and autonomous vehicles [1]. To enable wireless communications with the capacity to satisfy the request from the receiver end with ultra-high bandwidth efficiency in the big data era, semantic communication (SemCOM) has emerged as a transformative paradigm, which shifts data transmission from faithful bitwise recovery of source data to conveying its most critical semantic meaning [2].
Large Language Models for Cancer Communication: Evaluating Linguistic Quality, Safety, and Accessibility in Generative AI
Saha, Agnik, Churchill, Victoria, Rodriguez, Anny D., Kursuncu, Ugur, Idris, Muhammed Y.
Effective communication about breast and cervical cancers remains a persistent health challenge, with significant gaps in public understanding of cancer prevention, screening, and treatment, potentially leading to delayed diagnoses and inadequate treatments. This study evaluates the capabilities and limitations of Large Language Models (LLMs) in generating accurate, safe, and accessible cancer-related information to support patient understanding. We evaluated five general-purpose and three medical LLMs using a mixed-methods evaluation framework across linguistic quality, safety and trustworthiness, and communication accessibility and affectiveness. Our approach utilized quantitative metrics, qualitative expert ratings, and statistical analysis using Welch's ANOVA, Games-Howell, and Hedges' g. Our results show that general-purpose LLMs produced outputs of higher linguistic quality and affectiveness, while medical LLMs demonstrate greater communication accessibility. However, medical LLMs tend to exhibit higher levels of potential harm, toxicity, and bias, reducing their performance in safety and trustworthiness. Our findings indicate a duality between domain-specific knowledge and safety in health communications. The results highlight the need for intentional model design with targeted improvements, particularly in mitigating harm and bias, and improving safety and affectiveness. This study provides a comprehensive evaluation of LLMs for cancer communication, offering critical insights for improving AI-generated health content and informing future development of accurate, safe, and accessible digital health tools.
That weird call or text from a senator is probably an AI scam
Breakthroughs, discoveries, and DIY tips sent every weekday. If you recently received a voice message from an unusual number claiming to be your local congressperson, it's probably a scam. The FBI's crime division issued a warning this week about a new scheme in which bad actors use text messages and AI-generated voice clones to impersonate government officials. The scammers try to build a sense of connection with their target and eventually convince them to click on a malicious link that steals valuable login credentials. This scam is just the latest in a series of evolving attacks using convincing generative AI technology to trick people.
Learning Graph Representation of Agent Diffusers
Djenouri, Youcef, Belmecheri, Nassim, Michalak, Tomasz, Dubiลski, Jan, Belbachir, Ahmed Nabil, Yazidi, Anis
Diffusion-based generative models have significantly advanced text-to-image synthesis, demonstrating impressive text comprehension and zero-shot generalization. These models refine images from random noise based on textual prompts, with initial reliance on text input shifting towards enhanced visual fidelity over time. This transition suggests that static model parameters might not optimally address the distinct phases of generation. We introduce LGR-AD (Learning Graph Representation of Agent Diffusers), a novel multi-agent system designed to improve adaptability in dynamic computer vision tasks. LGR-AD models the generation process as a distributed system of interacting agents, each representing an expert sub-model. These agents dynamically adapt to varying conditions and collaborate through a graph neural network that encodes their relationships and performance metrics. Our approach employs a coordination mechanism based on top-$k$ maximum spanning trees, optimizing the generation process. Each agent's decision-making is guided by a meta-model that minimizes a novel loss function, balancing accuracy and diversity. Theoretical analysis and extensive empirical evaluations show that LGR-AD outperforms traditional diffusion models across various benchmarks, highlighting its potential for scalable and flexible solutions in complex image generation tasks. Code is available at: https://github.com/YousIA/LGR_AD
Demystifying AI Agents: The Final Generation of Intelligence
McNamara, Kevin J, Marpu, Rhea Pritham
The trajectory of artificial intelligence (AI) has been one of relentless acceleration, evolving from rudimentary rule-based systems to sophisticated, autonomous agents capable of complex reasoning and interaction. This whitepaper chronicles this remarkable journey, charting the key technological milestones--advancements in prompting, training methodologies, hardware capabilities, and architectural innovations--that have converged to create the AI agents of today. We argue that these agents, exemplified by systems like OpenAI's ChatGPT with plugins and xAI's Grok, represent a culminating phase in AI development, potentially constituting the "final generation" of intelligence as we currently conceive it. We explore the capabilities and underlying technologies of these agents, grounded in practical examples, while also examining the profound societal implications and the unprecedented pace of progress that suggests intelligence is now doubling approximately every six months. The paper concludes by underscoring the critical need for wisdom and foresight in navigating the opportunities and challenges presented by this powerful new era of intelligence.