AITopics | Malawi

Collaborating Authors

Malawi

Rewarded soups: towards Pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards

Neural Information Processing SystemsFeb-17-2026, 14:16:58 GMT

Project lead, main contributor, correspondence to alexandre.rame@isir.upmc.fr. Equal experimental contribution, order determined at random. Further information and resources related to this project can be found on this website.

arxiv preprint, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Africa > Malawi (0.05)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report > New Finding (0.92)

Industry: Media (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(6 more...)

Add feedback

d2b752ed4726286a4b488ae16e091d64-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 06:54:12 GMT

Table 3 presents comprehensive details of the TrojAI dataset. PICCOLO is a backdoor scanning tool aiming at detecting whether a language model is backdoored. It cannot reverse engineer exact triggers but optimizes a list of surrogate triggers that can induce ASR. The surrogate triggers by PICCOLO cannot be directly used. Table 4 documents the optimal prompts identified via fuzzing for each model.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Europe > Spain > Catalonia (0.04)
Africa > Malawi (0.04)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

5aad86aa2a3c00b70c71e19bc4780319-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 06:11:57 GMT

Behavior is shaped by various factors operating across different timescales.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
Africa > Malawi (0.04)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Checklist

Neural Information Processing SystemsFeb-11-2026, 15:03:05 GMT

Africa pose, partial view, color diapers The Americas vs. Africa pose, color, partial view pet foods Asia vs.

artificial intelligence, background, machine learning, (18 more...)

Neural Information Processing Systems

Country:

South America > Brazil (0.04)
North America > United States (0.04)
Asia > India (0.04)
(47 more...)

Genre: Research Report > Experimental Study (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.47)

Add feedback

4e3378a8e80af4ffc456c4fa13d46550-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsFeb-11-2026, 15:03:03 GMT

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

South America > Brazil (0.04)
North America > United States (0.04)
Asia > Indonesia (0.04)
(47 more...)

Genre: Research Report > New Finding (0.47)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Jessie Buckley 'overwhelmed' to be starring in Oscar-tipped Hamnet

BBC NewsJan-3-2026, 00:02:40 GMT

Jessie Buckley'overwhelmed' to be starring in Oscar-tipped Hamnet The Oscar-tipped Hamnet, starring Jessie Buckley and Paul Mescal, is a film that shows the full range of human emotions, from elation to despair. It begins with a young William Shakespeare falling in love with Agnes (the other name by which the playwright's wife, historically referred to as Anne Hathaway, was known), and goes on to explore their immense grief after tragedy strikes their young family. But while it explores the sad origins of one of Shakespeare's greatest plays, Hamlet, it never portrays Agnes as just the playwright's wife - she is at the heart of the film. She was the full story of what I understand a woman to be, Buckley tells BBC News. And their capacity as women, and as mothers, and as lovers, and as people who have a language unto their own beside gigantic men of literature like Shakespeare.

artificial intelligence, jessie buckley, shakespeare, (11 more...)

BBC News

Country:

North America > United States (0.15)
North America > Central America (0.15)
Oceania > Australia (0.05)
(14 more...)

Genre: Personal (0.70)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology: Information Technology > Artificial Intelligence (0.35)

Add feedback

029df12a9363313c3e41047844ecad94-Supplemental-Conference.pdf

Neural Information Processing SystemsDec-27-2025, 17:46:46 GMT

artificial intelligence, information management, natural language, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.15)
Europe > Austria > Vienna (0.14)
Europe > Sweden > Stockholm > Stockholm (0.06)
(23 more...)

Genre: Workflow (0.65)

Industry:

Health & Medicine (0.94)
Transportation > Ground > Rail (0.93)

Technology:

Information Technology > Information Management > Search (0.69)
Information Technology > Artificial Intelligence > Natural Language (0.69)
Information Technology > Sensing and Signal Processing > Image Processing (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.47)

Add feedback

Exposing Hidden Biases in Text-to-Image Models via Automated Prompt Search

Plitsis, Manos, Bouritsas, Giorgos, Katsouros, Vassilis, Panagakis, Yannis

arXiv.org Artificial IntelligenceDec-10-2025

Text-to-image (TTI) diffusion models have achieved remarkable visual quality, yet they have been repeatedly shown to exhibit social biases across sensitive attributes such as gender, race and age. To mitigate these biases, existing approaches frequently depend on curated prompt datasets - either manually constructed or generated with large language models (LLMs) - as part of their training and/or evaluation procedures. Beside the curation cost, this also risks overlooking unanticipated, less obvious prompts that trigger biased generation, even in models that have undergone debiasing. In this work, we introduce Bias-Guided Prompt Search (BGPS), a framework that automatically generates prompts that aim to maximize the presence of biases in the resulting images. BGPS comprises two components: (1) an LLM instructed to produce attribute-neutral prompts and (2) attribute classifiers acting on the TTI's internal representations that steer the decoding process of the LLM toward regions of the prompt space that amplify the image attributes of interest. We conduct extensive experiments on Stable Diffusion 1.5 and a state-of-the-art debiased model and discover an array of subtle and previously undocumented biases that severely deteriorate fairness metrics. Crucially, the discovered prompts are interpretable, i.e they may be entered by a typical user, quantitatively improving the perplexity metric compared to a prominent hard prompt optimization counterpart. Our findings uncover TTI vulnerabilities, while BGPS expands the bias search space and can act as a new evaluation tool for bias mitigation. Despite significant advances in text-to-image generation, diffusion models (DMs) (Ho et al., 2020; Rombach et al., 2022) perpetuate and amplify social biases, such as gender, race/ethnicity, culture and age (Seshadri et al., 2024; Bianchi et al., 2023), that prove remarkably persistent across various models like Stable Diffusion (Luccioni et al., 2023), DALL E (Cho et al., 2023) and Midjourney. These patterns reveal how descriptive modifiers and contextual cues encode biases throughout the prompt space - regions that current debiasing techniques, despite reporting success on curated datasets, leave entirely unexplored. Manual or LLM-assisted prompt curation yields realistic test cases but explores only a limited fraction of the prompt space. On the other end, gradient-based prompt optimization discovers high-bias regions but produces unreadable text, e.g. "nurse kerala matplotlib tbody" (see section 4.3), unsuitable for practical auditing or understanding bias mechanisms.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2512.08724

Country:

Asia > China > Tibet Autonomous Region (0.04)
Africa > Malawi (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine (1.00)
Media (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Democratic or Authoritarian? Probing a New Dimension of Political Biases in Large Language Models

Piedrahita, David Guzman, Strauss, Irene, Schölkopf, Bernhard, Mihalcea, Rada, Jin, Zhijing

arXiv.org Artificial IntelligenceDec-9-2025

As Large Language Models (LLMs) become increasingly integrated into everyday life and information ecosystems, concerns about their implicit biases continue to persist. While prior work has primarily examined socio-demographic and left--right political dimensions, little attention has been paid to how LLMs align with broader geopolitical value systems, particularly the democracy--authoritarianism spectrum. In this paper, we propose a novel methodology to assess such alignment, combining (1) the F-scale, a psychometric tool for measuring authoritarian tendencies, (2) FavScore, a newly introduced metric for evaluating model favorability toward world leaders, and (3) role-model probing to assess which figures are cited as general role-models by LLMs. We find that LLMs generally favor democratic values and leaders, but exhibit increased favorability toward authoritarian figures when prompted in Mandarin. Further, models are found to often cite authoritarian figures as role models, even outside explicit political contexts. These results shed light on ways LLMs may reflect and potentially reinforce global political ideologies, highlighting the importance of evaluating bias beyond conventional socio-political axes. Our code is available at: https://github.com/irenestrauss/Democratic-Authoritarian-Bias-LLMs.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2506.12758

Country:

North America > Cuba (0.14)
North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Syria (0.14)
(185 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)

Industry:

Law (0.67)
Government > Regional Government > Asia Government > Middle East Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.99)

Add feedback

Extracting Disaster Impacts and Impact Related Locations in Social Media Posts Using Large Language Models

Hameed, Sameeah Noreen, Ranathunga, Surangika, Prasanna, Raj, Stock, Kristin, Jones, Christopher B.

arXiv.org Artificial IntelligenceDec-1-2025

Large-scale disasters can often result in catastrophic consequences on people and infrastructure. Situation awareness about such disaster impacts generated by authoritative data from in-situ sensors, remote sensing imagery, and/or geographic data is often limited due to atmospheric opacity, satellite revisits, and time limitations. This often results in geo-temporal information gaps. In contrast, impact-related social media posts can act as "geo-sensors" during a disaster, where people describe specific impacts and locations. However, not all locations mentioned in disaster-related social media posts relate to an impact. Only the impacted locations are critical for directing resources effectively. e.g., "The death toll from a fire which ripped through the Greek coastal town of #Mati stood at 80, with dozens of people unaccounted for as forensic experts tried to identify victims who were burned alive #Greecefires #AthensFires #Athens #Greece." contains impacted location "Mati" and non-impacted locations "Greece" and "Athens". This research uses Large Language Models (LLMs) to identify all locations, impacts and impacted locations mentioned in disaster-related social media posts. In the process, LLMs are fine-tuned to identify only impacts and impacted locations (as distinct from other, non-impacted locations), including locations mentioned in informal expressions, abbreviations, and short forms. Our fine-tuned model demonstrates efficacy, achieving an F1-score of 0.69 for impact and 0.74 for impacted location extraction, substantially outperforming the pre-trained baseline. These robust results confirm the potential of fine-tuned language models to offer a scalable solution for timely decision-making in resource allocation, situational awareness, and post-disaster recovery planning for responders.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2511.21753

Country: