Goto

Collaborating Authors

 apology


Words Without Consequence

The Atlantic - Technology

What does it mean to have speech without a speaker? For the first time, speech has been decoupled from consequence. We now live alongside AI systems that converse knowledgeably and persuasively--deploying claims about the world, explanations, advice, encouragement, apologies, and promises--while bearing no vulnerability for what they say. Millions of people already rely on chatbots powered by large language models, and have integrated these synthetic interlocutors into their personal and professional lives. An LLM's words shape our beliefs, decisions, and actions, yet no speaker stands behind them. This dynamic is already familiar in everyday use. A chatbot gets something wrong. When corrected, it apologizes and changes its answer.


using soft-constraints to control for arbitrage opportunities, and the NN-based correction of a quant finance-based prior

Neural Information Processing Systems

We thank the reviewers for their comments to improve the paper. The main contributions have been well identified, i.e. Most banks and hedge-funds use IV surfaces (IVSs) and need such models. We will clarify by bringing Appendix E.2 (our current "broader impact section", not detailed enough) to the main text, We tested, our approach works both in high-vol periods (e.g., 09/2008) and with We will add figures/tables to the appendix. Apologies, we do not understand.



We will first discuss general points raised by multiple reviewers, then address reviewer-specific comments

Neural Information Processing Systems

We thank the reviewers for their detailed comments and helpful suggestions. We will first discuss general points raised by multiple reviewers, then address reviewer-specific comments. In the paper, we branded them as "variants" of REINFORCE, intending to make it easier We will clarify this distinction in our revision. Kastner et al. (2019)); while Li & Daw (2011), provide support for the view that humans may use policy-gradient Apologies, this phrase was a typo and will be removed.


MP wants Elon Musk's chatbot shut down over claim he enabled grooming gangs

BBC News

MP wants Elon Musk's chatbot shut down over claim he enabled grooming gangs An MP has called for Elon Musk's artificial intelligence (AI) chatbot to be shut down after it called him a rape enabler. The Grok chatbot made the post on X about SNP MP Pete Wishart, after a user asked it to comment on the member's opinion on whether there should be an inquiry into grooming gangs in Scotland. Mr Wishart said he was seeking legal advice over the deeply distressing accusation and called for Musk to recalibrate the bot to shut it down. The BBC has approached XAI, the parent company of X, for comment. I was genuinely shocked to be described in such an appalling and defamatory way, Mr Wishart said in a statement.





Indiana senator calls on WNBA, Fever to apologize to fans after accusations of racism: 'So demeaning'

FOX News

Republican Sen. Jim Banks explains why Indiana Fever fans deserve an apology after the league's latest investigation during an appearance on OutKick's'Don't @ Me with Dan Dakich.' U.S. Sen. Jim Banks, R-Ind., called on the WNBA and the Indiana Fever to apologize to Fever fans after the league's investigation failed to find evidence that corroborated allegations of racial comments directed at Angel Reese during a recent game. The league investigated the allegations involving the Chicago Sky star last month after a May 17 game hosted by the Fever. Chicago Sky forward Angel Reese (5) reacts to a flagrant foul from Indiana Fever guard Caitlin Clark (22) May 17, 2025, at Gainbridge Fieldhouse in Indianapolis. "Based on information gathered to date, including from relevant fans, team and arena staff, as well as audio and video review of the game, we have not substantiated [the report,]" the league said in a statement.


Safer or Luckier? LLMs as Safety Evaluators Are Not Robust to Artifacts

Chen, Hongyu, Goldfarb-Tarrant, Seraphina

arXiv.org Artificial Intelligence

Large Language Models (LLMs) are increasingly employed as automated evaluators to assess the safety of generated content, yet their reliability in this role remains uncertain. This study evaluates a diverse set of 11 LLM judge models across critical safety domains, examining three key aspects: self-consistency in repeated judging tasks, alignment with human judgments, and susceptibility to input artifacts such as apologetic or verbose phrasing. Our findings reveal that biases in LLM judges can significantly distort the final verdict on which content source is safer, undermining the validity of comparative evaluations. Notably, apologetic language artifacts alone can skew evaluator preferences by up to 98\%. Contrary to expectations, larger models do not consistently exhibit greater robustness, while smaller models sometimes show higher resistance to specific artifacts. To mitigate LLM evaluator robustness issues, we investigate jury-based evaluations aggregating decisions from multiple models. Although this approach both improves robustness and enhances alignment to human judgements, artifact sensitivity persists even with the best jury configurations. These results highlight the urgent need for diversified, artifact-resistant methodologies to ensure reliable safety assessments.