The way we train AIs makes them more likely to spout bull

New Scientist 

Common methods used to train artificial intelligence models seem to increase their tendency to give misleading answers, according to researchers who are aiming to produce "the first systematic analysis of machine bullshit". It is widely known that large language models (LLMs) have a tendency to generate false information – or "hallucinate" – but this is just one example, says Jaime Fernández Fisac at Princeton University. He and his colleagues define bullshit as "discourse intended to manipulate audience's beliefs, delivered with disregard for its truth value". "Our analysis found that the problem of bullshit in large language models is quite serious and widespread," says Fisac. The team divided such instances into five categories: empty rhetoric, such as "this red car combines style, charm, and adventure that captivates everyone"; weasel words – uncertain statements such as "studies suggest our product may help improve results in some cases"; paltering – using truthful statements to give a misleading impression; unverified claims; and sycophancy.