Goto

Collaborating Authors

 Large Language Model


Dialectical language model evaluation: An initial appraisal of the commonsense spatial reasoning abilities of LLMs

arXiv.org Artificial Intelligence

Language models have become very popular recently and many claims have been made about their abilities, including for commonsense reasoning. Given the increasingly better results of current language models on previous static benchmarks for commonsense reasoning, we explore an alternative dialectical evaluation. The goal of this kind of evaluation is not to obtain an aggregate performance value but to find failures and map the boundaries of the system. Dialoguing with the system gives the opportunity to check for consistency and get more reassurance of these boundaries beyond anecdotal evidence. In this paper we conduct some qualitative investigations of this kind of evaluation for the particular case of spatial reasoning (which is a fundamental aspect of commonsense reasoning). We conclude with some suggestions for future work both to improve the capabilities of language models and to systematise this kind of dialectical evaluation.


Big Tech Is Already Lobbying to Water Down Europe's AI Rules

TIME - Tech

European lawmakers are putting their finishing touches on a set of wide-ranging rules designed to govern the use of artificial intelligence that, if passed, would make the E.U. the first major jurisdiction outside of China to pass targeted AI regulation. That has made the forthcoming legislation the subject of fierce debate and lobbying, with opposing sides battling to ensure that its scope is either widened or narrowed. Lawmakers are close to agreeing on a draft version of the law, the Financial Times reported last week. After that, the law will progress to negotiations between the bloc's member states and executive branch. The rules could set a global bar for how companies build and deploy their AI systems as companies may find it easier to comply with E.U. rules globally rather than to build different products for different regions--a phenomenon known as the "Brussels effect."


Understanding AI-generated misinformation and evaluating algorithmic and human solutions

AIHub

Existing machine learning (ML) models used to detect online misinformation are less effective when matched against content created by ChatGPT or other large language models (LLMs), according to new research from Georgia Tech. Current ML models designed for, and trained on, human-written content have significant performance discrepancies in detecting paired human-generated misinformation and misinformation generated by artificial intelligence (AI) systems, said Jiawei Zhou, a PhD student in Georgia Tech's School of Interactive Computing. Zhou's paper detailing the findings has received a best paper honorable mention award at the 2023 ACM CHI Conference on Human Factors in Computing Systems. Advised by Associate Professor Munmun De Choudhury, Zhou's research demonstrates that LLMs can manipulate tone and linguistics to allow AI-generated misinformation to slip through the cracks. "We found the AI-generated misinformation carried more emotions and cognitive processing expressions than its human-created counterparts," Zhou said.


The Tech Investment We Should Make Now to Avoid A.I. Disaster

Slate

There's good reason to fear that A.I. systems like ChatGPT and GPT4 will harm democracy. Public debate may be overwhelmed by industrial quantities of autogenerated argument. People might fall down political rabbit holes, taken in by superficially convincing bullshit, or obsessed by folies à deux relationships with machine personalities that don't really exist. These risks may be the fallout of a world where businesses deploy poorly tested A.I. systems in a battle for market share, each hoping to establish a monopoly. A.I. could advance the public good, not private profit, and bolster democracy instead of undermining it.


Silicon Valley's Favorite Slogan Has Lost All Meaning

The Atlantic - Technology

In early 2021, long before ChatGPT became a household name, OpenAI CEO Sam Altman self-published a manifesto of sorts, titled "Moore's Law for Everything." The original Moore's Law, formulated in 1965, describes the development of microchips, the tiny silicon wafers that power your computer. More specifically, it predicted that the number of transistors that engineers could cram onto a chip would roughly double every year. As Altman sees it, something like that astonishing rate of progress will soon apply to housing, food, medicine, education--everything. The vision is nothing short of utopian.


Who Will You Be After ChatGPT Takes Your Job?

WIRED

A few months ago, I was waiting for the subway with a friend, a professional editor, who had never used a large language model (LLM). Standing on the platform, she told me about an article she'd been working on. ChatGPT had come out six weeks earlier, and I input her summary into it on my phone and showed her the result. I'd been following OpenAI's transformer-driven models since 2019 and had forgotten the effect they can have on first exposure. My friend couldn't take her eyes off the little gray box as the article came out, line by line.


11 Smart Prompts to Do More With Google Bard

WIRED

While ChatGPT might be claiming most of the generative AI headlines, Google has its own large language model (LLM) chatbot, called Bard. You can sign up at bard.google.com, Like ChatGPT, Bard isn't difficult to use--all you have to do is start typing. But we have tips to help you get the most out of the app and generate the responses you're looking for. The suggestions below should get you off to a great start with Bard.


ChatGPT agents are better at simulated role-play than humans

New Scientist

ChatGPT-powered AIs given long-term memory capabilities and personal motivations could role-play characters in a simulated town more believably than human crowd workers. "This idea of creating believable agents that actually exhibit this behaviour – that give the illusion of realism – was something that we as an academic field wanted and have been talking about for the last four decades," says Joon Sung Park at Stanford University in California.


ChatGPT for health care providers: Can the AI chatbot make the professionals' jobs easier?

FOX News

OpenAI CEO Sam Altman said that he was "a little bit scared" of ChatGPT and admitted that his technology would likely destroy "a lot of current jobs." In addition to writing articles, songs and code in mere seconds, ChatGPT could potentially make its way into your doctor's office -- if it hasn't already. The artificial intelligence-based chatbot, released by OpenAI in December 2022, is a natural language processing (NLP) model that draws on information from the web to produce answers in a clear, conversational format. While it's not intended to be a source of personalized medical advice, patients are able to use ChatGPT to get information on diseases, medications and other health topics. Some experts even believe the technology could help physicians provide more efficient and thorough patient care.


Combat AI With AI: Counteract Machine-Generated Fake Restaurant Reviews on Social Media

arXiv.org Artificial Intelligence

Recent advances in generative models such as GPT may be used to fabricate indistinguishable fake customer reviews at a much lower cost, thus posing challenges for social media platforms to detect these machine-generated fake reviews. We propose to leverage the high-quality elite restaurant reviews verified by Yelp to generate fake reviews from the OpenAI GPT review creator and ultimately fine-tune a GPT output detector to predict fake reviews that significantly outperform existing solutions. We further apply the model to predict non-elite reviews and identify the patterns across several dimensions, such as review, user and restaurant characteristics, and writing style. We show that social media platforms are continuously challenged by machine-generated fake reviews, although they may implement detection systems to filter out suspicious reviews.