Large Language Model
Accelerating Diffusion Models for Inverse Problems through Shortcut Sampling
Liu, Gongye, Sun, Haoze, Li, Jiayi, Yin, Fei, Yang, Yujiu
Recently, diffusion models have demonstrated a remarkable ability to solve inverse problems in an unsupervised manner. Existing methods mainly focus on modifying the posterior sampling process while neglecting the potential of the forward process. In this work, we propose Shortcut Sampling for Diffusion (SSD), a novel pipeline for solving inverse problems. Instead of initiating from random noise, the key concept of SSD is to find the "Embryo", a transitional state that bridges the measurement image y and the restored image x. By utilizing the "shortcut" path of "input-Embryo-output", SSD can achieve precise and fast restoration. To obtain the Embryo in the forward process, We propose Distortion Adaptive Inversion (DA Inversion). Moreover, we apply back projection and attention injection as additional consistency constraints during the generation process. Experimentally, we demonstrate the effectiveness of SSD on several representative tasks, including super-resolution, deblurring, and colorization. Compared to state-of-the-art zero-shot methods, our method achieves competitive results with only 30 NFEs. Moreover, SSD with 100 NFEs can outperform state-of-the-art zero-shot methods in certain tasks.
Playing repeated games with Large Language Models
Akata, Elif, Schulz, Lion, Coda-Forno, Julian, Oh, Seong Joon, Bethge, Matthias, Schulz, Eric
Large Language Models (LLMs) are transforming society and permeating into diverse applications. As a result, LLMs will frequently interact with us and other agents. It is, therefore, of great societal value to understand how LLMs behave in interactive social settings. Here, we propose to use behavioral game theory to study LLM's cooperation and coordination behavior. To do so, we let different LLMs (GPT-3, GPT-3.5, and GPT-4) play finitely repeated games with each other and with other, human-like strategies. Our results show that LLMs generally perform well in such tasks and also uncover persistent behavioral signatures. In a large set of two players-two strategies games, we find that LLMs are particularly good at games where valuing their own self-interest pays off, like the iterated Prisoner's Dilemma family. However, they behave sub-optimally in games that require coordination. We, therefore, further focus on two games from these distinct families. In the canonical iterated Prisoner's Dilemma, we find that GPT-4 acts particularly unforgivingly, always defecting after another agent has defected only once. In the Battle of the Sexes, we find that GPT-4 cannot match the behavior of the simple convention to alternate between options. We verify that these behavioral signatures are stable across robustness checks. Finally, we show how GPT-4's behavior can be modified by providing further information about the other player as well as by asking it to predict the other player's actions before making a choice. These results enrich our understanding of LLM's social behavior and pave the way for a behavioral game theory for machines.
Conjunct Resolution in the Face of Verbal Omissions
Rassin, Royi, Goldberg, Yoav, Tsarfaty, Reut
Verbal omissions are complex syntactic phenomena in VP coordination structures. They occur when verbs and (some of) their arguments are omitted from subsequent clauses after being explicitly stated in an initial clause. Recovering these omitted elements is necessary for accurate interpretation of the sentence, and while humans easily and intuitively fill in the missing information, state-of-the-art models continue to struggle with this task. Previous work is limited to small-scale datasets, synthetic data creation methods, and to resolution methods in the dependency-graph level. In this work we propose a conjunct resolution task that operates directly on the text and makes use of a split-and-rephrase paradigm in order to recover the missing elements in the coordination structure. To this end, we first formulate a pragmatic framework of verbal omissions which describes the different types of omissions, and develop an automatic scalable collection method. Based on this method, we curate a large dataset, containing over 10K examples of naturally-occurring verbal omissions with crowd-sourced annotations of the resolved conjuncts. We train various neural baselines for this task, and show that while our best method obtains decent performance, it leaves ample space for improvement. We propose our dataset, metrics and models as a starting point for future research on this topic.
AlignScore: Evaluating Factual Consistency with a Unified Alignment Function
Zha, Yuheng, Yang, Yichi, Li, Ruichen, Hu, Zhiting
Many text generation applications require the generated text to be factually consistent with input information. Automatic evaluation of factual consistency is challenging. Previous work has developed various metrics that often depend on specific functions, such as natural language inference (NLI) or question answering (QA), trained on limited data. Those metrics thus can hardly assess diverse factual inconsistencies (e.g., contradictions, hallucinations) that occur in varying inputs/outputs (e.g., sentences, documents) from different tasks. In this paper, we propose AlignScore, a new holistic metric that applies to a variety of factual inconsistency scenarios as above. AlignScore is based on a general function of information alignment between two arbitrary text pieces. Crucially, we develop a unified training framework of the alignment function by integrating a large diversity of data sources, resulting in 4.7M training examples from 7 well-established tasks (NLI, QA, paraphrasing, fact verification, information retrieval, semantic similarity, and summarization). We conduct extensive experiments on large-scale benchmarks including 22 evaluation datasets, where 19 of the datasets were never seen in the alignment training. AlignScore achieves substantial improvement over a wide range of previous metrics. Moreover, AlignScore (355M parameters) matches or even outperforms metrics based on ChatGPT and GPT-4 that are orders of magnitude larger.
Nvidia gains $185bn in value after predicting AI-driven boom in chip demand
The value of the US tech company Nvidia has soared by a quarter after it predicted a boom in demand for its computer chips to meet the needs of artificial intelligence products such as ChatGPT. Nvidia's share price rose by 25% in early trading on the back of the announcement, and gave it a market valuation of more than $940bn (£760bn) after stock markets opened on Wall Street on Thursday, up from $755bn on Wednesday evening. The share price had already more than doubled over the course of 2023, amid huge optimism over the rapid progress of generative AI products. These require massive datacentres full of semiconductor chips to operate. The hype was kicked off late last year after the startup OpenAI revealed ChatGPT, a chatbot capable of producing extraordinarily human-like answers to users' queries – albeit with problems around accuracy.
AI will eventually need an international authority, OpenAI leaders say
Sam Altman, the CEO of artificial intelligence lab OpenAI, told a Senate panel he welcomes federal regulation on the technology "to mitigate" its risks. The artificial intelligence field needs an international watchdog to regulate future superintelligence, according to the founder of OpenAI. In a blog post from CEO Sam Altman and company leaders Greg Brockman and Ilya Sutskever, the group said – given potential existential risk – the world "can't just be reactive," comparing the tech to nuclear energy. To that end, they suggested coordination among leading development efforts, highlighting that there are "many ways this could be implemented," including a project set up by major governments or curbs on annual growth rates. "Second, we are likely to eventually need something like an IAEA for superintelligence efforts; any effort above a certain capability (or resources like compute) threshold will need to be subject to an international authority that can inspect systems, require audits, test for compliance with safety standards, place restrictions on degrees of deployment and levels of security, etc." they asserted.
No more 'I took an arrow to the knee': could AI write super-intelligent video game characters?
Corny dialogue has been part of video games almost since they have existed. From 1989's Zero Wing spawning the decades old "All your base are belong to us" internet meme, to the clunky translations of the pre-remake Resident Evil games ("the master of unlocking"), to Skyrim's infamous adventurer who once took an arrow to the knee and never shuts up about it, non-playable character (NPC) dialogue has rarely been exactly Shakespearean, and the frequent repetition doesn't help. But could AI tools change that, enabling a world full of characters that respond believably when you talk to them? In collaboration with Google, a team of researchers from Stanford have built a game demo called Smallville that integrates the AI writing tool ChatGPT. Instead of just walking into walls and setting themselves on fire like the classic Sims characters we all knew and loved, the game's 25 characters can instead comfortably discuss topics such as local politics and composing music, pulling from ChatGPT's enormous database.
ChatGPT for iOS is now available in 11 more countries
OpenAI first launched its ChatGPT iOS app across the US in mid-May and now it has made good on its promise to expand to more countries in the "coming weeks" by launching in 11 new countries. The countries are a global mix with iOS users in Albania, Croatia, France, Germany, Ireland, Jamaica, Korea, New Zealand, Nicaragua, Nigeria and the UK now able to access the app. The ChatGPT app for iOS is now available to users in 11 more countries -- Albania, Croatia, France, Germany, Ireland, Jamaica, Korea, New Zealand, Nicaragua, Nigeria, and the UK. The ChatGPT app works and looks like the website does with conversation history synced between the computer and iPhone. ChatGPT Plus subscribers can access GPT-4 through the app and receive faster responses.
No 10 acknowledges 'existential' risk of AI for first time
The "existential" risk of artificial intelligence has been acknowledged by No 10 for the first time, after the prime minister met the heads of the world's leading AI research groups to discuss safety and regulation. Rishi Sunak and Chloe Smith, the secretary of state for science, innovation and technology, met the chief executives of Google DeepMind, OpenAI and Anthropic AI on Wednesday evening and discussed how best to moderate the development of the technology to limit the risks of catastrophe. "They discussed safety measures, voluntary actions that labs are considering to manage the risks, and the possible avenues for international collaboration on AI safety and regulation," the participants said in a joint statement. "The lab leaders agreed to work with the UK government to ensure our approach responds to the speed of innovations in this technology both in the UK and around the globe. "The PM and CEOs discussed the risks of the technology, ranging from disinformation and national security, to existential threats … The PM set out how the approach to AI regulation will need to keep pace with the fast-moving advances in this technology." It is the first time the prime minister has acknowledged the potential "existential" threat of developing a "superintelligent" AI without appropriate safeguards, a risk that contrasts with the UK government's generally positive approach to AI development.