Large Language Model
CoBIT: A Contrastive Bi-directional Image-Text Generation Model
You, Haoxuan, Guo, Mandy, Wang, Zhecan, Chang, Kai-Wei, Baldridge, Jason, Yu, Jiahui
The field of vision and language has witnessed a proliferation of pre-trained foundation models. Most existing methods are independently pre-trained with contrastive objective like CLIP, image-to-text generative objective like PaLI, or text-to-image generative objective like Parti. However, the three objectives can be pre-trained on the same data, image-text pairs, and intuitively they complement each other as contrasting provides global alignment capacity and generation grants fine-grained understanding. In this work, we present a Contrastive Bi-directional Image-Text generation model (CoBIT), which attempts to unify the three pre-training objectives in one framework. Specifically, CoBIT employs a novel unicoder-decoder structure, consisting of an image unicoder, a text unicoder and a cross-modal decoder. The image/text unicoders can switch between encoding and decoding in different tasks, enabling flexibility and shared knowledge that benefits both image-to-text and text-to-image generations. CoBIT achieves superior performance in image understanding, image-text understanding (Retrieval, Captioning, VQA, SNLI-VE) and text-based content creation, particularly in zero-shot scenarios. For instance, 82.7% in zero-shot ImageNet classification, 9.37 FID score in zero-shot text-to-image generation and 44.8 CIDEr in zero-shot captioning.
ReCOGS: How Incidental Details of a Logical Form Overshadow an Evaluation of Semantic Interpretation
Wu, Zhengxuan, Manning, Christopher D., Potts, Christopher
Compositional generalization benchmarks seek to assess whether models can accurately compute meanings for novel sentences, but operationalize this in terms of logical form (LF) prediction. This raises the concern that semantically irrelevant details of the chosen LFs could shape model performance. We argue that this concern is realized for the COGS benchmark (Kim and Linzen, 2020). COGS poses generalization splits that appear impossible for present-day models, which could be taken as an indictment of those models. However, we show that the negative results trace to incidental features of COGS LFs. Converting these LFs to semantically equivalent ones and factoring out capabilities unrelated to semantic interpretation, we find that even baseline models get traction. A recent variable-free translation of COGS LFs suggests similar conclusions, but we observe this format is not semantically equivalent; it is incapable of accurately representing some COGS meanings. These findings inform our proposal for ReCOGS, a modified version of COGS that comes closer to assessing the target semantic capabilities while remaining very challenging. Overall, our results reaffirm the importance of compositional generalization and careful benchmark task design.
A Simple Explanation for the Phase Transition in Large Language Models with List Decoding
Various recent experimental results show that large language models (LLM) exhibit emergent abilities that are not present in small models. System performance is greatly improved after passing a certain critical threshold of scale. In this letter, we provide a simple explanation for such a phase transition phenomenon. For this, we model an LLM as a sequence-to-sequence random function. Instead of using instant generation at each step, we use a list decoder that keeps a list of candidate sequences at each step and defers the generation of the output sequence at the end. We show that there is a critical threshold such that the expected number of erroneous candidate sequences remains bounded when an LLM is below the threshold, and it grows exponentially when an LLM is above the threshold. Such a threshold is related to the basic reproduction number in a contagious disease.
5 ways ChatGPT could shape enterprise search in 2023
Join top executives in San Francisco on July 11-12, to hear how leaders are integrating and optimizing AI investments for success. It's been an exciting few months since OpenAI released ChatGPT, which now has everyone talking about it, many talking to it and all eyes on what's next. ChatGPT raised the bar for what computers are capable of and is a window into what's possible with AI. And with tech giants Microsoft, Google and now Meta joining the race, we should all buckle up for an exciting but potentially bumpy ride. Core to these capabilities are large language models (LLMs) -- specifically, a particular generative LLM that makes ChatGPT possible.
GitHub Copilot X: The AI-powered developer experience
From reading docs to writing code to submitting pull requests and beyond, we're working to personalize GitHub Copilot for every team, project, and repository it's used in, creating a radically improved software development lifecycle. Together with Microsoft's knowledge model, we will harness the reservoir of data and insights that exist in every organization, to strengthen the connection between all workers and developers, so every idea can go from code to reality without friction. At the same time, we will continue to innovate and update the heart of GitHub Copilot--the AI pair programmer that started it all.
Bill Gates says AI-powered ChatGPT as important as 'PC, internet, mobile phones'
Bill Gates likened the development of artificial intelligence-powered ChatGPT to the advent of the personal computer and said that the new technology will be like having a "white-collar worker" as a personal assistant. "The development of AI is as fundamental as the creation of the microprocessor, the personal computer, the Internet, and the mobile phone," Gates wrote in a blog post. "It will change the way people work, learn, travel, get health care, and communicate with each other." Gates added: "Entire industries will reorient around it. "Businesses will distinguish themselves by how well they use it."
Fake ChatGPT extensions want to steal your Facebook account
That Chrome extension you downloaded to add ChatGPT integration to Google results may not be the legitimate one--and it could cause you to lose access to your Facebook account. Until earlier today, a malware copy of the "ChatGPT for Google" extension that stole Facebook session cookies could be found in the Chrome Web Store, allowing hackers to infiltrate accounts and lock users out. Discovered by security firm Guardio Labs and reported by BleepingComputer, the false extension leveraged the Chrome Extension API to sniff out active Facebook cookies and sent the pilfered data to the attacker's server. Hackers then logged into Facebook, changed the account credentials, and converted profiles to that of a false persona named "Lily Collins." These zombie accounts were used to spread malicious advertising and extremist propaganda.
How to use ChatGPT in the real world and join the AI revolution
ChatGPT launched last November and has taken the world by storm, unlike any other technology since the dawn of the smartphone. You have probably seen examples on social media of the AI giving eerily human responses to obscure prompts -- but the chatbot can actually be used to carry out a large number of basic daily tasks that could save you time -- and money. It is easy (and free) to use ChatGPT and can be used to do anything from writing work reports to creating diet plans and helping you apply for jobs. So whether you want to use it on a desktop, tablet or cellphone, DailyMail.com There's a subscription option, but you don't have to pay to use ChatGPT ChatGPT (it stands for Generative Pre-trained Transformer) is a'large language model' which can produce convincing, human-like answers to almost any question.
Welcome! You are invited to join a webinar: The Insightful Leader Live: What to Know about Today's AI--and Tomorrow's. After registering, you will receive a confirmation email about joining the webinar.
Large-scale language models like ChatGPT have taken the world by storm, dazzling users with their ability to pen convincing marketing copy, suggest recipes, and converse a lot like humans. But these models, for all their strengths, have some hefty (and concerning) limitations. And what other kinds of AI could be on the horizon? In this complimentary webinar, Kellogg faculty David Ferrucci, the AI researcher who started and led the IBM Watson team from its inception through its landmark Jeopardy success in 2011, and Brian Uzzi, a professor of management and organizations, will walk us through the inner workings and social ramifications of today's AI--and tomorrow's.
To Teach Better Writing, Don't Ban Artificial Intelligence. Instead, Embrace it. - Education Next
For all the speculation about ChatGPT's potential to upend K–12 writing instruction, there has been little investigation into the underlying assumption that the AI chatbot can produce writing that makes the grade. We put OpenAI's ChatGPT to the test by asking it to write essays in response to real school curriculum prompts. We then submitted those essays for evaluation. The results show that ChatGPT produces responses that meet or exceed standards across grade levels. This has big implications for schools, which should move with urgency to adjust their practices and learning models to keep pace with the shifting technological landscape.