Large Language Model
OpenAI's GPT-4 Is Closed Source and Shrouded in Secrecy
Stable Diffusion was trained on LAION-5B, an open-source dataset, which resulted in the public being able to see if their own images were included in the dataset. GPT-4's release is the latest volley from OpenAI in an AI arms race. Big tech companies like Google, Microsoft, and Meta are racing to create new AI technologies as fast as possible, often sidestepping or shrugging off ethical concerns along the way. Google announced on Wednesday that its language model PaLM would be launching an API for businesses and developers to use. Meanwhile, Microsoft cut an entire ethics and society team within its AI department, as part of its recent layoffs, leaving the company without a dedicated responsible AI team, while it continues to adopt OpenAI products as part of its business.
OpenAI's GPT-4 Model Can Ace The SAT, Pass The Bar, And Explain Memes
By now, you've probably read no end of articles about GPT-3 and its derivative model, ChatGPT. You may even have played with the AI yourself, whether using the OpenAI website directly, using Bing's new chat feature, or some other implementation. It's rather limited, as articles around the web have enjoyed pointing out, but it's also only one step along the path toward generalized artificial intelligence. The next step is already here, predictably named GPT-4. GPT-4 was just released by OpenAI today, and the company describes it as "the latest milestone" in deep learning.
Revisiting the Plastic Surgery Hypothesis via Large Language Models
Xia, Chunqiu Steven, Ding, Yifeng, Zhang, Lingming
Automated Program Repair (APR) aspires to automatically generate patches for an input buggy program. Traditional APR tools typically focus on specific bug types and fixes through the use of templates, heuristics, and formal specifications. However, these techniques are limited in terms of the bug types and patch variety they can produce. As such, researchers have designed various learning-based APR tools with recent work focused on directly using Large Language Models (LLMs) for APR. While LLM-based APR tools are able to achieve state-of-the-art performance on many repair datasets, the LLMs used for direct repair are not fully aware of the project-specific information such as unique variable or method names. The plastic surgery hypothesis is a well-known insight for APR, which states that the code ingredients to fix the bug usually already exist within the same project. Traditional APR tools have largely leveraged the plastic surgery hypothesis by designing manual or heuristic-based approaches to exploit such existing code ingredients. However, as recent APR research starts focusing on LLM-based approaches, the plastic surgery hypothesis has been largely ignored. In this paper, we ask the following question: How useful is the plastic surgery hypothesis in the era of LLMs? Interestingly, LLM-based APR presents a unique opportunity to fully automate the plastic surgery hypothesis via fine-tuning and prompting. To this end, we propose FitRepair, which combines the direct usage of LLMs with two domain-specific fine-tuning strategies and one prompting strategy for more powerful APR. Our experiments on the widely studied Defects4j 1.2 and 2.0 datasets show that FitRepair fixes 89 and 44 bugs (substantially outperforming the best-performing baseline by 15 and 8), respectively, demonstrating a promising future of the plastic surgery hypothesis in the era of LLMs.
Revisiting Automatic Question Summarization Evaluation in the Biomedical Domain
Yuan, Hongyi, Zhang, Yaoyun, Huang, Fei, Huang, Songfang
Automatic evaluation metrics have been facilitating the rapid development of automatic summarization methods by providing instant and fair assessments of the quality of summaries. Most metrics have been developed for the general domain, especially news and meeting notes, or other language-generation tasks. However, these metrics are applied to evaluate summarization systems in different domains, such as biomedical question summarization. To better understand whether commonly used evaluation metrics are capable of evaluating automatic summarization in the biomedical domain, we conduct human evaluations of summarization quality from four different aspects of a biomedical question summarization task. Based on human judgments, we identify different noteworthy features for current automatic metrics and summarization systems as well. We also release a dataset of our human annotations to aid the research of summarization evaluation metrics in the biomedical domain.
Cross-Modal Fine-Tuning: Align then Refine
Shen, Junhong, Li, Liam, Dery, Lucio M., Staten, Corey, Khodak, Mikhail, Neubig, Graham, Talwalkar, Ameet
Fine-tuning large-scale pretrained models has led to tremendous progress in well-studied modalities such as vision and NLP. However, similar gains have not been observed in many other modalities due to a lack of relevant pretrained models. In this work, we propose ORCA, a general cross-modal fine-tuning framework that extends the applicability of a single large-scale pretrained model to diverse modalities. ORCA adapts to a target task via an align-then-refine workflow: given the target input, ORCA first learns an embedding network that aligns the embedded feature distribution with the pretraining modality. The pretrained model is then fine-tuned on the embedded data to exploit the knowledge shared across modalities. Through extensive experiments, we show that ORCA obtains state-of-the-art results on 3 benchmarks containing over 60 datasets from 12 modalities, outperforming a wide range of hand-designed, AutoML, general-purpose, and task-specific methods. We highlight the importance of data alignment via a series of ablation studies and demonstrate ORCA's utility in data-limited regimes.
An Empirical Study of Pre-trained Language Models in Simple Knowledge Graph Question Answering
Hu, Nan, Wu, Yike, Qi, Guilin, Min, Dehai, Chen, Jiaoyan, Pan, Jeff Z., Ali, Zafar
Large-scale pre-trained language models (PLMs) such as BERT have recently achieved great success and become a milestone in natural language processing (NLP). It is now the consensus of the NLP community to adopt PLMs as the backbone for downstream tasks. In recent works on knowledge graph question answering (KGQA), BERT or its variants have become necessary in their KGQA models. However, there is still a lack of comprehensive research and comparison of the performance of different PLMs in KGQA. To this end, we summarize two basic KGQA frameworks based on PLMs without additional neural network modules to compare the performance of nine PLMs in terms of accuracy and efficiency. In addition, we present three benchmarks for larger-scale KGs based on the popular SimpleQuestions benchmark to investigate the scalability of PLMs. We carefully analyze the results of all PLMs-based KGQA basic frameworks on these benchmarks and two other popular datasets, WebQuestionSP and FreebaseQA, and find that knowledge distillation techniques and knowledge enhancement methods in PLMs are promising for KGQA. Furthermore, we test ChatGPT, which has drawn a great deal of attention in the NLP community, demonstrating its impressive capabilities and limitations in zero-shot KGQA. We have released the code and benchmarks to promote the use of PLMs on KGQA.
ChatGPT, Tech Map, Capital Story: Unveiling the Mystery Boss
OpenAI, the company behind ChatGPT, has become the fastest-growing consumer application in history. With more than 30 executives, engineers, and researchers leaving the company to start their own companies, OpenAI has raised over US$1 billion in financing and created the "OpenAI Mafia", a powerful network of talent, social connections, and capital opportunities. This new generation of AI companies is driving a new round of technological frenzy and investment opportunities, and OpenAI is dedicated to helping humans realize their beautiful vision with an elite team. The OpenAI Mafia is the new generation of AI companies founded by OpenAI employees in the past five years, and is set to revolutionize the AI industry and shape the future of AI technology. Anthropic is an AI company founded in 2021 by Dario and Daniela Amodei, former vice presidents of OpenAI.
Open AI reveals info about ChatGPT-4 - Global News Pakistan
San Francisco, 17 March 2023 (GNP): The next-generation AI language model GPT-4, which can read photographs and describe what's in them, was just released by OpenAI, according to a research blog post. The world has been captivated by Chat GPT-3, although the deep learning language model previously only took text inputs. GPT-4 will also accept visual cues. OpenAI issued a statement which read: "It generates text outputs given inputs consisting of interspersed text and images. Over a range of domains -- including documents with text and photographs, diagrams, or screenshots -- GPT-4 exhibits similar capabilities as it does on text-only inputs."
OpenAI's GPT-4 exhibits "human-level performance" on professional benchmarks
On Tuesday, OpenAI announced GPT-4, a large multimodal model that can accept text and image inputs while returning text output that "exhibits human-level performance on various professional and academic benchmarks," according to OpenAI. Also on Tuesday, Microsoft announced that Bing Chat has been running on GPT-4 all along. If it performs as claimed, GPT-4 potentially represents the opening of a new era in artificial intelligence. "It passes a simulated bar exam with a score around the top 10% of test takers," writes OpenAI in its announcement. OpenAI plans to release GPT-4's text capability through ChatGPT and its commercial API, but with a waitlist at first.
South Park's latest episode 'Deep Learning' was co-written by ChatGPT
The creators of South Park employed help from OpenAI's ChatGPT to create the latest episode titled'Deep Learning.' The fourth episode of season 26 shows several boys of Stan's class using the chatbot to write essays and send text messages to girls - and the ending credits show it was'written by Trey Parker and ChatGPT'. The AI generated a speech presented by Stan, which is noticeable due to its robotic sound, in which the boy argues why people should not be blamed for using ChatGPT. 'It's the giant tech companies who took Open AI, packaged it, monetized it, and pushed it out to all of us as fast as they could in order to get ahead,' the character said. And several text messages Stan sent to his girlfriend was also generated by ChatGPT.