Large Language Model
Should I Use an AI to Write My Wedding Toast?
I have absolutely no idea what to say. Should I get an AI to help me? Or would that make me the worst man?" For philosophical guidance on encounters with technology, open a support ticket via email; or register and post a comment below. You're certainly not alone in realizing that some onerous creative or emotive task can be completed relatively painlessly with AI. The same thought has undoubtedly occurred to the tongue-tied Tinder user who discovers that he can enlist a digital Cyrano to pen his opening lines to a prospective date; or to the exhausted mother who recognizes that she has at her fingertips a tireless Scheherazade that can produce an infinite scroll of bedtime stories for her children; or to the overworked son who realizes that he can generate, in seconds, a personalized poem for his father's retirement party.
ChatGPT owner chooses London for first office outside US
Chloe Smith, the Science, Innovation and Technology Secretary, told the BBC: "OpenAI's decision to expand into London as their first international office is another vote of confidence for Britain as an AI powerhouse and, in OpenAI's own words, for our vibrant technology ecosystem and exceptional talent.
Job applicants in Japan embrace ChatGPT to improve their chances
The use of ChatGPT among job applicants has grown in popularity amid their concerns about their own ability to create a resume that stands out in a competitive job market. In Japan, it is customary for students to begin job hunting long before graduation. The job-hunting process is arduous, and there is a stigma around failing to secure a job before graduation. One critical aspect of the application process is the completion of company-specific questionnaires known as "entry sheets" (ES), with students typically applying to several firms. These sheets require concise responses, typically within 150 to 400 characters, for each question.
AI recap this month: Drone 'kills' operator; DeepMind's speed up
This month we heard about a fascinating AI experiment from a US Air Force colonel. An AI-controlled drone trained to autonomously carry out bombing missions had turned on its human operator when told not to attack targets; its programming prioritised successfully carrying out missions, so it saw human intervention as an obstacle in its way and decided to forcefully take it out. The only problem with the story was that it was nonsense. Firstly, as the colonel told it, the test was a simulation. Secondly, a US Air Force statement was hastily issued to clarify that the colonel, speaking at a UK conference, had "mis-spoke" and that no such tests had been carried out.
SummQA at MEDIQA-Chat 2023:In-Context Learning with GPT-4 for Medical Summarization
Mathur, Yash, Rangreji, Sanketh, Kapoor, Raghav, Palavalli, Medha, Bertsch, Amanda, Gormley, Matthew R.
Medical dialogue summarization is challenging due to the unstructured nature of medical conversations, the use of medical terminology in gold summaries, and the need to identify key information across multiple symptom sets. We present a novel system for the Dialogue2Note Medical Summarization tasks in the MEDIQA 2023 Shared Task. Our approach for section-wise summarization (Task A) is a two-stage process of selecting semantically similar dialogues and using the top-k similar dialogues as in-context examples for GPT-4. For full-note summarization (Task B), we use a similar solution with k=1. We achieved 3rd place in Task A (2nd among all teams), 4th place in Task B Division Wise Summarization (2nd among all teams), 15th place in Task A Section Header Classification (9th among all teams), and 8th place among all teams in Task B. Our results highlight the effectiveness of few-shot prompting for this task, though we also identify several weaknesses of prompting-based approaches. We compare GPT-4 performance with several finetuned baselines. We find that GPT-4 summaries are more abstractive and shorter. We make our code publicly available.
Modeling Parallel Programs using Large Language Models
Nichols, Daniel, Marathe, Aniruddha, Menon, Harshitha, Gamblin, Todd, Bhatele, Abhinav
Parallel software codes in high performance computing (HPC) continue to grow in complexity and scale as we enter the exascale era. A diverse set of emerging hardware and programming paradigms make developing, optimizing, and maintaining parallel software burdensome for developers. One way to alleviate some of these burdens is with automated development and analysis tools. Such tools can perform complex and/or remedial tasks for developers that increase their productivity and decrease the chance for error. So far, such tools for code development and performance analysis have been limited in the complexity of tasks they can perform. However, with recent advancements in language modeling, and the wealth of code related data that is now available online, these tools have started to utilize predictive language models to automate more complex tasks. In this paper, we show how large language models (LLMs) can be applied to tasks specific to high performance and scientific codes. We train LLMs using code and performance data that is specific to parallel codes. We compare several recent LLMs on HPC related tasks and introduce a new model, HPC-Coder, trained on parallel code. In our experiments we show that this model can auto-complete HPC functions where general models cannot, decorate for loops with OpenMP pragmas, and model performance changes in two scientific application repositories.
DisasterResponseGPT: Large Language Models for Accelerated Plan of Action Development in Disaster Response Scenarios
Goecks, Vinicius G., Waytowich, Nicholas R.
The development of plans of action in disaster response scenarios is a time-consuming process. Large Language Models (LLMs) offer a powerful solution to expedite this process through in-context learning. This study presents DisasterResponseGPT, an algorithm that leverages LLMs to generate valid plans of action quickly by incorporating disaster response and planning guidelines in the initial prompt. In DisasterResponseGPT, users input the scenario description and receive a plan of action as output. The proposed method generates multiple plans within seconds, which can be further refined following the user's feedback. Preliminary results indicate that the plans of action developed by DisasterResponseGPT are comparable to human-generated ones while offering greater ease of modification in real-time. This approach has the potential to revolutionize disaster response operations by enabling rapid updates and adjustments during the plan's execution.
Towards Zero-Shot Scale-Aware Monocular Depth Estimation
Guizilini, Vitor, Vasiljevic, Igor, Chen, Dian, Ambrus, Rares, Gaidon, Adrien
Monocular depth estimation is scale-ambiguous, and thus requires scale supervision to produce metric predictions. Even so, the resulting models will be geometry-specific, with learned scales that cannot be directly transferred across domains. Because of that, recent works focus instead on relative depth, eschewing scale in favor of improved up-to-scale zero-shot transfer. In this work we introduce ZeroDepth, a novel monocular depth estimation framework capable of predicting metric scale for arbitrary test images from different domains and camera parameters. This is achieved by (i) the use of input-level geometric embeddings that enable the network to learn a scale prior over objects; and (ii) decoupling the encoder and decoder stages, via a variational latent representation that is conditioned on single frame information. We evaluated ZeroDepth targeting both outdoor (KITTI, DDAD, nuScenes) and indoor (NYUv2) benchmarks, and achieved a new state-of-the-art in both settings using the same pre-trained model, outperforming methods that train on in-domain data and require test-time scaling to produce metric estimates.
A Hybrid System for Systematic Generalization in Simple Arithmetic Problems
Petruzzellis, Flavio, Testolin, Alberto, Sperduti, Alessandro
Solving symbolic reasoning problems that require compositionality and systematicity is considered one of the key ingredients of human intelligence. However, symbolic reasoning is still a great challenge for deep learning models, which often cannot generalize the reasoning pattern to out-of-distribution test cases. In this work, we propose a hybrid system capable of solving arithmetic problems that require compositional and systematic reasoning over sequences of symbols. The model acquires such a skill by learning appropriate substitution rules, which are applied iteratively to the input string until the expression is completely resolved. We show that the proposed system can accurately solve nested arithmetical expressions even when trained only on a subset including the simplest cases, significantly outperforming both a sequence-to-sequence model trained end-to-end and a state-of-the-art large language model.
LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding
Zhang, Yanzhe, Zhang, Ruiyi, Gu, Jiuxiang, Zhou, Yufan, Lipka, Nedim, Yang, Diyi, Sun, Tong
Instruction tuning unlocks the superior capability of Large Language Models (LLM) to interact with humans. Furthermore, recent instruction-following datasets include images as visual inputs, collecting responses for image-based instructions. However, visual instruction-tuned models cannot comprehend textual details within images well. This work enhances the current visual instruction tuning pipeline with text-rich images (e.g., movie posters, book covers, etc.). Specifically, we first use publicly available OCR tools to collect results on 422K text-rich images from the LAION dataset. Moreover, we prompt text-only GPT-4 with recognized texts and image captions to generate 16K conversations, each containing question-answer pairs for text-rich images. By combining our collected data with previous multi-modal instruction-following data, our model, LLaVAR, substantially improves the LLaVA model's capability on text-based VQA datasets (up to 20% accuracy improvement) while achieving an accuracy of 91.42% on ScienceQA. The GPT-4-based instruction-following evaluation also demonstrates the improvement of our model on both natural images and text-rich images. Through qualitative analysis, LLaVAR shows promising interaction (e.g., reasoning, writing, and elaboration) skills with humans based on the latest real-world online content that combines text and images. We make our code/data/models publicly available at https://llavar.github.io/.