Large Language Model
Minimum Levels of Interpretability for Artificial Moral Agents
Vijayaraghavan, Avish, Badea, Cosmin
The deployment of consumer-facing generative artificial intelligence (AI) models such as Midjourney and ChatGPT has raised important questions on the ethics [1] and consequences of widespread access to AI technologies [2]. Tracing the evolution of these models over the past five years [3], it is likely that we will soon see multi-modal general-purpose models [4-8] available to the public. As these models begin operating with higher autonomy and become integrated into existing applications [9-11] (e.g. ChatGPT with plugins, AI vision models within self-driving cars), they will play a greater role in many aspects of human decision-making [12, 13]. A fundamental subset of human decision-making is moral decisionmaking (MDM).
TensorGPT: Efficient Compression of the Embedding Layer in LLMs based on the Tensor-Train Decomposition
Xu, Mingxue, Xu, Yao Lei, Mandic, Danilo P.
High-dimensional token embeddings underpin Large Language Models (LLMs), as they can capture subtle semantic information and significantly enhance the modelling of complex language patterns. However, the associated high dimensionality also introduces considerable model parameters, and a prohibitively high model storage. To address this issue, this work proposes an approach based on the Tensor-Train Decomposition (TTD), where each token embedding is treated as a Matrix Product State (MPS) that can be efficiently computed in a distributed manner. The experimental results on GPT-2 demonstrate that, through our approach, the embedding layer can be compressed by a factor of up to 38.40 times, and when the compression factor is 3.31 times, even produced a better performance than the original GPT-2 model.
Large Language Models Enable Few-Shot Clustering
Viswanathan, Vijay, Gashteovski, Kiril, Lawrence, Carolin, Wu, Tongshuang, Neubig, Graham
Unlike traditional unsupervised clustering, semi-supervised clustering allows users to provide meaningful structure to the data, which helps the clustering algorithm to match the user's intent. Existing approaches to semi-supervised clustering require a significant amount of feedback from an expert to improve the clusters. In this paper, we ask whether a large language model can amplify an expert's guidance to enable query-efficient, few-shot semi-supervised text clustering. We show that LLMs are surprisingly effective at improving clustering. We explore three stages where LLMs can be incorporated into clustering: before clustering (improving input features), during clustering (by providing constraints to the clusterer), and after clustering (using LLMs post-correction). We find incorporating LLMs in the first two stages can routinely provide significant improvements in cluster quality, and that LLMs enable a user to make trade-offs between cost and accuracy to produce desired clusters. We release our code and LLM prompts for the public to use.
Artificial General Intelligence for Medical Imaging
Li, Xiang, Zhang, Lu, Wu, Zihao, Liu, Zhengliang, Zhao, Lin, Yuan, Yixuan, Liu, Jun, Li, Gang, Zhu, Dajiang, Yan, Pingkun, Li, Quanzheng, Liu, Wei, Liu, Tianming, Shen, Dinggang
In this review, we explore the potential applications of Artificial General Intelligence (AGI) models in healthcare, focusing on foundational Large Language Models (LLMs), Large Vision Models, and Large Multimodal Models. We emphasize the importance of integrating clinical expertise, domain knowledge, and multimodal capabilities into AGI models. In addition, we lay out key roadmaps that guide the development and deployment of healthcare AGI models. Throughout the review, we provide critical perspectives on the potential challenges and pitfalls associated with deploying large-scale AGI models in the medical field. This comprehensive review aims to offer insights into the future implications of AGI in medical imaging, healthcare and beyond.
Practical PCG Through Large Language Models
Nasir, Muhammad U, Togelius, Julian
Large Language Models (LLMs) have proven to be useful tools in various domains outside of the field of their inception, which was natural language processing. In this study, we provide practical directions on how to use LLMs to generate 2D-game rooms for an under-development game, named Metavoidal. Our technique can harness the power of GPT-3 by Human-in-the-loop fine-tuning which allows our method to create 37% Playable-Novel levels from as scarce data as only 60 hand-designed rooms under a scenario of the non-trivial game, with respect to (Procedural Content Generation) PCG, that has a good amount of local and global constraints.
When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories
Mallen, Alex, Asai, Akari, Zhong, Victor, Das, Rajarshi, Khashabi, Daniel, Hajishirzi, Hannaneh
Despite their impressive performance on diverse tasks, large language models (LMs) still struggle with tasks requiring rich world knowledge, implying the limitations of relying solely on their parameters to encode a wealth of world knowledge. This paper aims to understand LMs' strengths and limitations in memorizing factual knowledge, by conducting large-scale knowledge probing experiments of 10 models and 4 augmentation methods on PopQA, our new open-domain QA dataset with 14k questions. We find that LMs struggle with less popular factual knowledge, and that scaling fails to appreciably improve memorization of factual knowledge in the long tail. We then show that retrieval-augmented LMs largely outperform orders of magnitude larger LMs, while unassisted LMs remain competitive in questions about high-popularity entities. Based on those findings, we devise a simple, yet effective, method for powerful and efficient retrieval-augmented LMs, which retrieves non-parametric memories only when necessary. Experimental results show that this significantly improves models' performance while reducing the inference costs.
Few-shot Reranking for Multi-hop QA via Language Model Prompting
Khalifa, Muhammad, Logeswaran, Lajanugen, Lee, Moontae, Lee, Honglak, Wang, Lu
We study few-shot reranking for multi-hop QA with open-domain questions. To alleviate the need for a large number of labeled question-document pairs for retriever training, we propose PromptRank, which relies on large language models prompting for multi-hop path reranking. PromptRank first constructs an instruction-based prompt that includes a candidate document path and then computes the relevance score between a given question and the path based on the conditional likelihood of the question given the path prompt according to a language model. PromptRank yields strong retrieval performance on HotpotQA with only 128 training examples compared to state-of-the-art methods trained on thousands of examples -- 73.6 recall@10 by PromptRank vs. 77.8 by PathRetriever and 77.5 by multi-hop dense retrieval. Code available at https://github.com/mukhal/PromptRank
Twitter applies reading limit after users report issues with platform
Twitter has applied temporary reading limits to address "extreme levels" of data scraping and system manipulation, Elon Musk said in a post on the social media platform on Saturday. Verified accounts were temporarily limited to reading 6,000 posts a day, Musk said, adding that unverified accounts and new unverified accounts were limited to reading 600 posts a day and 300 posts a day respectively. In a later tweet, the billionaire added: "Rate limits increasing soon to 8,000 for verified, 800 for unverified & 400 for new unverified." That comes after Twitter had announced that it will require users to have an account on the social media platform to view tweets, a move that Musk on Friday called a "temporary emergency measure". Musk had said that hundreds of organisations were scraping Twitter data "extremely aggressively", affecting user experience.
Instance-Level Semantic Maps for Vision Language Navigation
Nanwani, Laksh, Agarwal, Anmol, Jain, Kanishk, Prabhakar, Raghav, Monis, Aaron, Mathur, Aditya, Murthy, Krishna, Hafez, Abdul, Gandhi, Vineet, Krishna, K. Madhava
Humans have a natural ability to perform semantic associations with the surrounding objects in the environment. This allows them to create a mental map of the environment, allowing them to navigate on-demand when given linguistic instructions. A natural goal in Vision Language Navigation (VLN) research is to impart autonomous agents with similar capabilities. Recent works take a step towards this goal by creating a semantic spatial map representation of the environment without any labeled data. However, their representations are limited for practical applicability as they do not distinguish between different instances of the same object. In this work, we address this limitation by integrating instance-level information into spatial map representation using a community detection algorithm and utilizing word ontology learned by large language models (LLMs) to perform open-set semantic associations in the mapping representation. The resulting map representation improves the navigation performance by two-fold (233%) on realistic language commands with instance-specific descriptions compared to the baseline. We validate the practicality and effectiveness of our approach through extensive qualitative and quantitative experiments.
CephGPT-4: An Interactive Multimodal Cephalometric Measurement and Diagnostic System with Visual Large Language Model
Ma, Lei, Han, Jincong, Wang, Zhaoxin, Zhang, Dian
Large-scale multimodal language models (LMMs) have achieved remarkable success in general domains. However, the exploration of diagnostic language models based on multimodal cephalometric medical data remains limited. In this paper, we propose a novel multimodal cephalometric analysis and diagnostic dialogue model. Firstly, a multimodal orthodontic medical dataset is constructed, comprising cephalometric images and doctor-patient dialogue data, with automatic analysis of cephalometric landmarks using U-net and generation of diagnostic reports. Then, the cephalometric dataset and generated diagnostic reports are separately fine-tuned on Minigpt-4 and VisualGLM. Results demonstrate that the CephGPT-4 model exhibits excellent performance and has the potential to revolutionize orthodontic measurement and diagnostic applications. These innovations hold revolutionary application potential in the field of orthodontics.