Goto

Collaborating Authors

 Large Language Model


Samsung bans staff's AI use after spotting ChatGPT data leak

The Japan Times

Samsung Electronics is banning employee use of popular generative AI tools like ChatGPT after discovering staff uploaded sensitive code to the platform, dealing a setback to the spread of such technology in the workplace. The Suwon, South Korea-based company notified staff at one of its biggest divisions on Monday about the new policy via a memo reviewed by Bloomberg News. The company is concerned that data transmitted to such artificial intelligence platforms including Google Bard and Bing is stored on external servers, making it difficult to retrieve and delete, and could end up being disclosed to other users, according to the document. The company conducted a survey last month about the use of AI tools internally and said that 65% of respondents believe that such services pose a security risk. Earlier in April, Samsung engineers accidentally leaked internal source code by uploading it to ChatGPT, according to the memo. It's unclear what the information encompassed, and a Samsung representative declined to comment.


'I think they soon may be more intelligent than us'

BBC News

"Right now, what we're seeing is things like GPT-4 eclipses a person in the amount of general knowledge it has and it eclipses them by a long way. In terms of reasoning, it's not as good, but it does already do simple reasoning.


Generating Executable Action Plans with Environmentally-Aware Language Models

arXiv.org Artificial Intelligence

Large Language Models (LLMs) trained using massive text datasets have recently shown promise in generating action plans for robotic agents from high level text queries. However, these models typically do not consider the robot's environment, resulting in generated plans that may not actually be executable, due to ambiguities in the planned actions or environmental constraints. In this paper, we propose an approach to generate environmentally-aware action plans that agents are better able to execute. Our approach involves integrating environmental objects and object relations as additional inputs into LLM action plan generation to provide the system with an awareness of its surroundings, resulting in plans where each generated action is mapped to objects present in the scene. We also design a novel scoring function that, along with generating the action steps and associating them with objects, helps the system disambiguate among object instances and take into account their states. We evaluated our approach using the VirtualHome simulator and the ActivityPrograms knowledge base and found that action plans generated from our system had a 310% improvement in executability and a 147% improvement in correctness over prior work. The complete code and a demo of our method is publicly available at https://github.com/hri-ironlab/scene_aware_language_planner.


The Role of Summarization in Generative Agents: A Preliminary Perspective

arXiv.org Artificial Intelligence

Generative agents that simulate human society show tremendous potential for further research and practical applications. Specifically, the generative agent architecture comprising several meticulously designed modules constitutes the most critical component. To facilitate progress in this research, this report presents our integrated perspective on comprehending generative agents through summarization, since we believe summarization is the most fundamental and indispensable capacity of generative agents manifested across diverse scenarios. We hope this report can provide insight into understanding the importance of summarization capacity in generative agents and motivate future research.


Multimodal Procedural Planning via Dual Text-Image Prompting

arXiv.org Artificial Intelligence

Embodied agents have achieved prominent performance in following human instructions to complete tasks. However, the potential of providing instructions informed by texts and images to assist humans in completing tasks remains underexplored. To uncover this capability, we present the multimodal procedural planning (MPP) task, in which models are given a high-level goal and generate plans of paired text-image steps, providing more complementary and informative guidance than unimodal plans. The key challenges of MPP are to ensure the informativeness, temporal coherence,and accuracy of plans across modalities. To tackle this, we propose Text-Image Prompting (TIP), a dual-modality prompting method that jointly leverages zero-shot reasoning ability in large language models (LLMs) and compelling text-to-image generation ability from diffusion-based models. TIP improves the interaction in the dual modalities using Text-to-Image Bridge and Image-to-Text Bridge, allowing LLMs to guide the textual-grounded image plan generation and leveraging the descriptions of image plans to ground the textual plan reversely. To address the lack of relevant datasets, we collect WIKIPLAN and RECIPEPLAN as a testbed for MPP. Our results show compelling human preferences and automatic scores against unimodal and multimodal baselines on WIKIPLAN and RECIPEPLAN in terms of informativeness, temporal coherence, and plan accuracy. Our code and data: https://github.com/YujieLu10/MPP.


Psychologically-Inspired Causal Prompts

arXiv.org Artificial Intelligence

NLP datasets are richer than just input-output pairs; rather, they carry causal relations between the input and output variables. In this work, we take sentiment classification as an example and look into the causal relations between the review (X) and sentiment (Y). As psychology studies show that language can affect emotion, different psychological processes are evoked when a person first makes a rating and then self-rationalizes their feeling in a review (where the sentiment causes the review, i.e., Y -> X), versus first describes their experience, and weighs the pros and cons to give a final rating (where the review causes the sentiment, i.e., X -> Y ). Furthermore, it is also a completely different psychological process if an annotator infers the original rating of the user by theory of mind (ToM) (where the review causes the rating, i.e., X -ToM-> Y ). In this paper, we verbalize these three causal mechanisms of human psychological processes of sentiment classification into three different causal prompts, and study (1) how differently they perform, and (2) what nature of sentiment classification data leads to agreement or diversity in the model responses elicited by the prompts. We suggest future work raise awareness of different causal structures in NLP tasks. Our code and data are at https://github.com/cogito233/psych-causal-prompt


Transformers Learn Shortcuts to Automata

arXiv.org Artificial Intelligence

Algorithmic reasoning requires capabilities which are most naturally understood through recurrent models of computation, like the Turing machine. However, Transformer models, while lacking recurrence, are able to perform such reasoning using far fewer layers than the number of reasoning steps. This raises the question: what solutions are learned by these shallow and non-recurrent models? We find that a low-depth Transformer can represent the computations of any finite-state automaton (thus, any bounded-memory algorithm), by hierarchically reparameterizing its recurrent dynamics. Our theoretical results characterize shortcut solutions, whereby a Transformer with $o(T)$ layers can exactly replicate the computation of an automaton on an input sequence of length $T$. We find that polynomial-sized $O(\log T)$-depth solutions always exist; furthermore, $O(1)$-depth simulators are surprisingly common, and can be understood using tools from Krohn-Rhodes theory and circuit complexity. Empirically, we perform synthetic experiments by training Transformers to simulate a wide variety of automata, and show that shortcut solutions can be learned via standard training. We further investigate the brittleness of these solutions and propose potential mitigations.


Stance Detection With Supervised, Zero-Shot, and Few-Shot Applications

arXiv.org Artificial Intelligence

Stance detection is the identification of an author's beliefs about a subject from a document. Researchers widely rely on sentiment analysis to accomplish this. However, recent research has show that sentiment analysis is only loosely correlated with stance, if at all. This paper advances methods in text analysis by precisely defining the task of stance detection, providing a generalized framework for the task, and then presenting three distinct approaches for performing stance detection: supervised classification, zero-shot classification with NLI classifiers, and in-context learning. In doing so, I demonstrate how zero-shot and few-shot language classifiers can replace human labelers for a variety of tasks and discuss how their application and limitations differ from supervised classifiers. Finally, I demonstrate an application of zero-shot stance detection by replicating Block Jr et al. (2022).


FreeLM: Fine-Tuning-Free Language Model

arXiv.org Artificial Intelligence

Pre-trained language models (PLMs) have achieved remarkable success in NLP tasks. Despite the great success, mainstream solutions largely follow the pre-training then finetuning paradigm, which brings in both high deployment costs and low training efficiency. Nevertheless, fine-tuning on a specific task is essential because PLMs are only pre-trained with language signal from large raw data. In this paper, we propose a novel fine-tuning-free strategy for language models, to consider both language signal and teacher signal. Teacher signal is an abstraction of a battery of downstream tasks, provided in a unified proposition format. Trained with both language and strong task-aware teacher signals in an interactive manner, our FreeLM model demonstrates strong generalization and robustness. FreeLM outperforms large models e.g., GPT-3 and InstructGPT, on a range of language understanding tasks in experiments. FreeLM is much smaller with 0.3B parameters, compared to 175B in these models.


ContraCLM: Contrastive Learning For Causal Language Model

arXiv.org Artificial Intelligence

Despite exciting progress in causal language models, the expressiveness of the representations is largely limited due to poor discrimination ability. To remedy this issue, we present ContraCLM, a novel contrastive learning framework at both token-level and sequence-level. We assess ContraCLM on a variety of downstream tasks. We show that ContraCLM enhances discrimination of the representations and bridges the gap with the encoder-only models, which makes causal language models better suited for tasks beyond language generation. Specifically, we attain $44\%$ relative improvement on the Semantic Textual Similarity tasks and $34\%$ on Code-to-Code Search tasks. Furthermore, by improving the expressiveness of the representations, ContraCLM also boosts the source code generation capability with $9\%$ relative improvement on execution accuracy on the HumanEval benchmark.