AITopics | Large Language Model

Collaborating Authors

Large Language Model

News Overviews Instructional Materials AI-Alerts Classics

Samsung bans staff's AI use after spotting ChatGPT data leak

The Japan TimesMay-2-2023, 04:03:21 GMT

Samsung Electronics is banning employee use of popular generative AI tools like ChatGPT after discovering staff uploaded sensitive code to the platform, dealing a setback to the spread of such technology in the workplace. The Suwon, South Korea-based company notified staff at one of its biggest divisions on Monday about the new policy via a memo reviewed by Bloomberg News. The company is concerned that data transmitted to such artificial intelligence platforms including Google Bard and Bing is stored on external servers, making it difficult to retrieve and delete, and could end up being disclosed to other users, according to the document. The company conducted a survey last month about the use of AI tools internally and said that 65% of respondents believe that such services pose a security risk. Earlier in April, Samsung engineers accidentally leaked internal source code by uploading it to ChatGPT, according to the memo. It's unclear what the information encompassed, and a Samsung representative declined to comment.

ai use, chatgpt data leak, samsung ban staff, (1 more...)

The Japan Times

Country: Asia > South Korea > Gyeonggi-do > Suwon (0.29)

Industry:

Semiconductors & Electronics (1.00)
Information Technology > Security & Privacy (0.54)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

'I think they soon may be more intelligent than us'

BBC NewsMay-2-2023, 00:27:27 GMT

"Right now, what we're seeing is things like GPT-4 eclipses a person in the amount of general knowledge it has and it eclipses them by a long way. In terms of reasoning, it's not as good, but it does already do simple reasoning.

eclipse, reasoning

BBC News

Country:

North America > United States (0.40)
North America > Canada (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.45)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)

Add feedback

Generating Executable Action Plans with Environmentally-Aware Language Models

Gramopadhye, Maitrey, Szafir, Daniel

arXiv.org Artificial IntelligenceMay-2-2023

Large Language Models (LLMs) trained using massive text datasets have recently shown promise in generating action plans for robotic agents from high level text queries. However, these models typically do not consider the robot's environment, resulting in generated plans that may not actually be executable, due to ambiguities in the planned actions or environmental constraints. In this paper, we propose an approach to generate environmentally-aware action plans that agents are better able to execute. Our approach involves integrating environmental objects and object relations as additional inputs into LLM action plan generation to provide the system with an awareness of its surroundings, resulting in plans where each generated action is mapped to objects present in the scene. We also design a novel scoring function that, along with generating the action steps and associating them with objects, helps the system disambiguate among object instances and take into account their states. We evaluated our approach using the VirtualHome simulator and the ActivityPrograms knowledge base and found that action plans generated from our system had a 310% improvement in executability and a 147% improvement in correctness over prior work. The complete code and a demo of our method is publicly available at https://github.com/hri-ironlab/scene_aware_language_planner.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2210.04964

Country:

North America > United States > North Carolina > Orange County > Chapel Hill (0.04)
North America > United States > Alaska > Anchorage Municipality > Anchorage (0.04)
Asia > Middle East > Republic of Türkiye (0.04)
Asia > China > Hong Kong (0.04)

Genre:

Research Report (0.82)
Workflow (0.68)

Industry:

Leisure & Entertainment > Games (0.46)
Consumer Products & Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

The Role of Summarization in Generative Agents: A Preliminary Perspective

Feng, Xiachong, Feng, Xiaocheng, Qin, Bing

arXiv.org Artificial IntelligenceMay-2-2023

Generative agents that simulate human society show tremendous potential for further research and practical applications. Specifically, the generative agent architecture comprising several meticulously designed modules constitutes the most critical component. To facilitate progress in this research, this report presents our integrated perspective on comprehending generative agents through summarization, since we believe summarization is the most fundamental and indispensable capacity of generative agents manifested across diverse scenarios. We hope this report can provide insight into understanding the importance of summarization capacity in generative agents and motivate future research.

large language model, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2305.01253

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Multimodal Procedural Planning via Dual Text-Image Prompting

Lu, Yujie, Lu, Pan, Chen, Zhiyu, Zhu, Wanrong, Wang, Xin Eric, Wang, William Yang

arXiv.org Artificial IntelligenceMay-2-2023

Embodied agents have achieved prominent performance in following human instructions to complete tasks. However, the potential of providing instructions informed by texts and images to assist humans in completing tasks remains underexplored. To uncover this capability, we present the multimodal procedural planning (MPP) task, in which models are given a high-level goal and generate plans of paired text-image steps, providing more complementary and informative guidance than unimodal plans. The key challenges of MPP are to ensure the informativeness, temporal coherence,and accuracy of plans across modalities. To tackle this, we propose Text-Image Prompting (TIP), a dual-modality prompting method that jointly leverages zero-shot reasoning ability in large language models (LLMs) and compelling text-to-image generation ability from diffusion-based models. TIP improves the interaction in the dual modalities using Text-to-Image Bridge and Image-to-Text Bridge, allowing LLMs to guide the textual-grounded image plan generation and leveraging the descriptions of image plans to ground the textual plan reversely. To address the lack of relevant datasets, we collect WIKIPLAN and RECIPEPLAN as a testbed for MPP. Our results show compelling human preferences and automatic scores against unimodal and multimodal baselines on WIKIPLAN and RECIPEPLAN in terms of informativeness, temporal coherence, and plan accuracy. Our code and data: https://github.com/YujieLu10/MPP.

foam block, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2305.01795

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > California > Santa Cruz County > Santa Cruz (0.14)
North America > United States > California > Santa Barbara County > Santa Barbara (0.14)
(3 more...)

Genre:

Workflow (1.00)
Research Report > New Finding (0.68)

Industry:

Health & Medicine > Consumer Health (0.92)
Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Psychologically-Inspired Causal Prompts

Lyu, Zhiheng, Jin, Zhijing, Mattern, Justus, Mihalcea, Rada, Sachan, Mrinmaya, Schoelkopf, Bernhard

arXiv.org Artificial IntelligenceMay-2-2023

NLP datasets are richer than just input-output pairs; rather, they carry causal relations between the input and output variables. In this work, we take sentiment classification as an example and look into the causal relations between the review (X) and sentiment (Y). As psychology studies show that language can affect emotion, different psychological processes are evoked when a person first makes a rating and then self-rationalizes their feeling in a review (where the sentiment causes the review, i.e., Y -> X), versus first describes their experience, and weighs the pros and cons to give a final rating (where the review causes the sentiment, i.e., X -> Y ). Furthermore, it is also a completely different psychological process if an annotator infers the original rating of the user by theory of mind (ToM) (where the review causes the rating, i.e., X -ToM-> Y ). In this paper, we verbalize these three causal mechanisms of human psychological processes of sentiment classification into three different causal prompts, and study (1) how differently they perform, and (2) what nature of sentiment classification data leads to agreement or diversity in the model responses elicited by the prompts. We suggest future work raise awareness of different causal structures in NLP tasks. Our code and data are at https://github.com/cogito233/psych-causal-prompt

large language model, natural language, text classification, (18 more...)

arXiv.org Artificial Intelligence

2305.01764

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
North America > United States > Michigan (0.04)
Asia > China > Hong Kong (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry:

Health & Medicine (0.93)
Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.89)

Add feedback

Transformers Learn Shortcuts to Automata

Liu, Bingbin, Ash, Jordan T., Goel, Surbhi, Krishnamurthy, Akshay, Zhang, Cyril

arXiv.org Artificial IntelligenceMay-2-2023

Algorithmic reasoning requires capabilities which are most naturally understood through recurrent models of computation, like the Turing machine. However, Transformer models, while lacking recurrence, are able to perform such reasoning using far fewer layers than the number of reasoning steps. This raises the question: what solutions are learned by these shallow and non-recurrent models? We find that a low-depth Transformer can represent the computations of any finite-state automaton (thus, any bounded-memory algorithm), by hierarchically reparameterizing its recurrent dynamics. Our theoretical results characterize shortcut solutions, whereby a Transformer with $o(T)$ layers can exactly replicate the computation of an automaton on an input sequence of length $T$. We find that polynomial-sized $O(\log T)$-depth solutions always exist; furthermore, $O(1)$-depth simulators are surprisingly common, and can be understood using tools from Krohn-Rhodes theory and circuit complexity. Empirically, we perform synthetic experiments by training Transformers to simulate a wide variety of automata, and show that shortcut solutions can be learned via standard training. We further investigate the brittleness of these solutions and propose potential mitigations.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2210.10749

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

Stance Detection With Supervised, Zero-Shot, and Few-Shot Applications

Burnham, Michael

arXiv.org Artificial IntelligenceMay-2-2023

Stance detection is the identification of an author's beliefs about a subject from a document. Researchers widely rely on sentiment analysis to accomplish this. However, recent research has show that sentiment analysis is only loosely correlated with stance, if at all. This paper advances methods in text analysis by precisely defining the task of stance detection, providing a generalized framework for the task, and then presenting three distinct approaches for performing stance detection: supervised classification, zero-shot classification with NLI classifiers, and in-context learning. In doing so, I demonstrate how zero-shot and few-shot language classifiers can replace human labelers for a variety of tasks and discuss how their application and limitations differ from supervised classifiers. Finally, I demonstrate an application of zero-shot stance detection by replicating Block Jr et al. (2022).

large language model, natural language, tweet, (16 more...)

arXiv.org Artificial Intelligence

2305.01723

Country:

Asia > Russia (0.46)
Europe > Eastern Europe (0.04)
Europe > Ukraine (0.04)
(3 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Government > Regional Government (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.99)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.72)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

FreeLM: Fine-Tuning-Free Language Model

Li, Xiang, Jiang, Xin, Meng, Xuying, Sun, Aixin, Wang, Yequan

arXiv.org Artificial IntelligenceMay-2-2023

Pre-trained language models (PLMs) have achieved remarkable success in NLP tasks. Despite the great success, mainstream solutions largely follow the pre-training then finetuning paradigm, which brings in both high deployment costs and low training efficiency. Nevertheless, fine-tuning on a specific task is essential because PLMs are only pre-trained with language signal from large raw data. In this paper, we propose a novel fine-tuning-free strategy for language models, to consider both language signal and teacher signal. Teacher signal is an abstraction of a battery of downstream tasks, provided in a unified proposition format. Trained with both language and strong task-aware teacher signals in an interactive manner, our FreeLM model demonstrates strong generalization and robustness. FreeLM outperforms large models e.g., GPT-3 and InstructGPT, on a range of language understanding tasks in experiments. FreeLM is much smaller with 0.3B parameters, compared to 175B in these models.

freelm, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2305.01616

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Beijing > Beijing (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(10 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

Add feedback

ContraCLM: Contrastive Learning For Causal Language Model

Jain, Nihal, Zhang, Dejiao, Ahmad, Wasi Uddin, Wang, Zijian, Nan, Feng, Li, Xiaopeng, Tan, Ming, Nallapati, Ramesh, Ray, Baishakhi, Bhatia, Parminder, Ma, Xiaofei, Xiang, Bing

arXiv.org Artificial IntelligenceMay-2-2023

Despite exciting progress in causal language models, the expressiveness of the representations is largely limited due to poor discrimination ability. To remedy this issue, we present ContraCLM, a novel contrastive learning framework at both token-level and sequence-level. We assess ContraCLM on a variety of downstream tasks. We show that ContraCLM enhances discrimination of the representations and bridges the gap with the encoder-only models, which makes causal language models better suited for tasks beyond language generation. Specifically, we attain $44\%$ relative improvement on the Semantic Textual Similarity tasks and $34\%$ on Code-to-Code Search tasks. Furthermore, by improving the expressiveness of the representations, ContraCLM also boosts the source code generation capability with $9\%$ relative improvement on execution accuracy on the HumanEval benchmark.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2210.01185

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(4 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback