Large Language Model
OpenAI and Figure develop terrifyingly creepy humanoid robots for the workforce
Two companies are coming together to develop humanoid robots with AI that will be able to perform jobs from manufacturing to healthcare professions. Do you ever find yourself glued to the screen watching a movie like "Terminator" or "Westworld" and think, "Phew! Movies like this are getting closer to becoming a reality with each passing day. CLICK TO GET KURT'S FREE CYBERGUY NEWSLETTER WITH QUICK TIPS, TECH REVIEWS, SECURITY ALERTS AND EASY HOW-TO'S TO MAKE YOU SMARTER Let me introduce you to a breakthrough that's happening in the real world, which is both exciting and slightly unsettling. It's a fascinating development that might make you wonder if the line between humans and machines is starting to blur, just like in those movies we love.
Forget ChatGPT. These Are the Best AI-Powered Apps.
Its answer can be full of errors. And during long conversations, it can veer into wild tangents. Language app Duolingo and learning platform Khan Academy now offer conversational, personalized tutoring with this technology. Travel app Expedia features a chatty trip planner. And all Snapchat users just got a new friend on the social network called My AI.
A curious person's guide to artificial intelligence
Large language models, or LLMs, are a type of neural network that learns to write and converse with users; they back all of the chatbots that have swooped onto the scene in recent months. They learn to "speak" by hovering up massive amounts of text, often websites scraped from the internet, and finding statistical relationships between words. When these systems pattern match, it can lead to feats of creativity: A chatbot can create song lyrics closely matching Jay-Z's style because it's absorbed the patterns of his entire discography. But LLMs don't have awareness of the meanings behind words.
Optimizing National Security Strategies through LLM-Driven Artificial Intelligence Integration
Artificial Intelligence is revolutionizing the way military INCE the early days of cyber space technology strides in enhancing its strategic capabilities. Today, we and government organizations operate. These advanced find ourselves at the precipice of a new technological technologies enable machines to learn and reason revolution: Artificial Intelligence (AI). As a strategic autonomously, with applications ranging from situational imperative for national security, AI presents unparalleled awareness to decision-making support. In particular, the opportunities for strengthening our defense capabilities, advent of Large Language Models (LLMs) has significantly similar to how space and cyberspace technology transformed impacted the field of natural language processing, providing our approach to warfare and reconnaissance.
Improving Cross-Task Generalization with Step-by-Step Instructions
Wu, Yang, Zhao, Yanyan, Li, Zhongyang, Qin, Bing, Xiong, Kai
Instruction tuning has been shown to be able to improve cross-task generalization of language models. However, it is still challenging for language models to complete the target tasks following the instructions, as the instructions are general and lack intermediate steps. To address this problem, we propose to incorporate the step-by-step instructions to help language models to decompose the tasks, which can provide the detailed and specific procedures for completing the target tasks. The step-by-step instructions are obtained automatically by prompting ChatGPT, which are further combined with the original instructions to tune language models. The extensive experiments on SUP-NATINST show that the high-quality step-by-step instructions can improve cross-task generalization across different model sizes. Moreover, the further analysis indicates the importance of the order of steps of the step-by-step instruction for the improvement. To facilitate future research, we release the step-by-step instructions and their human quality evaluation results.
Do Large Language Models Show Decision Heuristics Similar to Humans? A Case Study Using GPT-3.5
Suri, Gaurav, Slater, Lily R., Ziaee, Ali, Nguyen, Morgan
A Large Language Model (LLM) is an artificial intelligence system that has been trained on vast amounts of natural language data, enabling it to generate human-like responses to written or spoken language input. GPT-3.5 is an example of an LLM that supports a conversational agent called ChatGPT. In this work, we used a series of novel prompts to determine whether ChatGPT shows heuristics, biases, and other decision effects. We also tested the same prompts on human participants. Across four studies, we found that ChatGPT was influenced by random anchors in making estimates (Anchoring Heuristic, Study 1); it judged the likelihood of two events occurring together to be higher than the likelihood of either event occurring alone, and it was erroneously influenced by salient anecdotal information (Representativeness and Availability Heuristic, Study 2); it found an item to be more efficacious when its features were presented positively rather than negatively - even though both presentations contained identical information (Framing Effect, Study 3); and it valued an owned item more than a newly found item even though the two items were identical (Endowment Effect, Study 4). In each study, human participants showed similar effects. Heuristics and related decision effects in humans are thought to be driven by cognitive and affective processes such as loss aversion and effort reduction. The fact that an LLM - which lacks these processes - also shows such effects invites consideration of the possibility that language may play a role in generating these effects in humans.
Unlocking Practical Applications in Legal Domain: Evaluation of GPT for Zero-Shot Semantic Annotation of Legal Texts
We evaluated the capability of a state-of-the-art generative pre-trained transformer (GPT) model to perform semantic annotation of short text snippets (one to few sentences) coming from legal documents of various types. Discussions of potential uses (e.g., document drafting, summarization) of this emerging technology in legal domain have intensified, but to date there has not been a rigorous analysis of these large language models' (LLM) capacity in sentence-level semantic annotation of legal texts in zero-shot learning settings. Yet, this particular type of use could unlock many practical applications (e.g., in contract review) and research opportunities (e.g., in empirical legal studies). We fill the gap with this study. We examined if and how successfully the model can semantically annotate small batches of short text snippets (10-50) based exclusively on concise definitions of the semantic types. We found that the GPT model performs surprisingly well in zero-shot settings on diverse types of documents (F1=.73 on a task involving court opinions, .86 for contracts, and .54 for statutes and regulations). These findings can be leveraged by legal scholars and practicing lawyers alike to guide their decisions in integrating LLMs in wide range of workflows involving semantic annotation of legal texts.
Shortcut Learning of Large Language Models in Natural Language Understanding
Du, Mengnan, He, Fengxiang, Zou, Na, Tao, Dacheng, Hu, Xia
Large language models (LLMs) have achieved state-of-the-art performance on a series of natural language understanding tasks. However, these LLMs might rely on dataset bias and artifacts as shortcuts for prediction. This has significantly affected their generalizability and adversarial robustness. In this paper, we provide a review of recent developments that address the shortcut learning and robustness challenge of LLMs. We first introduce the concepts of shortcut learning of language models. We then introduce methods to identify shortcut learning behavior in language models, characterize the reasons for shortcut learning, as well as introduce mitigation solutions. Finally, we discuss key research challenges and potential research directions in order to advance the field of LLMs.
Plan, Eliminate, and Track -- Language Models are Good Teachers for Embodied Agents
Wu, Yue, Min, So Yeon, Bisk, Yonatan, Salakhutdinov, Ruslan, Azaria, Amos, Li, Yuanzhi, Mitchell, Tom, Prabhumoye, Shrimai
Pre-trained large language models (LLMs) capture procedural knowledge about the world. Recent work has leveraged LLM's ability to generate abstract plans to simplify challenging control tasks, either by action scoring, or action modeling (fine-tuning). However, the transformer architecture inherits several constraints that make it difficult for the LLM to directly serve as the agent: e.g. limited input lengths, fine-tuning inefficiency, bias from pre-training, and incompatibility with non-text environments. To maintain compatibility with a low-level trainable actor, we propose to instead use the knowledge in LLMs to simplify the control problem, rather than solving it. We propose the Plan, Eliminate, and Track (PET) framework. The Plan module translates a task description into a list of high-level sub-tasks. The Eliminate module masks out irrelevant objects and receptacles from the observation for the current sub-task. Finally, the Track module determines whether the agent has accomplished each sub-task. On the AlfWorld instruction following benchmark, the PET framework leads to a significant 15% improvement over SOTA for generalization to human goal specifications.
Machine Generated Text: A Comprehensive Survey of Threat Models and Detection Methods
Crothers, Evan, Japkowicz, Nathalie, Viktor, Herna
Machine generated text is increasingly difficult to distinguish from human authored text. Powerful open-source models are freely available, and user-friendly tools that democratize access to generative models are proliferating. ChatGPT, which was released shortly after the first edition of this survey, epitomizes these trends. The great potential of state-of-the-art natural language generation (NLG) systems is tempered by the multitude of avenues for abuse. Detection of machine generated text is a key countermeasure for reducing abuse of NLG models, with significant technical challenges and numerous open problems. We provide a survey that includes both 1) an extensive analysis of threat models posed by contemporary NLG systems, and 2) the most complete review of machine generated text detection methods to date. This survey places machine generated text within its cybersecurity and social context, and provides strong guidance for future work addressing the most critical threat models, and ensuring detection systems themselves demonstrate trustworthiness through fairness, robustness, and accountability.