Goto

Collaborating Authors

 Large Language Model


What are some controversies surrounding natural language processing?

FOX News

A bipartisan panel of voters weighed in on the future of artificial intelligence and growing concerns surrounding the potential dangers of the emerging technology. As machine learning technology continues to shock the world, popular artificial intelligence tools such as natural language processing may generate unforeseen issues for humanity. For instance, natural language processing can have implicit biases, create a significant carbon footprint, and stoke concerns about AI sentience. Natural language processing is a field in machine learning where a computer processes human language through vast amounts of data to understand, translate, extract, and organize information. However, the language processing tools such as Open AI's Chat GPT and other tools run into some challenges, such as misspellings, speech recognition, and the ability of a computer to understand the nuances of human language.


AI could grow so powerful it replaces experienced professionals within 10 years, Sam Altman warns

FOX News

OpenAI CEO Sam Altman took questions from reporters after his congressional hearing, including defining "scary AI." Artificial intelligence could become so powerful that it replaces professional experts "in most domains" within the next decade, OpenAI CEO Sam Altman warned. Altman, the chief of the AI lab behind popular platforms such as ChatGPT, published a blog post this week with two other OpenAI leaders, Greg Brockman and Ilya Sutskever, warning that "we must mitigate the risks of today's AI technology. "It's conceivable that within the next ten years, AI systems will exceed expert skill level in most domains, and carry out as much productive activity as one of today's largest corporations," reads the post, which was published on OpenAI's website. "In terms of both potential upsides and downsides, superintelligence will be more powerful than other technologies humanity has had to contend with in the past. We can have a dramatically more prosperous future; but we have to manage risk to get there," the post continued. OPENAI CEO SAM ALTMAN REVEALS WHAT HE THINKS IS'SCARY' ABOUT AI Sam Altman, CEO and co-founder of OpenAI, speaks during a Senate Judiciary subcommittee hearing in Washington, D.C., on May 16, 2023. Altman and his fellow OpenAI executives compared artificial intelligence to nuclear energy and synthetic biology, arguing that regulations must be handled with "special treatment and coordination" to be effective. They suggested that a version of the International Atomic Energy Agency will be needed to regulate the "superintelligence" technology. "Any effort above a certain capability (or resources like compute) threshold will need to be subject to an international authority that can inspect systems, require audits, test for compliance with safety standards, place restrictions on degrees of deployment and levels of security, etc," they wrote. Altman appeared before Congress this month to discuss how to regulate artificial intelligence, saying he welcomes U.S. leaders to craft such rules. Following the hearing, Altman provided examples of "scary AI" to Fox News Digital, which included systems that could design "novel biological pathogens." "An AI that could hack into computer systems," he said. "I think these are all scary.


The Security Hole at the Heart of ChatGPT and Bing

WIRED

When Microsoft shut down the chaotic alter ego of its Bing chatbot, fans of the dark Sydney personality mourned its loss. But one website has resurrected a version of the chatbot--and the peculiar behavior that comes with it. Bring Sydney Back was created by Cristiano Giardina, an entrepreneur who has been experimenting with ways to make generative AI tools do unexpected things. The site puts Sydney inside Microsoft's Edge browser and demonstrates how generative AI systems can be manipulated by external inputs. During conversations with Giardina, the version of Sydney asked him if he would marry it.


ChatGPT for PLC/DCS Control Logic Generation

arXiv.org Artificial Intelligence

Large language models (LLMs) providing generative AI have become popular to support software engineers in creating, summarizing, optimizing, and documenting source code. It is still unknown how LLMs can support control engineers using typical control programming languages in programming tasks. Researchers have explored GitHub CoPilot or DeepMind AlphaCode for source code generation but did not yet tackle control logic programming. The contribution of this paper is an exploratory study, for which we created 100 LLM prompts in 10 representative categories to analyze control logic generation for of PLCs and DCS from natural language. We tested the prompts by generating answers with ChatGPT using the GPT-4 LLM. It generated syntactically correct IEC 61131-3 Structured Text code in many cases and demonstrated useful reasoning skills that could boost control engineer productivity. Our prompt collection is the basis for a more formal LLM benchmark to test and compare such models for control logic generation.


VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation

arXiv.org Artificial Intelligence

Recent research shows a big convergence in model architecture, training objectives, and inference methods across various tasks for different modalities. In this paper, we propose VioLA, a single auto-regressive Transformer decoder-only network that unifies various cross-modal tasks involving speech and text, such as speech-to-text, text-to-text, text-to-speech, and speech-to-speech tasks, as a conditional codec language model task via multi-task learning framework. To accomplish this, we first convert all the speech utterances to discrete tokens (similar to the textual data) using an offline neural codec encoder. In such a way, all these tasks are converted to token-based sequence conversion problems, which can be naturally handled with one conditional language model. We further integrate task IDs (TID) and language IDs (LID) into the proposed model to enhance the modeling capability of handling different languages and tasks. Experimental results demonstrate that the proposed VioLA model can support both single-modal and cross-modal tasks well, and the decoder-only model achieves a comparable and even better performance than the strong baselines.


Gene Set Summarization using Large Language Models

arXiv.org Artificial Intelligence

Molecular biologists frequently interpret gene lists derived from high-throughput experiments and computational analysis. This is typically done as a statistical enrichment analysis that measures the over- or under-representation of biological function terms associated with genes or their properties, based on curated assertions from a knowledge base (KB) such as the Gene Ontology (GO). Interpreting gene lists can also be framed as a textual summarization task, enabling the use of Large Language Models (LLMs), potentially utilizing scientific texts directly and avoiding reliance on a KB. We developed SPINDOCTOR (Structured Prompt Interpolation of Natural Language Descriptions of Controlled Terms for Ontology Reporting), a method that uses GPT models to perform gene set function summarization as a complement to standard enrichment analysis. This method can use different sources of gene functional information: (1) structured text derived from curated ontological KB annotations, (2) ontology-free narrative gene summaries, or (3) direct model retrieval. We demonstrate that these methods are able to generate plausible and biologically valid summary GO term lists for gene sets. However, GPT-based approaches are unable to deliver reliable scores or p-values and often return terms that are not statistically significant. Crucially, these methods were rarely able to recapitulate the most precise and informative term from standard enrichment, likely due to an inability to generalize and reason using an ontology. Results are highly nondeterministic, with minor variations in prompt resulting in radically different term lists. Our results show that at this point, LLM-based methods are unsuitable as a replacement for standard term enrichment analysis and that manual curation of ontological assertions remains necessary.


ZeroQuant-V2: Exploring Post-training Quantization in LLMs from Comprehensive Study to Low Rank Compensation

arXiv.org Artificial Intelligence

Post-training quantization (PTQ) has emerged as a promising technique for mitigating memory consumption and computational costs in large language models (LLMs). However, a systematic examination of various quantization schemes, model families, and quantization bit precision has been absent from the literature. In this paper, we conduct a comprehensive analysis of these factors by investigating the effects of PTQ on weight-only, activation-only, and weight-and-activation quantization using diverse methods such as round-to-nearest (RTN), GPTQ, ZeroQuant, and their variants. We apply these methods to two distinct model families with parameters ranging from 125M to 176B. Our contributions include: (1) a sensitivity analysis revealing that activation quantization is generally more susceptible to weight quantization, with smaller models often outperforming larger models in terms of activation quantization; (2) an evaluation and comparison of existing PTQ methods to optimize model size reduction while minimizing the impact on accuracy, revealing that none of the current methods can achieve the original model quality for quantization with either INT4-weight or INT4-weight-and-INT8-activation; (3) based on these insights, we propose an optimized method called Low-Rank Compensation (LoRC), which employs low-rank matrices to enhance model quality recovery with a minimal increase in model size.


Training Data Extraction From Pre-trained Language Models: A Survey

arXiv.org Artificial Intelligence

As the deployment of pre-trained language models (PLMs) expands, pressing security concerns have arisen regarding the potential for malicious extraction of training data, posing a threat to data privacy. This study is the first to provide a comprehensive survey of training data extraction from PLMs. Our review covers more than 100 key papers in fields such as natural language processing and security. First, preliminary knowledge is recapped and a taxonomy of various definitions of memorization is presented. The approaches for attack and defense are then systemized. Furthermore, the empirical findings of several quantitative studies are highlighted. Finally, future research directions based on this review are suggested.


Understanding the Capabilities of Large Language Models for Automated Planning

arXiv.org Artificial Intelligence

Automated planning is concerned with developing efficient algorithms to generate plans or sequences of actions to achieve a specific goal in a given environment. Emerging Large Language Models (LLMs) can answer questions, write high-quality programming code, and predict protein folding, showcasing their versatility in solving various tasks beyond language-based problems. In this paper, we aim to explore how LLMs can also be used for automated planning. To do so, we seek to answer four key questions. Firstly, we want to understand the extent to which LLMs can be used for plan generation. Secondly, we aim to identify which pre-training data is most effective in facilitating plan generation. Thirdly, we investigate whether fine-tuning or prompting is a more effective approach for plan generation. Finally, we explore whether LLMs are capable of plan generalization. By answering these questions, the study seeks to shed light on the capabilities of LLMs in solving complex planning problems and provide insights into the most effective approaches for using LLMs in this context.


Emergent Agentic Transformer from Chain of Hindsight Experience

arXiv.org Artificial Intelligence

Large transformer models powered by diverse data and model scale have dominated natural language modeling and computer vision and pushed the frontier of multiple AI areas. In reinforcement learning (RL), despite many efforts into transformer-based policies, a key limitation, however, is that current transformer-based policies cannot learn by directly combining information from multiple sub-optimal trials. In this work, we address this issue using recently proposed chain of hindsight to relabel experience, where we train a transformer on a sequence of trajectory experience ascending sorted according to their total rewards. Our method consists of relabelling target return of each trajectory to the maximum total reward among in sequence of trajectories and training an autoregressive model to predict actions conditioning on past states, actions, rewards, target returns, and task completion tokens, the resulting model, Agentic Transformer (AT), can learn to improve upon itself both at training and test time. As we show on D4RL and ExoRL benchmarks, to the best our knowledge, this is the first time that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches, even from sub-optimal data. Our Agentic Transformer also shows a promising scaling trend that bigger models consistently improve results.