AITopics | Large Language Model

Collaborating Authors

Large Language Model

News Overviews Instructional Materials AI-Alerts Classics

She Built an App to Block Harassment on Twitter. Elon Musk Killed It

TIME - TechJun-2-2023, 15:01:59 GMT

Tracy Chou launched the Twitter app Block Party in 2021 to help users escape targeted harassment campaigns that she--as an Asian American woman--knew from personal experience could ostracize vulnerable voices from the public conversation. But on Wednesday Block Party closed its doors, becoming the latest victim of soaring new bills imposed by a struggling Twitter under new owner Elon Musk. Under Twitter's former ownership, Chou struck a deal with the company for free access to data--a win-win arrangement that would allow Block Party to grow and provide Twitter with a valuable anti-harassment tool to which it didn't have to devote expensive engineering time. But Chou tells TIME that following the recent expiration of that contract, Twitter wanted Block Party to pay $42,000 per month for access to enough data to keep the app running. There was no way Block Party could afford the figure, she says.

large language model, natural language, sociale medier, (22 more...)

TIME - Tech

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.34)

Add feedback

The Instagram Founders' News App Artifact Is Actually an AI Play

WIREDJun-2-2023, 13:00:00 GMT

Today, Artifact is taking another jump on the generative-AI rocket ship in an attempt to address an annoying problem--clickbaity headlines. The app already offers a way for users to flag clickbait stories, and if multiple people tag an article, Artifact won't spread it. But, Systrom explains, sometimes the problem isn't with the story but the headline. It might promise too much, or mislead, or lure the reader into clicking just to find some information that's held back from the headline. From the publisher's viewpoint, winning more clicks is a big plus--but it's frustrating to users, who might feel they have been manipulated.

large language model, machine learning, natural language, (18 more...)

WIRED

Industry:

Media > News (0.54)
Information Technology > Services (0.51)
Information Technology > Software (0.42)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.56)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.37)

Add feedback

ChatGPT took their jobs. Now they walk dogs and fix air conditioners.

Washington Post - Technology NewsJun-2-2023, 10:07:54 GMT

Companies that replaced workers with chatbots have faced high-profile stumbles. When the technology news site CNET used artificial intelligence to write articles, the results were riddled with errors and resulted in lengthy corrections. A lawyer who relied on ChatGPT for a legal brief cited numerous fictitious cases. And the National Eating Disorders Association, which laid off people staffing its helpline and reportedly replaced them with a chatbot, suspended its use of the technology after it doled out insensitive and harmful advice.

chatbot, chatgpt, dog and fix air conditioner

Washington Post - Technology News

Industry:

Law (0.78)
Construction & Engineering > HVAC (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.72)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

When Federated Learning Meets Pre-trained Language Models' Parameter-Efficient Tuning Methods

Zhang, Zhuo, Yang, Yuanhang, Dai, Yong, Qu, Lizhen, Xu, Zenglin

arXiv.org Artificial IntelligenceJun-2-2023

With increasing privacy concerns on data, recent studies have made significant progress using federated learning (FL) on privacy-sensitive natural language processing (NLP) tasks. Much literature suggests fully fine-tuning pre-trained language models (PLMs) in the FL paradigm can mitigate the data heterogeneity problem and close the performance gap with centralized training. However, large PLMs bring the curse of prohibitive communication overhead and local model adaptation costs for the FL system. To this end, we introduce various parameter-efficient tuning (PETuning) methods into federated learning. Specifically, we provide a holistic empirical study of representative PLMs tuning methods in FL. The experimental results cover the analysis of data heterogeneity levels, data scales, and different FL scenarios. Overall communication overhead can be significantly reduced by locally tuning and globally aggregating lightweight model parameters while maintaining acceptable performance in various FL settings. To facilitate the research of PETuning in FL, we also develop a federated tuning framework FedPETuning, which allows practitioners to exploit different PETuning methods under the FL training paradigm conveniently. The source code is available at \url{https://github.com/iezhuozhuo/FedETuning/tree/deltaTuning}.

large language model, machine learning, petuning method, (19 more...)

arXiv.org Artificial Intelligence

2212.10025

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Guangdong Province > Shenzhen (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Pre-computed memory or on-the-fly encoding? A hybrid approach to retrieval augmentation makes the most of your compute

de Jong, Michiel, Zemlyanskiy, Yury, FitzGerald, Nicholas, Ainslie, Joshua, Sanghai, Sumit, Sha, Fei, Cohen, William

arXiv.org Artificial IntelligenceJun-2-2023

Retrieval-augmented language models such as Fusion-in-Decoder are powerful, setting the state of the art on a variety of knowledge-intensive tasks. However, they are also expensive, due to the need to encode a large number of retrieved passages. Some work avoids this cost by pre-encoding a text corpus into a memory and retrieving dense representations directly. However, pre-encoding memory incurs a severe quality penalty as the memory representations are not conditioned on the current input. We propose LUMEN, a hybrid between these two extremes, pre-computing the majority of the retrieval representation and completing the encoding on the fly using a live encoder that is conditioned on the question and fine-tuned for the task. We show that LUMEN significantly outperforms pure memory on multiple question-answering tasks while being much cheaper than FiD, and outperforms both for any given compute budget. Moreover, the advantage of LUMEN over FiD increases with model size.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2301.10448

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > California (0.14)
South America > Colombia > Meta Department > Villavicencio (0.04)
(10 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)

Add feedback

ProgSG: Cross-Modality Representation Learning for Programs in Electronic Design Automation

Bai, Yunsheng, Sohrabizadeh, Atefeh, Qin, Zongyue, Hu, Ziniu, Sun, Yizhou, Cong, Jason

arXiv.org Artificial IntelligenceJun-2-2023

Recent years have witnessed the growing popularity of domain-specific accelerators (DSAs), such as Google's TPUs, for accelerating various applications such as deep learning, search, autonomous driving, etc. To facilitate DSA designs, high-level synthesis (HLS) is used, which allows a developer to compile a high-level description in the form of software code in C and C++ into a design in low-level hardware description languages (such as VHDL or Verilog) and eventually synthesized into a DSA on an ASIC (application-specific integrated circuit) or FPGA (field-programmable gate arrays). However, existing HLS tools still require microarchitecture decisions, expressed in terms of pragmas (such as directives for parallelization and pipelining). To enable more people to design DSAs, it is desirable to automate such decisions with the help of deep learning for predicting the quality of HLS designs. This requires us a deeper understanding of the program, which is a combination of original code and pragmas. Naturally, these programs can be considered as sequence data, for which large language models (LLM) can help. In addition, these programs can be compiled and converted into a control data flow graph (CDFG), and the compiler also provides fine-grained alignment between the code tokens and the CDFG nodes. However, existing works either fail to leverage both modalities or combine the two in shallow or coarse ways. We propose ProgSG allowing the source code sequence modality and the graph modalities to interact with each other in a deep and fine-grained way. To alleviate the scarcity of labeled designs, a pre-training method is proposed based on a suite of compiler's data flow analysis tasks. Experimental results on two benchmark datasets show the superiority of ProgSG over baseline methods that either only consider one modality or combine the two without utilizing the alignment information.

cross-modality representation learning, large language model, machine learning, (5 more...)

arXiv.org Artificial Intelligence

2305.10838

Genre: Research Report (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.44)

Add feedback

The Science of Detecting LLM-Generated Texts

Tang, Ruixiang, Chuang, Yu-Neng, Hu, Xia

arXiv.org Artificial IntelligenceJun-2-2023

The emergence of large language models (LLMs) has resulted in the production of LLM-generated texts that is highly sophisticated and almost indistinguishable from texts written by humans. However, this has also sparked concerns about the potential misuse of such texts, such as spreading misinformation and causing disruptions in the education system. Although many detection approaches have been proposed, a comprehensive understanding of the achievements and challenges is still lacking. This survey aims to provide an overview of existing LLM-generated text detection techniques and enhance the control and regulation of language generation models. Furthermore, we emphasize crucial considerations for future research, including the development of comprehensive evaluation metrics and the threat posed by open-source LLMs, to drive progress in the area of LLM-generated text detection.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2303.07205

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Texas (0.04)
North America > United States > New York (0.04)
(6 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Media > News (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Education (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)

Add feedback

Distinguishing ChatGPT(-3.5, -4)-generated and human-written papers through Japanese stylometric analysis

Zaitsu, Wataru, Jin, Mingzhe

arXiv.org Artificial IntelligenceJun-2-2023

In the first half of 2023, text-generative artificial intelligence (AI), including ChatGPT, equipped with GPT-3.5 and GPT-4, from OpenAI, has attracted considerable attention worldwide. In this study, first, we compared Japanese stylometric features of texts generated by GPT (-3.5 and -4) and those written by humans. In this work, we performed multi-dimensional scaling (MDS) to confirm the distributions of 216 texts of three classes (72 academic papers written by 36 single authors, 72 texts generated by GPT-3.5, and 72 texts generated by GPT-4 on the basis of the titles of the aforementioned papers) focusing on the following stylometric features: (1) bigrams of parts-of-speech, (2) bigram of postpositional particle words, (3) positioning of commas, and (4) rate of function words. MDS revealed distinct distributions at each stylometric feature of GPT (-3.5 and -4) and human. Although GPT-4 is more powerful than GPT-3.5 because it has more parameters, both GPT (-3.5 and -4) distributions are likely to overlap. These results indicate that although the number of parameters may increase in the future, GPT-generated texts may not be close to that written by humans in terms of stylometric features. Second, we verified the classification performance of random forest (RF) for two classes (GPT and human) focusing on Japanese stylometric features. This study revealed the high performance of RF in each stylometric feature: The RF classifier focusing on the rate of function words achieved 98.1% accuracy. Furthermore the RF classifier focusing on all stylometric features reached 100% in terms of all performance indexes (accuracy, recall, precision, and F1 score). This study concluded that at this stage we human discriminate ChatGPT from human limited to Japanese language.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2304.05534

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.05)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.72)

Add feedback

Linguistic Properties of Truthful Response

Lee, Bruce W., Arockiaraj, Benedict Florance, Jin, Helen

arXiv.org Artificial IntelligenceJun-2-2023

We investigate the phenomenon of an LLM's untruthful response using a large set of 220 handcrafted linguistic features. We focus on GPT-3 models and find that the linguistic profiles of responses are similar across model sizes. That is, how varying-sized LLMs respond to given prompts stays similar on the linguistic properties level. We expand upon this finding by training support vector machines that rely only upon the stylistic components of model responses to classify the truthfulness of statements. Though the dataset size limits our current findings, we show the possibility that truthfulness detection is possible without evaluating the content itself. But at the same time, the limited scope of our experiments must be taken into account in interpreting the results.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2305.15875

Country:

North America > United States > Pennsylvania (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.55)

Add feedback

ThinkSum: Probabilistic reasoning over sets using large language models

Ozturkler, Batu, Malkin, Nikolay, Wang, Zhen, Jojic, Nebojsa

arXiv.org Artificial IntelligenceJun-2-2023

Large language models (LLMs) have a substantial capacity for high-level analogical reasoning: reproducing patterns in linear text that occur in their training data (zero-shot evaluation) or in the provided context (few-shot in-context learning). However, recent studies show that even the more advanced LLMs fail in scenarios that require reasoning over multiple objects or facts and making sequences of logical deductions. We propose a two-stage probabilistic inference paradigm, ThinkSum, which reasons over sets of objects or facts in a structured manner. In the first stage (Think - retrieval of associations), a LLM is queried in parallel over a set of phrases extracted from the prompt or an auxiliary model call. In the second stage (Sum - probabilistic inference or reasoning), the results of these queries are aggregated to make the final prediction. We demonstrate the possibilities and advantages of ThinkSum on the BIG-bench suite of LLM evaluation tasks, achieving improvements over the state of the art using GPT-family models on thirteen difficult tasks, often with far smaller model variants. We also compare and contrast ThinkSum with other proposed modifications to direct prompting of LLMs, such as variants of chain-of-thought prompting. Our results suggest that because the probabilistic inference in ThinkSum is performed outside of calls to the LLM, ThinkSum is less sensitive to prompt design, yields more interpretable predictions, and can be flexibly combined with latent variable models to extract structured knowledge from LLMs. Overall, our proposed paradigm represents a promising approach for enhancing the reasoning capabilities of LLMs.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2210.01293

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
North America > Canada > Quebec > Montreal (0.04)
(9 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment > Sports (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback