AITopics

2212.1045

Country:

Asia > Philippines > Luzon > National Capital Region > City of Manila (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Indonesia > Java > Jakarta > Jakarta (0.04)
(10 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Transportation > Air (0.93)
Media (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

arXiv.org Artificial IntelligenceJun-14-2023

DOC: Improving Long Story Coherence With Detailed Outline Control

Yang, Kevin, Klein, Dan, Peng, Nanyun, Tian, Yuandong

We propose the Detailed Outline Control (DOC) framework for improving long-range plot coherence when automatically generating several-thousand-word-long stories. DOC consists of two complementary components: a detailed outliner and a detailed controller. The detailed outliner creates a more detailed, hierarchically structured outline, shifting creative burden from the main drafting procedure to the planning stage. The detailed controller ensures the more detailed outline is still respected during generation by controlling story passages to align with outline details. In human evaluations of automatically generated stories, DOC substantially outperforms a strong Re3 baseline (Yang et al., 2022) on plot coherence (22.5% absolute gain), outline relevance (28.2%), and interestingness (20.7%). Humans also judged DOC to be much more controllable in an interactive generation setting.

large language model, machine learning, natural language, (22 more...)

2212.10077

Country:

Asia > Russia (0.13)
Africa > South Africa (0.13)
North America > United States > New York (0.04)
(11 more...)

Genre:

Personal > Obituary (1.00)
Personal > Interview (1.00)
Research Report > New Finding (0.67)

Industry:

Media > News (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
(8 more...)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)
(2 more...)

EngadgetJun-13-2023, 18:20:44 GMT

OpenAI reportedly warned Microsoft about rushing GPT-4 integration into Bing

OpenAI warned Microsoft early this year about rushing the integration of GPT-4 into Bing without further training, according to The Wall Street Journal. Although Microsoft forged ahead anyway, the alert proved prescient as early users noticed "unhinged" behavior in the Bing AI tool. Rather than buying OpenAI outright, Microsoft invested in a 49-percent stake in the artificial intelligence startup, a strategy designed to help it avoid antitrust scrutiny. The arrangement gave Microsoft early access to OpenAI's ChatGPT and DALL-E 2 to boost its Bing search engine. In addition, it's adding OpenAI-powered CoPilot to Office and other software products as rival Google scrambles to catch up.

integration, microsoft, openai, (9 more...)

Engadget

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

WSJ.com: WSJD - TechnologyJun-13-2023, 13:00:00 GMT

Microsoft and OpenAI Forge Awkward Partnership

As the companies lead the AI boom, their unconventional arrangement sometimes causes conflict

large language model, machine learning, natural language, (4 more...)

WSJ.com: WSJD - Technology

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.78)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.78)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.78)

The Japan TimesJun-13-2023, 03:55:35 GMT

Homework will 'never be the same' says ChatGPT founder

Artificial intelligence tools will revolutionize education like calculators did, but will not supplant learning, ChatGPT's founder Sam Altman told students in Tokyo on Monday, defending the new technology. "Probably take-home essays are never going to be quite the same again," the OpenAI chief said in remarks at Keio University. "We have a new tool in education. Sort of like a calculator for words," he said. "And the way we teach people is going to have to change and the way we evaluate students is going to have to change."

large language model, machine learning, say chatgpt founder, (7 more...)

The Japan Times

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.31)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.31)

Mishra, Aakash, Mittal, Rajat, Jestin, Christy, Tingos, Kostas, Rajpurkar, Pranav

Improving Zero-Shot Detection of Low Prevalence Chest Pathologies using Domain Pre-trained Language Models

Recent advances in zero-shot learning have enabled the use of paired image-text data to replace structured labels, replacing the need for expert annotated datasets. Models such as CLIP-based CheXzero utilize these advancements in the domain of chest X-ray interpretation. We hypothesize that domain pre-trained models such as CXR-BERT, BlueBERT, and ClinicalBERT offer the potential to improve the performance of CLIP-like models with specific domain knowledge by replacing BERT weights at the cost of breaking the original model's alignment. We evaluate the performance of zero-shot classification models with domain-specific pre-training for detecting low-prevalence pathologies. Even though replacing the weights of the original CLIP-BERT degrades model performance on commonly found pathologies, we show that pre-trained text towers perform exceptionally better on low-prevalence diseases. This motivates future ensemble models with a combination of differently trained language models for maximal performance.

artificial intelligence, domain pre-trained language model, machine learning, (2 more...)

2306.08

Genre: Research Report (1.00)

Industry:

Health & Medicine > Diagnostic Medicine (0.80)
Energy > Oil & Gas > Upstream (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.80)

Assessing the Effectiveness of GPT-3 in Detecting False Political Statements: A Case Study on the LIAR Dataset

Buchholz, Mars Gokturk

The detection of political fake statements is crucial for maintaining information integrity and preventing the spread of misinformation in society. Historically, state-of-the-art machine learning models employed various methods for detecting deceptive statements. These methods include the use of metadata (W. Wang et al., 2018), n-grams analysis (Singh et al., 2021), and linguistic (Wu et al., 2022) and stylometric (Islam et al., 2020) features. Recent advancements in large language models, such as GPT-3 (Brown et al., 2020) have achieved state-of-the-art performance on a wide range of tasks. In this study, we conducted experiments with GPT-3 on the LIAR dataset (W. Wang et al., 2018) and achieved higher accuracy than state-of-the-art models without using any additional meta or linguistic features. Additionally, we experimented with zero-shot learning using a carefully designed prompt and achieved near state-of-the-art performance. An advantage of this approach is that the model provided evidence for its decision, which adds transparency to the model's decision-making and offers a chance for users to verify the validity of the evidence provided.

large language model, machine learning, natural language, (18 more...)

2306.0819

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > Mexico (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Industry:

Media > News (1.00)
Law (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Truong, Thinh Hung, Baldwin, Timothy, Verspoor, Karin, Cohn, Trevor

Language models are not naysayers: An analysis of language models on negation benchmarks

Negation has been shown to be a major bottleneck for masked language models, such as BERT. However, whether this finding still holds for larger-sized auto-regressive language models (``LLMs'') has not been studied comprehensively. With the ever-increasing volume of research and applications of LLMs, we take a step back to evaluate the ability of current-generation LLMs to handle negation, a fundamental linguistic phenomenon that is central to language understanding. We evaluate different LLMs -- including the open-source GPT-neo, GPT-3, and InstructGPT -- against a wide range of negation benchmarks. Through systematic experimentation with varying model sizes and prompts, we show that LLMs have several limitations including insensitivity to the presence of negation, an inability to capture the lexical semantics of negation, and a failure to reason under negation.

large language model, machine learning, natural language, (17 more...)

2306.08189

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
North America > United States > Washington > King County > Seattle (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
(4 more...)

Genre: Research Report (0.82)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Safeguarding Data in Multimodal AI: A Differentially Private Approach to CLIP Training

Huang, Alyssa, Liu, Peihan, Nakada, Ryumei, Zhang, Linjun, Zhang, Wanrong

The surge in multimodal AI's success has sparked concerns over data privacy in vision-and-language tasks. While CLIP has revolutionized multimodal learning through joint training on images and text, its potential to unintentionally disclose sensitive information necessitates the integration of privacy-preserving mechanisms. We introduce a differentially private adaptation of the Contrastive Language-Image Pretraining (CLIP) model that effectively addresses privacy concerns while retaining accuracy. Our proposed method, Dp-CLIP, is rigorously evaluated on benchmark datasets encompassing diverse vision-and-language tasks such as image classification and visual question answering. We demonstrate that our approach retains performance on par with the standard non-private CLIP model. Furthermore, we analyze our proposed algorithm under linear representation settings. We derive the convergence rate of our algorithm and show a trade-off between utility and privacy when gradients are clipped per-batch and the loss function does not satisfy smoothness conditions assumed in the literature for the analysis of DP-SGD.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

2306.08173

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > India (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

INT2.1: Towards Fine-Tunable Quantized Large Language Models with Error Correction through Low-Rank Adaptation

Chai, Yuji, Gkountouras, John, Ko, Glenn G., Brooks, David, Wei, Gu-Yeon

We introduce a method that dramatically reduces fine-tuning VRAM requirements and rectifies quantization errors in quantized Large Language Models. First, we develop an extremely memory-efficient fine-tuning (EMEF) method for quantized models using Low-Rank Adaptation (LoRA), and drawing upon it, we construct an error-correcting algorithm designed to minimize errors induced by the quantization process. Our method reduces the memory requirements by up to 5.6 times, which enables fine-tuning a 7 billion parameter Large Language Model (LLM) on consumer laptops. At the same time, we propose a Low-Rank Error Correction (LREC) method that exploits the added LoRA layers to ameliorate the gap between the quantized model and its float point counterpart. Our error correction framework leads to a fully functional INT2 quantized LLM with the capacity to generate coherent English text. To the best of our knowledge, this is the first INT2 Large Language Model that has been able to reach such a performance. The overhead of our method is merely a 1.05 times increase in model size, which translates to an effective precision of INT2.1. Also, our method readily generalizes to other quantization standards, such as INT3, INT4, and INT8, restoring their lost performance, which marks a significant milestone in the field of model quantization. The strategies delineated in this paper hold promising implications for the future development and optimization of quantized models, marking a pivotal shift in the landscape of low-resource machine learning computations.

large language model, machine learning, natural language, (18 more...)

2306.08162

Country:

Europe > France (0.94)
North America > Canada > Saskatchewan (0.04)
North America > Canada > Quebec (0.04)
(15 more...)

Genre: Research Report > New Finding (0.46)

Industry: Government > Regional Government > Europe Government > France Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)