AITopics

GAMMT: Generative Ambiguity Modeling Using Multiple Transformers

Xu, Xingcheng

We introduce a novel model called GAMMT (Generative Ambiguity Models using Multiple Transformers) for sequential data that is based on sets of probabilities. Unlike conventional models, our approach acknowledges that the data generation process of a sequence is not deterministic, but rather ambiguous and influenced by a set of probabilities. To capture this ambiguity, GAMMT employs multiple parallel transformers that are linked by a selection mechanism, allowing for the approximation of ambiguous probabilities. The generative nature of our approach also enables multiple representations of input tokens and sequences. While our models have not yet undergone experimental validation, we believe that our model has great potential to achieve high quality and diversity in modeling sequences with uncertain data generation processes.

ambiguity, large language model, machine learning, (17 more...)

2211.09812

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > Promising Solution (0.35)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Spencer, Samuel T., Joshi, Vikas, Mitchell, Alison M. W.

Can AI Put Gamma-Ray Astrophysicists Out of a Job?

In what will likely be a litany of generative-model-themed arXiv submissions celebrating April the 1st, we evaluate the capacity of state-of-the-art transformer models to create a paper detailing the detection of a Pulsar Wind Nebula with a non-existent Imaging Atmospheric Cherenkov Telescope (IACT) Array. We do this to evaluate the ability of such models to interpret astronomical observations and sources based on language information alone, and to assess potential means by which fraudulently generated scientific papers could be identified during peer review (given that reliable generative model watermarking has yet to be deployed for these tools). We conclude that our jobs as astronomers are safe for the time being. From this point on, prompts given to ChatGPT and Stable Diffusion are shown in orange, text generated by ChatGPT is shown in black, whereas analysis by the (human) authors is in blue.

large language model, machine learning, natural language, (17 more...)

2303.17853

Country:

Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.72)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.72)

One Small Step for Generative AI, One Giant Leap for AGI: A Complete Survey on ChatGPT in AIGC Era

Zhang, Chaoning, Zhang, Chenshuang, Li, Chenghao, Qiao, Yu, Zheng, Sheng, Dam, Sumit Kumar, Zhang, Mengchun, Kim, Jung Uk, Kim, Seong Tae, Choi, Jinwoo, Park, Gyeong-Moon, Bae, Sung-Ho, Lee, Lik-Hang, Hui, Pan, Kweon, In So, Hong, Choong Seon

OpenAI has recently released GPT-4 (a.k.a. ChatGPT plus), which is demonstrated to be one small step for generative AI (GAI), but one giant leap for artificial general intelligence (AGI). Since its official release in November 2022, ChatGPT has quickly attracted numerous users with extensive media coverage. Such unprecedented attention has also motivated numerous researchers to investigate ChatGPT from various aspects. According to Google scholar, there are more than 500 articles with ChatGPT in their titles or mentioning it in their abstracts. Considering this, a review is urgently needed, and our work fills this gap. Overall, this work is the first to survey ChatGPT with a comprehensive review of its underlying technology, applications, and challenges. Moreover, we present an outlook on how ChatGPT might evolve to realize general-purpose AIGC (a.k.a. AI-generated content), which will be a significant milestone for the development of AGI.

large language model, machine learning, natural language, (18 more...)

2304.06488

Country:

Asia > South Korea (0.05)
Asia > China > Hong Kong (0.04)
Europe > Spain (0.04)
(6 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Instructional Material (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (1.00)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Is ChatGPT a Highly Fluent Grammatical Error Correction System? A Comprehensive Evaluation

Fang, Tao, Yang, Shu, Lan, Kaixin, Wong, Derek F., Hu, Jinpeng, Chao, Lidia S., Zhang, Yue

ChatGPT, a large-scale language model based on the advanced GPT-3.5 architecture, has shown remarkable potential in various Natural Language Processing (NLP) tasks. However, there is currently a dearth of comprehensive study exploring its potential in the area of Grammatical Error Correction (GEC). To showcase its capabilities in GEC, we design zero-shot chain-of-thought (CoT) and few-shot CoT settings using in-context learning for ChatGPT. Our evaluation involves assessing ChatGPT's performance on five official test sets in three different languages, along with three document-level GEC test sets in English. Our experimental results and human evaluations demonstrate that ChatGPT has excellent error detection capabilities and can freely correct errors to make the corrected sentences very fluent, possibly due to its over-correction tendencies and not adhering to the principle of minimal edits. Additionally, its performance in non-English and low-resource settings highlights its potential in multilingual GEC tasks. However, further analysis of various types of errors at the document-level has shown that ChatGPT cannot effectively correct agreement, coreference, tense errors across sentences, and cross-sentence boundary errors.

large language model, machine learning, natural language, (18 more...)

2304.01746

Country:

Asia > Macao (0.04)
Europe > Italy > Tuscany > Florence (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(10 more...)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules

Cheng, Zhi-Qi, Dai, Qi, Li, Siyao, Sun, Jingdong, Mitamura, Teruko, Hauptmann, Alexander G.

Charts are a powerful tool for visually conveying complex data, but their comprehension poses a challenge due to the diverse chart types and intricate components. Existing chart comprehension methods suffer from either heuristic rules or an over-reliance on OCR systems, resulting in suboptimal performance. To address these issues, we present ChartReader, a unified framework that seamlessly integrates chart derendering and comprehension tasks. Our approach includes a transformer-based chart component detection module and an extended pre-trained vision-language model for chart-to-X tasks. By learning the rules of charts automatically from annotated datasets, our approach eliminates the need for manual rule-making, reducing effort and enhancing accuracy.~We also introduce a data variable replacement technique and extend the input and position embeddings of the pre-trained model for cross-task training. We evaluate ChartReader on Chart-to-Table, ChartQA, and Chart-to-Text tasks, demonstrating its superiority over existing methods. Our proposed framework can significantly reduce the manual effort involved in chart analysis, providing a step towards a universal chart understanding model. Moreover, our approach offers opportunities for plug-and-play integration with mainstream LLMs such as T5 and TaPas, extending their capability to chart comprehension tasks. The code is available at https://github.com/zhiqic/ChartReader.

large language model, machine learning, natural language, (20 more...)

2304.02173

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Europe > Spain > Galicia > Madrid (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.85)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.71)
(2 more...)

REFINER: Reasoning Feedback on Intermediate Representations

Paul, Debjit, Ismayilzada, Mete, Peyrard, Maxime, Borges, Beatriz, Bosselut, Antoine, West, Robert, Faltings, Boi

Language models (LMs) have recently shown remarkable performance on reasoning tasks by explicitly generating intermediate inferences, e.g., chain-of-thought prompting. However, these intermediate inference steps may be inappropriate deductions from the initial context and lead to incorrect final predictions. Here we introduce REFINER, a framework for finetuning LMs to explicitly generate intermediate reasoning steps while interacting with a critic model that provides automated feedback on the reasoning. Specifically, the critic provides structured feedback that the reasoning LM uses to iteratively improve its intermediate arguments. Empirical evaluations of REFINER on three diverse reasoning tasks show significant improvements over baseline LMs of comparable scale. Furthermore, when using GPT3.5 as the reasoner, the trained critic significantly improves reasoning without finetuning the reasoner. Finally, our critic model is trained without expensive human-in-the-loop data but can be substituted with humans at inference time.

computational linguistic, large language model, machine learning, (19 more...)

2304.01904

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Washington > King County > Seattle (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(6 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.66)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.51)
Information Technology > Artificial Intelligence > Representation & Reasoning > Abductive Reasoning (0.48)

Lin, Fangzhen, Shou, Ziyi, Chen, Chengcai

Using Language Models For Knowledge Acquisition in Natural Language Reasoning Problems

For a natural language problem that requires some non-trivial reasoning to solve, there are at least two ways to do it using a large language model (LLM). One is to ask it to solve it directly. The other is to use it to extract the facts from the problem text and then use a theorem prover to solve it. In this note, we compare the two methods using ChatGPT and GPT4 on a series of logic word puzzles, and conclude that the latter is the right approach.

large language model, machine learning, natural language, (18 more...)

2304.01771

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Pennsylvania (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Sadeq, Nafis, Kang, Byungkyu, Lamba, Prarit, McAuley, Julian

Unsupervised Improvement of Factual Knowledge in Language Models

Masked language modeling (MLM) plays a key role in pretraining large language models. But the MLM objective is often dominated by high-frequency words that are sub-optimal for learning factual knowledge. In this work, we propose an approach for influencing MLM pretraining in a way that can improve language model performance on a variety of knowledge-intensive tasks. We force the language model to prioritize informative words in a fully unsupervised way. Experiments demonstrate that the proposed approach can significantly improve the performance of pretrained language models on tasks such as factual recall, question answering, sentiment analysis, and natural language inference in a closed-book setting.

computational linguistic, large language model, natural language, (18 more...)

2304.01597

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Finland > Uusimaa > Helsinki (0.05)
South America > Brazil (0.04)
(18 more...)

Genre: Research Report (0.40)

Industry:

Leisure & Entertainment (0.47)
Government (0.47)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.34)

Tursun, Osman, Denman, Simon, Sridharan, Sridha, Fookes, Clinton

Towards Self-Explainability of Deep Neural Networks with Heatmap Captioning and Large-Language Models

Heatmaps are widely used to interpret deep neural networks, particularly for computer vision tasks, and the heatmap-based explainable AI (XAI) techniques are a well-researched topic. However, most studies concentrate on enhancing the quality of the generated heatmap or discovering alternate heatmap generation techniques, and little effort has been devoted to making heatmap-based XAI automatic, interactive, scalable, and accessible. To address this gap, we propose a framework that includes two modules: (1) context modelling and (2) reasoning. We proposed a template-based image captioning approach for context modelling to create text-based contextual information from the heatmap and input data. The reasoning module leverages a large language model to provide explanations in combination with specialised knowledge. Our qualitative experiments demonstrate the effectiveness of our framework and heatmap captioning approach. The code for the proposed template-based heatmap captioning approach will be publicly available.

large language model, machine learning, natural language, (20 more...)

2304.02202

Country:

Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Oceania > Australia > Queensland (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)