AITopics | esponse

Collaborating Authors

esponse

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

FactCHD: Benchmarking Fact-Conflicting Hallucination Detection

Chen, Xiang, Song, Duanzheng, Gui, Honghao, Wang, Chenxi, Zhang, Ningyu, Yong, Jiang, Huang, Fei, Lv, Chengfei, Zhang, Dan, Chen, Huajun

arXiv.org Artificial IntelligenceJan-18-2024

Despite their impressive generative capabilities, LLMs are hindered by fact-conflicting hallucinations in real-world applications. The accurate identification of hallucinations in texts generated by LLMs, especially in complex inferential scenarios, is a relatively unexplored area. To address this gap, we present FactCHD, a dedicated benchmark designed for the detection of fact-conflicting hallucinations from LLMs. FactCHD features a diverse dataset that spans various factuality patterns, including vanilla, multi-hop, comparison, and set operation. A distinctive element of FactCHD is its integration of fact-based evidence chains, significantly enhancing the depth of evaluating the detectors' explanations. Experiments on different LLMs expose the shortcomings of current approaches in detecting factual errors accurately. Furthermore, we introduce Truth-Triangulator that synthesizes reflective considerations by tool-enhanced ChatGPT and LoRA-tuning based on Llama2, aiming to yield more credible detection through the amalgamation of predictive results and evidence. The benchmark dataset is available at https://github.com/zjunlp/FactCHD.

esponse, hallucination, knowledge, (15 more...)

arXiv.org Artificial Intelligence

2310.12086

Country:

Asia > China > Shanghai > Shanghai (0.05)
North America > United States > New York (0.04)
Africa > Zambia (0.04)
(8 more...)

Genre:

Personal > Obituary (0.46)
Research Report > New Finding (0.46)

Industry:

Media (1.00)
Leisure & Entertainment (0.94)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Automatic Instruction Optimization for Open-source LLM Instruction Tuning

Liu, Yilun, Tao, Shimin, Zhao, Xiaofeng, Zhu, Ming, Ma, Wenbing, Zhu, Junhao, Su, Chang, Hou, Yutai, Zhang, Miao, Zhang, Min, Ma, Hongxia, Zhang, Li, Yang, Hao, Jiang, Yanfei

arXiv.org Artificial IntelligenceNov-22-2023

Instruction tuning is crucial for enabling Language Learning Models (LLMs) in responding to human instructions. The quality of instruction pairs used for tuning greatly affects the performance of LLMs. However, the manual creation of high-quality instruction datasets is costly, leading to the adoption of automatic generation of instruction pairs by LLMs as a popular alternative in the training of open-source LLMs. To ensure the high quality of LLM-generated instruction datasets, several approaches have been proposed. Nevertheless, existing methods either compromise dataset integrity by filtering a large proportion of samples, or are unsuitable for industrial applications. In this paper, instead of discarding low-quality samples, we propose CoachLM, a novel approach to enhance the quality of instruction datasets through automatic revisions on samples in the dataset. CoachLM is trained from the samples revised by human experts and significantly increases the proportion of high-quality samples in the dataset from 17.7% to 78.9%. The effectiveness of CoachLM is further assessed on various real-world instruction test sets. The results show that CoachLM improves the instruction-following capabilities of the instruction-tuned LLM by an average of 29.9%, which even surpasses larger LLMs with nearly twice the number of parameters. Furthermore, CoachLM is successfully deployed in a data management system for LLMs at Huawei, resulting in an efficiency improvement of up to 20% in the cleaning of 40k real-world instruction pairs. We release the training data and code of CoachLM (https://github.com/lunyiliu/CoachLM).

dataset, instruction, instruction pair, (14 more...)

arXiv.org Artificial Intelligence

2311.13246

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > Russia (0.04)
Asia > Russia (0.04)
Asia > China (0.04)

Genre:

Research Report > New Finding (0.66)
Instructional Material > Course Syllabus & Notes (0.64)
Instructional Material > Online (0.40)

Industry: Education > Educational Setting > Online (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

A Novel Demand Response Model and Method for Peak Reduction in Smart Grids -- PowerTAC

Chandlekar, Sanjay, Boroju, Arthik, Jain, Shweta, Gujar, Sujit

arXiv.org Artificial IntelligenceFeb-24-2023

One of the widely used peak reduction methods in smart grids is demand response, where one analyzes the shift in customers' (agents') usage patterns in response to the signal from the distribution company. Often, these signals are in the form of incentives offered to agents. This work studies the effect of incentives on the probabilities of accepting such offers in a real-world smart grid simulator, PowerTAC. We first show that there exists a function that depicts the probability of an agent reducing its load as a function of the discounts offered to them. We call it reduction probability (RP). RP function is further parametrized by the rate of reduction (RR), which can differ for each agent. We provide an optimal algorithm, MJS--ExpResponse, that outputs the discounts to each agent by maximizing the expected reduction under a budget constraint. When RRs are unknown, we propose a Multi-Armed Bandit (MAB) based online algorithm, namely MJSUCB--ExpResponse, to learn RRs. Experimentally we show that it exhibits sublinear regret. Finally, we showcase the efficacy of the proposed algorithm in mitigating demand peaks in a real-world smart grid system using the PowerTAC simulator as a test bed.

agent, artificial intelligence, esponse, (16 more...)

arXiv.org Artificial Intelligence

2302.1252

Country:

Asia > India > Telangana > Hyderabad (0.04)
South America > Brazil (0.04)
North America > United States > California (0.04)
North America > Canada (0.04)

Genre: Research Report (0.64)

Industry: Energy > Power Industry (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback