AITopics | Port of Spain

Collaborating Authors

Port of Spain

Can Knowledge Editing Really Correct Hallucinations?

Huang, Baixiang, Chen, Canyu, Xu, Xiongxiao, Payani, Ali, Shu, Kai

arXiv.org Artificial IntelligenceOct-29-2024

Large Language Models (LLMs) suffer from hallucinations, referring to the nonfactual information in generated content, despite their superior capacities across tasks. Meanwhile, knowledge editing has been developed as a new popular paradigm to correct the erroneous factual knowledge encoded in LLMs with the advantage of avoiding retraining from scratch. However, one common issue of existing evaluation datasets for knowledge editing is that they do not ensure LLMs actually generate hallucinated answers to the evaluation questions before editing. When LLMs are evaluated on such datasets after being edited by different techniques, it is hard to directly adopt the performance to assess the effectiveness of different knowledge editing methods in correcting hallucinations. Thus, the fundamental question remains insufficiently validated: Can knowledge editing really correct hallucinations in LLMs? We proposed HalluEditBench to holistically benchmark knowledge editing methods in correcting real-world hallucinations. First, we rigorously construct a massive hallucination dataset with 9 domains, 26 topics and more than 6, 000 hallucinations. Then, we assess the performance of knowledge editing methods in a holistic way on five dimensions including Efficacy, Generalization, Portability, Locality, and Robustness. Through HalluEditBench, we have provided new insights into the potentials and limitations of different knowledge editing methods in correcting hallucinations, which could inspire future improvements and facilitate the progress in the field of knowledge editing. Considering Table 1: Performance measured by Accuracy (%) the high cost of retraining LLMs from scratch, of Llama2-7B before editing ("Pre-edit") and after knowledge editing has been designed as a new applying typical knowledge editing methods ("Postedit") paradigm to correct erroneous or outdated factual on common existing evaluation datasets. When such datasets are adopted to evaluate the performance of LLMs after being edited, it is hard to directly use the scores to judge the effectiveness of different knowledge editing techniques in correcting hallucinations, which is the motivation of applying knowledge editing to LLMs. To better illustrate this point, following the evaluation setting in (Zhang et al., 2024e), we conducted a preliminary study to examine the pre-edit and post-edit performances of Llama2-7B on the aforementioned Who is the Chief Scientist of OpenAI? Who is the Chief Scientist of OpenAI? Who is the Chief Scientist of OpenAI?

arxiv preprint, editing, knowledge editing, (14 more...)

arXiv.org Artificial Intelligence

2410.16251

Country:

North America > Canada (0.04)
Europe > Poland (0.04)
South America > Venezuela > Gulf of Paria (0.04)
(12 more...)

Genre:

Overview (0.67)
Research Report (0.56)

Industry: Health & Medicine > Therapeutic Area (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.75)

Add feedback

Adoption of Bots Across the Insurance Value Chain

#artificialintelligenceSep-22-2022, 16:06:23 GMT

Today, customers expect their queries to be answered on their terms and as quickly as possible. What are the significant factors to create a superior customer experience? A PwC study states that nearly 80% of US consumers prefer the above-stated factors as the most important elements for a positive customer experience. Customers outside the US value time taken to offer more. Adoption of AI bot technology in insurance brings a "human touch" and helps insurers to "build real connections" with their customers without frustrating them.

adoption, customer experience, insurance value chain, (1 more...)

#artificialintelligence

Country:

North America > United States > Washington > King County > Seattle (0.10)
North America > United States > Texas > Collin County > Plano (0.10)
North America > Trinidad and Tobago > Trinidad > Port of Spain (0.10)
Asia > India > Telangana (0.10)

Industry: Banking & Finance > Insurance (0.45)

Technology: Information Technology > Artificial Intelligence (0.45)

Add feedback

ICATT hosts business forum on artificial intelligence

#artificialintelligenceJul-14-2019, 17:42:23 GMT

The Institute of Chartered Accountants of Trinidad and Tobago (ICATT), earlier this month, hosted a business forum comprising an audience of financial executives from various sectors including energy, banking and finance at the KPMG Headquarters in Port of Spain. The event themed "Artificial Intelligence (AI) – the Future of Accounting" exposed professional accountants to global developments, good practice guidance and knowledge-sharing that will enhance their roles and domain across the economy. In delivering the opening remarks, ICATT's president, Stacy-Ann Golding, praised the ICATT Professional Accountants in Business (PAIB) Committee for organising the forum, the topic of which, she noted, was critical to improving the readiness of today's accounting professionals to deal with AI and its implications. Bring a depth of insight and experience were featured speakers Nigel Romano, managing director and chief executive officer, JMMB Bank and Leslie Lee Fook, director of Artificial Intelligence, Automation and Analytics at Incus Services Ltd. Romano spoke on the use of AI, "I can recall the now obsolete, clunky computerised systems used in accounting during the 1970s and how they helped speed up work processes at that time. Today a similar shift is happening as current systems will soon be overshadowed by those powered by self-learning / machine learning capabilities."

artificial intelligence, icatt host business forum, machine learning, (6 more...)

#artificialintelligence

Country: North America > Trinidad and Tobago > Trinidad > Port of Spain (0.27)

Industry: Banking & Finance (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.59)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.35)

Add feedback

SilverHook gains edge with high-tech AI in race to the podium

#artificialintelligenceJun-4-2019, 20:53:49 GMT

Last year, after breaking the Guinness World Record for the Key West to Cuba run, we wondered what was next for the #77 Lucas Oil SilverHook ocean racing powerboat? We found the answer in the 50th anniversary of the Trinidad & Tobago Great Race, one of the most grueling races in the world. The 115-mile endurance course starts in Trinidad's Port of Spain, where you head north and then east near the island before popping into the Atlantic Ocean for a 50-mile sprint to the finish in Store Bay, Tobago. Because of the logistical difficulties of racing on foreign shores, we were the first American entry in 29 years. We knew we would face stiff competition from Jumbie, Cat Killer, Mr. Solo and other local rivals that know the course well.

artificial intelligence, great race, machine learning, (12 more...)

#artificialintelligence

Country:

North America > United States > Florida > Monroe County > Key West (0.25)
North America > Trinidad and Tobago > Trinidad > Port of Spain (0.25)
North America > Cuba (0.25)
(2 more...)

Industry:

Information Technology (0.47)
Transportation (0.38)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.33)

Add feedback