AITopics | phere

Collaborating Authors

phere

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SPHERE: An Evaluation Card for Human-AI Systems

Ma, Qianou, Zhao, Dora, Zhao, Xinran, Si, Chenglei, Yang, Chenyang, Louie, Ryan, Reiter, Ehud, Yang, Diyi, Wu, Tongshuang

arXiv.org Artificial IntelligenceApr-14-2025

In the era of Large Language Models (LLMs), establishing effective evaluation methods and standards for diverse human-AI interaction systems is increasingly challenging. To encourage more transparent documentation and facilitate discussion on human-AI system evaluation design options, we present an evaluation card SPHERE, which encompasses five key dimensions: 1) What is being evaluated?; 2) How is the evaluation conducted?; 3) Who is participating in the evaluation?; 4) When is evaluation conducted?; 5) How is evaluation validated? We conduct a review of 39 human-AI systems using SPHERE, outlining current evaluation practices and areas for improvement. We provide three recommendations for improving the validity and rigor of evaluation practices.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2504.07971

Country:

North America > United States (1.00)
Europe (1.00)
Asia (1.00)

Genre:

Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (0.94)

Industry:

Health & Medicine (1.00)
Education (1.00)
Media (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Add feedback

The Web Is Your Oyster -- Knowledge-Intensive NLP against a Very Large Web Corpus

Piktus, Aleksandra, Petroni, Fabio, Karpukhin, Vladimir, Okhonko, Dmytro, Broscheit, Samuel, Izacard, Gautier, Lewis, Patrick, Oğuz, Barlas, Grave, Edouard, Yih, Wen-tau, Riedel, Sebastian

arXiv.org Artificial IntelligenceDec-18-2021

In order to address the increasing demands of real-world applications, the research for knowledge-intensive NLP (KI-NLP) should advance by capturing the challenges of a truly open-domain environment: web scale knowledge, lack of structure, inconsistent quality, and noise. To this end, we propose a new setup for evaluating existing KI-NLP tasks in which we generalize the background corpus to a universal web snapshot. We repurpose KILT, a standard KI-NLP benchmark initially developed for Wikipedia, and ask systems to use a subset of CCNet - the Sphere corpus - as a knowledge source. In contrast to Wikipedia, Sphere is orders of magnitude larger and better reflects the full diversity of knowledge on the Internet. We find that despite potential gaps of coverage, challenges of scale, lack of structure and lower quality, retrieval from Sphere enables a state-of-the-art retrieve-and-read system to match and even outperform Wikipedia-based models on several KILT tasks - even if we aggressively filter content that looks like Wikipedia. We also observe that while a single dense passage index over Wikipedia can outperform a sparse BM25 version, on Sphere this is not yet possible. To facilitate further research into this area, and minimise the community's reliance on proprietary black box search engines, we will share our indices, evaluation metrics and infrastructure.

knowledge source, phere, wikipedia, (15 more...)

arXiv.org Artificial Intelligence

2112.09924

Country:

Oceania > Australia (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Oceania > Solomon Islands > Isabel Province > Santa Isabel Island > Buala (0.04)
(7 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.34)

Add feedback