Semantic Feature Verification in FLAN-T5

Suresh, Siddharth, Mukherjee, Kushin, Rogers, Timothy T.

Apr-11-2023–arXiv.org Artificial Intelligence

In cognitive science, efforts to understand the structure of human concepts have relied on semantic feature norms: participants list all the properties they believe to be true of a given concept; responses are collected from many participants for many concepts; overlap in the resulting feature vectors captures the degree to which concepts are semantically related(Rosch, 1973; McRae et al., 2005). Yet participants often produce only a fraction of what they know for each concept: tigers have DNA, can breathe, and are alive, but these properties are not typically produced in feature norms for tiger. Such omissions are important because they express deep conceptual structure: having DNA and breathing connect tigers to all other plants and animals. To better capture such structure, some studies ask human participants to make yes/no judgments for all possible properties across every concept. Thus if "can breathe" was listed for a single concept, human raters would then evaluate whether each other concept in the dataset can breathe. This verification step significantly enriches the conceptual structure that features norms express (De Deyne et al., 2008), but is exceedingly costly in human labor: the number of verification questions asked increases exponentially with the number of concepts probed. Previous work has shown that the conceptual structure of a large language model (LLM) for semantic feature listing is similar to human conceptual structure (Suresh et al., 2023; Bhatia & Richie, 2022). In this paper we consider whether this step can be reliably "outsourced" to an open sourced LLM optimized for question-answering, specifically the opensource FLAN-T5 XXL model (Chung et al., 2022; Wei et al., 2021), focusing on two questions: (1) How accurately does the LLM capture human responses to the questions?

artificial intelligence, natural language, text processing, (19 more...)

arXiv.org Artificial Intelligence

Apr-11-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States > Wisconsin (0.28)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found