Drawing Causal Inferences About Performance Effects in NLP
–arXiv.org Artificial Intelligence
This article emphasizes that NLP as a science seeks to make inferences about the performance effects that result from applying one method (compared to another method) in the processing of natural language. Yet NLP research in practice usually does not achieve this goal: In NLP research articles, typically only a few models are compared. Each model results from a specific procedural pipeline (here named processing system) that is composed of a specific collection of methods that are used in preprocessing, pretraining, hyperparameter tuning, and training on the target task. To make generalizing inferences about the performance effect that is caused by applying some method A vs. another method B, it is not sufficient to compare a few specific models that are produced by a few specific (probably incomparable) processing systems. Rather, the following procedure would allow drawing inferences about methods' performance effects: A population of processing systems that researchers seek to infer to has to be defined. A random sample of processing systems from this population is drawn.
arXiv.org Artificial Intelligence
Sep-14-2022
- Country:
- Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
- Genre:
- Research Report (1.00)
- Technology: