A Hypothesis-Driven Framework for the Analysis of Self-Rationalising Models

Feb-7-2024–arXiv.org Artificial Intelligence

The self-rationalising capabilities of LLMs are appealing because the generated explanations can give insights into the plausibility of the predictions. However, how faithful the explanations are to the predictions is questionable, raising the need to explore the patterns behind them further. To this end, we propose a hypothesis-driven statistical framework. We use a Bayesian network to implement a hypothesis about how a task (in our example, natural language inference) is solved, and its internal states are translated into natural language with templates. Those explanations are then compared to LLM-generated free-text explanations using automatic and human evaluations. This allows us to judge how similar the LLM's and the Bayesian network's decision processes are. We demonstrate the usage of our framework with an example hypothesis and two realisations in Bayesian networks. The resulting models do not exhibit a strong similarity to GPT-3.5. We discuss the implications of this as well as the framework's potential to approximate LLM decisions better in future work.

explanation, gpt-3, hypothesis, (16 more...)

arXiv.org Artificial Intelligence

Feb-7-2024

arXiv.org PDF

Add feedback

Country:
- North America
  - Dominican Republic (0.04)
  - United States
    - Washington > King County
      - Seattle (0.04)
    - New York > New York County
      - New York City (0.04)
    - New Mexico > Santa Fe County
      - Santa Fe (0.04)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
    - California > Los Angeles County
      - Los Angeles (0.14)
  - Canada > Ontario
    - Toronto (0.04)
- Europe
  - Netherlands (0.04)
  - Belgium (0.04)
  - United Kingdom > England
    - Greater London > London (0.04)
  - Sweden > Östergötland County
    - Linköping (0.04)
  - Portugal > Lisbon
    - Lisbon (0.04)
  - Italy > Tuscany
    - Florence (0.04)
  - Germany > Baden-Württemberg
    - Stuttgart Region > Stuttgart (0.04)
- Asia > Japan
  - Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.90)
  - Machine Learning
    - Neural Networks > Deep Learning (1.00)
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.90)