Misc,

A Looming Replication Crisis in Evaluating Behavior in Language Models? : Evidence and Solutions

, , and .
Preprint, (2024)
DOI: 10.48550/arXiv.2409.20303

Meta data

Tags

Users

  • @unibiblio
  • @iris
  • @thilohagendorff

Comments and Reviews