Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

https://crfm.stanford.edu/helm/air-bench/latest/#/leaderboar...

This isn’t the gotcha question you think it is. AI safety is being defined and measured.



Cool, another metric to game like they do the other ones.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: