Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I majorly compared it to the native Explorer agents (for example in claude code). So far it has won against the explorer agents in 98 of 100 cases. I am already in the works to create a bigger benchmark, but did not have so much time for it. But you are welcome to test it out :)
 help



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: