Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't know if it will stay this low but the whole point of v3.2 is to be cheaper to run than <= v3.1.

(The inference costs are cheaper for them now as context grows because of the Sparse attention mechanism)



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: