Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

OpenAI stopped releasing information about their models after gpt-3, which was 175b, but the leaks and rumours that gpt-4 is an 8x220 billion parameter model are most certainly correct. 4o is likely a distilled 220b model. Other commercial offerings are going to be in the same ballpark. Comparing these to llama 3 8b is like comparing a bicycle or a car to a train or cruise ship when you need to transport a few dozen passengers at best. There are local models in the 70-240b range that are more than capable of competing with commercial offerings if you're willing to look at anything that isn't bleeding edge state of the art.


Any pointers on where we can check the best local models per amount of VRAM available? I only have consumer level cards available, but I would think something that just fits in to a 24Gb card should noticably outperform something scaled for an 8Gb card, yes?


lm studio tells you what models fit in your available RAM, with or without quantization




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: