Cerebras is a totally different product though. They can (theoretically) run any frontier model provided it gets compiled a certain way. Like a wafer scale TPU.
This is using hardwired weights with on-die SRAM used for K/V for example. It's WAY more power efficient and faster. The tradeoff being it's hardwired.
Still, most frontier models are "good enough" where an obscenely fast version would be a major seller.
This is using hardwired weights with on-die SRAM used for K/V for example. It's WAY more power efficient and faster. The tradeoff being it's hardwired.
Still, most frontier models are "good enough" where an obscenely fast version would be a major seller.