That seems promising for applications that require raw speed. Wonder how much th...

		Havoc 2 days ago \| parent \| context \| favorite \| on: The path to ubiquitous AI (17k tokens/sec) That seems promising for applications that require raw speed. Wonder how much they can scale it up - 8B model quantized is very usable but still quite small compared to even bottom end cloud models.

		help