FWIW we are in the process of replacing a Dev/QA cluster in our data center and just the storage appliance alone went from $58k to $120k since January. That doesn't include the 5 servers in the cluster, which each have gone up $8k in the same time frame.
Cannot imagine how you even plan to build a data center when your costs are going up 20% per month.
My super uninformed theory is that local LLM will trail foundation models by about 2 years for practical use.
For example right now a lot of work is being done on improving tool calling and agentic workflows, which tool calling was first popping up around end of 2023 for local LLMs.
This is putting aside the standard benchmarks which get "benchmaxxed" by local LLMs and show impressive numbers, but when used with OpenCode rarely meet expectations. In theory Qwen3.5-397B-A17B should be nearly a Sonnet 4.6 model but it is not.
This is not an ordinary LLM benchmark, it's streaming experts' weights from storage. It opens up running very large (near-SOTA, potentially SOTA) MoE models on very limited hardware, since you no longer need enough RAM for the entirety of the model's parameters. The comparison to 20 t/s local AI models is simply not fair.
SOC2 has been in trouble for a while now. Completely gamified. I was managing an acquisition of a healthtech company and asked if they did an internal risk assessment as part of their audit. Nope.
SOC2 certified, has never actually put to paper "here's what we know we're doing wrong, here is how we plan to remediate it."
My experience with the defaults in JavaScript is that they’re pretty slow. It’s really, really easy to hit the limits of an express app and for those limits to be in your app code. I’ve worked on JVM backed apps and they’re memory hungry (well, they require a reallocation for the JVM) and they’re slow to boot but once they’re going they are absolutely ripping fast and your far more likely to be bottlenecked by your DB long before you need to start doing any horizontal scaling.
Fair point on ecosystem decisions, that's basically the thesis of the post. These patterns aren't Java being slow, they're developers (myself included) writing code that looks fine but works against the JVM. Enterprise Java gets a bad rap partly because these patterns compound silently across large codebases and nobody profiles until something breaks.
No, just a 20+ year C++ and Java developer, while you clearly haven't used modern Java. Now, I admit that because I have a lot of experience in low-level programming, I am often able to beat Java's performance in C++, but not without a lot of effort. I can do better in Zig when arenas fit, but I wouldn't use it (or C++ for that matter) for a huge program that needs to be maintained by a large team over many years.
Cannot imagine how you even plan to build a data center when your costs are going up 20% per month.
reply