More

bearjaws · 2026-04-03T16:00:23 1775232023

FWIW we are in the process of replacing a Dev/QA cluster in our data center and just the storage appliance alone went from $58k to $120k since January. That doesn't include the 5 servers in the cluster, which each have gone up $8k in the same time frame.

Cannot imagine how you even plan to build a data center when your costs are going up 20% per month.

bearjaws · 2026-04-03T13:43:02 1775223782

Why would I use this over Google?

bearjaws · 2026-04-02T18:32:11 1775154731

The labels on the table read "Gemma 431B IT" which reads as 431B parameter model, not Gemma 4 - 31B...

bearjaws · 2026-03-31T12:49:06 1774961346

reasoning is just more tokens that come out first wrapped in <thinking></thinking>

bearjaws · 2026-03-31T12:43:30 1774961010

My super uninformed theory is that local LLM will trail foundation models by about 2 years for practical use.

For example right now a lot of work is being done on improving tool calling and agentic workflows, which tool calling was first popping up around end of 2023 for local LLMs.

This is putting aside the standard benchmarks which get "benchmaxxed" by local LLMs and show impressive numbers, but when used with OpenCode rarely meet expectations. In theory Qwen3.5-397B-A17B should be nearly a Sonnet 4.6 model but it is not.

bearjaws · 2026-03-24T12:27:28 1774355248

Why use JS at all for SSR?

It's not a great language for it.

azangru · 2026-03-24T12:47:53 1774356473

Where does the article say anything about js for ssr?

bearjaws · 2026-03-23T15:00:50 1774278050

We need a rule that if your LLM benchmark is running under 20t/s it's simply unusable in any real workflow.

6t/s is unbearable, if you used it with OpenCode you would be waiting 20+ minutes per turn.

zozbot234 · 2026-03-23T15:30:30 1774279830

This is not an ordinary LLM benchmark, it's streaming experts' weights from storage. It opens up running very large (near-SOTA, potentially SOTA) MoE models on very limited hardware, since you no longer need enough RAM for the entirety of the model's parameters. The comparison to 20 t/s local AI models is simply not fair.

bearjaws · 2026-03-23T17:09:51 1774285791

I understand that. I am saying there is a clear cliff where the value of an LLM reaches 0.

At 1t/s you are never going to get anything done. At 6t/s, it's significantly degrades the experience, one mistake setting you back 20-30 minutes.

At ~20t/s it's much more usable.

bearjaws · 2026-03-22T21:10:09 1774213809

SOC2 has been in trouble for a while now. Completely gamified. I was managing an acquisition of a healthtech company and asked if they did an internal risk assessment as part of their audit. Nope.

SOC2 certified, has never actually put to paper "here's what we know we're doing wrong, here is how we plan to remediate it."

bearjaws · 2026-03-20T14:34:22 1774017262

JavaScript can be fast too, it's just the ecosystem and decisions devs make that slow it down.

Same for Java, I have yet to in my entire career see enterprise Java be performant and not memory intensive.

At the end of the day, if you care about performance at the app layer, you will use a language better suited to that.

maccard · 2026-03-20T14:46:52 1774018012

My experience with the defaults in JavaScript is that they’re pretty slow. It’s really, really easy to hit the limits of an express app and for those limits to be in your app code. I’ve worked on JVM backed apps and they’re memory hungry (well, they require a reallocation for the JVM) and they’re slow to boot but once they’re going they are absolutely ripping fast and your far more likely to be bottlenecked by your DB long before you need to start doing any horizontal scaling.

wiradikusuma · 2026-03-20T14:59:23 1774018763

Compile it to native (GraalVM) and you can get it fast while consuming less memory. But now your build is slow :)

maccard · 2026-03-20T15:13:01 1774019581

The minute a project has maven in it the build is slow. Don’t even get me started on Gradle…

j-vogel · 2026-03-20T14:39:55 1774017595

Fair point on ecosystem decisions, that's basically the thesis of the post. These patterns aren't Java being slow, they're developers (myself included) writing code that looks fine but works against the JVM. Enterprise Java gets a bad rap partly because these patterns compound silently across large codebases and nobody profiles until something breaks.

FatherOfCurses · 2026-03-20T15:12:32 1774019552

"Enterprise Java"

Factories! Factories everywhere!

whattheheckheck · 2026-03-20T15:42:42 1774021362

Why do you think this plays out over and over again? What's the causal mechanisms of this strange attractor

wood_spirit · 2026-03-20T15:24:06 1774020246

Yes! Obligatory link to the seminal work on the subject:

https://gwern.net/doc/cs/2005-09-30-smith-whyihateframeworks...

pron · 2026-03-20T20:18:00 1774037880

Well, JS is fast and Go is faster, but Java is C++-fast.

Mawr · 2026-03-21T01:51:47 1774057907

What a ridiculous claim. You're either deluded or outright lying.

pron · 2026-03-21T14:21:22 1774102882

No, just a 20+ year C++ and Java developer, while you clearly haven't used modern Java. Now, I admit that because I have a lot of experience in low-level programming, I am often able to beat Java's performance in C++, but not without a lot of effort. I can do better in Zig when arenas fit, but I wouldn't use it (or C++ for that matter) for a huge program that needs to be maintained by a large team over many years.

bearjaws · 2026-03-20T14:08:08 1774015688

Mostly good advice other than "Run Ad-Hoc Claude Commands Inside Scripts"

I barely trust Claude Code as it is, and neither should anyone to run arbitrary commands unless you run it in a strong sandbox.

gibuloto · 2026-03-21T01:33:28 1774056808

Yes, so I run it with `claude --tools ""` with no tool use.