More

james2doyle · 2026-04-02T16:35:38 1775147738

Hmm just tried the google/gemma-4-31B-it through HuggingFace (inference provider seems to be Novita) and function/tool calling was not enabled...

james2doyle · 2026-04-02T16:36:37 1775147797

Yeah you can see here that tool calling is disabled: https://huggingface.co/inference/models?model=google%2Fgemma...

At least, as of this post

hyjohnnychin · 2026-04-03T15:47:55 1775231275

Tool calling is enabled now

linolevan · 2026-04-02T16:53:27 1775148807

Hosted on Parasail + Google (both for free, as of now) themselves, probably would give those a shot

james2doyle · 2026-04-01T20:56:26 1775076986

None of the Qwen 3.5 models seem present? I’ve heard people are pretty happy with the smaller 3.5 versions. I would be curious to see those too.

I would also be interested to see "KAT-Coder-Pro-V2" as they brag about their benchmarks in these bots as well

Aerroon · 2026-04-02T00:52:48 1775091168

If they use OpenRouter pricing then the Qwen3.5 models are going to be poor value.

The Qwen3.5 27B model on OR is $1.56/million tokens out (it used to be $2.4/mil).

Meanwhile Minimax M2.7 (a much larger model) is $1.2/mil out.

The smaller and medium tier Qwen3.5 models are only really cost effective if you run them yourself.

james2doyle · 2026-04-02T16:12:32 1775146352

Oh I never noticed that. Good to call out. But that would put it much closer to Minimax M2.7 in terms of price than to the likes of Mimo V2 Pro, and Gemini Flash 3 preview, which are both on the list

p1necone · 2026-04-02T02:37:13 1775097433

Is Minimax M2.7 better than Qwen3.5 27B, or is it just bigger?

kdasme · 2026-04-02T04:42:02 1775104922

Minimax M2.7 is similar to sonnet in my tests. This is the first non OAI/Anthropic model I use for coding. It does require more steering, though.

wg0 · 2026-04-02T05:46:33 1775108793

More steering than Sonnet? What is your experience?

wilj · 2026-04-02T10:35:31 1775126131

I'm about 2 days into transitioning, using MiMo V2 Pro in place of Opus and MiniMax M2.7 in place of Sonnet.

I'm finding that the extra "hand holding" that MiMo and MiniMax need isn't really "extra." The Anthropic models happily agree to a plan and then do something else entirely way too often.

With MiMo and MiniMax I'm just spreading the attention throughout the day instead of big spikes of frustration figuring out where Claude went off the rails.

wg0 · 2026-04-02T13:20:00 1775136000

Thank for responding. So you are using MiMo V2 Pro to plan and then asking MiniMax M2.7 to read that plan file and execute? Or how the workflow looks like?

Pi/Opencode/Kilocode? Just curious.

I am using Opencode mostly and thinking to abandon Copilot so looking for something similar.

Aerroon · 2026-04-02T12:27:04 1775132824

Yes, it's significantly better.

james2doyle · 2026-03-20T23:10:52 1774048252

What caused the switch? Also, are you still trying to use Claude models in OpenCode?

systima · 2026-03-21T09:26:48 1774085208

Sorry, I missed part of your question:

What caused the switch was that we're building AI solutions for sometimes price-conscious customers, so I was already familiar with the pattern of "Use a superior model for setting a standard, then fine-tuning a cheaper one to do that same work".

So I brought that into my own workflows (kind of) by using Opus 4.6 to do detailed planning and one 'exemplar' execution (with 'over documentation' of the choices), then after that, use Opus 4.6 only for planning, then "throw a load of MiniMax M2.5s at the problem".

They tend to do 90% of the job well, then I sometimes do a final pass with Opus 4.6 again to mop up any issues, this saves me a lot of tokens/money.

This pattern wasn't possible with Claude Code, thus my move to Open Code.

zingar · 2026-03-20T23:36:21 1774049781

You can access anthropic models with subscription pricing via a copilot license.

xvector · 2026-03-21T00:52:09 1774054329

Pretty sure that's against TOS.

Edit: it's not. https://github.blog/changelog/2026-01-16-github-copilot-now-...

They must be eating insane amounts of $$$ for this. I wouldn't expect it to last

hawtads · 2026-03-21T02:45:00 1774061100

No, Claude on GitHub Copilot is billed at 3X the usage rate of the other models e.g. GPT-5.4 and you get an extremely truncated context window.

See https://models.dev for a comparison against the normal "vanilla" API.

systima · 2026-03-21T05:48:34 1774072114

Yes I regularly plan in Opus 4.6 and execute in “lesser” models ie MiniMax

james2doyle · 2026-03-18T18:14:47 1773857687

The only similarity is that they both say "you’re absolutely right" when you point out their obvious mistakes

james2doyle · 2026-03-18T18:13:37 1773857617

_hyper-competent collaborator who may completely make things up occasionally and will sometimes give different answers to the same question*_

bensyverson · 2026-03-18T18:43:22 1773859402

So, indistinguishable from a human then

bigstrat2003 · 2026-03-18T19:57:29 1773863849

No. A competent human doesn't make things up, he admits ignorance. He also only very rarely changes answers he previously gave.

james2doyle · 2026-03-18T18:01:17 1773856877

I’ve used M2.5 in OpenCode using their Zen inference. I found it to be decent. Did not really seem comparable to Opus 4.5 for "quality" output. As in, I often tweaked the output more when using M2.5.

I think the best thing was the speed. If it is going to be wrong, I would prefer it to be wrong quickly.

james2doyle · 2026-03-18T17:59:01 1773856741

They have the details up on their site now: https://www.minimax.io/models/text/m27

james2doyle · 2026-03-16T20:27:37 1773692857

There is a blog post now: https://mistral.ai/news/leanstral

james2doyle · 2026-03-16T18:26:55 1773685615

> Leanstral > Our first open-source code agent designed for Lean 4, built for formal proof engineering in realistic repositories. 119B parameters with 6.5B active.

Mentioned in the 2.5.0 release of the Vibe CLI tool: https://github.com/mistralai/mistral-vibe/releases/tag/v2.5.... A HuggingFace page is linked for the weights but it returns a 404: https://huggingface.co/mistralai/Leanstral-120B-A6B-2603

james2doyle · 2026-03-16T18:29:22 1773685762

Mentioned in the 2.5.0 release of the Vibe CLI tool: https://github.com/mistralai/mistral-vibe/releases/tag/v2.5....

james2doyle · 2026-03-06T22:22:19 1772835739

This article sure uses a lot of em dashes. I see 9 in the article body.