More

chermi · 2026-03-27T04:32:10 1774585930

"Nobody should go and put a "retracted" stamp over "Principia Mathematica", or the "Special Relativity" paper of Einstein. Both are wrong, we know."

What does that have to do with this situation? I'm honestly trying to figure out your chain of thought. Do you think the future should have an impact on the present somehow? The fraud in the op post happened at the time of publication. Oh and btw. No fraud in the two you cited. Unless you figured out how to apply future to present. In which case they probably would've published much better papers, somehow.

gus_massa · 2026-03-27T17:07:37 1774631257

What about:

https://en.wikipedia.org/wiki/Gregor_Mendel#Mendelian_parado... Apparently the numbers of the second generation are too good to be true.

https://en.wikipedia.org/wiki/Oil_drop_experiment#Millikan's... Apparently the viscosity was wrong, and then everyone else made corrections to get a similar result.

> What does that have to do with this situation?

The problem is how to document error without overwhelming honest authors. Imagine a nightmare with a DMCA like process, where anyone can can fill a retraction request and the authors have a week to reply. [The data is in an obscure folder in a notebook that is dead since 5 years. Most of the processing was done by a guy that is now working in the industry for x10 salary.] [Assuming you didn't work with mice, and you must resurrect them to fill the additional data asked in the retraction request.]

An alternative is let the editors ask a new reviewer to make the decision, but everyone has horror stories of reviewers that made bad reviews in spite the manuscript was correct. Then what? Ask the authors again to defend the paper?

The current method is that anybody can publish a "comment" if they find a journal that agree to publish it.

jmalicki · 2026-03-27T11:45:06 1774611906

Regardless, published papers aren't an authoritative source of truth. Just a note to your friends "hey I did some cool stuff I want to tell you about!"

Sure it's slightly more reviewed than a GitHub repo, but it's not an end all be all.

chermi · 2026-03-25T17:02:11 1774458131

I was just reading "how the world became rich" and they made an interesting distinction economic "development" vs plain "growth". Amusingly, "development" to them means exactly what you're saying "engineer" should mean. It's sustainable, structural, not ephemeral. Development in the abstract hints at foundational work. Building something up to last. It seems like this meaning degradation is common in software. It still blows my mind how the "full-stack" naming stuck, for example.

https://www.howtheworldbecamerich.com/

Edit-on a related note, are there any studies on the all-in long-term cost between companies that "develop" vs. "engineer". I doubt there would be clean data since the managers that ignored all of the warning of "tech debt" would probably have the say on both compiling and releasing such data.

Does the cost of "tech-debt" decrease as the cost of "coding" decreased or is there a phase transition on the quality of the code? I bet there will be an inflection point if you plotted the adoption time of AI coding by companies. Late adapters that timed it after the models and harnesses and practices were good enough (probably still some time in the near future) would have less all-in cost per same codebase quality.

chermi · 2026-03-17T02:21:21 1773714081

Ahhh yes. As we all know regulations and requirements and bureaucracy never have unintended consequences, especially on the little guy. All that matters is intent, right?

Uvix · 2026-03-17T03:56:55 1773719815

The "little guy" isn't a publicly listed company issuing reports. By the time you have an IPO, you're no longer little.

chermi · 2026-03-15T17:39:04 1773596344

It seems like we're going back to expert systems in a kind of inverted sense with all of this chaining of deterministic steps. But now the "experts" are specialized and well-defined actions available to something smart enough to compose them to create new, more powerful actions. We've moved the determinism to the right spot, maybe? Just a half-thought.

I'm just trying to learn this stuff now, so I don't the literature. The "trajectory view" through action space is what makes the most sense to me.

Along these lines, another half-baked pattern I see is kind of a time-lagged translation of stuff from modern stat mech to deep learning/"AI". First it was energy based systems and the complex energy landscape view, a-la spin glasses and boltzmann machines. The "equilibrium" state-space view, concerned with memory and pattern storage/retrieval. Hinton, amit, hopfield, mackay and co.

Now, the trajectory view that started in the 90s with jarzynski and crooks and really bloomed in 2010+ with "stochastic thermodynamics" seems to be a useful lens. The agent stuff is very "nonequilibrium"/ "active"-system coded, in the thermo sense... With the ability to create, modify, and exploit resources (tools/memory) on the fly, there's deep history and path dependence. I see ideas from recent wolpert and co.(Susanne still, crooks again, etc.) w.r.t. thermodynamics of computation providing a kind of through line, all trajectory based. That's all very vague I know, but I recently read the COALA paper and was very enchanted and have been trying to combine what I actually know with this new foreign agent stuff.

It's also very interesting to me how the Italian stat mech school, the parisi family, have continuously put out bangers trying to actually explain machine learning and deep learning success.

I'd love to hear if anyone is thinking along similar lines, or thinks I'm way off track, has paper recs please let me know! Especially papers on the trajectory view of agents.

gbro3n · 2026-03-15T21:45:13 1773611113

I have wondered if we're going to end up investing so much in putting up guard rails around AI that we end up with systems of the same complexity as a non AI expert system that runs slower and at higher costs due just having injected models and tokens into the mix! I joke, but it seems like there's a pull towards that.

chermi · 2026-03-15T17:10:36 1773594636

Mushrooms!

chermi · 2026-03-15T17:09:19 1773594559

I just love the simplicity of a cron job control system. This is so fun.

chermi · 2026-03-09T01:47:39 1773020859

I don't think that's the main problem. Yes, we can't fix stuff, but labor is among the smallest parts of the problem.

chermi · 2026-03-03T23:25:32 1772580332

How is this on frontpage but not 3.1 flash-lite?

magicarp · 2026-03-03T23:29:53 1772580593

Think about who runs this website

chermi · 2026-02-19T16:42:40 1771519360

You can really notice the tool use problems. They gotta get on that. The agent trend seems real, and powerful. They can't afford to fall behind on it.

verdverm · 2026-02-19T17:13:30 1771521210

I don't really have tool usage issues that I don't put under that doesn't follow system prompt instructions consistently

there are these times where it puts a prefix on all function calls, which is weird and I think hallucination, so maybe that one

3.1 hopefully fixes that

HardCodedBias · 2026-02-19T20:32:41 1771533161

"They can't afford to fall behind on it."

They are very, very seriously far behind as of 3.0.

We'll see if 3.1 addresses the issue at all.

chermi · 2026-02-13T02:53:32 1770951212

Uhh, just false.

HardCodedBias · 2026-02-13T16:12:13 1770999133

It's just poop tier.

Come on.

Worthless.

Do you have any market counter points.

Market counter points that aren't really just a repackaging of:

  1. "Google has the world's best distribution" and/or  
  2. "Google has a firehose of money that allows them to sell their 'AI product' at an enormous discount?

Good luck!