Which makes for an interesting thought / discussion; code is written to be read by humans first, executed by computers second. What would code look like if it was written to be read by LLMs? The way they work now (or, how they're trained) is on human language and code, but there might be a style that's better for LLMs. Whatever metric of "better" you may use.
Just a thought experiment, I very much doubt I'm the first one to think of it. It's probably in the same line of "why doesn't an LLM just write assembly directly"
LLMs read and write human-code because humans have been reading and writing human-code. The sample size of assembly problems is, in my estimate, too small for LLMs to efficiently read and write it for common use cases.
I liken it to the problem of applying machine learning to hard video games (e.g. Starcraft). When trained to mimic human strategies, it can be extremely effective, but machine learning will not discover broadly effective strategies on a reasonable timescale.
If you convert "human strategies" to "human theory, programming languages, and design patterns", perhaps the point will be clear.
But: could the ouroboric cycle of LLM use decay the common strategies and design patterns we use into inexplicable blobs of assembly? Can LLMs improve at programming if humans do not advance the theory or invent new languages, patterns, etc?
But starcraft training is not through mimicking human strategies - it was pure RL with a reward function shaped around winning, which allows it to emerge non-human and eventually super-human strategies (such as the worker oversaturation).
The current training loop for coding is RL as well - so a departure from human coding patterns is not unexpected (even if departure from human coding structure is unexpected, as that would require development of a new coding language).
> It's probably in the same line of "why doesn't an LLM just write assembly directly"
My suspicion is that the "language" part of LLMs means they tend to prefer languages which are closer to human languages than assembly and benefit from much of the same abstractions and tooling (hence the recent acquisition of bun and astral).
The problem with that is that assembly isn't portable, and x86 isn't as dominant as it once was, so then you've got arm and x86(_64). But you could target the LLVM machine if you wanted.
Yes but my point was that they seem to explicitly not care about code quality and/or the insane amount of bloat, and seem to just want the LLM to be able to deal with it.
I've heard somewhere that they have roughly 100% code churn every few months, so yes, they unfortunately don't care about code quality. It's a shame, because it's still the best coding agent, in my experience.
> they unfortunately don't care about code quality.
> It's a shame, because it's still the best coding agent, in my experience.
If it is the best, and if it delivers the value users are asking for, then why would they have an incentive to make further $$$ investments to make it of a "higher" quality if the value this difference could make is not substantial or hurts the ROI?
On many projects I found this "higher quality" not only to be false of delivering more substantial value but actually I found it was hurting the project to deliver the value that matters.
Maybe we are after all entering the era of SWE where all this bike-shedding is gone and only type of engineers who will be able to survive in it will be the ones who are capable of delivering the actual value (IME very few per project).
Yes, but as I said, it’s in a way the ultimate form of dogfooding: ideally they’ll be able to get the LLM smart enough to keep the codebase working well long-term.
Now whether that’s actually possible is a second topic.
My experience with reporting stuff to mods is that people who post racist stuff do get banned but I also think there's a difference between holding opinions I consider based in racism or having racist outcomes (which I don't report), and posting actually racist stuff (which I do report).
Do you customize or personalize these at all? Or use it just out of the box? I use gemini for image generation and as an, assistant through my android although not frequently. As a chat assistant the online Google UI is a turnoff personally.
For tone, I prefer efficient — higher signal-to-noise.
In custom instructions I’ll sometimes write something like “tell it like it is.” Interestingly, some models then start with “okay, I’ll tell you like it is…” which makes me wonder whether that’s psychological framing, operational behavior, or both.
Curious why you asked — have you found customization meaningfully changes performance?
Since I primarily used ChatGPT, I picked up a bunch of prompts from people sharing it in order to optimize for better results. I don't know if this is relevant anymore as I think they've fixed quite a bit and now you can customize tone and warmth directly.
My prompts were more on avoiding filler, be more direct etc, no emojis etc.
reply