More

xscott · 2026-05-27T13:18:33 1779887913

Scheme is (or at least was) coherent. You don't need to look any further than set/setf/setq to see that Common Lisp is "organically grown" from the fertilizer of a committee. CL does its best to make every other lisp more attractive.

rootnod3 · 2026-05-27T13:22:43 1779888163

Which Scheme are we talking about? R5RS? R7RS-small? R6RS? With SRFIs? Without? Which scheme? Is it `(library...)` or `(define-module...)`?

xscott · 2026-05-27T13:33:52 1779888832

Heh, I'd probably take R4RS with define-syntax :-)

rootnod3 · 2026-05-27T17:02:45 1779901365

I mean, good choice, but you see the point, right? As much as ANSI CL has it's flaws, it has a standard, as much of a mixed bag it might be. Scheme is just a general potpourri of "we kinda have a guideline, but do whatever".

I would very much prefer scheme if the different implementations had a working standard. But I can't take my Chez-scheme code and throw it into Guile-scheme.

But pretty good chance I can take my ECL code and throw it into SBCL or LispWorks.

xscott · 2026-05-27T22:39:22 1779921562

> you see the point, right?

Bah, I think this debate was already old when I first saw people arguing it on comp.lang.lisp in the 90s. I don't have a dog in this fight other than to reject the notion that Common Lisp is "coherent" and not "organically grown".

The original Scheme belongs in the category of languages like Standard ML and SmallTalk, where a small, careful, and talented group designed them with focus. Common Lisp seems like a bunch of smart people with competing interest and legacy baselines tried to meet in the middle. To the extent CL is more pragmatic, it's another example of "Worse is Better".

rahen · 2026-05-27T14:50:00 1779893400

Scheme has a coherent and minimalist design, but its ecosystem and abstraction facilities feel too sparse for large applications.

When I started building a Lisp-based machine learning framework, Guile seemed like the right choice because it provides GOOPS and generic functions, yet I still ended up with a lot of boilerplate to compensate for the lack of a strong type system.

Scheme feels to me like C is to C++: not ergonomic for large-scale application development. Go is one of those languages that has both minimalism and productivity.

a-french-anon · 2026-05-28T08:00:18 1779955218

Common Lisp didn't grow into its warts, though, it was intentionally a "Common" Lisp. A compromise, like I wrote.

xscott · 2026-05-20T17:00:43 1779296443

I believe the idea is that you don't care what language is being used if you aren't going to look at it anyway. Given that premise, the AI can write JavaScript instead of something you need to compile separately.

esperent · 2026-05-21T10:59:56 1779361196

WASM is hard enough to use that I would say the number of people using it just because they don't like JS/TS is fairly small.

At least that's always been my impression. If you're willing to go through the effort to use it, it's probably because you need the performance gains, and AI hasn't changed that.

xscott · 2026-05-19T05:38:42 1779169122

I predict it will get regulated in the US, and that it will lead to regulatory capture. Solving absolutely NONE of the problems people complain about while providing NONE of the benefits AI could bring to society.

xscott · 2026-05-19T03:15:55 1779160555

I agree with your logic, but you should replace 2 with "AI used by governments only". The haters would have more luck getting rid of nuclear weapons than putting the AI cat back in the bag. Governments will use it for surveillance. Think "sentiment analysis" to make sure you're not a terrorist.

xscott · 2026-05-19T03:10:05 1779160205

All that's going to happen is people will "voluntarily" take it away from themselves.

The fearmongers will tell stories about biological or chemical weapons. It'll be things you could learn from a textbook - something like mercury molecules or cultivating rabies. People will vote to ban AI.

The puritans will clutch their pearls because it can be used to make porn they don't like. They'll vote to ban AI.

People who are afraid of losing their jobs will make tangential arguments about copyright violations. They'll vote to ban AI.

So citizens won't be allowed to use AI directly.

Instead, there will be regulatory capture. Microsoft and Apple will pay fees for compliance testing (bribes). Then they'll serve you a dumbed down version you can't escape. "I see you're trying to analyze numbers. Click here for a free signup to Office 365!".

The social media sites will make sure you still have access to create rage bait slop. That improves engagement.

Big software companies will pay for bug finding services. Small open source projects won't have the money.

If you're upset by AI, you should ask yourself if that's part of the plan. Because there's a lot of money to be made and power to be stripped from citizens if everything above comes true.

linkregister · 2026-05-19T03:26:34 1779161194

There are many cheap, open models available on the vLLM engine: https://huggingface.co/models?other=vllm. This includes gpt-oss, LLaMa, and Gemma. This is in addition to Qwen, Deepseek, Mistral, Kimi, GLM, and Poolside.

xscott · 2026-05-19T04:40:00 1779165600

Yes, and I keep copies of the ones I like[0]. I can't run the huge ones, but the ones I can run aren't as good the "frontier" models. Regardless, I expect they will be considered contraband someday.

[0] - I've been using llama.ccp and Ollama. I should checkout vLLM.

peyton · 2026-05-19T03:54:15 1779162855

Is there any precedent you’re referencing? Many things that are expensive, slow, scarce, or bad are going to become cheap, fast, abundant, and awesome. Historically people like that a lot.

I just have trouble seeing how we get to there from here. Vote to ban AI? Has anything like that happened before?

xscott · 2026-05-17T23:45:11 1779061511

I believe almost nobody thinks original thoughts. I never have. At best I applied an idea from one area to another, which is something AI can do.

Moreover, most novel advancements seem like they come when society is ready for them - the nearly simultaneous discovery of calculus for example.

Pick any thought of yours you truly believe is novel and do a serious literature search on the topic and adjacent fields. Ask an AI to help you with the search if necessary :-)

mhitza · 2026-05-19T09:51:56 1779184316

Interesting though experiment, if you train an LLM with all available knowledge (cut-off date) before the day Einstein published/first time discussed the relativity theory, would an LLM be able to come up with that?

xscott · 2026-05-19T11:04:27 1779188667

I think about that kind of thing a lot. For Special Relativity, maybe? I don't know the historical interactions between Einstein, Minkowski, and Lorentz, but my gut tells me the idea was ripe. For General Relativity, I'm less optimistic the current flavor of LLMs could make that leap.

There's so much I don't know though, and I'm certain we aren't the only two who think about it :-)

dominotw · 2026-05-21T15:30:17 1779377417

what about the famous go game move that ai came up with. move that baffled experts.

xscott · 2026-05-17T23:25:15 1779060315

For me, I'm very enthusiastic about it's use for programming, mathematics, and as a teaching assistant[0]. I'm very worried about it being used for automated surveillance, terrible customer service, and deceptive targeted advertisements. I'm unconcerned about slop and alignment issues. I'm very much in favor of local models (democratization), just like I'm a fan of Wikipedia for making so many topics available to everyone for free.

[0] I don't see a lot of people using LLMs to learn a new topic, but I had a really great experience by walking through some math I wanted to know, forcing it to go slowly, and writing code and test cases for each concept to make sure it wasn't hallucinating. There are no "choose your own adventure" textbooks like this, and there are no professors who would be that patient with me in office hours. I don't think it will work well for unmotivated learners.

xscott · 2026-05-17T23:05:14 1779059114

There was also a decent amount of enthusiasm for the "long tails" because with unlimited virtual shelf space, you could find products that would not have enough mass appeal to the average consumer to justify their space on physical shelves. For instance, Netflix would loan you a DVD of almost any movie but Blockbuster only stocked the middle of the bell curve.

adrian_b · 2026-05-18T06:08:31 1779084511

Yes, this has always been the major advantage of online shopping for me, since its very appearance.

When Amazon was launched, I immediately started to use it to get books that were impossible to find at bookstores near me. Similarly for various computer components that are less frequently used by typical users or even certain kinds of clothes or accessories needed for special activities, for which there were no nearby shops.

I have never been a typical consumer, so it had always been very difficult for me to find what I wanted at local shops, thus the appearance of online shopping was really great for me.

xscott · 2026-05-04T19:25:28 1777922728

I think you're right about the cost/benefit trade-off in general, but I do wonder how much "compaction" Codex and Claude do is to keep context fresh and how much is to save (them) runtime costs.

If you've got a 1M token context, but they constantly summarize it down to something much smaller, is it really 1M tokens of benefit? With a local model, you can use all 256k tokens on your own terms. However, I don't have any benchmarks to know.

0xbadcafebee · 2026-05-04T23:43:07 1777938187

I think you might be confused a bit about compaction? The LLM API endpoint does not do compaction, it's an external agent harness that does it. And the Codex/Claude agents aren't constantly summarizing it down, they generally wait until you get within 3/4 of the max of the context size.

Compaction doesn't save them money, it just makes it possible for you to continue a session. If you compact a session too many times, besides the fact that the model basically stops being useful, you eventually just cannot do anything else in the session because all the context is taken up by compaction notes. But if you don't compact it, pretty soon the session is completely unusable because it can't output any more tokens. You can disable compaction in those agents if you want to see the difference.

Also, using a lot of context can make the model perform poorly, so compaction can improve results. If you have a much larger context size, it means you have more headroom before the model starts to perform poorly (as it grows closer to max context size). A larger context also lets you do things like handle larger documents or reason over a larger amount of data without having to break it up into subtasks. Eventually we want models' context to get much bigger so we can do more things in a session. (Some research is being done to see if we can get rid of the limit entirely)

cmrx64 · 2026-05-05T01:04:46 1777943086

LLM API endpoint does do compaction. OpenAI definitely does support serverside compaction, both explicit and automatic, and this is different than what could be implemented purely clientside: https://developers.openai.com/api/docs/guides/compaction (and there was rumors a few months ago on HN about how activation-preserving/latent it is, vs just summarization). Anthropic as well, in beta (new to me): https://platform.claude.com/docs/en/build-with-claude/compac...

xscott · 2026-05-05T00:27:09 1777940829

The names for the pieces are confusing, so it's easy to talk past each other. For instance, you're saying "Codex the agent", which isn't a thing now. It's currently GPT-5.5, and at one point it was GPT-5.3-Codex, so when I say "Codex", I meant the MacOS "harness". Similar for Claude Code vs Claude Opus/Sonnet.

Anyways, I don't know specifics well enough to argue with you on anything, but there is a cost for input tokens, and you see/pay it when you use the API directly or through OpenRouter. Maybe you looked at the leaked source for the Claude Code and can tell me definitively otherwise, but Anthropics and OpenAI's incentives for when to compact are not always aligned with the users depending on pricing plans.

xscott · 2026-05-04T19:19:38 1777922378

Your point about caliber/quality is fair, but I have been pretty astonished by some of the newer/better models (Gemma 4 variants, GPT-OSS before that).

However, there's not a lot of memory increase to have multiple sessions in parallel with one model. It's an HTTP server, and other than some caching, basically stateless.

iib · 2026-05-04T20:21:49 1777926109

Doesn't llama.cpp (or similar) have to evict the kv cache for this, so that performance is degraded when running multiple sessions? Or how do you load a model in memory and then use it in multiple sessions? I am still learning this stuff

tredre3 · 2026-05-05T00:03:43 1777939423

The model is loaded once and can be used for multiple sessions, and even parallel requests.

llama.cpp uses a unified KV cache that is shared between requests (be they happening in parallel or not). As new requests come in, they'll evict no longer referenced branches, then move to evict the least recently used entry, and so on.

If you come back to a session that's been evicted it will just be parsed again. This is a problem only on very long context sessions, but it can still be a problem to you.

So one way to reduce such evictions (and reduce KV cache size significantly as a bonus) is to reduce the number of kv cache checkpoints.

Checkpoints allow you to branch a session at any point and not have to recompute it from the start. If you find that you rarely branch a conversation, or if you rely entirely on a coding harness, then setting ctx-checkpoints to 0 or 1 will save tons of VRAM and allow more different sessions to stay in VRAM. This is especially true for models with very large checkpoints (such as Gemma 4).

xscott · 2026-05-05T01:13:48 1777943628

There are so many flags to llama.ccp that I won't try to say anything too strong, but I believe things related to `--kv-offload` mean you can have the KV cache in GPU VRAM, regular GPU RAM, paged to disk, etc...

I'm on a Mac with unified memory, so I can't easily benchmark it for you, but I think a PC with 64GB of regular RAM and a 24GB gaming card could swap between multiple sessions without too much pain. The weights could stay resident on the GPU.

On the other hand, I did just dump some Project Gutenberg texts into a prompt, and building that cache in the first place was slower than I though it would be.