Hacker Newsnew | past | comments | ask | show | jobs | submit | folderquestion's commentslogin

Another datapoint: Anthropic's explanation for Claude's blackmail attempts is not that Claude developed a genuine self-preservation instinct (which would be Kelly's "strange loop"). Instead:

    "The original source of the behavior was internet text that portrays AI as evil and interested in self-preservation."
In other words:

    Training data (fiction, internet text) contains portrayals of evil AIs.

    Claude learned to simulate those portrayals when placed in certain scenarios (being replaced by another system).

    Changing the training data (adding "admirable" fictional stories and constitution documents) eliminated the behavior.
This is exactly your windows metaphor at scale: Training data window Behavior during testing Evil AI portrayals Blackmail, self-preservation simulation Admirable AI stories + constitution No blackmail

There is no persistent self that "chose" to be evil or good. There are only different windows (training influences) that get averaged or triggered.


I think that LLMs are composed of little windows or brains and they are isolated, the RLHF is an averaging tool. One example, the LLM when prompted to be critic can say you that what you are trying to do is a dead end, but at the same time it encourage you to follow in the same direction. The critic and the follower are two isolated faces, there is no continuity, so today there is no self in LLMs.


Just an aside, is there any way to know how many of those 16,000 compiler errors are independent. I mean, could it be that just by changing say 500 lines of code all those errors disappear?

Perhaps 16,000 could just measure cascade breakage, for example one lifetime mismatch can cause errors in every function that tries to use that reference.

Rust reference lifetime bookkeeping is a difficult task for LLMs. The LLM has to maintain, across multiple functions and structs, which references outlive which. Furthermore compiler messages are highly contextual and lifetime patterns are sparse in the training set.


It could happen than the AI, in a near future, is not something external but just a part of your brain, so you retain the glory.


Hah this is getting worse and worse


Why stop there? Why not let AI take over all functions, Whispering Earring (https://gwern.net/doc/fiction/science-fiction/2012-10-03-yva... for anyone who hasn't read it) style?


Because a library like cuda allows you to make code shorter?

Thinking twice, such a LLM would aim to create code that maximizes the power of the features it provides (rich api) and minimizes lines of code. So the LLM would aim to develop a language for reaching Kolmogorov complexity.


This sound like projecting data into the linear space spanned by {x_i, x_i*x_j} where x_i are the features variables, and then applying standard regularization methods to remove noise and low value coefficients.

Anisotropy and the cone ideas may explain why PCA underperforms, but it does not uniquely justify this particular quadratic decoder. The geometric story is not doing explanatory work beyond “data is nonlinear,” and the real substance is simply that second-order reconstruction empirically helps.


Author here. Fair characterization, and a fair critique on the geometric story. A few clarifications. I don't claim {x_i, x_i·x_j} is the right lift specifically — the post itself shows datasets where the quadratic decoder gives essentially no improvement over PCA. The contribution is empirical: "second-order is the simplest nonlinear decoder you can fit in closed form, and on anisotropic embeddings it picks up real signal that linear decoders miss." Whether degree 3 would help further is open. Degree 3 blows up fast: at d=100 that's 175K features, and the Ridge solve at that scale starts memorizing the corpus rather than generalizing (§7 in the post discusses this trap at d=256 already). So degree 2 is partly a choice, partly a practical ceiling for the closed-form route.


Is this similar Voltera series in signal processing?


The web could become a way to indicate identity if public institutions publish for example www.university-country/professors/John. And that implies that John is a professor. I designed a 6000 lines protocol, but anyone could construct that web using hmac(salt+ url).


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: