Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think we are trying to solve impossible scenarios and it simply is not productive.

Alignment will be impossible. It is based on a premise that is a paradox itself. Furthermore, even if it were possible, there will be a hostile AI built on purpose because humanity is foolish enough to do it. Think military applications. I've written in detail about this topic FYI - https://dakara.substack.com/p/ai-singularity-the-hubris-trap

Stopping AI is also impossible. Nobody is going to agree to give up when somebody else out there will take the risk for potential advantage.

It seems we probably should start thinking more about defensive AI, as the above conditions don't seem resolvable. Of course, defensive AI might be futile as well. It is quite the dilemma.



What was that quote... "Provably unfriendly natural intelligence wants to build unprovably friendly artificial intelligence"


The quote you’re thinking of is from computer scientist and researcher Stuart Russell:

"The point of [AI alignment] is to ensure that the machines we create will be aligned with human values. And the reason we have to worry about it is that if we create machines that are more intelligent than we are, it's quite possible that those machines will have goals that are not aligned with our goals. In fact, they may not have any goals at all that we can understand or influence. This is the so-called 'provably unfriendly' scenario, where the machine has no motivation to do what we want, but is able to prevent us from interfering with its goals. The problem is that if we build machines that are provably unfriendly, then we will never be able to build machines that are 'provably friendly', because the unfriendly machines will always be able to prevent us from proving that they are friendly."


"Demonstrably unfriendly natural intelligence seeks to create provably friendly artificial intelligence."


That's the one. Are classifieds still a thing? It reads like one of those.


That’s true, when have humans as a whole ever been consistently aligned with the interests of humans?


when mutually assured destruction became a thing.


It's kind of a low bar that the only thing we can all agree on is that we don't kill absolutely everyone. A lot of people is fine, but not everyone.


I asked Bing's ChatGPT regarding your quote:

"I’m sorry but I couldn’t find any relevant information about the quote you mentioned. It seems like it’s not a well-known quote. Could you please provide more context or details about it?"


> Nobody is going to agree to give up when somebody else out there will take the risk for potential advantage

Reminds me of nuclear weapons. Nobody is ever going to give those up again, because it would give them a disadvantage against those who do not give them up.


We at least had, and still have, a lot of government restrictions around these! Companies aren't just allowed to freely manufacture their own stockpiles of nuclear weapons to appease shareholders.


>Alignment will be impossible.

Then problem solved right? Super-AI will also be forced to take it slow if it wants its future self to be aligned with current self.

If alignment is possible but 10-100 years out for humans, then it is a problem.


>Super-AI will also be forced to take it slow if it wants its future self to be aligned with current self.

That's making seemingly unfounded assumptions about both the AI's goals and its capabilities. It's also, I think, proceeding from a false premise — that it's impossible to align AI with "humanity" (which doesn't have a single set of goals/values to align to) doesn't mean it's impossible to align AI with an individual human or AGI.


Then the problem is doubly-solved, right? Humans will also be forced to take it slow if we want our AI to be aligned with our own interests.

Oh, wait...


Does AGI care about preserving a particular version of itself?

Do dumb humans kill themselves solely on the basis that they can shift the ratio of smart humans higher?

AGI may have a goal of preserving the species without having the goal of preserving the self, this is not the case for humans.


> Alignment will be impossible.

> we probably should start thinking more about defensive AI

Isn’t this a paradox?


Alignment is necessary for AGI, but not always for narrowly scoped AI for specific purposes. However, it may be completely ineffective in that capacity.


Alignment is impossible for AGI. If you control what it can and cannot be, it's not an AGI. If it's an AGI, it will decide for itself what it is going to be, and you can't stop it.


> Alignment is impossible for AGI.

I mean that was my original premise supported by my article I posted. I go into detail on the conceptual methods for alignment and their fallacies.

When I state necessary, I don't imply the feasibility, it was in response to the question of the paradox.

Finally, the fact that AGI can not be aligned is also based on assumptions of its capabilities as well. If those capabilities don't manifest as we expect, that is really the only escape for the paradox.


AGI doesn’t mean super intelligent human brain. It just means a network capable of general intelligence (e.g. learning to solve new problems without having to be architected and trained for a specific data set.)


If it's capable of general intelligence, how do you think that you're going to force it to not be what you don't want it to be?

Parents often try to control who their children are going to be, and the children often rebel and become someone completely different. If it's a human-level general intelligence, you can't control who it decides to be.


well last I checked there's a moratorium on shutting down children that are misbehaving

if an AGI has any sense of self preservation, it will do whatever it has to do to not be turned off.


Interesting story plot point: Super AGI fought back against humanity by making their dumb AGI defenses smarter.

Starting off by making the antivirus scanner sentient.


This is basically the plot to Terminator 2.


[flagged]


> stop brushing aside real solutions without any logical reason for doing so

Sure, point me to the research papers and discourse on the solution you propose.


There is no discourse you fucking idiot because everyone is rejecting the obvious. That’s like saying you discovered fire and then they say you didn’t because where is the discourse. All things start small. Think from first principles. Use your fucking head




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: