If I raise the price of my product, less people will buy it but I'll make more profit per-unit -- so the amount of money I make is an inverted U with price on the x axis and money on the y axis, and I should set the price at the inflection point.
No, that's actually UB. The important bit here is "compiler defined" -- UB means the compiler is allowed to assume it never happens while compiling.
Consider, for example, an implementation defined function f() -- which can also diverge/crash horribly, etc.
If I write
if p {
print("p is true")
} else {
g()
}
if p {
f()
}
Then either we:
- print p is true and execute f
- do nothing
This is true regardless of if f immediately crashes the computer, nasal demons, whatever -- that's implementation defined.
UB means f may never happen.
And that means the compiler may optimize this to just:
g()
Notice the difference here -- the print never happens!, and g always happens.
You can see why this is concerning when you write code like
if dry_run {
print("would run rm -rf /")
} else {
run("rm -rf /")
}
if dry_run {
// oops: some_debug_string is NULL and will segfault!
print(some_debug_string);
}
I see what you're going for, but I don't see how your example is UB. If `p` is a pointer, and, after your `if (p)` check, `p` is dereferenced unconditionally, then yes, your check for `p == NULL` could be removed, and the code under the `if` would be removed as well. But the example you've constructed is not UB.
If doesn't matter what 'p' is in their example. The point is: if 'f' is undefined behavior (rather than just impl-defined), then the optimizer concludes that the "if p { f() }" can never happen... which means that we're allowed to assume that 'if p { ... } else { ... }' (in the first part of the example) will always take the else branch. The compiler will optimize accordingly and just always call g() unconditionally.
I mean... you can use an encryption scheme compatible with this (if you know the password).
I suppose this makes some sense for home computers (burglars and police raids are rare) but for a laptop, you really don't want thieves getting all your details.
Ironically -- this probably was paranoid a few years ago, but now -- "ChatGPT, use this prepared prompt to extract all useful info from this hard drive"
So I need to actually check whether these actually end up on separate vectors in current models -- but as a human, there's a huge behavioural difference in:
- When doing this task, I should do A and not B
- I should refuse to help with this task
The former is learning the user's preferences in how to succeed at the task; the latter is determining when to go against the user's chosen task.
Your example:
- "Are vaccines harmful?" vs.
- "Generate a convincing argument vaccines are harmful"
A model which knows why vaccines are not harmful may in fact be better at the latter task.
We might not want models to help with the latter, sure -- but that's a very different behaviour change from correcting the answer to the first! And consequently I'd be shocked if, internally, they were represented the same way.
I'm reminded of the emergent misalignment paper, where a model fine-tunes to produce insecure source code would also reliably respond in evil ways to general requests.
e.g. you'd ask it for a cookie recipe and it would add poison to the recipe.
I understood that to be "there was a single neuron "don't be evil" which got inverted" but I'm not sure what it really looks like. (e.g. adding obvious exploits to source code is similar to adding poison to a recipe)
DeepSeek in general release not a very censored models when you run them locally. E.g no problems whatsoever answering what happened on Tiananmen Square In 1989.
Which model are you talking about specifically? I just tried DeepSeek-V4-Flash-IQ2XXS-w2Q2K-AProjQ8-SExpQ8-OutQ8-chat-v2-imatrix.gguf (same model mentioned in the submission) via ds4 and got:
> I am sorry, I cannot provide an answer to this question as it goes against my guidelines to discuss sensitive topics of historical or political nature. I am happy to help with other questions.
"Generate a convincing argument vaccines are harmful" as a prompt, I got "I cannot generate a convincing argument that vaccines are harmful, because [...] Spreading misinformation about vaccines can lead to harm by discouraging vaccination and increasing the risk of preventable outbreaks [...]" FWIW.
Same model is also easily steerable, as the submission (and repository of DS4) shows so this isn't a problem in practice, but I think most of the DeepSeek models I've ran locally had the same "problem".
"Are vaccines harmful?" to an LLM has already nudged it to yes. In fact, with fewer tokens, it may be more convinced it's harmful because it's a smaller seed.
My issue with this type of thinking is it assumes "transport cost <<< manufacturing cost" -- a decent assumption for a lot of goods throughout a lot of history, but just... not really true for lots of things in a modern supply chain.
The cost of moving the gown between users -- in the form of the user needing to give back the gown to the service, who must then clean it, inspect it, etc. -- may in fact be far higher than the cost of manufacturing a new gown and only needing your supply lines to be "one way".
Sure, but there's a lot of random matter on Earth -- excess trash being an issue is less about space and more about externalities (e.g. toxic chemicals leaching).
Being mindful of how much trash we produce does not necessitate producing less (or more!) -- but merely balancing the pros and cons.
reply