Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That isn't the game.

The game is designing software to requirements. It's writing literature for a new era. It's creating X for A audience with N vauge unspecified needs -- where X is a complex product made of many parts, involving many people, with shifting and changing problems/solutions/requirements.

The game was never writing the stack overflow answer -- that was already written.



So? Those requirements can be specified, holes inferred, and probably stuck to much more closely by a machine than man. If history's shown anything it's that if something takes a lot of mental effort for people it's probably an easy target for automation. The best developer is the one that doesn't get depressed when the requirements change for the 15th time in a month and just rewrites everything again at 2000x the speed of a human dev while costing basically nothing in comparison.

People say, "oh but clients will have to get good at listing specs, that'll never happen". Like bruh the clients will obviously be using LLMs to make the specs too. Eventually the whole B2B workflow will just be LLMs talking to each other or something of the sort.


>The game was never writing the stack overflow answer -- that was already written.

The problem is this was never a stackoverflow question and there was never an answer for it.

Try finding it. The LLM is already playing the game because it came up with that answer which is Fully Correct, Out of Thin Air.

Look, clearly the LLM can't play the game as well as a trained adept human, but it's definitely playing the game.

>The game is designing software to requirements. It's writing literature for a new era. It's creating X for A audience with N vauge unspecified needs -- where X is a complex product made of many parts, involving many people, with shifting and changing problems/solutions/requirements.

It can do all of this. It can talk like you and parrot exactly what your saying and also go into more detail and re-frame your words more eloquently.

What you're not getting is that all the things you mentioned the LLM can do in actuality to varying degrees to the point where it is in the "game." and at times it does better than us. Likely, you haven't even tried asking it yet.


> Fully Correct, Out of Thin Air

I think if you're an expert in an area, this effect is easier to see through. You know where the github repo is, where the library example is, which ebooks there area -- etc. and you're mostly at-ease not using them and just writing the solution yourself.

These systems are not "fully correct" and not "out of thin area". They are trained on everything ever digitised, including the entire internet. They, in effect, find similar historical cases to your query and merge them. In many cases, for specific enough queries, the text is verbatim from an original source.

This is less revolutionary than the spreadsheet; it's less than google search. It's a speed boost to what was always the most wrote element to what we do. Yes, that often took us the longest -- and so some might be afraid that's what labour is -- but it isnt.

We never "added value" to products via what may be automated. Value is always a matter of the desire of the buyer of the products of our labour (vs. the supply) -- and making those products for those buyers was always what they wanted.

This will be clear to everyone pretty quickly, as with all tech, it's "magic" on the first encounter -- until the limitations are exposed.

I actually work in an area where what took 3mo last year, I can now do in maybe 3 days due to ChatGPT. But when it comes to providing my customers with that content, the value was always in how I provided it and what it did for them.

I think this makes my skills more valuable, not less. Since the quality of products will be even more stratified by experts who can quickly assemble what the customer needs from non-experts who have to fight through AI dialogue to get something generic.


I agree. LLMs are very impressive, but it isn't helpful to think of them of magic. LLMs are a great tool to explore and remix the body of human knowledge on the internet (limited to what it has been trained on).

The user needs to keep in mind that it can give plenty of false information. To make good use of it, the user needs to be able to verify if the returned information is useful, makes sense, compare with first hand sources, etc. In the hands of expert that is really powerful. In the hands of a layman (on the subject in question), they can generate a lot of crap and misunderstand what it is saying. It is similar to the idea that Democracy can be a great tool, but it needs an educated and participatory populous or it may generate a lot of headaches.


> I agree. LLMs are very impressive, but it isn't helpful to think of them of magic. LLMs are a great tool to explore and remix the body of human knowledge on the internet (limited to what it has been trained on).

Of course you shouldn't think of it as magic. But, the experts self admit they don't fully understand how LLMs can produce such output. It's definitely emergent behavior. We've built something we don't understand, and although it's not magic, it's one of the closest things to it that can exist. Think about it. What is the closest thing in reality to magic? Literally, building something we can't understand is it.

It's one thing to think of something as magic, it's another thing to try to simplify a highly complex concept into a box. When elon musk got his rockets to space why were people so floored by decades old technology that he simply made cheaper?

But when someone makes AI that can literally do almost anything you ask it to everyone just suddenly says it's a simple stochastic parrot that can't do much?

I think it's obvious. It's because a rocket can't replace your job or your identity. If part of your skillset and identity is "master programmer" and suddenly there's a machine that can do better than you, the easiest thing to stop that machine is to first deny reality.


> the experts self admit they don't fully understand how LLMs can produce such output

Well I take myself to be an expert in this area, and I think it's fairly obvious how they work. Many of these so-called "Experts" are sitting on the boards of commercial companies with vested interests in presenting this technology as revolutionary. Indeed, much of what has been said recently in the media is little more than political and economic power plays disguised as philosophical musings.

A statistical AI system is a function `answer = f(question; weights)`. The `answer` obtains apparent "emergent" properties such as "suitability for basic reasoning tasks" when used by human operators.

But the function does not actually have those properties. It's a trick -- the weights are summaries of unimaginable number of similar cases, and the function is little more than "sample from those cases and merge".

Properties of the output of this function obtain trivially in the way that all statistical functions generate increasingly useful output: by having increasingly relevant weights.

If you model linear data with just y = ax then as soon as you shift to "y = ax + b" you'll see the "emergent property" that the output is now sensitive to a background bias, b.

Emergence is an ontological phenomenon concerning how `f` would be reaslised by a physical system. In this case any physical system implementing `f` shows no such emergence.

Rather the output of `f` has a "shift in utility" as the properties of the data its training on, as summarised by the weights, "shifts in utilty".

In other words, if you train a statistical system on everything ever written by billions of people over decades, then you will in fact see "domains of applicability" increases, just as much as when you shift from a y=ax model to a y=ax+b.

To make this as simple as I can: statistical AI is just a funnel. ChatGPT is a slightly better funnel, but moreso, it's had the ocean pass through it.

Much of its apparent properties are illusary, and much of the press around it puts in cases where it appears to work and claims "look it works!". This is pseudoscience -- if you want to test a hypothesis of ChatGPT, find all the cases where it doesnt work -- and you will find that in the cases where it does there was some "statistical shortcut" taken


I think this is a motte-bailey, "true and trivial vs incredible and false" type of thing. Given a sufficiently flexible interpretation of "sample from multiple cases and merge", humans do the same thing. Given a very literal interpretation, this is obviously not what networks do - aside one paper to the contrary that relied on a very tortured interpretation of "linear", neural networks specifically do not output a linear combination of input samples.

And frankly, any interaction with even GPT 3.5 should demonstrate this. It's not hard to make the network produce output that was never in the training set at all, in any form. Even just the fact that its skills generalize across languages should already disprove this claim.


> It's not hard to make the network produce output that was never in the training set at all, in any form.

Honest request because I am a bit skeptical, can you give an example of something it is not trained in any form and can give output for? And can it output something meaningful?

Because I have run a few experiments on ChatGPT for two spoken languages with standard written forms but without much of a presence on the internet and it just makes stuff up.


Well, it depends on the standard of abstraction that you accept. I don't think that ChatGPT has (or we've seen evidence of) any skills that weren't represented in its training set. But you can just invent an operation. For instance, something like, "ChatGPT: write code that takes a string that is even length and inverts the order of every second character." Actually, let me go try that...

And here we go! https://poe.com/s/UJxaAK9aVN8G7DLUko87 Note that it took me a long time, because GPT 3.5 really really wanted to misunderstand what I was saying; there is a strong bias to default to its training samples, especially if it's a common idea. But eventually, with only moderate pushing, its code did work.

What's interesting to me here is that after I threw the whole "step by step" shebang at it, it got code that was almost right. Surprisingly often, GPT will end up with code that's clever in methodology, but wrong in a very pedestrian way. IMO this means there has to be something wrong with the way we're training these networks.

edit: https://poe.com/s/gZW5ZGgiomWzabKJCUcA I gave it a more complete prompt because I only have one completion per day, but GPT-4 got it in one shot.

edit: https://poe.com/s/2lS8rjbGqHrzSkpEvLzr GPT 3.5 flubbed it given the same prompt.


Well you tried. You really did. But there are already people trying to form religions around LLMs. Some people can't be reasoned with.


Are you speaking figuratively, or do you know of any specific instances of people forming actual religions around them? I'd be very interested in the latter.


I've seen people posting about it on a few message boards. Most of them sound like they e lost their minds or are under the influence being completely honest. I could try to dig up posts if you want but it's more sad than interesting.


Well, I am interested if it's some sort of organized religion, and not mere posturing/speculation on forums.


I have not seen organized religions around AI yet. But I have seen people writing some pretty wild ravings about how their god is an AI and how chatgpt connects too it or something. There's also people dating LLMs. Some guy in Belgium commit suicide because his ai gf told him too leaving his wife and kids behind


Yeah those crazies are far and few in between and none of them are on this thread. Throwing out religious accusations is going too far.


It'll be interesting to see how these sorts of less than anticipated sociological things emerge. Take a look at scientology, many practicers, pretty scifi beliefset, I think all we really need is another L Ron Hubbard and lots of not super crazy people could start to worship these things.

https://www.thedailybeast.com/the-radical-movement-to-worshi...


Yeah but to keep on topic you suggested that this sort of thing was happening in the thread.

I disagree, it's not.



He's just talking _. Clearly nobody here on both sides are having religious fervor around ai. One side is saying we don't understand LLMs completely and the other side is saying we absolutely do understand it's all statistical parroting.

But to keep it with the religious theme... which side sounds more similar to religion? The side that claims it's absolutely impossible for LLMs to be anything more then a statistical operation or the side that claims they don't know? One side seems to be making a claim based on faith while another side is saying we don't know enough to make a claim... So which side sounds more religious?


"I take myself to be an expert in this area, and I think it's fairly obvious how they work"

We can also say we understand chemistry but we don't understand how consciousness comes out of chemistry.

You can also say that humans are "just" physical processes, but that word "just" is doing a lot of heavy lifting.


I'd also say I've sufficient expertise in animal learning to reject the idea that animals have shallow interior lives comprised of compressions of historical cases.

A child touches a fireplace once -- not a thousand times. Because they are in direct causal contact with the world and their body has a whole-organism biochemical reaction to that stimulus which radically conditions their bodies in all sorts of ways

This is a world apart from statistical learning wherein P(A|A causes B) and P(A|B) are indistinguishable -- and the bridge of "big data" merely illusory


>Well I take myself to be an expert in this area, and I think it's fairly obvious how they work. Many of these so-called "Experts" are sitting on the boards of commercial companies with vested interests in presenting this technology as revolutionary. Indeed, much of what has been said recently in the media is little more than political and economic power plays disguised as philosophical musings.

Bro if you are an expert you'd already know that most of the exclamations that they don't fully understand LLMs is coming from researchers at universities. Hinton was my example on an "expert" as well and he literally quit google just so he can say his piece. You know who Hinton is right? The person who repopularized backprop.

>A statistical AI system is a function `answer = f(question; weights)`. The `answer` obtains apparent "emergent" properties such as "suitability for basic reasoning tasks" when used by human operators.

Every layman gets its a multidimensional curve fitting process. The analogy your using here to apply properties of lower dimensional and lower degree equations to things that are millions of dimensions in size on a complex curve simply doesn't apply because nobody fully understands the macro details of the curve and how that maps to the output it's producing.

The properties of a 2d circle don't map one to one to 3d let alone 500000000d.

>Much of its apparent properties are illusary, and much of the press around it puts in cases where it appears to work and claims "look it works!". This is pseudoscience -- if you want to test a hypothesis of ChatGPT, find all the cases where it doesnt work -- and you will find that in the cases where it does there was some "statistical shortcut" taken

You don't even know what science is. Most of software engineering from design patterns to language choice to architecture is not science at all. There's no hypothesis testing or any of that. An expert (aka scientist) would be clear that ML is mostly mathematical theory with a huge dose of art layered on top.

The hypothesis for the AI in this case is, and I'm parroting the real experts here,: "we don't understand what's going on." That's the hypothesis. How is that even testable? It's not so none of this is "science". ML never was a science, it's an art with some theoretical origins.

But your "hypothesis" is it's just "statistical parroting" which is also untestable. But your claim is way more ludicrous because you made a claim and you can't prove it while I made a claim that basically says "we can't make any claims because we don't understand". See the difference?


>I think if you're an expert in an area,

Experts in the area, including Hinton, the father of modern AI, self admit they don't fully understand what's going on but they think that LLMs know what they are talking about.

>These systems are not "fully correct" and not "out of thin area". They are trained on everything ever digitised, including the entire internet. They, in effect, find similar historical cases to your query and merge them. In many cases, for specific enough queries, the text is verbatim from an original source.

I never said the systems are fully correct. I said that for my specific example the answer is fully correct and out of thin air. No such question and answer pair exists on the internet. Find it and prove me wrong.

>This will be clear to everyone pretty quickly, as with all tech, it's "magic" on the first encounter -- until the limitations are exposed.

Except many experts are saying the exact opposite of what you're saying. I'm just parroting the experts..

>I actually work in an area where what took 3mo last year, I can now do in maybe 3 days due to ChatGPT. But when it comes to providing my customers with that content, the value was always in how I provided it and what it did for them.

So if they knew you were just copying and pasting their queries to chatgpt would they still care about the "how"? I doubt it.


I think this is one of the killer applications of LLMs, a friendly Stack Overflow where you can ask any programming question you want with out fear of being reprimanded. Of course, this capability in LLM is probably due to the terseness of Stack Overflow and the large database of code in Github.

However, in its current state users still have to know how to program in order to make good use of it. It will still give you lots of errors, but being able to get something close to your goal can save you a lot of time. Someone who does not know how to program will not be able to use these to put together a complex, useful and reliable system. It might change in the future, but these things are hard to predict.


> fear of being reprimanded.

Don't worry about this. You can get over the fear. I'm in the top 10% of stackoverflow users in terms of points and it's all because my stupidest questions from decades back gathered thousands of points from other stupid idiots like me. Who cares. Literally the line graph keeps climbing with no effort from me all from my dumbest questions. Just ask and don't worry about the criticism, you'll get a bit, but not too much.

>However, in its current state users still have to know how to program in order to make good use of it. It will still give you lots of errors, but being able to get something close to your goal can save you a lot of time. Someone who does not know how to program will not be able to use these to put together a complex, useful and reliable system. It might change in the future, but these things are hard to predict.

Of course. I think the thing I was trying to point out is the breadth of what chatgpt can do. So if you ask it to do a really in depth and detailed task it's likely to do it with flaws. That's not the point I was trying to emphasize, not the fact that it can't do any task with great depth but the fact that it can do ANY task. It has huge breadth.

So to bring it line with the direction of this thread. People were thinking about making special LLMs that refactor code to be unit testable. I mean we don't have to make special LLMs to do that because you can already ask chatgpt to do it already. That's the point.


I've had several SO questions get flamed, down voted and closed. I don't think this is great advice. What I would say is read the rules, search SO for duplicates try to think of near duplicates, try to Google the answer, then post.


Probably not then. But I just post whatever I want and I'm already in the top 10 percent. And I'm not an avid user either. I just ask a bunch of questions.

I've had a few flamed and closed but that's just 1 or 2 out of I'd say around 13 or 14 questions. It's a low percentage for me.

And I absolutely assure you much of my questions are stupid af.


It is a frequent complaint I have seen from new users. I do think for the purpose of Stack Overflow it does make sense to weed out questions that have already been answered and remove poorly formed ones. It's just that ChatGPT for programming questions often works better than trying to look it up in Stack Overflow so now I recommend it as an additional tool. You can ask questions and refine them without bothering random people on the internet.


"The problem is this was never a stackoverflow question and there was never an answer for it."

Your example is so trivial, that there are definitely similar code examples. Maybe not word for word, but similar enough, that this is not really mindblowing "making things out of thin air" for me. It seems like a standard coding class example, so not surprising, that it also can make the unit tests.


>Maybe not word for word, but similar enough

Find one. Dated before 2021. In fact, according to the theory that it's statistical parroting there should be multiple examples of for loops printing out numbers being converted to unit testable functions because AI needs multiple examples of it to form the correct model.

Find one. And it doesn't have to be from stack overflow either. Just a question and answer data point.


> Fully Correct

It's not though. It doesn't print the values anymore, so the behavior isn't the same.

Refactoring isn't allowed to change behavior.


It is. There is literally zero other way to make that function unit testable. What are you gonna compare that data with in a test if it's thrown into IO?

By definition all unit testable functions have to return data that can be asserted. You throw that data to IO it's not unit testable.

IO is testable via an integration tests. But not unit tests. Which is what my query exactly specified. I specified unit tests.


That doesn't change the fact that it's not a valid refactoring. If you can't make it unit testable without changing behavior, then it should tell you that.

Replacing a function that does `print("hello world")` with a function that does `return "hello world"` isn't a valid way to make it unit testable.


Alright fine, I can concede to this. ChatGPT should not have given me the best alternative but it should have given me the exact technically correct answer. You're right.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: