I have a simple front-end test that I give to junior devs. Every few months I se...

ern · on Nov 14, 2023

I was on a team developing a critical public safety system on a tight deadline a few years ago, and i had to translate some wireframes for the admin back-end into CSS. I did a passable job but it wasn’t a perfect match. I was asked to redo it by the team-lead. It had zero business value, but such was the state of our team…being pixel perfect was a source of pride.

It was one of the incidents that made me to stop front-end development.

As an exercise, I recently asked ChatGPT to produce similar CSS and it did so flawlessly.

I’m certainly a middling programmer when it comes to CSS. But with ChatGPT I can produce stuff close to the quality of what the CSS masters do. The article points this out: middling generalists can now compete with specialists.

lucideer · on Nov 14, 2023

> I recently asked ChatGPT to produce similar CSS and it did so flawlessly.

I use ChatGPT every day for many tasks in my work and find it very helpful, but I simply do not believe this.

> The article points this out: middling generalists can now compete with specialists.

I'd say it might allow novices to compete with middling generalists, but even that is a stretch. On the contrary, ChatGPT is actually best suited to use by a specialist who has enough contextual knowledge to construct targeted prompts & can then verify & edit the responses into something optimal.

steveBK123 · on Nov 14, 2023

That's about my experience.

The worst dev on my team uses ChatGPT a lot, and its facilitated him producing more bad code more quickly. I'm not sure it's a win for anyone, and he's still unlikely to be with the team in a year.

It allows a dev who doesn't care about their craft or improving to generate code without learning anything. The code they generate today or a year from today is the same quality.

Part of it is that it allows devs who lean into overcomplicating things to do so even more. The solutions are never a refinement of what already exists, but patch on top of patch on top of patch of complexity. ChatGPT is not going to tell you how to design a system, architect properly, automate, package, test, deploy, etc.

For the team it means there's a larger mess of a code base to rewrite.

whynotminot · on Nov 14, 2023

> ChatGPT is not going to tell you how to design a system, architect properly, automate, package, test, deploy, etc.

If you ask the right questions it absolutely can.

I’ve found that most people thinking ChatGPT is a rube are expecting too much extrapolation from vague prompts. “Make me a RESTful service that provides music data.” ChatGPT will give you something that does that. And then you’ll proceed to come to hacker news and talk about all the dumb things it did.

But, if you have a conversation with it. Tell it more of the things you’re considering. Some of the trades off you’re making—how the schema might grow over time, it’s kind of remarkable.

You need to treat it like a real whiteboarding session.

I also find it incredibly useful for getting my code into more mainstream shape. I have my own quirks that I’ve developed over time learning a million different things in a dozen different programming languages. It’s nice to be able to hand your code to ChatGPT and simply ask “is this idiomatic for this language?”

I think the people most disappointed with ChatGPT are trying to treat it like a Unix CLI instead of another developer to whiteboard with.

cbozeman · on Nov 14, 2023

This has been my experience as well.

Every person I've noticed who says that ChatGPT isn't good at what it does has the same thing in common - they're not great at talking to people, either.

Turns out when you train an AI on the corpus of human knowledge, you have to actually talk to it like a human. Which entirely too many people visiting this website don't do effectively.

ChatGPT has allowed me to develop comprehensive training programs for our internal personnel, because I already have some knowledge of training and standardization from my time in the military, but I also have in-depth domain knowledge so I can double-check what it's recommending, then course correct it if necessary.

rewmie · on Nov 14, 2023

> Every person I've noticed who says that ChatGPT isn't good at what it does has the same thing in common - they're not great at talking to people, either.

I think that the people who nowadays shit on ChatGPT's code generating abilities are the same blend of people who, a couple decades ago, wasted their time complaining that hand-rolled assembly would beat any compiled code in any way, shape, or form, provided that people knew what they were doing.

ben_w · on Nov 14, 2023

> But, if you have a conversation with it. Tell it more of the things you’re considering. Some of the trades off you’re making—how the schema might grow over time, it’s kind of remarkable.

You're not wrong, but I would caution that it can get really confused when the code it produces exceeds the context length. This is less of a problem than it used to be as the maximum context length is increasing quite quickly, but by way of example: I'm occasionally using it for side projects to see how to best use it, one of which is a game engine, and it (with a shorter context length than we have now) started by creating a perfectly adequate Vector2D class with `subtract(…)` and `multiply(…)` functions, but when it came to using that class it was calling `sub(…)` and `mul(…)` — not absolutely stupid, and a totally understandable failure mode given how it works, but still objectively incorrect.

zerbinxx · on Nov 14, 2023

I frequently run into this, and it’s quite maddening. When you’re working on a toy problem where generating functioning code is giving you a headache - either because it’s complex or because the programming language is foreign or crass - no problem. When you’re trying to extend an assemblage of 10 mixins in a highly declarative framework that many large-scale API contracts rely on to be correct, the problem is always going to boil down to how well the programmer understands the existing tools/context that they’re working with.

To me, a lot of this boils down to the old truism that “code is easier to write than maintain or extend”. Companies who dole out shiny star stickers for producing masses of untested, unmaintainable code will always reap their rewards, whether they’re relying on middling engineers and contractors alone, or with novices supercharged with ChatGPT.

raducu · on Nov 14, 2023

> But, if you have a conversation with it

It can't tell you a straight answer or halucinates API. It can't tell you "no, this cannot be done", it tries to "help" you.

For me it's great for writing simple isolated functions, generating regexes, command line solutions, exploring new technologies, it's great.

But after making it write a few methods, classes, it just gets extremelely tedious to make it add/change code, to the point I just write it myself.

Further, when operating at the edge of your knowledge, it also leads you on, whereas a human expert would just tell you "aaah, but that's just not possible/not a good idea".

tjr · on Nov 14, 2023

I think that's a fair description. While I have not yet found ChatGPT useful in my "real" day job (its understanding of aerospace systems is more than I would have guessed, but yet not enough to be super helpful to me), I have found it generally useful in more commonplace scripting tasks and what-not.

With the caveat of, I still need to understand what it's talking about. Copy-pasting whatever it says may or may not work.

Which is why I remain dubious that we're on the road to LLMs replacing software engineers. Assisting? Sure, absolutely.

Will we get there? I don't know. I mean, like, fundamentally, I do not trust LLMs. I am not going to say "hey ChatGPT, write me a flight management system suitable for a Citation X" and then just go install that on the plane and fly off into the sunset. I'm sure things will improve, and maybe improve enough to replace human programmers in some contexts, but I don't think we're going to see LLMs replacing all software engineers across the board.

tetha · on Nov 14, 2023

In a similar vein, ChatGPT can be an amazing rubber duck. If I have strange and obscure problems that stumps me, I kinda treat ChatGPT like I would treat a forum or an IRC channel 15 - 20 years back. I don't have "prompting experience or skills", but I can write up the situation, what we've tried, what's going on, and throw that at the thing.

And.. it can dredge up really weird possible reasons for system behaviors fairly reliably. Usually, for a question of "Why doesn't this work after all of that?", it drags up like 5-10 reasons for something misbehaving. We usually checked like 8 of those. But the last few can be really useful to start thinking outside of the normal box why things are borked.

And often enough, it can find at least the right idea to identify root causes of these weird behaviors. The actual "do this" tends to be some degree of bollocks, but enough of an idea to follow-up.

andy99 · on Nov 14, 2023

> The worst dev on my team uses ChatGPT a lot, and its facilitated him producing more bad code more quickly.

This is great. The exact same is true with writing, which I think it's trivial for anyone to see. Especially non-native speakers or otherwise bad writers can now write long-winded nonsense, which we're starting to see all over. It hasn't made anyone a good writer, it's just helped bad ones go faster.

ptyyy · on Nov 14, 2023

> Especially non-native speakers or otherwise bad writers can now write long-winded nonsense

You have now described 95% of Quora's content.

ponector · on Nov 14, 2023

Isn't it expected? Chatgpt was trained on such texts as well.

mlinhares · on Nov 14, 2023

Bless this person your team, he is creating work out of thin air and will keep your team and possibly other teams employed for a really long time

steveBK123 · on Nov 14, 2023

Exactly - and people say AI will take away jobs!

lytefm · on Nov 14, 2023

This could even play out on a broader scale and even increase the demand for software engineers.

What is going to happen if more and more people are creating lots of software that delivers value but is difficult to maintain and extend?

First it's going to be more high-level stuff, then more plumbing, more debugging and stitching together half-baked solutions than ever before.

AI might make our jobs suck more, but it's not going to replace them.

ben_w · on Nov 14, 2023

https://en.wikipedia.org/wiki/Parable_of_the_broken_window

gessha · on Nov 14, 2023

I have a hunch that using ChatGPT might be a skill in of itself and it doesn’t necessarily hurt or help any particular skill level of developers.

In previous replies in this thread the claim is it helps novices compete with associates or associates with seniors but in reality, it will probably help any tier of skill level. You just have to figure out how to prompt it

whynotminot · on Nov 14, 2023

One hundred percent. Most people I’ve seen dismiss ChatGPT simply refuse to engage it appropriately. It’s not likely to solve your most complex problem with a single prompt.

Asking the right questions is such an important skill in and of itself. I think we’re seeing to some extent the old joke about engineers not knowing how to talk to people manifest itself a bit with a lot of engineers right now not knowing quite how to get good results from ChatGPT. Sort of looking around the room wondering what they’re missing since it seems quite dumb to them.

somestag · on Nov 14, 2023

I had a friend jokingly poke fun at me for the way I was writing ChatGPT prompts. It seemed, to him, like I was going out of my way to be nice and helpful to an AI. It was a bit of an aha moment for him when I told him that helping the AI along gave much more useful answers, and he saw I was right.

Workaccount2 · on Nov 14, 2023

They use GPT 3.5, prompt it with "Write a javascript login page" and then look at the code an go "Damn, this thing is stupid as fuck".

JohnMakin · on Nov 14, 2023

I use “chatGPT” (really bing chat which is openAI under the hood as I understand) more than anyone on my team but it is very rarely for code.

I most often use it for summarizing/searching through dense documentation, creating quick prototypes, “given X,Y,Z symptoms and this confusing error message, can you give me a list of possible causes?” (basically searches Stack Overflow far better than I can).

Anyway basically the same as I was using google when google was actually good. sometimes I will forget some obscure syntax and ask it how to do something, but not super often. I’m convinced using it solely to generate code is a mistake unless it’s tedious boilerplate stuff.

necrotic_comp · on Nov 14, 2023

Yes, agreed. The best way of putting this is "using google when google was actually good."

Tostino · on Nov 14, 2023

Bing is far and away worse than gpt4 through ChatGPT or the API, just FYI. Don't even consider it comparable, even if they say it is the same model under the hood. Their "optimizations" have crippled it's capabilities if that is the case.

JohnMakin · on Nov 14, 2023

Well, it works pretty well for me, and cites its sources with links, and I have yet to catch it making up something.

Tostino · on Nov 14, 2023

We were talking about code generation though. It's horrible at it.

JohnMakin · on Nov 14, 2023

My parent commen says I rarely use it for code generation and I think if you're using these tools purely for that you're doing it wrong... that was like my entire point

Workaccount2 · on Nov 14, 2023

Right, but your reference for the point is a weak model. Your view would likely change a bit if you used GPT-4. It is substantially more powerful and skilled.

Tostino · on Nov 14, 2023

Exactly what I was trying to say. Thank you.

Still works best if you know what you are doing and can give very detailed instructions, but GPT-4 is vastly more capable than Bing for these tasks.

JohnMakin · on Nov 14, 2023

But I don't need it to be. Like at all. I've yet to find a programming task that GPT-4 would have made faster, and yes I have used it.

happycube · on Nov 14, 2023

On the flip side, one can use ChatGPT as only a starting point and to learn from there. One isn't stuck with actually using what it outputs verbtim, and really shouldn't until at least a hypothetical GPT-6 or 7... and to use it fully now, one has to know how to nudge it when it goes into a bad direction.

So overall it's more an amplifier than anything else.

komali2 · on Nov 14, 2023

I have a lot of juniors floating around my co-op and when I watch them use chatgpt it seems it becomes a dependency. In my opinion it's harming their ability to learn. Rather than thinking through problems they'll just toss every single roadblock they hit instantly into a chatgpt prompt.

To be fair I've been doing the same thing with simple mathematics into a calculator in my browser that at this point I'm pretty sure I'd fail at long division by hand.

Maybe it won't matter in a few years and their chatgpt skills will be well honed, but if it were me in their position I wouldn't gamble on it.

steveBK123 · on Nov 14, 2023

Yeah that's my larger point about the guy and the pattern. It doesn't lead to growth. I've seen zero growth whatsoever.

And he slacks coworkers like he is talking to ChatGPT too, slinging code blobs without context, example input data, or the actual error he received..

raducu · on Nov 14, 2023

> Yeah that's my larger point about the guy and the pattern. It doesn't lead to growth. I've seen zero growth whatsoever.

If Chat GPT can solve a problem consistently well, I don't think it's worth the effort to master it.

My examples are regexes, command lines to manipulate files, kafka/zookeeper commands to explore a test environment.

For me it's a big win in that regard.

lucideer · on Nov 14, 2023

> So overall it's more an amplifier than anything else.

Overall it would be an amplifier if that were how the majority used it. Sadly I don't believe that to be the case.

aswanson · on Nov 14, 2023

That's been the case with every technology made by man since fire.

lucideer · on Nov 14, 2023

Yup. We should embrace it, but without be naïve about what great things it's bringing us :)

jjgreen · on Nov 14, 2023

Ouch! Hot!

steveBK123 · on Nov 14, 2023

If the results were more akin to a google or stack overflow where there was a list of results with context.. sure.

But people are using the singular response as "the answer" and moving on..

JadeNB · on Nov 14, 2023

> If the results were more akin to a google or stack overflow where there was a list of results with context.. sure.

I don't think the history of the usage of either shows that most people make any use of that context.

happycube · on Nov 14, 2023

Especially these days you have to know how to use/read Google and SO results too.

(And I should have said ChatGPT4 earlier, if you're a bad to medicore developer taking ChatGPT3.5 literally you'll probably wind up in a Very Bad Place.)

thebigspacefuck · on Nov 14, 2023

Phind is a bit more like this

chrisfinazzo · on Nov 14, 2023

This has been my experience as well - For repetitive things, If what you're looking for is the shitty first draft, it's a way to get things started.

After that, you can shape the output - without GPT's help - into something that you can pull off the shelf again as needed and drop it into where you want it to go, because at that point in the process, you know it works.

raducu · on Nov 14, 2023

> nudge it when it goes into a bad direction

It happenes a few times for me that Chat GPT gets stuck in a bullshit loop and I can't get it unstuck.

Sure I could summarise the previous session for Chat GPT and try again, but I'm too tired at that point.

ben_w · on Nov 14, 2023

I get that with humans sometimes too. Even here, even before LLMs became popular. Someone gets primed on some keyword, then goes off on a direction unrelated to whatever it was I had in mind and I can't get them to change focus — and at least one occasion (here) where I kept saying ~"that's not what I'm talking about" only to be eventually met (after three rounds of this) with the accusation that I was moving the goalposts :P

somestag · on Nov 14, 2023

Yeah, regardless of hallucinations and repeating the same mistake even after you tell it to fix it, iterating with ChatGPT is so much less stressful than iterating with another engineer.

I almost ruined my relationship with a coworker because they submitted some code that took a dependency on something it shouldn't have, and I told them to remove the dependency. What I meant was "do the same thing you did, but instead of using this method, just do what this method does inside your own code." But they misinterpreted it to mean "Don't do what this method does, build a completely different solution." Repeated attempts to clarify what I meant only dug the hole deeper because to them I was just complaining that their solution was different from how I would've done it.

Eventually I just showed them in code what I was asking for (it was a very small change!) and they got mad at me for making such a big deal over 3 lines of code. Of course the whole point was that it was a small change that would avoid a big problem down the road...

So I'll take ChatGPT using library methods that don't exist, no matter how much you tell them to fix it, over that kind of stress any day.

throwuwu · on Nov 14, 2023

Use the edit button

itsoktocry · on Nov 14, 2023

>One isn't stuck with actually using what it outputs verbtim, and really shouldn't until at least a hypothetical GPT-6 or 7... and to use it fully now, one has to know how to nudge it when it goes into a bad direction.

Exactly.

chatGPT is the single smartest person you can ask about anything, and has unlimited patience.

itsoktocry · on Nov 14, 2023

>ChatGPT is not going to tell you how to design a system, architect properly, automate, package, test, deploy, etc.

Really? Please explain how chatGPT can not, but you can? What magic is it that you know that it's incapable of explaining?

RugnirViking · on Nov 15, 2023

because it seems to have built in assumptions about the scale, scope and complexity of soluton you are trying to develop. Specifically, unless you tell it to think about automated testing, or architecting, or making code be loosely coupled, it will hack.

this is because a lot of the time what a beginner needs IS a hack. But if you always hack at your job, things stop working.

I want to reiterate. One of the reasons the LLM is so helpful is because it has been RLHF'ed into always treating you as a beginner. This is core to it's usefulness. But it also limits it from producing quality work unless you bombard it with an enormous prompt explaining all of the difference between a beginne's and an expert's code. Which is tedious and lengthy to do every time when you just want a short change that doesnt suck.

Humans are able to learn from all sorts of external cues what level they should pitch an idea at. Startups should focus on hacky code, for velocity. Large buisnesses should focus on robust processes.

jorams · on Nov 14, 2023

I agree with this. There are cases where it produces good results, but there are also cases where it produces bs, and it's not always obvious. I find it to work fine for cases where I know what I want but could use a starting point, but it often invents or misunderstands all kinds of things.

The most frustrating situations are those where it invents a function that would miraculously do what's necessary, I tell it that function does not exist, it apologizes, shuffles the code around a bit and invents a different function, etc. It's the most annoying kind of debugging there is.

lucideer · on Nov 14, 2023

Another very obvious thing it does when it comes to code is take the most common misconceptions & anti-patterns used within the programming community & repeats them in an environment where there's no-one to comment. People have critiqued Stack Overflow for having so many "wrong" answers with green checkmarks, but at least those threads have surrounding context & discussion.

A case in point: I asked ChatGPT to give me some code for password complexity validation. It gave me perfectly working code that took a password and validated it against X metrics. Obviously the metrics are garbage, but the code works, and what inexperienced developer would be any the wiser. The only way to get ChatGPT to generate something "correct" there would be to tell it algorithmically what you want (e.g. "give me a function measuring information entropy of inputs", etc.) - you could ask it 50 times for a password validator: every one may execute successfully & produce a desired UI output for a web designer, but be effectively nonsense.

katzgrau · on Nov 14, 2023

> there are also cases where it produces bs, and it's not always obvious

Particularly annoying because I wind up risking not actually saving time because it’s producing subtle bugs that I wouldn’t have written myself.

So, you save yourself the time of thought and research at the risk of going down new and mysterious rabbit holes

redblacktree · on Nov 14, 2023

For me the trick to avoiding this trap is to limit usage to small areas of code, test frequently, and know its limits. I love using copilot/GPT for boilerplate stuff.

nkozyra · on Nov 14, 2023

> There are cases where it produces good results, but there are also cases where it produces bs, and it's not always obvious.

Pessimistically, this is the medium term role I see for a lot of devs. Less actual development, more assembly of pieces and being good enough at cleaning up generated code.

If an LLM can get you even 25% there most of the time, that's a massive disruption of this industry.

zrobotics · on Nov 14, 2023

I mean, especially in webdev we've been heading in that direction for a while now anyway. So much of the job is already just wiring up different npm packages and APIs that someone else has written. I've read substantially similar comments back in the mid 2010s about how people weren't learning the fundamentals and just pulling things like left pad off of a repo. That did cause a disruption in how people coded by abstraction away many of the problems and making the job more about integrating different things together.

schnable · on Nov 14, 2023

> ChatGPT is actually best suited to use by a specialist who has enough contextual knowledge to construct targeted prompts & can then verify & edit the responses into something optimal.

I agree with this, but what that means is that specialists will be able to create next generation tools--across all professions including coding--that do supercharge novices and generalists to do more.

dijit · on Nov 14, 2023

> ChatGPT is actually best suited to use by a specialist who has enough contextual knowledge to construct targeted prompts

This is my take also.

ChatGPT for novices is dangerous, its the equivalent of a calculator. If you don't know your expected output you're just wrong faster.

But if you know what to expect, whats your bounds and how to do it normally anyway, it can make you faster.

tlarkworthy · on Nov 14, 2023

I wrote just the tool to optimize AI in the hand of a coding expert https://observablehq.com/@tomlarkworthy/robocoop

miiiiiike · on Nov 14, 2023

I’d need to see it.

I can’t get ChatGPT to outperform a novice. And now I’m having candidates argue that they don’t need to learn the fundamentals because LLMs can do it for them.. Good luck HTML/CSS expert who couldn’t produce a valid HTML5 skeleton. Reminds me of the pre-LLM guy who said he was having trouble because usually uses React.. So I told him he could use React. I don’t mean to rag on novices but these guys really seemed to think the question was beneath them.

If you want to get back into front-end read “CSS: The Definitive Guide”. Great book, gives you a complete understanding of CSS by the end.

vidarh · on Nov 14, 2023

Requirements vary. It certainly can't produce really complex visual designs, or code a designer would be very happy with, but I have a hobby project work in progress where gpt4 has produced all of the CSS and templates. I have no doubt that the only reason that worked well is that it's a simple design of a type there is about a billion of in its training set and that it'd fall apart quickly if I started deviating much from that. But if t produced both clean CSS and something nicer looking than I suspect I would have myself.

A designer would probably still beat it - this doesn't compete with someone well paid to work on heavily custom designs. But at this point it does compete with places like Fiverr for me for things I can't or don't want to do myself. It'll take several iterations for it to eat it's way up the value chain, but it probably will.

But also, I suspect a lot of the lower end of the value chain, or at least part of them, will pull themselves up and start to compete with the lower end of the middle by figuring out how to use LLMs to take on bigger, more complex projects.

miiiiiike · on Nov 14, 2023

This meshes pretty well with my experience.

flir · on Nov 14, 2023

I'm always asking it to stitch together ad hoc bash command lines for me, eg "find all the files called *.foo in directories called bar and search them for baz".

(`find / -type d -name 'bar' -exec find {} -type f -name '*.foo' \; | xargs grep 'baz'` apparently.)

I would have done that differently, but it's close enough for government work.

LaGrange · on Nov 14, 2023

This is funny to me, because I would _always_ use -print0 and xargs -0, and for good reasons, I believe. But if you base your entire knowledge on what you find online, then yes, that's what you get - and what _most people will get too_. Also, I can still update that command if I want.

So it's not any worse than good-old "go to stack overflow" approach, but still benefits from experience.

FYI, this is the correct, as-far-as-I-can-tell "good" solution:

find . -type d -name 'bar' -print0 | \ xargs -0 -I{} find {} -type f -name '*.foo' -print0 | \ xargs -0 grep -r baz

This won't choke on a structure like this: ls -R .: bar foo

./bar: test.foo 'test test.foo'

./foo: bar bleb.foo

./foo/bar:

LaGrange · on Nov 14, 2023

...actually

find . -path '/bar/.foo' -print0 | xargs -0 grep baz

;-) no regex, no nested suff, much shorter. My brain went back to it ;-)

lukeschlather · on Nov 14, 2023

Using better languages like Powershell or Python becomes a lot more valuable here. I definitely think bash is going to be mostly useless in 5 years, you'll be able to generate legible code that does exactly what you want rather than having to do write-only stuff like that. Really we're already there. I've long switched from bash to something else at the first sign of trouble, but LLMs make it so easy. Poorly written python is better than well-written bash.

Of course, LLMs can generate go or rust or whatever so I suspect such languages will become a lot more useful for things that would call for a scripting language today.

boredtofears · on Nov 14, 2023

> I definitely think bash is going to be mostly useless in 5 years

I'll take that bet

LaGrange · on Nov 14, 2023

This is kinda side to my main point: while online knowledge is great, there are sometimes surprisingly deep gaps in it. So I can see AI trained on it sometimes struggle in surprising ways.

emporas · on Nov 14, 2023

I would generalize even more and say that any scripting language is going to be deprecated very soon, like Python etc. They are going to be replaced by safe, type-checked, theorem-proved verbose code, like Rust or something similar.

What do i care how many lines of code are necessary to solve a problem, if all of them are gonna be written automatically. 1 line of Bash/awk versus 10 lines of Python versus 100 lines of Rust? Are they any different to one another?

sgarland · on Nov 14, 2023

  $ find . -type f -regex '.*bar/[^/]*.foo' -exec grep baz {} +

I wonder if you created a GPT and fed it the entirety of Linux man pages (not that it probably didn't consume them already, but perhaps this weights them higher), if it would get better at this kind of thing. I've found GPT-4 is shockingly good at sed, and to some extent awk; I suspect it's because there are good examples of them on SO.

flir · on Nov 14, 2023

If SO had known to block GPTBot before it was trained, GPT4 would be a lot less impressive.

SavageBeast · on Nov 14, 2023

Same here! Thats the main use I have for ChatGPT in any practical sense today - generating Bash commands. I set about giving it prompts to do things that I've had to do in the past - it was great at it.

Find all processes named '*-fpm' and kill the ones that have been active for more than 60 seconds - then schedule this as a Cron job to run every 60 seconds. It not only made me a working script rather than a single command but it explained its work. I was truly impressed.

Yes it can generate some code wireframes that may be useful in a given project or feature. But I can do that too, usually in about the time it'd take me to adequately form my request into a prompt. Life could get dangerous in a hurry if product management got salty enough in the requirements phase that the specs for a feature could just be dropped into some code assistant and generate product. I don't see that happening ever though - not even with tooling - product people just don't seem to think that way in the first place in my experience.

As developers we spend a lot of our time modifying existing product - and if the LLM knows about that product - all the better job it could do I suppose. Not saying that LLMs aren't useful now and won't become more useful in time - because they certainly will.

What I am saying is that we all like to think of producing code as some mystical gift that only we as experienced (BRILLIANT, HANDSOME AND TALENTED TOO!!) developers are capable of. The reality is that once we reach a certain level of career maturity, if we were ever any good in the first place, writing code becomes the easiest part of the job. So theres a new tool that automates the easiest part of the job? Ok - autocomplete code editors we're cool too like that. The IDE was a game changer too. Automated unit tests were once black magic too (remember when the QA department was scared of this?).

When some AI can look at a stack trace from a set of log files, being fully aware of the entire system architecture, locate the bug that compiled and passed testing all the way to production, recommend, implement, test and pre-deploy a fix while a human reviews the changes then we're truly onto something. Until then I'm not worried that it can write some really nice SQL against my schema with all kinds of crazy joins - because I can do that too - sometimes faster - sometimes not.

So far ChatGPT isn't smarter than me but it is a very dutiful intern that does excellent work if you're patient and willing to adequately describe the problem, then make a few tweaks at the end. "Tweaks" up to seeing how the AI approached it, throwing it out and doing it your own way too.

BlueTemplar · on Nov 14, 2023

Except you should at least try to write code for someone else (and probably of lower level of competence - this also helps for your own debugging later) - obscure one-liners like these should be rejected.

flir · on Nov 14, 2023

I wouldn't call it obscure, just bog standard command line stuff. How would you have done it?

techbuttman · on Nov 14, 2023

The lower level person need only plug that one liner into chatGPT and ask for a simple explanation.

We're in a different era now.

maegul · on Nov 14, 2023

Yep! It’s something some aren’t seeing.

The AI coding assistant is now part of the abstraction layers over machine code. Higher level languages, scripting languages, all the happy paths we stick to (in bash, for example), memory management with GCs and borrow checkers, static analysis … now just add GPT. Like mastering memory management and assembly instructions … now you also don’t have to master the fiddly bits of core utils and bash and various other things.

Like memory management, whole swathes of programming are being taken care of by another program now, a Garbage Collector, if you will, for all the crufty stuff that made computing hard and got in between intent and assessment.

Nullabillity · on Nov 14, 2023

The difference is that all of them have theories and principles backing them, and we understand why they work.

LLMs (and "AI" in general) are just bashing data together until you get something that looks correct (as long as you squint hard enough). Even putting them in the same category is incredibly insulting.

maegul · on Nov 15, 2023

There are theories and principles behind what an AI is doing and a growing craft around how to best use AI that may very well form relatively established “best practices” over time.

Yes there’s a significant statistical aspect involved in the workings of an AI, which distinguishes it from something more deterministic like syntactic sugar or a garbage collector. But I think one could argue that that’s the trade off for a more general tool like AI in the same way that giving a task to a junior dev is going to involve some noisiness in need of supervision. But in grand scheme of software development, is devs are in the end tools too, apart of the grand stack, and I think it’s reasonable to consider AI as just another tool in the stack. This is especially so if devs are already using it as a tool.

Dwelling on the principled v statistical distinction, while salient, may very well be a fallacy or irrelevant to the extent that we want to talk about the stack of tools and techniques software development employs. How much does the average developer understand or employ said understanding of a principled component of their stack? How predictable is that component, at least in the hands of the average developer making average but real software? When the end of the pipeline is a human and it’s human organisation of other humans, whether a tool’s principled or statistical may not matter much so long as it’s useful or productive.

BlueTemplar · on Nov 15, 2023

Yes, but this is not something that has been enabled by the new neural networks, but rather by search engines, years ago - culminating in the infamous «copy-paste from Stack Overflow without understanding the code» / libraries randomly pulled from the Web with for instance the leftpad incident.

So what makes it different this time ?

Nullabillity · on Nov 14, 2023

So now the same tool can generate both the wrong script and the wrong documentation!

flir · on Nov 15, 2023

Wait, wait, let me show you my prompt for generating unit tests.

Zababa · on Nov 14, 2023

Assuming there is a comment just above the one-liner saying "find all directories named 'bar', find all files named '*.foo' in those directories, search those files for 'baz'", this code is perfectly clear. Even without the comment, it's not hard to understand.

emidln · on Nov 14, 2023

If the someone elses on my team can't read a short shell pipeline then I failed during interviewing.

radiator · on Nov 14, 2023

To me, the only obscure thing about this is that it is a one-liner.

If you write it in three lines, it is fine. Although, I guess the second find and the grep could be shortened, combined into one command.

BlueTemplar · on Nov 15, 2023

I agree - but my point still stands.

candiddevmike · on Nov 14, 2023

You can use gpt for government work??

flir · on Nov 14, 2023

Shh. What they don't know won't hurt me.

(Serious answer: it's just an expression. https://grammarist.com/idiom/good-enough-for-government-work...).

Cthulhu_ · on Nov 14, 2023

I haven't practiced or needed to use the fundamentals in literal years; I'm sure I'd fumble some of these tests, and I've got err, 15 years of experience.

It's good to know the fundamentals and be able to find them IF you find a situation where you need them (e.g. performance tuning), but in my anecdotal and limited experience, you're fine staying higher level.

soco · on Nov 14, 2023

I had a chilling experience of late when, out of curiosity, I tried the actual online practice exam for driving school. Boy did I fail it. I realized that there are quite some road signs I never saw in my life, and more important, that my current solution to all their right of way questions is "slow down and see what the others do" - not even that wrong if I think about but won't get you points in the exam.

collyw · on Nov 14, 2023

And I suspect you would be a lot less likely to be involved in a crash than someone who had just passed the test.

worksonmine · on Nov 14, 2023

There are levels of fundamentals though, since parent mentioned HTML/CSS/React I guess they're referring to being able create a layout by hand vs using a CSS framework/library. You don't need to know how a CPU works to fix a CSS issue, but if all you know is combining the classes available you'll have trouble with even the simplest web development.

Everyone should know enough fundamentals to be able to write simple implementations of the frameworks they depend on.

dragonelite · on Nov 14, 2023

Kind of same sort of situation, but i do like to refresh some of the fundamentals every 3~4 years or so. Usually when i do a job hop.

Its kind of like asking an olympic sprinter how to walk fast.

itsoktocry · on Nov 14, 2023

>If you want to get back into front-end read “CSS: The Definitive Guide”. Great book, gives you a complete understanding of CSS by the end.

Do you realize for how many technologies you can say the same thing? I don't want to read a 600 page tome on CSS. The language is a drop in the bucket of useful things to know. How valuable is a "complete understanding" of CSS? I just want something on my site to look a specific way.

miiiiiike · on Nov 15, 2023

> I don't want to read a 600 page tome on CSS.

The latest edition is closer to 1,100 pages.

It’s worth it.

It’s usually worth it for any long-standing technology you’re going to spending a significant amount of time using. Over the years you’ll save time because you’ll be able to get to the right answer in less time, debug faster, and won’t always be pulling out your hair Googling.

j45 · on Nov 14, 2023

Sometimes it requires expert guidance to get something meaningful out.

Art9681 · on Nov 14, 2023

This is the correct answer. I have 23 years of experience in datacenter ops and it has been a game changer for me. Just like any tool on one's arsenal, it's utility increases with practice and learning to use it correctly. ChatGPT is no different. You get out of it what you put in to it. This is the way of the world.

I used to be puzzled as to why my peers are so dismissive of this tech. Same folks who would say "We don't need to learn no Kubernetes! We don't need to code! We don't need ChatGPT". They don't!

And it's fine. If their idea of a career is working in same small co. doing the same basic Linux sysadmin tasks for a third of the salary I make then more power to them.

The folks dismissive of the AI/ML tech are effectively capping their salary and future prospects in this industry. This is good for us! More demand for experts and less supply.

You ever hire someone that uses punch cards to code?

Neither have I.

kevindamm · on Nov 14, 2023

I think it's more akin to using compilers in the early days of BCPL or C. You could expect it to produce working assembly for most code but sometimes it would be slower than a hand-tuned version and sometimes a compiler bug would surface, but it would work well enough most of the time.

For decades there were still people who coded directly in assembly, and with good reason. And eventually the compiler bugs would be encountered less frequently (and the programmer would get a better understanding of undefined behavior in that language).

Similar to how dropping into inline assembly for speeding up execution time can still have its place sometimes, I think using GPT for small blocks of code to speed up developer time may make some sense (or tabbing through CoPilot), but just as with the early days of higher level programming languages, expect to come across cases where it doesn't speed up DX or introduces a bug.

These bugs can be quite costly, I've seen GPT spit out encryption code and completely leave out critical parts like missing arguments to a library or generating the same nonce or salt value every execution. With code like this, if you're not well versed in the domain it is very easy to overlook, and unit tests would likely still pass.

I think the same lesson told to young programmers should be used here -- don't copy/paste any code that you do not sufficiently understand. Also maybe avoid using this tool for critical pieces like security and reliability.

j45 · on Nov 15, 2023

I think many folks have those early chats with friends where one side was dismissing llms for so many reasons, when the job was to see and test the potential.

While part of me enjoyed the early gpt much more than the polished version today, as a tool it’s much more useful to the average person should they make it back to gpt somehow.

ern · on Nov 14, 2023

Just out of curiosity: is the code generated by ChatGPT not what you expected or is it failing to produce the result that you wanted.

I suspect you mean the latter, but just wanted to confirm.

miiiiiike · on Nov 14, 2023

The statements are factually inaccurate and the code doesn’t do what it claims it should.

meiraleal · on Nov 14, 2023

Right. That's an experience completely different from the majority here that have been able to produce code that integrates seamlessly into their projects. Do you have any idea why?

I guess we should start by what version of ChatGPT you are using.

BlueTemplar · on Nov 14, 2023

ChatGPT might as well not exist - I'm not touching anything GAFAM-related.

Any worthy examples of open source neural networks, ideally not from companies based in rogue states like the US ?

meiraleal · on Nov 14, 2023

Falcon is developed by an UAE tech arm, not sure if you would consider it a rogue state or not: https://falconllm.tii.ae/

BlueTemplar · on Nov 14, 2023

What is «tech» supposed to mean here ? Infocoms ?

The United Arab Emirates ? Well, lol, of course I do, that's way worse than the US.

barrkel · on Nov 14, 2023

ChatGPT goes from zero to maybe 65th percentile? There or thereabouts. It's excellent if you know nothing. It's mediocre and super buggy if you're an expert.

A big difference is that the expert asks different questions, off in the tails of the distribution, and that's where these LLMs are no good. If you want a canonical example of something, the median pattern, it's great. As the ask heads out of the input data distribution the generalization ability is weak. Generative AI is good at interpolation and translation, it is not good with novelty.

(Expert and know-nothing context dependent here.)

One example: I use ChatGPT frequently to create Ruby scripts for this and that in personal projects. Frequently they need to call out other tools. ChatGPT 4 consistently fails to properly (and safely!) quote arguments. It loves the single-argument version of system which uses the shell. When you ask it to consider quoting arguments, it starts inserting escaped quotes, which is still unsafe (what if the interpolated variable contains a quote in its name). If you keep pushing, it might pull out Shell.escape or whatever it is.

I assume it reproduces the basic bugs that the median example code on the internet does. And 99% of everything being crap, that stuff is pretty low quality, only to be used as an inspiration or a clue as to how to approach something.

FrustratedMonky · on Nov 14, 2023

I encountered this with particular problem in python. Seemed like GPT wanted to always answer with something that had a lot of examples on the web, even if most answers were not correct. So garbage in, garbage out problem. I'm bit worried that the LLM's will continue to degrade as the web has increasing amount of LLM generated content. Seems to already be occurring.

fkyoureadthedoc · on Nov 14, 2023

Why do people that hand wave away the entire concept of LLMs because of one instance of it doing one thing poorly that they could do better, and yet always seem to fail to just show us their concrete example?

FrustratedMonky · on Nov 14, 2023

Technically the garbage in/garbage out, problem is not being hand waved away. I've seen a lot of articles on this, or sometimes called a degrading feedback loop. The more of the web that is LLM generated, then the more new models will be trained on generated data, and will fuzz out. Or 'drift'.

For a specific example. Sorry, I didn't grab screen shots at the time. It had to do with updating a datafame in pandas. It gave me solution that generated an error, I'd continue to ask it to change steps to fix previous errors, and it would go in a circle, fix it, but generate other warnings, and further changes to eliminate warnings, and it would recommend the same thing that originally caused an error.

Also. I'm a big fan. Use GPT-4 all the time. SO not waving away, but kind of curious how it sometimes fails in un-expected ways.

fkyoureadthedoc · on Nov 14, 2023

> The more of the web that is LLM generated, then the more new models will be trained on generated data, and will fuzz out

And yet it's so obvious that a random Hackernews independently discovers it and repeats it on every Chat GPT post, and prophesies it as some inevitable future. Not could happen, will happen. The clueless researchers will be blindsided by this of course, they'll never see it coming from their ivory tower.

And yes Chat GPT fails to write code that runs all the time. But it's not very interesting to talk about without an example.

FrustratedMonky · on Nov 14, 2023

How? It isn't exactly easy to re-produce these examples. I'd have to write a few pages to document and explain it. And scrub it to remove anything too internal, so create a vanilla example of the bug. And then it would be too long to go into a post, so what then, I'd have to go sign up to blog it somewhere and link to it.

I'm not arguing that GPT is bad. Just that it is as susceptible to rabbit wholes as any human.

I'm actually having a hard time narrowing down where your frustration is aimed.

At naysayers? At those that don't put effort into documenting? At GPT itself? Or that a news site on the internet dares have repetition ?

cozzyd · on Nov 15, 2023

So someone could sabotage LLMs by writing some scripts to fill GitHub (or whatever other corpus is used) with LLM-generated crap? Someone must be doing this, no?

barrkel · on Nov 14, 2023

FWIW I didn't wave away the entire concept. LLMs definitely have uses.

4star3star · on Nov 14, 2023

I would prefer that google search didn't suck. Instead, I ask ChatGPT. The best case scenario, IMO, would be for people to lay out excellent documentation and working code and train the LLM specifically on that in a way that it can provide reference links to justify its answers. Then, I will take what it says and go directly to the source to get the knowledge as it was intended to be ingested by a human. We get a lot more value than we're initially looking for when we dive into the docs, and I don't want to lose that experience.

danielbln · on Nov 14, 2023

Why not give it a systemprompt that specifies some you're requirements: "you are a experienced senior ruby developer who writes robust maintainable code, follow these coding guidelines 《examples》"

barrkel · on Nov 14, 2023

If I thought it was worthwhile, maybe that would patch that specific hole.

The other problem I get when trying to make it write code is that it gets kinda slippery with iterated refinements. In the back and forth dialog, addressing issues 1 through n in sequence, it gets to a place where issue k < n is fixed but issue i < k gets broken again. Trying to get it to produce the right code becomes a programming exercise of its own, and it's more frustrating than actually typing stuff up myself.

I mean, I still use it to get a basic shape especially when I'm working with a command line tool I'm not an expert in, it's still useful. It's just not great code.

janosd · on Nov 14, 2023

> middling generalists can now compete with specialists.

They can maybe compete in areas where there has been a lot of public discussion about a topic, but even that is debatable as there are other tasks than simply producing code (e.g. debugging existing stuff). In areas where there's close to no public discourse, ChatGPT and other coding assistance tools fail miserably.

red-iron-pine · on Nov 14, 2023

this be the answer. GPT is as good as the dataset it's trained off of, and if you're going by the combined wisdom of StackOverflow then you're going to have a middling time.

YeGoblynQueenne · on Nov 14, 2023

>> The article points this out: middling generalists can now compete with specialists.

They can't, and aren't even trying to. It's OpenAI that's competing with the specialists. If the specialists go out of business, the middling generalists obviously aren't going to survive either so in the long term it is not in the interest of the "middling generalists" to use ChatGPT for code generation. What is in their interest is to become expert specialists and write better code both than ChatGPT currently can, and than "middling generalists". That's how you compete with specialists, by becoming a specialist yourself.

Speaking as a specialist occupying a very, very er special niche, at that.

elif · on Nov 14, 2023

It REALLY depends on the task. For instance, if you provide GPT with a schema, it can produce a complex and efficient SQL query in <1% of the time an expert could.

I would also argue that not only are the models improving, we have less than a year practically interfacing with LLM's. OUR ability to communicate with them is in infancy, and a generation that is raised speaking with them will be more fluent and able to navigate some of the clear pitfalls better than we can.

emporas · on Nov 14, 2023

There is not much of a need for humans to get closer to the machine long term, when with new datasets for training the machine will get closer to humans. Magic keywords like "step by step" won't be as necessary to know.

One obstacle for interfacing with LLM's is the magic cryptic commands it executes internally, but that need not be the case in the future.

Cthulhu_ · on Nov 14, 2023

> middling generalists can now compete with specialists.

I want to say that this has been the state of a lot of software development for a while now, but then, the problems that need to be solved don't require specialism, they require people to add a field to a database or to write a new SQL query to hook up to a REST API. It's not specialist work anymore, but it requires attention and meticulousness.

gumballindie · on Nov 14, 2023

But if you are a middling programmer when it comes to CSS how do you know the output was “flawless” and close to the quality that css “masters” produce?

ern · on Nov 14, 2023

It looked correct visually and it matched the techniques in the actual CSS that the team lead and I produced when we paired to get my layout to the standard he expected.

kypro · on Nov 14, 2023

You may think it did a good job because of your limited CSS ability. I'd be amazed if ChatGPT can create pixel-perfect animations and transitions along with reusable clean CSS code which supports all of the browser requirements at your org.

I've seen the similar claims made on Twitter by people with zero programming ability claiming they've used ChatGPT to build an app. Although 99% of the time what they've actually created is some basic boilerplate react app.

> middling generalists can now compete with specialists.

Middling generalists can now compete with individuals with a basic understanding assuming they don't need to verify anything that they've produced.

nisegami · on Nov 14, 2023

>I'd be amazed if ChatGPT can create pixel-perfect animations and transitions along with reusable clean CSS code which supports all of the browser requirements at your org.

Personally, I'd be more amazed if a person could do that than if a LLM could do it.

kypro · on Nov 14, 2023

Google, "UI developer".

rpmisms · on Nov 14, 2023

It does great at boilerplate, so I think it's safe to say it will disrupt Java.

I've been using tabnine for years now, and I use chatGPT the same way; write my boilerplate, let me think about logic.

jameshart · on Nov 14, 2023

Here’s the thing though:

If a new version of the app can be generated on the fly in minutes, why would we need to worry about reusability?

GPT generated software can be disposable.

Why even check the source code in to git - the original source artifact is the prompt after all.

philipwhiuk · on Nov 14, 2023

Let me know how you get on with a disposable financial system, safety system or electoral voting system.

raducu · on Nov 14, 2023

I work with java and do a lot of integration, but a looot of my effort goes into exploring and hacking away some limitations of a test system, and doing myself things that would take a lot of time if I had to ask the proper admins.

I had a problem where I was mocking a test system (for performance testing of my app) and I realized the mocked system was doing an externalUserId to internalUserId mapping.

Usually that would have been a game stopper, but instead I did a slow run, asked Chat GPT to write code that reads data from a topic and eventually create a CSV of 50k user mappings; it would have taken me at least half a day to do that, and Chat GPT allowed me to do it in 15 minutes.

While very little code went into my app, Chat GPT did write a lot of disposable code that did help me a lot.

blharr · on Nov 14, 2023

Because in my experience GPT can produce a maximum of like 200 lines of code before it makes an error usually.

hanzmanner · on Nov 14, 2023

> It had zero business value, but such was the state of our team…being pixel perfect was a source of pride

UX and UI are not some secondary concerns that engineers should dismiss as an annoying "state of our team" nuance. If you can't produce a high quality outcome you either don't have the skills or don't have the right mindset for the job.

philipwhiuk · on Nov 14, 2023

Would you give the critical public safety system bit to ChatGPT?

This scenario reminds me of:

If a job's worth doing, do it yourself. If it's not worth doing, give it to Rimmer.

Except now it's "give it to ChatGPT"

city41 · on Nov 14, 2023

I'm a developer but also have an art degree and an art background. I'm very mediocre at art and design. But lately I've been using AI to help plug that gap a bit. I really think it will be possible for me to make an entire game where I do the code, and AI plus my mediocre art skills get the art side across the line.

I think at least in the short term, this is where AI's power will lie. Augmentation, not replacement.

crabbone · on Nov 14, 2023

It probably depends on the area. CSS is very popular on one hand and limited to a very small set of problems on the other.

I did try asking ChatGPT about system-related stuff several times and had given up since then. The answers are worthless if not wrong, unless the questions are trivial.

ChatGPT works if it needs to answer a question that was already answered before. If you are facing a genuinely new problem, then it's just a waste of time.

scythe · on Nov 14, 2023

I suspect that the "depth" of most CSS code is significantly shallower than what gets written in general purpose programming languages. In CSS you often align this box, then align that box, and so forth. A lot of the complexity in extant CSS comes from human beings attempting to avoid excessive repetition and typing. And this is particularly true when we consider the simple and generic CSS tasks that many people in this thread have touted GPT for performing. There are exceptions where someone builds something really unique in CSS, but that isn't what most people are asking from GPT.

But the good news is that "simple generic CSS" is the kind of thing that most good programmers consider to be essentially busywork, and they won't miss doing it.

ryandvm · on Nov 14, 2023

> middling generalists can now compete with specialists

Great point. That's been my experience as well. I'm a generalist and ChatGPT can bring me up to speed on the idiomatic way to use almost any framework - provided it's been talked about online.

I use it to spit out simple scripts and code all day, but at this point it's not creating entire back-end services without weird mistakes or lots of hand holding.

That said, the state of the art is absolutely amazing when you consider that a year ago the best AIs on the market were Google or Siri telling me "I'm sorry I don't have any information about that" on 50% of my voice queries.

sonicanatidae · on Nov 14, 2023

AI is a tool. Like all tools, it can be useful, when applied the right way, to the right circumstances. I use it to write powershell scripts, then just clean them up, and voila.

That being said, humans watch too much tv/movies. ;)

itsoktocry · on Nov 14, 2023

>The article points this out: middling generalists can now compete with specialists.

This is why you're going to get a ton of gatekeepers asking you to leetcode a bunch of obscure stuff with zero value to business, all to prove you're a "real coder". Like the OP.

rixed · on Nov 14, 2023

Out of curiosity, how did you pass the wireframe to chatGPT ?

ern · on Nov 14, 2023

I described what I wanted. It was earlier this year..not sure if chatgpt can understand wireframes now, but it couldn’t at the time.

rixed · on Nov 14, 2023

you described it with pixel accuracy ?

Cthulhu_ · on Nov 14, 2023

Doesn't ChatGPT support image uploads these days?

frabcus · on Nov 14, 2023

Yes, but the paid-for Plus version only.

Ignore the free version, pretend it doesn't exist.

pjc50 · on Nov 14, 2023

I would really like to see the prompts for some of these. Mostly because I'm an old-school desktop developer who is very unfamiliar with modern frontend.

calvinmorrison · on Nov 14, 2023

> being pixel perfect was a source of pride.

Then use LaTex and PDF. CSS is not for designing pixel perfect documents.

number6 · on Nov 14, 2023

I might be a bit out of the loop: how did you do it? I thought ChatGPT is text based?

sublinear · on Nov 14, 2023

[flagged]

throwbadubadu · on Nov 14, 2023

Calling bullshit is maybe too harsh. There may be requirements matching the available training data and the right mood the LLM has been tuned for where it delivers acceptable, then considered flawless, results (extreme example: just try "create me hello world in language x" will mostly deliver flawless)... and by that amateurs (not judging, just mean less exposed to variety of problems and challenges) may end up with the feeling that LLMs could do it all.

But yes, any "serious" programmer working on harder problems can quickly derail an LLM and prove otherwise, with dozens of his simple problems each day (tried it *). It doesn't even need that, one can e.g. prove ChatGPT (also 4) quickly wrong and going in wrong circles on C++ language questions :D, though C++ is still hard, one can also do the same with questions on the not-ultracommon Python libs. It confidently outputs bullshit quick.

(*): Still can be helpful for templating, ideas, or getting into the direction or alternatives, no doubts on that!

anonzzzies · on Nov 14, 2023

So, don't leave us in suspense; what do you ask of it? Because I'm quite sure it can already pass it.

Your experience is very different from mine anyway. I am a grumpy old backend dev that uses formal verification in anger when I consider it is needed and who gets annoyed when things don't act logical. We are working with computers, so everything is logical, but no; I mean things like a lot of frontend stuff. I ask our frontend guy; 'how do I center a text', he says 'text align'. Obviously I tried that, because that would be logical, but it doesn't work, because frontend is, for me, absolutely illogical. Even frontend people actually have to try-and-fail; they cannot answer simple questions without trying like I can in backend systems.

Now, in this new world, I don't have to bother with it anymore. If copilot doesn't just squirt out the answer, then chatgpt4 (and now my personal custom gpt 'front-end hacker' who knows our codebase) will fix it for me. And it works, every day, all day.

ryanjshaw · on Nov 14, 2023

I'm not the person you're responding to, but here's an example of it failing subtly:

https://chat.openai.com/share/4e958c34-dcf8-41cb-ac47-f0f6de...

finalAlice's Children have no parent. When you point this out, it correctly advises regarding the immutable nature of these types in F#, then proceeds to produce a new solution that again has a subtle flaw: Alice -> Bob has the correct parent... but Alice -> Bob -> Alice -> Bob is missing a parent again.

Easy to miss this if you don't know what you're doing, and it's the kind of bug that will hit you one day and cause you to tear your hair out when half your program has a Bob-with-parent and the other half has an Orphan-Bob.

Phrase the question slightly differently, swapping "Age: int" with "Name: string":

https://chat.openai.com/share/df2ddc0f-2174-4e80-a944-045bc5...

Now it produces invalid code. Share the compiler error, and it produces code that doesn't compile but in a different way -- it has marked Parent mutable but then tried to mutate Children. Share the new error, and it concludes you can't have mutable properties in F#, when you actually can, it just tried marking the wrong field mutable. If you fix the error, you have correct code, but ChatGPT-4 has misinformed you AND started down a wrong path...

Don't get me wrong - I'm a huge fan of ChatGPT, but it's nowhere near where it needs to be yet.

doctorpangloss · on Nov 14, 2023

I'm not really sure what I'm looking at. It seems to perform flawlessly for me... when using Python: https://chat.openai.com/share/7e048acb-a573-45eb-ba6c-2690d2...

I only made two changes to your prompt: one to specify Python, and another to provide explicit instructions to trigger using the Advanced Data Analysis pipeline.

You also had a couple typos.

I'm not sure if "Programming-like tool that reflects programming language popularity performs poorly on unpopular programming language" is the gotchya you think it is. It performs extremely well authoring Kubernetes manifests and even makes passing Envoy configurations. There's a chance that configuration files for reverse proxy configuration DSLs have better representation than F# does. I guess if you disagree at how obscure F# is, you're observing a real, objective measurement of how obscure it is, in the fascinating performance of this stochastic parrot.

ryanjshaw · on Nov 14, 2023

F# fields are immutable unless you specify they are mutable. The question I posed cannot be solved with exclusively immutable fields. This is basic computer science, and ChatGPT has the knowledge but fails to infer this while providing flawed code that appears to work.

An inexperienced developer would eventually shoot themselves in the foot, possibly long after integrating the code thinking it was correct and missing the flaws. FYI, your Python code works because of the mutation "extend()":

    alice.children.extend([bob, carol])

dontupvoteme · on Nov 14, 2023

>F#

Barely exists in training data.

Might as well ask it to code some microcontroller specifically assembly, watch it fail and claim victory.

ryanjshaw · on Nov 14, 2023

> Barely exists in training data.

Irrelevant - this is basic computer science. As far as I know, you can't create a bidirectional graph node structure without a mutable data structure or language magic that ultimately hides the same mutability.

The fact that ChatGPT recognizes the mutability issue when I explain the bug tells you it has the knowledge, but it doesn't correctly infer the right answer and instead makes false claims and sends developers down the wrong path. This speaks to OP's claim about subtle inaccuracies.

I have used ChatGPT to write 10k lines of a static analyzer for a 1k AST model definition in F#, without knowing the language before I started. I'm a big fan, but there were many, many times a less experienced developer would have shot themselves in the foot using it blindly on a project with any degree of complexity.

dontupvoteme · on Nov 14, 2023

I would agree with you if it was a model trained to do computer science, rather than a model to basically do anything, which just happens to be able to do computer science as well.

Also code is probably one of the easiest use cases for detecting hallucinations since you can literally just see if it is valid or not the majority of the time.

It's much harder for cases where your validation involves wikipedia, or academic journals, etc.

ryanjshaw · on Nov 14, 2023

Then we are in agreement but bear in mind that I was replying to this comment:

> So, don't leave us in suspense; what do you ask of it? Because I'm quite sure it can already pass it.

eloisant · on Nov 14, 2023

If it can pass it when you ask it in a way only a coder can write, then we will still need coders.

If you need to tweak your prompt until you get the correct result, then we still need coders who can tell that the code is wrong.

Ask Product Managers to use ChatGPT instead of coders and they will ask for 7 red lines all perpendicular to each other with one being green.

https://www.youtube.com/watch?v=BKorP55Aqvg

anonzzzies · on Nov 14, 2023

I didn't say we don't need coders. We need less average/bad ones and a very large amounts of coders that came after the 'coding makes $$$$' worldwide are not even average.

I won't say AI will not eventually make coding obsolete; even just 2 years ago I would've said we are 50-100 years away from that. No i'm not so sure. However, I am saying that I can replace many programmers with gpt right now, and I am. The prompting and reprompting is still both faster and cheaper than many humans.

mathgeek · on Nov 14, 2023

In my mind, we need more folks who have both the ability to code and the ability to translate business needs into business logic. That’s not a new problem though.

anonzzzies · on Nov 14, 2023

That's what we are doing all day no? I mean besides fighting tooling (which is getting a larger and larger % of the time building stuff).

v-erne · on Nov 14, 2023

Only if you have access to end user.

If between you and your client four people are playing deaf phone (client's project manager, our project manager, team leader and some random product guy just to get even numer), then actually this is not what you are doing.

I would argue that the thing that happens at this stage is more akin to manually transpiling business logic into code.

In this kind od organization programmers become computer whisperers. And this is why there is a slight chance that GPT-6 or 7 will take their job.

BlueTemplar · on Nov 14, 2023

TFA's point is not that «coders» won't be needed any more, it's that they will hardly spend their time «coding», that is «devot[ing themselves] to tedium, to careful thinking, and to the accumulation of obscure knowledge», «rob[bing them] of both the joy of working on puzzles and the satisfaction of being the one[s] who solved them».

Ygg2 · on Nov 14, 2023

You can ask it almost anything. Ask it to write a YAML parser in something a bit more complex like Rust and it falls like a rag.

Rust mostly because it's relatively new, and there isn't a native YAML parser in Rust (there is a translation of libfyaml). Also you can't bullshit your way out of Rust by making bunch of void* pointers.

Exuma · on Nov 14, 2023

How do you make a custom gpt which knows a specific code base? I have been wanting to do this

electrondood · on Nov 14, 2023

You tune an existing model on your own set of inputs/outputs.

Whatever you expect to start typing, and have the model produce as output, should be those input/output pairs.

I'd start by using ChatGPT etc. to add comments throughout your code base describing the code. Then break it into pairs where the input is the prefacing comment, and the output is the code that follows. Create about 400-500 such pairs, and train a model with 3-4 epochs.

Some concerns: you're going to get output that looks like your existing codebase, so if it's crap, you'll create a function which can produce crap from comments. :-)

anonzzzies · on Nov 14, 2023

I use the new feature of creating a custom gpt and I keep adding new information ; files, structures etc by editing the gpt. It seems to work well.

Exuma · on Nov 14, 2023

Ah ok so you have to paste entire files in 1 by 1, you can't just add it locally somehow? too bad you cant just upload a zip or something...

grozmovoi · on Nov 14, 2023

you can upload zips. Make a new GPT and go to the custom settings.

steveBK123 · on Nov 14, 2023

That's been my experience both with Tesla AP/FSD implementation & with LLMs.

Super neat trick the first time you encounter it, feels like alien tech from the future.

Then you find all the holes. Use it for months/years and you notice the holes aren't really closing.. The pace of improvement is middling compared to the gap to it meeting the marketing/rhetoric. Eventually using them feels more like a chore than not using them.

It's possible some of these purely data driven ML approaches don't work for problems you need to be more than 80% correct on.

Trading algos that just need to be right 55% of the time to make money, recommendation engines that present a page of movies/songs for you to scroll, Google search results that come back with a list you can peruse, Spam filters that remove some noise from your inbox.. sure.

But authoritative "this is the right answer" or "drive the car without murdering anyone".. these problems are far harder.

wrzuteczka · on Nov 14, 2023

With the AI "revolution," I began to appreciate the simplicity of models we create when doing programming (and physics, biology, and so on as well).

I used to think about these things differently: I felt that because our models of reality are just models, they aren't really something humanity should be proud of that much. Nature is more messy than the models, but we develop them due to our limitations.

AI is a model, too, but of far greater complexity, able to describe reality/nature more closely than what we were able to achieve previously. But now I've begun to value these simple models not because they describe nature that well but because they impose themselves on nature. For example, law, being such a model, is imposed on reality by the state institutions. It doesn't describe the complexity of reality very well, but it makes people take roles in its model and act in a certain way. People now consider whether something is legal or not (instead of moral vs immoral), which can be more productive. In software, if I implement the exchange of information based on an algorithm like Paxos/Raft, I get provable guarantees compared to if I allowed LLMs to exchange information over the network directly.

steveBK123 · on Nov 14, 2023

I think you've found a good analogy there in the concept of moral vs legal. We defined a fixed system to measure against (rule of law) to reduce ambiguity.

Moral code varies with time, place, and individual person. It is a decimal scale of gray rather than a binary true/false.

Places historically that didn't have rule of law left their citizens to the moral interpretation whim of whoever was in charge. The state could impose different punishments on different people for different reasons at different times.

AI models I find a similar fixed&defined vs unlimited&ambiguous issue in ADAS in cars.

German cars with ADAS are limited&defined, have a list of features they perform well, but that is all.

Tesla advertises their system as an all knowing, all seeing system with no defined limits. Of course every time there is an incident they'll let slip certain limits "well it can't really see kids shorter than 3ft" or "well it can't really detect cross traffic in this scenario" etc.

hospitalJail · on Nov 14, 2023

Yep, lots of people are using LLMs for problems LLMs aren't good at.

They still do an alright job, but you get that exact situation of 'eh, its just okay'.

Its the ability to use those responses when they are good, and knowing when to move on from using an LLM as a tool.

steveBK123 · on Nov 14, 2023

Not terribly different than Google Translate.

Ff you have a familiarity with the foreign language, you can cross check yourself & the tool against each other to get to a more competent output.

If you do not know the foreign language at all, the tool will produce word salad that sort of gets your point across while sounding like an alien.

atoav · on Nov 14, 2023

I tried for 2 hours to get ChatGPT to write a working smooth interpolation function in python. Most of the functions it returned didn't even go through the points between which it should be interpolating. When I pointed that out it returned a function that went through the points but it was no longer smooth. I really tried and restarted over multiple times. I believe we have to choose between a world with machine learning and robot delivery drones. Because if that thing writes code that controls machines it will be total pandemonium.

It did a decent job at trivial things like creating function parameters out of a variable tho.

lewhoo · on Nov 14, 2023

That's weird to read. Interpolations of various sorts are known and solved and should probably be digested by chatgpt in training by the bulk. I'm not doubting your effort by any means, I'm just saying this sounds like one of those things it should do well.

atoav · on Nov 14, 2023

This is why I asked it that and was surprised with the questionable quality of the results. My goal wasn't even to break ChatGPT, it was to learn about new ways of interpolating that I hadn't thought about.

svantana · on Nov 14, 2023

There's a recent "real coding" benchmark that all the top LLMs perform abysmally on: https://www.swebench.com/

However, it seems only a matter of time before even this challenge is overcome, and when that happens the question will remain whether it's a real capability or just a data leak.

swells34 · on Nov 14, 2023

I have a very similar train of thought roll through my head nearly every day now as I browse through github and tech news. To me it seems wild how much serious effort is put into the misapplication of AI tools on problems that are obviously better solved with other techniques, and in some cases where the problem already has a purpose built, well tested, and optimized solution.

It's like the analysis and research phase of problem solving is just being skipped over in favor of not having to understand the mechanics of the problem you're trying to solve. Just reeks of massive technical debt, untraceable bugs, and very low reliability rates.

frereubu · on Nov 14, 2023

When studying fine art, a tutor of mine talked about "things that look like art", by which she meant the work that artists produce when they're just engaging with surface appearances rather than fully engaging with the process. I've been using GitHub Copilot for a while and find that it produces output that looks like working code but, aside from the occasional glaring mistake, it often has subtle mistakes sprinkled throughout it too. The plausibility is a serious issue, and means that I spend about as much time checking through the code for mistakes as I'd take to actually write it, but without the satisfaction that comes from writing my own code.

I dunno, maybe LLMs will get good enough eventually, but at the moment it feels plausible to me that there's some kind of an upper limit caused by its very nature of working from a collection of previous code. I guess we'll see...

mrtksn · on Nov 14, 2023

Try breaking down the problem. You don't have to do it yourself, you can tell ChatGPT to break down the problem for you then try to implement individual parts.

When you have something that kind of works, tell ChatGPT what the problems are and ask for refinement.

IMHO currently the weak point of LLMs is that they can't really tell what's adequate for human consumption. You have to act as a guide who knows what's good and what can be improved and how can be improved. ChatGPT will be able to handle the implementation.

In programming you don't have to worry too much about hallucinations because it won't work at all if it hallucinates.

throwaway346434 · on Nov 14, 2023

... What.

It hallucinates and it doesn't compile, fine. It hallucinates and flips a 1 with a -1; oops that's a lot of lost revenue. But it compiled, right? It hallucinates, and in 4% of cases rejects a home loan when it shouldn't because of a convoluted set of nested conditions, only there is no one on staff that can explain the logic of why something is laid out the way it is and I mean, it works 96% of the time so don't rock the boat. Oops, we just oppressed a minority group or everyone named Dave because you were lazy.

mrtksn · on Nov 14, 2023

As I said, you are still responsible for the quality control. You are supposed to notice that everyone is named Dave and tell ChatGPT to fix it. Write tests, read code, run & observe for odd behaviours.

It's not an autonomous agent just yet.

gumballindie · on Nov 14, 2023

But why should i waste time using a broken product when i can do it properly myself? To me a lot of this debate sounds like people obsessively promoting a product for some odd reason, as if they were the happy owners of a hammer in search of a nail.

mrtksn · on Nov 14, 2023

If you are faster and more productive that way, do it that way.

Most people are not geniuses and polymaths, it's much easier and cheaper for me to design the architecture and ask ChatGPT to generate the code in many different languages(Swift/HTML/JS/CSS on the client side and Py, JS, PHP on the server side). It's easier because although I'm proficient an all these it's very hard for me to switch solving client specific JS problems to server specific JS problems or between graphics and animation related problems and data processing problems with Swift. It's also cheaper because I don't have to pay someone to do it for me.

In my case, I know all that well enough to spot a problem and debug, I just don't want to go through the trouble of actually writing it.

gumballindie · on Nov 14, 2023

The debate here is wether openai's product, chatgpt, can indeed deliver what it claims - coding, saving dogs' lives, mental health counceling, and so on. It would appear that it doesn't but it does mislead people without experience in whatever field they use it. For instance if i ask it about law i am impressed, but when I ask it about coding of software engineering it blatantly fails. The conclusion being that as a procedural text generator it is impressive - it nails language - but the value of the output is far from settled.

This debate is important because as technical people it is our reposnbility to inform non technical people about the use of this technology and to bring awareness about potential misleading claims its seller makes - as it was the case with crypto currencies, and many other technologies that promised the world delivered nothing of real benefit (but made people rich in the process by exploting the uniformed).

mrtksn · on Nov 14, 2023

That's not the debate, it's the first time I'm hearing about that in this thread.

johnfn · on Nov 16, 2023

It's funny you say that. Reading over a lot of the comments here sound like a lot of people obsessively dismissing a swiss army knife because it doesn't have their random favorite tool of choice.

somewhereoutth · on Nov 14, 2023

As we all know, it is much easier to read and verify code you've written yourself - perhaps it is only code you've written yourself that can be properly read and verified. As ever, tests can be of only limited utility (separate discussion).

mrtksn · on Nov 14, 2023

It's easier to read the code you recently wrote, sure. But in real life people use and debug other people's code all the time, LLM generated code is just like that. Also, if you make it generate the code in small enough blocks you also end up knowing the codebase is if you wrote it.

throwaway346434 · on Nov 24, 2023

You've missed the subtle point here.

Imagine you walk in 4 years down the track and try to examine AI generated logic committed under a dev's credentials. It's written in an odd, but certain way. There is no documentation. The original dev is MIA. You know there is something you defective from the helpdesk tickets coming through, but it's also a complex area. You want to go through a process of writing tests, refactoring, understanding, but to redeploy this is hard work. You talk to your manager. It's not everyone. Neither of you realize its only people named Dave/minority attribute X affected, because why would that matter? You need 40 hours of budget to begin to maybe fix this. Institutionally, this is not supportable because it's "only affecting 4% of users and that's not many". Close ticket, move on.

Only it's everyone named Dave. 100% of the people born to this earth with parents who named them Dave are, for absolutely no discernable reason, denied and oppressed.

wokwokwok · on Nov 14, 2023

The output of an LLM is a distribution, and yes, if you’re just taking the first answer, that’s problematic.

However, it is a distribution, and than means the majority of solutions are not weird edge cases, they’re valid solutions.

Your job as a user is to generate multiple solutions and then review them and pick the one you like the most, and maybe modify it to work correctly if it has weird edge cases.

How do you do that?

Well, you can start by following a structured process where you define success criteria as a validator (eg. Tests, compiler, parser, linters) and fitness criteria as a scorer (code metrics like complexity, runtime, memory use, etc)… then:

1) define goal

2) generate multiple solution candidates

3) filter candidates by validator (does it compile? Pass tests? Etc)

4) score the solutions (is it pure? Is it efficient? Etc)

5) pick the best solution

6) manually review and tweak the solution

This structured and disciplined approach to software engineering works. Many of the steps (eg. 3, 4, 5) can be automated.

It generates meaningful quality code results.

You can use it with or without AI…

You don’t have to follow this approach, but my point is that you can; there is nothing fundamentally intractable able using a language model to generate code.

The problem that you’re critiquing is the trivial and naive approach of just hitting “generate” and blindly copying that into your code base.

…that’s stupid and dangerous, but it’s also a straw man.

Seriously; people writing code with these models aren’t doing that; when you read blogs and posts from people, eg. Building seriously using copilot you’ll see this pattern emerge repeatedly:

Generate multiple solutions. Tweak your prompt. Ask for small pure dependency free code blocks. Review the and test output.

It’s not a dystopian AI future, it’s just another tool.

emporas · on Nov 14, 2023

In general, one should not instruct GPT to solve a problem. The instructions should be about generating code, after a human thought process took place, and then generate even more code, then even more, and after merging all the code together the problem is solved.

The particulars are roughly what you describe, in how to achieve that.

yodsanklai · on Nov 14, 2023

I'd be curious to see how a non expert could perform a non-trivial programming task using ChatGPT. It's good at writing code snippets which is occasionally useful. But give it a large program that has a bug which isn't a trivial syntax error, and it won't help you.

> In programming you don't have to worry too much about hallucinations because it won't work at all if it hallucinates.

You still have to worry for your job if you're unable to write a working program.

mrtksn · on Nov 14, 2023

Understanding on core principle is definitely needed, but it helps you to punch above your weight.

Generally, generative AI gives mastery of an art to a theorists. To generate an impressive AI Art, you still need to have understanding of aesthetics and have an idea, but don't have to know how to use the graphic editors and other tools. It's quite similar for programming too, You still need understanding of whatever you're building, but you no longer have to be expert in using the tools. To build a mobile app you will need to have a grasp on how everything works in general, but you don't have to be expert in Swift or Kotlin.

raducu · on Nov 14, 2023

> give it a large program that has a bug which isn't a trivial syntax error, and it won't help you

That's not fair, humans can't do that, and if you walk Chat GPT through it, it might surprise you with its debugging abilitis... or thankfully it might suck(so we still have a job).

Complex code is complex code, no general inteligence thing will be able to fix it at first sight without running it, writing tests and so on.

d3w4s9 · on Nov 14, 2023

Similar experience. I recently needed to turn a list of files into a certain tree structure. It is a non-trivial problem with a little bit of flavor of algorithm. I was wondering if GPT can save me some time there. No. It never gave me the correct code. I tried different prompts and even used different models (including the latest GPT 4 Turbo), none of the answers were correct, even after follow-ups. By then I already wasted 20 minutes of time.

I ended up implementing the thing myself.

runeks · on Nov 14, 2023

> Self-driving trucks were going to upend the trucking industry in ten years, ten years ago.

And around the same time, 3D printing was going to upend manufacturing; bankrupting producers as people would just print what they needed (including the 3D printers themselves).

ChrisMarshallNY · on Nov 14, 2023

A few weeks ago, I was stumped on a problem, so I asked ChatGPT (4) for an answer.

It confidently gave me a correct answer.

Except that it was "correct," if you used an extended property that wasn't in the standard API, and it did not specify how that property worked.

I assume that's because most folks that do this, create that property as an extension (which is what I did, once I figured it out), so ChatGPT thought it was a standard API call.

Since it could have easily determined whether or not it was standard, simply by scanning the official Apple docs, I'm not so sure that we should rely on it too much.

I'm fairly confident that could change.