Hacker Newsnew | past | comments | ask | show | jobs | submit | runarberg's commentslogin

It was never a good word anyway. Infinitely better then Artificial intelligence (at least machine learning has machine and learning) but still bad.

I favor a lexicon which is more specific, like Markov Chains, Supervised Learning, etc.

In my view LLMs can keep the AI label exclusively (a bad technology deserves a bad name) and machine learning can walk slowly into the sunshine never to be seen again.


I used to hear things like “if cigarettes/alcohol were invented now, they would never allow it”, indicating that consumer protection used to be a thing, as early as 10-20 years ago. Now when AI hit the market it was obvious how bad and dangerous it was, yet governments (even the supposedly good ones in Europe which still [pretend to] do consumer protection) did nothing to protect their citizens from the harms AI was causing.

If we still did (or ever did) consumer protection like that cigarette/alcohol myth above indicates, then the makers of that tool would indeed be responsible for when their products does dangerous things.


That was what I was thinking when the conversation went to Derek of Veritasium. This poem was centered around humanity and our shared experiences as humans. Derek is consistently obsessed with technology, and will center every conversation around how technology will enhance the human experience (by which he probably means capitalism to be honest).

Taking the conversation to Derek of Veritasium feels like after having watched Koyaanisqatsi your mind goes to James Burke and how the invention of the plow has improved how we experience human society.


If that is what you think of Derek, then you really don't understand Derek.

The video that I linked to is over an hour on why new technologies never transform education. He has a number of videos that critique what capitalism has lead big industries to. For example he has one on Monsanto's war on farming, another on how forever chemicals are poisoning us, and a third on how short-sightedness on protecting the health of rubber trees could be an existential threat to civilization.

Your model of him says that he should have done none of those things. The fact that he did is strong evidence that you've got a cardboard cutout that you're using as a strawman. Because it's a convenient punching bag. And not because it matches a real human very well.


James Burke would also criticize technology. But compared to Koyaanisqatsi James Burke’s critiques feel very tame indeed. While James Burke would critique bad implementations of technology, or point at a place where technology was detrimental, Koyaanisqatsi would say: “humanity has lost its way in pursuit of technology”.

Reading this poem I saw a similar critique of AI as Koyaanisqatsi critiqued technology. And any advocate of this technology for whichever purpose, even the ones who occupationally critique some aspect of it, feel very tame in comparison, and off the mark.

I put my views of Derik in parentheses on purpose. I wanted to share my bias towards him, while also saying: “This is besides the point”.


It's clear that you're discussing this from an ideological bias that I do not share.

I see no reason to engage further in your assertions about what content is worthwhile.


We are debating a piece of art, having different interpretation is perfectly normal.

English is not a programming language though. I don’t understand how such an obviously false sentence can be so persistent.

I'm not even sure what you mean. Of course it is.

A programming language is a formal intermediate language for turning human comprehensible instructions into machine instructions by means of an interpreter or compiler. We've now allowed that intermediate language to be English, because that's preferable to most people, and the "compiler" has become very complicated indeed as a result of that.

You still have to be able to express what it is you want in a way the machine can understand, it's just both simpler and less deterministic now.


This. Just because an llm can translate any language into a programming language doesn’t suddenly make all languages programming languages. Until I can ‘brew install englishc’ and so on, it’s not a f**ing programming language.

Can you define programing language in a way that includes all the current programming languages and excludes English? I kind of doubt it unless you just define it as "anything that isn't a human language", which would be silly.

Natural language is full of ambiguities and redundancies which makes it a poor fit for a programming language, which is why it is never used as such.

You don’t need a precise definition of a term to know what a thing is and isn’t (Wittgenstein has taught us that much at least). We just need to know that programming languages are used to express an executable computer programs (usually by translating to simple machine instructions) and that a natural language has never been used in this way in a significant manner.

A case in point. I bet you can‘t find a definition for a fish which includes cods and sting-rays, but excludes dolphins and shrimp. And similarly the IAU were unable to come up with a definition of a planet which included Pluto and Mercury but excluded Ceres and Sedna.


> Natural language is full of ambiguities and redundancies which makes it a poor fit for a programming language, which is why it is never used as such.

I mean, a quarter century ago Dijkstra argued your point compellingly, and he was right back then. If you read his "On the foolishness of “natural language programming”" (1978) you'll find that all of his most compelling arguments are gone now. Things have changed, and the machines can now largely cope with the ambiguity of language as well as the average human being can.

Since human language is the original source for the specifications we turn into formal code most of the time anyhow, we're really just asking if that original specification the programmers turn into formal symbolism is a form of code or not, and whether a good spec is equivalent to good code. I think it's difficult to argue that it's not, especially given that we now have these handy Natural Language to Formal Symbolism compilers.

> We just need to know that programming languages are used to express an executable computer programs (usually by translating to simple machine instructions) and that a natural language has never been used in this way in a significant manner.

I did that like 30 times today. Maybe it wasn't in the past, today it is. The path is now Specifications->LLM->Formal Symbolism->Machine Code, it used to be Specifications->Human->Formal Symbolism->Machine Code. The inputs and outputs are the same, and I would argue that the process is still "programming" regardless of syntactic games with semantics.

Eventually we'll find a more efficient version of that formal symbolism and stop using code designed to be human readable at all. Still nothing will really changed besides the input method.


> I did that like 30 times today.

You did no such thing. You fed some text into a statistical machinery which was able to infer another text from it. The first text just so happens to be a natural language and the inferred text was a formalized programming language which the statistical model had had its weight tuned to produce.

Statistical inference is a completely different process then compilation. Inferring is a completely different verb from compiling. Two different verbs which mean different things.

If we take your logic and explore its implication, we can just as easily claim that a project manager writing JIRA ticket is programming, and that JIRA is a programming language. The project manager wrote a ticket in natural language which was picket up by a developer who translated it (by your defintion of translation) to a formal language which got compiled to machine instruction and executed by a computer. This is obviously silly. And as silly as you find my description, I find yours equally silly.


My reaction to this sentiment is that they fill the same need in Europe as Uber did in the USA. They found a way to operate in a market while avoiding its regulations and are therefor able to offer much lower prices as their competitors who still follow the regulations.

Europe has historically had pretty strict consumer protection laws, and ever since the end of the Cold War these consumer protection laws have been slowly chipped away. When I was a kid for example companies were not allowed to target children in their marketing material. When American media became predominant in the continent, instead of enforcing our own consumer protection laws against American advertisers, regulators just ignored it and allowed it to proliferate, effectively making ads targeting children legal in the continent. Regulators have been showing the exact same inaction towards Chinese retailers breaking our own laws as they did towards American advertisers three decades ago. I foresee that consumer safety laws getting the same fate as the ban on ads targeting children.


I wouldn’t call them effective as much as motivating. I think for people who would not be motivated otherwise, this methodology is fine actually, as the alternative is probably nothing. However if you are motivated, almost any other method is more effective then DuoLingo (or alternatives), including more effective then the old DuoLingo with the forums and everything.

Simon Willison’s analogy does not apply unless that other team was immediately fired after they delivered the image resize service, or (more commonly) was done by a one off contractor. The difference is the trust model. We trust that our company has hired a competent team which maintains knowledge of the image resizing service, that they respond to bug reports and feature requests and that they know how to fix and implement those.

Now I have been on HN long enough to know that we used to despise code written by contractors which we now depend on.


Why does the team need to be "fired"?

The single person who did the service might just quit and go to another job. They might be external consultants that rotate away when the contract ends. It might be a SaaS service where you don't control the code at all - nor the composition of their team.

We have trusted services, contractors and teams within our companies before. Now suddenly _everyone_ has ALWAYS read and meticulously analyzed every single line of code they have ever imported to a project?


As your parent comment says. It’s about trust. People don’t hire contractors with low reputations. Same with SaaS services. That’s why you see so much stuff about branding and customer testimonials. It can be gamed, but usually works well enough.

LLM have no reputation to lose. Their work may or may not be aligned with your goals and they can’t care if they messed up.


Personally, if my company would have one person write a utility which mine would depend on, and that person would quit soon after delivery, I would be pissed. And I would demand that my team take ownership of the utility, and gain intimate knowledge of the utility, and voice my concerns with management who made the decision to hand out a task like that to a single person. I would then inform that management about the concept of bus factor, and how they just violated best practices. That next time they decide to hand out a task like that to a single person, that they should instead just hand it out to the team which is gonna rely on that utility.

I’ve noticed you are posting a lot of studies around, some of which have been peer reviewed and some not, some argue against your point, and some show mixed results.

Are you a researcher in the pedagogical sciences? Regardless, you have to admit that the original claim has very little evidence behind it despite being testable. And also the caveat you tag onto the end is a pretty massive caveat, and from the sources you provided it seems that students which use in the way which you claim has been shown to be effective, that those students are in a minority anyway.


I'm not a researcher in the official sense, my interest is that of a parent whose kids are interested in programming and will be graduating into a world upended by AI, and how I can best prepare them for it. I always look to empirical evidence whenever there is a conflict of opinions, and there certainly are many opinions here!

I initially banned them from using LLMs for homework or coding assignments, because as above, my intuition is that you learn best by doing, and you won't learn anything if LLMs do everything for you.

On the other hand, I personally have learned insane amounts of a new subject matter simply by pair programming and conversing with an LLM. I could not even "cheat" and let the LLM do everything because the problem I tackled is not really addressed anywhere! This forced me to experiment a lot, which helped me learn very quickly.

This led me to wonder what "disciplined" use of LLMs can do for learning... which is how I came across a whole bunch of these studies.

I think your concern is really about disciplined use of LLMs, rather than the overall effect of LLMs on learning. And I would agree: students will just be too tempted to use them to cheat. However, I think those who have the discipline to use them judiciously can supercharge their learning like never before, but only as long they do the hard work of "building the muscles" without AI.


> a Junior (in ANY subject) has the ability to LEARN so much faster with an AI research assistant

This is a testable hypotheses with severe lack of citations. Intuition would argue the opposite. We learn by using our brains, if we offload the thinking to a machine and copy their output we don‘t learn. A child does not learn multiplication by using a calculator, and a language learner will not learn a new language by machine translating every sentence. In both cases all they’ve learnt is using a tool to do what they skipped learning.


This seems to me like one of those things where people go into it with widely different initial assumptions.

1. AI is for cheating and doing the work for you. Obviously it won't help you learn faster because you won't have to do any thinking at all.

2. AI is an always-available question answering machine. It's like having a teaching assistant who you can ask about anything at any time. This means you can greatly accelerate the process of learning new things.

I'm in team 2, but given how many people are in team 1 (and may not even acknowledge team 2 as even being a possibility) I suspect there may be some core values or different-types-of-people factors at play here.


This is also a testable hypothesis. I would like to see usage statistics before making assumptions here but my gut feeling is that an overwhelming AI usage (like > 90%) would fall into your category 1.

But even with category 2. I think that still does not absolve AI as a cheating machine. Doing research is a skill and if you ask AI to do the research for you that is a skill a junior developer simply never learns.


This is interesting and relevant: https://www.sciencedirect.com/science/article/pii/S095947522...

"The expertise reversal effect is present when instructional assistance leads to increased learning gains in novices, but decreased learning gains in experts."

There's a whole lot of depth to the question of how AI tools support or atrophy learning for different levels of expertise.


Actually, you're both right. Using AI as a supplementary learning aid -- i.e. students use AI as a personalized tutor but still do the assignments themselves -- produces better outcomes. But using AI as a crutch -- i.e. using it to do the assignments -- produces worse outcomes.

There is even preliminary research evidence for this, e.g. https://www.mdpi.com/2076-3417/14/10/4115 and https://www.sciencedirect.com/science/article/pii/S2666920X2...


> students use AI as a personalized tutor but still do the assignments themselves.

So your first study actually concludes the opposite. It concluded that all AI users performed worse, but the effect was smaller for students which used AI as a tutor.

The second meta analysis I don‘t quite understand. I understand they conclude that using AI tutor shows significant improvement, but I don‘t understand the methodology. I may be misunderstanding but it seems to simply count papers which shows positive outcomes and reaches conclusion that way. I think that methodology is deeply flawed as it will amplify whichever biases are present in the studies it uses. I also think the lack of control groups is a major issues. If we are comparing AI tutor to nothing, off course the AI tutor is gonna perform better. We need to compare to traditional methods. And this is especially relevant in our discussion because junior developers usually have excellent access to senior developers (via peer review, pair programing, etc.), much better then student’s access to tutors for that matter.

So out of the meta-analysis I picked the paper with the strongest claim (trying to steel-man it) which is this one: https://online-journal.unja.ac.id/JIITUJ/article/view/34809/...

It claims the following in the abstract:

> The results indicated that students employing AI tutors shown significant improvements in problem-solving and personalized learning compared to the control group.

Now when I look at the control group it claims this (also in the abstract):

> Participants were allocated to a control group receiving conventional training and an experimental group utilizing AI technology,

But when I look into the methodology section I see this:

> The researchers classified the patients into two groups: MathGPT and Flexi 2.0

MathGPT and Flexi 2.0 are both AI tutors. Now I am confused, where is the control group and how was this “conventional training conducted”?

The methodology section actually tells a different story from the abstract:

> This research utilized a quantitative methodology via a quasi-experimental design.

By quasi-experimental design they mean that they tested the same students before and after AI intervention. And concluded that the AI tutor helped them improve. Now this is not what control group means, so the researchers are actually lying by omission in the abstract. This is a spectacularly bad experimental design and I wonder how it would pass peer review, so I look at the publisher Jurnal Ilmiah Ilmu Terapan Universitas Jambi. So not exactly a reputable journal.

I still stand by my no evidence for a testable hypotheses. I suspect that your first link is actually correct in that AI is bad for students and just less bad if it is used as a tutor.


I hadn't looked at that study you selected, but yeah the methodology conflicts with the abstract (Also it low-key seems to be an ad for "Flexi 2.0.") It does seems to be a shady paper, with a small N and in a journal of questionable repute.

That said, there are 80+ other studies listed in the meta-study, which is pretty frank about its limitations. (Note the snippets about positive biases in the conclusion.) It is going more for quantity over quality and is transparent about the statistical findings of each one (or lack thereof; see the count of "Not reported"s.) All these references have a myriad of results, but across the spectrum of well-designed studies at reputable venues to the other end, they follow the same themes, so I don't think this can be dismissed that easily.

But if you want, here's more research (some of which I linked in a sibling comment https://news.ycombinator.com/item?id=48241839) which has similar findings:

https://scale.stanford.edu/ai/repository/ai-meets-classroom-...

https://arxiv.org/html/2601.20245v2 (from Anthropic)

This article summarizes some of the above and more studies and has similar findings: https://maxmynter.substack.com/p/learn-to-code-with-llms-i-r...


This was the only study from the meta analysis that I read, and I picked it because it made the strongest claim out of all of them.

This is in the opening of the results section in the meta-analysis:

> In the final screening phase, a rigorous full-text analysis evaluated the methodological robustness and empirical validity of the remaining studies. [...] The final corpus comprised 88 studies that demonstrated robust empirical evidence for LLM applications in educational contexts.

The inclusion of the study I read does not give me confidence that this statement is true. And the fact that they reach their conclusion by simply tallying up the positive vs. negative studies makes me conclude that this meta-analysis is practically useless. They do admit this in the conclusion (which is probably why it passed peer review [assuming the peer reviewer didn’t read the same citation as me as I am 100% certain they would have asked for it to be excluded]). But that pretty much just leaves us with nothing. We are exactly where we started. No evidence that LLMs help students beyond traditional methods.

Now I am not gonna read that Anthropic study. It reminds me of Cigarette companies finding the health benefits of cigarettes. That leaves that excellent 3-study review. In their first study they found LLM has negative effects on students (in line with the first link you showed me). In the second study they found no effect. And in the third study they found mixed (nuanced) effect where using LLMs as tutor helped students in one aspect but had negative effects on others. This is by far the best study you have presented me but it still does not change my opinion. There is little evidence that LLMs (even when used as a tutor) help people learn better traditional methods.

What makes me even more against this sentiment is this quote from the conclusion of the 3-study review paper:

> Our results suggest that students prefer to use LLMs to substitute rather than complement learning activities.

So on their own, students are more likely to use LLMs in a way which is harmful to their learning. I would expect similar behavior of junior developers.


As a precondition I think we have to assume that the person in question 1) wants to learn and 2) is smart enough to absorb new info and apply it and 3) reflects enough to adjust their approach when hitting bottlenecks or making mistakes 4) has a drive to create. Without these, self driven learning is not viable - and that has very little to do with AI.

For such a person, I believe AI can be very empowering for learning. Like Google, wikipedia and stack overflow, Arxiv before it - AI tools give access to a lot of information. It allows to quickly dig deep into any topic you can imagine. And yes, the quality is variable - so one needs to find ways to filter and synthesize from imperfect info. But that was also the case before. Furthermore AI tools can be used to find holes in arguments or a paper. And by coding one can use it to test out things in practice. These are also powerful (albeit imperfect) learning tools. But they will not apply themselves.


Who is talking about self driven learning? Every workplace teachers their juniors how to do their job, and how to become better at their jobs.

And as we are talking about junior developers it is safe to assume your conditions (1), (2), and (4) are all true, if any of them are false, then why did that person apply for and get a job as a junior developer? As for condition (3), all workplaces eventually hires a person who does not fulfill this, then they either fire that person, or they give them a talk and the developer grows out of it and changes their behavior to fulfill that condition.

Aside: you listed 4 conditions for learning. I am not sure these are actually conditions recognized as such by behavior science. In fact, I doubt they are and that these conditions are just your opinions (man).


And in doing so you spend what, a 100 watt hours per bad idea? Compared to how many megawatt hours of AIs failed attempts at proving math capabilities to investors only to prolong the AI bubble another month?

I bet your stupid ideas also taught you a valuable lesson and you learned at least something from the experience, maybe your next idea won’t be so dumb, and those 100 watt hours weren’t actually wasted (though it may feel like they were). Compered to a failed LLM experiment, where all those billions of billions of computations are completely wasted. the model knows exactly as much after a failed experiment as it did going into it. Those Megawatt hours were simply wasted, turned into heat energy, paid for by raising the power bills of the of the datacenter’s neighbors.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: