Hacker Newsnew | past | comments | ask | show | jobs | submit | mg's commentslogin

Considerations about what goes on in agents internally will probably not be part of software development for long.

Personally, I already see LLMs and agents as blackboxes. I give each feature request to multiple LLMs and then compare the results. I don't manually use "sessions" at all. I just look at the outcome. When I dislike it, I "git reset --hard", change my prompts and restart the feature request.

To have an ongoing sense of which agents perform best, I keep a log and calculate an ELO score of which agents meet my demands best. This score is imporant to me, not so much how the agent achieves it.


This is an absolutely crazy wasteful thing to do considering the actual cost of all that inference and nothing to be proud of.

Unless we do our own benchmarks, we have to take all the marketing fluff from the frontier labs at face value, and all public benchmarks degrade eventually as labs optimize towards them. OP’s approach is wasteful because it is brute force, but post says that an ELO is kept, so this is also an experiment, and I don‘t see what‘s wrong with that. You learn which model performs well in which settings which may save resources later. It‘s also wasteful to keep working with the wrong model/harness/tools for too long.

It is the other way round.

In an interactive session, adding "Fine, but make the button red" after the model generated a first solution more than doubles the tokens used. As the model now not only gets the original code and the feature request but also the updated code plus the change request as input tokens.

Sending a feature request to an LLM and then sending the feature request again with "The button shall be red" only doubles the tokens used.


The cost is far from linear though. Because of prompt caching and the fact that generally output tokens are a lot more expensive than input tokens.

Agreed that it is not linear.

I wrote my own agent, and it sends data to LLMs in this order: "General Prompts (How to write good code)" + "The Code" + "The Feature Request". This means the KV cache will be used even when the feature request changes.

And output tokens are usually way less than the input tokens.

So I think that my approach is very lightweight on token usage compared to an interactive session.

It would be interesting to measure it for the other agents out there. Sending a feature request two times vs an interactive session.


"Make the button red" probably doesn't need an LLM at all.

One tends to use LLMs for everything in practice. It‘s inconvenient to switch mode of operation

That’s usually not true due to caching. It may be true if you leave a large gap in between, but if you send “make it red” right after, then it’s purely incremental

Probably like 1% of the energy an average person spends on driving.

Average american is what you mean

come on now, we can't just not escape the permanent underclass by using our brains, we've also got to use up all the resources while doing it.

What kind of projects/code do you have them work on?

Asking because I could guess that approach would be ok for the types of front end work that doesn't require much security or other validation.

But it sounds like it wouldn't be suitable for work in regulated industries or anything that needs to have extreme care taken.

?


Which model is leading the pack for you?

From the SOTA model providers, I only use OpenAI and Google. And between gpt-5.5 and gemini-3.1-pro-preview, gpt-5.5 is currently leading.

    you will be surrounded by an ecosystem of
    devices, none of which stand alone, but are
    more like portals to interact with your agents
I would be really happy with my phone + headphones as the device I use most. But only if I could use Gemini (or ChatGPT or Grok or any other chat agent) in voice mode and say "SSH into my GitHub Codespace soandso and implement feature soandso.". And it replies "Did it. I told copilot (or codex or whatever coding agent lives on that VM) to implement the feature".

And then a minute later I could ask it "Is copilot done yet?" and it replies "No, looks like it is still working on it". And then a minute later I ask again. It replies "Yes, it finished. It changed chart.py and styles.css. Do you want me to tell you what specific changes it made to the files?".

But it looks like none of the chat agents with voice interface have such a connector at the moment? An SSH connector would be the most useful. But a "GitHub Codespace connector" or something like that would also do.

I wonder if that will be a missing piece for long. If so, I would build an agent with voice mode and ssh connector myself. But I guess it should come out from the big guys any moment now?


> Yes, it finished. It changed chart.py and styles.css. Do you want me to tell you what specific changes it made to the files?"

A verbal diff sounds practically useless. Does it first read out the entire left-hand base, and then read out the entire right-hand target? Does it say loudly "REMOVING ... ADDING ... "? How would it read out something like Struct->Field? This seems lower fidelity than a visual confirmation, and I just don't think that voice commands make sense with this kind of work.


It would tell me about the changes like a human would.

"It changed the plot function so it takes another parameter called linewidth. It also added an input field in the stylecontrols section where the user can ...".


How would you detect the presence of bugs in this scenario? How would you make sure the LLM isn't adding yet another useless, redundant function to the code base? Even if there isn't a bug in this PR, do you not want to be familiar with the actual shape of the code in case you need to dig through it while bug hunting later?

Every time I try to take a hands-off approach to the code like this, I come to regret it later. The code ends up bloated and labyrinthine. When I let it grow unabated, it becomes gradually more difficult for the LLM to understand the intended structure as the project becomes too big for the model to keep the whole thing in its context.


    How would you detect the
    presence of bugs in this
    scenario?
I would ask AI. "Did the last commit introduce any bugs or unintended consequences?". In fact I already use this prompt after every change I make manually.

    How would you make sure the LLM
    isn't adding yet another
    useless, redundant function to
    the code base?
By asking AI. In fact, I already run a long "Can you refactor anything in this codebase to reduce redundancy, improve readability, performance or maintainability" pretty regularly.

Are you ever reading the code? What do you do when the LLM can't fix a bug? Do you not wish you had a more intimate first-hand knowledge of the code when fixing things yourself?

Please don't tell me that never happens-- I've had one just in the last week and I use both OpenAI and Anthropic foundation models.


In my current workflow, yes, I read all code.

In fact, I usually let multiple LLMs implement the same feature, and then I compare them. I even run my own arena in which I calculate Elo scores for LLMs from my perspective of which one implemented features better.

Having the ability to control code agents via voice would not take away my ability to do that. But I think in the future, that will become less and less necessary. If we look back at this conversation in five years, it will look very archaic, and we will be used to having superhuman AI do everything for us. In 10 years, it will sound like a strange idea that humans were once fiddling with code to improve the quality.


Something something wasting machine cycles with a compiler.

Something something taking the crafts and the man out of craftsmanship to just get it out the door as quickly as possible.

All jest aside I mostly agree with you but I'd tack on another 20 years for a total of 30.

Though in this technological jump I don't think people are as excited (understandably) as when the teletype came on scene. I too like the potential but dislike the whole discourse around it, the ethics involved and the way it's deployed. Such is life I suppose.


Fair enough. Thank you for sating my curiosity. I'm not quite as optimistic as you, but I'm excited at the potential to be proven wrong. :)

I like how people think that if LLMs get to the point where they write code you can ship without reviewing it, that humans will still be in the loop "sshing into a code space" and "implementing features". Do you really think you'll even know what files are in that repo? Or that you'll be a necessary part of the process whatsoever?

I can't tell if this is sarcasm.

I wrote my own tooling around the raw LLMs:

I can tick files in Vim, those get concatenated into a prompt. Along with a feature request. Plus an instructions file that tells the LLM how to reply. Plus my general "rules for good code" file, plus one "rules for good code" file per language involved, plus a project specific overview file. The LLM then answers with a list of changes it wants to make to the code. My tooling then applies those changes and I look at them via "git diff". If I like it, I commit. If not, I change one of the prompts and start the process again.

Instead of replying with code changes, the LLM can also decide to request more files. I wrote a little DSL for that.

I described the beginnings of this workflow last July:

https://www.gibney.org/prompt_coding

Feels like an eternity ago. I think I will write a new blog post this July and describe how the workflow has evolved over the past year.


Makes me wonder how many bytes the shortest possible Mandelbrot implementation would need.


Author of "wakeup" here. You would would need between 32 and 64 bytes. I have something that almost looks like one in 32 but it's not published yet ;)


At the same event I released "Broccolori", a 32 Byte fractal for old-school PCs.

https://www.pouet.net/prod.php?which=106205

Related to the Dragon Fractal, with a twist:)


    You can’t import or require() a module
    that only exists in memory.
You can convert it into a data url and import that, can't you?


What happens to relative imports?


I wonder if we really need agents to have control of a full computer.

Maybe a browser plugin that lets the agent use websites is enough?

What would be a task that an agent cannot do on the web?


Not sure if this is a joke

But how would claude code work from a browser environment?

Or how would an agent that orchestrates claude code and does some customer service tasks via APIs work in a browser environment?

Would you prefer it do customer service tasks via brittle and slow browser automation instead?


    how would claude code work from a browser environment?
If you want an agent (like OpenClaw) to write software, why have it use another agent (Claude Code) in the first place? Why not let it develop the software directly? As for how that works in a browser - there are countless web based solutions to write and run software in the cloud. GitHub Codespaces is an example.


But OpenClaw is "Claude Code" with bells and whistles so it can be contacted via messaging services and be woken up to do things at specific times.


I personally won't allow full control for a long time.

On the other hand LLMs have been a very good tool to build bespoke tools (scripts, small CLI apps) that I can allow them to use. I prefer the constraints without having to think about sandboxing all of it, I design the tools for my workflow/needs, and make them available for the LLM when needed.

It's been a great middle ground, and actually very simple to do with AI-assisted code.

I don't "vibecode" the tools though, I still like to be in the loop acting more as a designer/reviewer of these tools, and let the LLM be the code writer.


But does the agent have access to a whole computer to write those tools?

Couldn't it write them in a web based dev environment?


No, it doesn't, I only run agents in a dedicated development environment (somewhat sandboxed in the file system) but that's how I've used them since the beginning, I don't want it to be accessing my file system as a whole, I only need it to look at code.

Don't think a web-based dev environment would be enough for my use case, I point agents to look into example code from other projects in that environment to use as as bootstraps for other tools.


Why can't that "dedicated development environment" be a cloud VM with a web interface, a GitHub codespace for example?

You could put the example code on the filesystem of that VM too.


It could be…


Browser plugins have a security problem that's easy to miss: the agent runs inside your existing browser profile. That means it has access to your active sessions, stored credentials, autofill data — everything you're already logged into. A sandboxed machine is actually the safer primitive for untrusted agent tasks, not the more paranoid one. I work on Cyqle (https://cyqle.in), which uses ephemeral sessions with per-session AES keys destroyed on close, because you want agents in a cryptographically isolated context — not loose inside your personal browser where one confused-deputy mistake can reach your bank session.


Every week there is a news article about some script kiddie who shot themselves in the foot after vibe coding their production-ready app, without the help of any senior engineer, because, let's face it, who needs them, right? Only to end up deleting their production database, or leaking their credentials on a html page or worse, exposing their sensitive personal data online.

I'm actually pro-agents and AI in general - but with careful supervision. Giving an unpredictable (semi) intelligent machine the ability to nuke your life seems like the dumbest idea ever and I am ready to die on this hill. Maybe this comment will age badly and maybe letting your agents "rm -rf /" will be the norm in the next decade and maybe I'll just be that old man yelling at clouds.


Run anything multi threaded?


    German chancellor Friedrich Merz ... 
    lashed out at German workers to
    “simply do a little more,”
Germany literally pays people to do nothing.

A friend of mine, an engineer who works in the German car industry, recently told me that nowadays he has a lot of free time. Because the company he works for has so few orders that the company is granted "Kurzarbeitergeld" - the government pays 60% of the salary if the employees work less.

That blew my mind. If I had fewer orders, I would work more to increase the quality of my product and my efficiency. Working less as a reaction to losing market share seems completely counterproductive to me.


When my eldest daughter was in high school (~2010, Argentina) there was a provincial policy where if every single student had a result below a certain score in a test, the scores had to be re assessed against the maximum result.

The resulting situation here was that she was constantly bullied into underperforming. Both cases are actually similar in that each individual has a personal incentive to underperform - the difference is that in your friend's case the policy is granted at the company level so no single employee can defect and break it for the rest, while in my daughter's case one high scorer could invalidate the reassessment for everyone, which is exactly what made defection punishable and the bullying emerge naturally.


This is the natural result of "equity" which is the academic jargon term for "forced equality of outcome". High achievers are attacked. People who push us forward are demonized. The low achievers are never pushed to be better. And the average drops.


Can you link a source for it? That sounds too absurd to be true…


It’s not that absurd and happens all over the world in university systems. I had a Comp. Sci. Professor that taught assembly and graded on a curve. As you might imagine the one guy that was a wizard at assembly caught flak from the unwashed masses.

I had another professor that not only did a curve but dropped statistical outliers to prevent this problem, he literally explained his system on Day 1 of the course. This was 15+ years ago and by no means a new idea.


The future is not evenly distributed.

I tried to search for it, but even the 2 documents that superseded the one from around the time my daughter was at school at not available.

I mean, the site doesn't even have a valid secure certificate so...

In the site below (In Spanish) you can search for 10/2019 and a cursory translation of the document title will show that this is the proper document (For 2019 onwards, the replaced doc 04/2014 isn't available either)

https://koha.chubut.edu.ar/cgi-bin/koha/opac-search.pl?idx=k...


It makes sense for production line workers. Less so for R&D, but I've seen it affect R&D as well.

By the way, it's not 60% of the salary that the state pays - it's 60% of the difference in salary due to reduced hours.


The problem is they don't get rid of their ICE cars any more - most of the world is transitioning to EVs and China - their previously biggest importer - has strict EV quotas and no one wants to buy German EVs that are too expensive and less capable than Chinese made EVs.

Germany was ahead in EVs and solar - both industries have been cuto off by conservative/free-market ICE lobbysts and these 2 huge global markets are now dominated by China instead.


> has strict EV quotas and no one wants to buy German EVs that are too expensive and less capable than Chinese made EVs

Both China and the US have enacted trade barriers against EU originated auto drive goods.

A Chinese VW ID4 is manufactured in Shanghai and an American one in Tennessee. And that is the crux of the issue - consumers are still open to buying a German badged product, but it won't be "Made in Germany".

And where else can Germany export?

The individual EU states are protective about exports by leaning hard on nationalism and union support as seen in France (Renault+Stellantis) and Italy (Stellantis), India demands China- and US-style JVs and domestic manufacturing as well even despite the EU FTA, Japan and SK prefer buying domestic, Russia is blocked due to sanctions, ASEAN+Africa is flooded with Chinese, Japanese, and Indian manufactured cars already, and South American is flooded with Chinese, American, Japanese or domestically manufactured cars.

Germany Inc will remain in Germany as long as Germany makes itself attractive. Otherwise, they will leave, as they have already done so for the US, China, the CEE, and increasingly India.


The VW EVs aren't bad, apart from poor UX design, which they have realized.


Broadly speaking they're not _as far away_ from Chinese EVs as we all make it here to be. However, the problem is that the stockholders of all these companies expect 500% YoY growth which isn't sustainable. Not to mention the cost of a car has grown significantly while all German cars have degraded substantially in quality. For example, you can expect Porsche to sell FOREVER the number of cars they manage to sell in the 00s and 10s:

    2000: ~54,600 (happy)
    2007: ~98,600 (happy)
    2019: ~280,800
    2025: 280,000 (CRISIS!!!)
I mean... It's a freaking sports car! Why on earth would you expect to sell more units than these?


Broadly speaking internet commentary about cars is crap and the differences between any two competing products (i.e. ignore the dishonest comparisons between a Chevy Cobalt and a Toyota Landcruiser) are way, way smaller than the online fanboys, shills and people with a very expensive purchasing decision they want to feel validated in will lead you to believe.


They are on par or better than their Chinese counterparts but they are a hard sell being 30-50% more expensive.


The same new VW ID.3 in China costs less than $20K, while in EU more than $40K.

Not the BYD or noname EV, the same model.


Kurzarbeit is only available for a limited time and has the target to avoid layoffs resulting in much higher costs in unemployment payments.


It's 24 months at the moment, which is way too long. A buffer period for companies to allow for necessary adjustments should be one quarter, max two.


The alternative is, that if they don't manage to adjust, they let people go.


Depends on your position I guess. If you are a worker at a conveyor belt, it doesn't make sense to work more to produce more cars nobody needs. I think originally this policy was designed to save jobs during temporary downturns, not to save industries going downhill


If anyone can predict whether a downturn is temporary or an industrial shift you can make enough money to never work again.


If you start with a couple hundred thousand $, I can see that. But if you start with like $10,000, how do you stop market fluctuations from eating your money before your correct prediction turns into runaway compounding gains?


Government support for reduced working hours is also limited to 12 months


That is true for RnD not for Factory workers. Germany has quite strong workers rights, so mass layoffs are not a possible solution to safe money if facing lacking orders.

Essentially companies get some of the money back they and their employees paid as taxes.


> If I had fewer orders, I would work more to increase the quality of my product and my efficiency. Working less as a reaction to losing market share seems completely counterproductive to me.

That may work if you are a sole proprietor or small business person, but that's not how shareholder owned corporations work.

A sole proprietor is willing to work more if business drops (effectively lowering their compensation rate) because they are the beneficiary of any future gains that may (or may not) result from their short term sacrifice. If they want their employees to do the same they have to give them the same deal.

A large corporation can't easily make its employees work much longer for the same pay (except in the very short term), nor can it easily get shareholders to be OK with increasing spending on labor. This usually ends with massive layoffs when it can't sustain itself anymore.

That's one reason that smaller companies can be more nimble.


Actually this is not as easy as it sounds. Quite a few companies opt not to go into 'Kurzarbeit' because it means that you go under extra scrutiny as it is only there to stop major layoffs, which would cost our social insurance system even more. There is typically enough mechanisms to make it unattractive enough, that even unions accept unpaid leaves instead. IMHO there is bigger productivity problems. After COVID sick leave has massively increased. Many women do not work as much as they want because child care is still sketchy. There is often simply no incentive to work more because engineering careers are quite limited: general pay compared to expenses is really good, but top performers earn considerably less than in other countries.


You cannot indefinitely ask employees to do more for less money. According to law and job contracts you can only do so much unpaid overtime. Usually 10h instead of 8h, and only when justifiable. Everything else is illegal. Paying 60% and working less is hoping for the environment or situation at the market to change, whether or not that is realistic.


However you look at it, sitting at home doing nothing is not the right approach for engineers to get their company back on track.

If there is no money to pay them, they should get shares in the company. So if their R&D is successful, they participate in the outcome.


Depends on where you work in the industry, there's a huge level of division of work. Upstream departments should work more on new products and marketing etc. But a little more downstream, there isn't much todo if not enough cars are ordered.

The intention of Kurzarbeitergeld is to prevent large layoffs. I honestly can't tell if that makes sense in the long run, but it seems reasonable for a political party trying to make it to the next term.


How does this blow your mind? Your suggested action is illogical since the standard way to deal with this problem is to fire workers, but if demand is seasonal, it means you constantly have to fire and rehire, so instead of firing full workers, they just fire part of their working hours.

The crazy part is that the government subsidizes this, which creates perverse incentives, but that's a different problem.


The worker who’s put on this reduced salary instead of being fired doesn’t have orders nor he has a product. He works for a wage.


> If I had fewer orders, I would work more to increase the quality of my product

Really? Because most of the time what you see is huge layoffs and gutting the company's assets.


So let me get this straight: you have exactly one data point, and the second data point using is a German chancellor widely regarded as one of the worst by many measures. Right?

This is FUD. They said the same about the Greeks in 2008, it is complete BS. In any given org, passed a certain size you'll find ppl who slack a lot and people who work for three. Unfortunately that seems to the nature of large orgs, nothing special about Germans...

ps. I'm having a DejaVu. This is the exact same narrative Greek politicians used against the Greek population to justify them become poor overnight.


The one thing that interests me most when it comes to laptops these days is weight. So I jumped right into the tech specs section and looked it up. Since this is the "Air" laptop of the company that is popular for thin and lightweight devices, my hopes were high.

But ...

The 13 inch version is heavier than a ThinkPad X1 Carbon. Which has a 14 inch screen and can run Linux.


I bought a ThinkPad X1. Had to send it back for repairs three times in the first year, including a complete motherboard replacement, and it died again immediately after the warranty expired. Been a $2800 door stop since then. The case is flimsy plastic that gets beat to crap easily. The trackpad is over-sensitive in all the wrong ways which makes it hard to use as an actual laptop. Plus it's weaker and slower than an Air. Also unbearably loud and unbearably hot.

I don't like Apple as a company and I don't particularly like MacOS, but no one except Apple makes a laptop worth a damn.


Was it a Gen 1 device? I bought a Thinkpad X13 Gen 1 many years ago and it kept having blue screens from RAM errors and other problems. Eventually after many warranty attempts and motherboard replacements they sent me a new X13 Gen 4. This has been running Ubuntu with no problems for 4 years now, it might be more a "lemons" phenomenon than a general rule. Also, AFAIK, the case is metal with a "soft-touch" coating.

The Apple ARM processors are still in a league of their own but personally I'm not willing to give up my OS freedom of choice for that advantage.


Not my experience in the slightest, after two decades of personal thinkpads and around 20 issued to my team.

Also if you'd just spent that extra 120 bucks for the 3 year onsite warranty, you'd have a lenovo technician replacing your motherboard at a location of your choice the next working day.


Very different experience here. I have an X1 Extreme Gen 4 since 2022 (running Linux), and have had zero hardware issues so far. The only thing that's annoying is that it gets quite warm on the hand rest.


I have an X1 Carbon 2023. It's pretty solid, the only complaint I have is once the CPU usage is over 10% the fan starts running full blast.


Eh. Just simply on stability and life, beyond CMOS battery and laptop battery changes, my 5x 2015-2018 lenovos are working like a charm. I love the plastic case, it flexes and catches falls better than the mac. The MBPs have fallen down and dent like crazy, leak electricity through the metal body, weight like crazy and still no OS freedom and no free app store and you got to rely on "homebrew"? It is wild that we are relying on "home" brew for making a machine from on the richest companies in the world palatable.


Since when do we use crowdsourced anecdotes to represent product quality?


> but no one except Apple makes a laptop worth a damn.

That's pure nonsense. I'm a fan of the Asus ExpertBooks myself which seem to be largely ignored in these discussions. They weigh about 2 pounds, 15mm thick, they don't overheat, about 15 hours of battery, and pretty damn durable.


I also bought a ThinkPad X1 back in 2015. Used it for 9 years with no issues at all. I installed Linux on it last year and still use it.


why did you create a throwaway account for this


Maybe they work for Apple


The Air is going to run laps around the X1, in literally every benchmark you can come up with besides "its not open source". I have that same processor in a much bulkier thinkpad and it thermal throttles instantly doing basic office multi-tasking, with the fan running constantly.

Also its made out of metal.


The X1 Carbon is getting updated to Panther Lake, and Panther Lake is getting competitive with the M5.

> in literally every benchmark you can come up

Nope, Panther Lake will win most gaming benchmarks. The M5 will win most others but not by "running laps around" levels.


At what power envelope? Intel chips can compete with M series chips, but usually at way higher power, which means fans running like a jet engine.


Similar power envelope. It depends on the laptop of course, but many Panther Lake laptops score comparably on battery life tests in reviews.


> score comparably on battery life tests

Until Windows leaves it in S0 state while its in your backpack :)

My Lenovo does this every week, such a joy.



Thinkpads have track points, macs don't.

That benchmark is really important to me due to RSI. Track points save me a buttload of hand pain.


interesting, I had to stop using my trackpoint because it was giving me rsi in my index finger. the track pad hasn't given me any issues.


Same here, had to stop using the trackpoint (after maybe 10-15 years of heavy usage). And macbook trackpads are awesome.


Same. I used to be ride-or-die for my Thinkpad Trackpoint™ keyboards, I even had an external USB one. Eventually I started to feel the RSI in the top of my hand. Haven't had any issues with a Macbook trackpad.


It's all about not doing the same thing, whatever that same thing may be. I switch between trackball and mouse as each one gets a bit uncomfortable.


I'm also convinced that trackpoints themselves are stiffer and less comfortable to use than they used to be in the older thinkpads


Ever since the T450 the trackpoint has been awful.

Can't replace the nob anymore either, as the convex knob was arguably the best


What do you mean when you say you can't replace the knob?

It comes off on my T14s Gen 1 and the T14s Gen 5 that replaced it.


I had the opposite issue. Trackpoints stated hurting my hand because it requires significantly more force than the Mac's touchpad.


What basic office tasks are that?

The last time I was excited about the performance of local computers was in the 90s I think.

Modern laptops are so insanely fast. Not sure if they are 2x, 10x or 100x faster than I need them to be. But I never hear fans. I never have to wait for the machine these days.


Have you used a MacBook as a daily driver since the M chips came out?


No. I'm looking to get one with 64GB memory for local AI models. The worry is the keyboard experience on the MBA isn't as good as the MacBook Pro.


The keyboard and touchpad experience are nearly identical between the two... not nearly as good as old IBM Thinkpads used to be, but that's a trade with IMO the much better touchpad experience on Mac.

That said, I just don't think I can keep buying Apple hardware, just not a fan of the company... I only begrudgingly use Android as there isn't a reasonable, more open option.

I'll probably stick with my M1 air for personal use a couple more years then pass it on. My daughter is still using my now 13yo rMBP with 16gb/512gb. I wish the ram and storage upgrades on mac weren't so overpriced.


At current rates they aren’t overpriced at all. Frankly I’m surprised we didn’t see a big increase in cost with this generation.


Apple has their supply lines locked in a few years ahead of time... they likely won't see downward pressure for a couple years still. Not that they might not still take advantage... though downward sales pressure is a trade off too.


I’ve used both extensively and there’s very little difference in the keyboards between an Air and a Pro.

The difference in displays (Pro much brighter) and size/weight (Air much lighter) are much more significant considerations, IMO.


It has always been like this. Apple's signature for their laptops is their aluminium body and people seem to like it.


I like the aluminum body a lot. I'm not particularly clumsy, but each of my macbooks ends up with some fall damage at some point over the 5+ years that I have it.

When I used to be assigned a plastic Dell work laptop, I dropped one onto the carpeted floor of my office because I thought it was going into my padded sleeve of backpack and that cracked the case, and broke the screen. I've accidentally yoinked my MBA (last intel one they made) off my desk, and while it dented the body of it, nothing broke. That is now my drum computer, and it gets regularly pelted with drumsticks when my grip tires.


My father recently dropped my macbook air from the car essentially on concrete bricks.

It has just gotten a single dent for something less than 0.5 cm and its on the side (although this damage was done when the laptop was closed so some damage is just above the laptop's display aluminium shell.

To be honest, its barely visible and everything is working and there was no damage on display or anything else for what its worth.

I usually don't like apple but damn the macbook air is tiny and can take some damage.

Although I am still just a little sad about the damage because the laptop was perfect condition beforehand now that we talked about it but its incredibly better than any other laptop atleast with that thing in mind. Gonna use this laptop for a long time (M1 Air)


Unfortunately dropping your laptop once in 5 years actually does make you too clumsy for a plastic laptop.


As someone clumsy, I'm so grateful that my MacBook Air can take a beating. It has one slight dent of about 1mm in the 4 years I've had it and I definitely drop it or knock it off a desk or something a few times a year.

I'll take the extra weight of aluminum (0.3lb, 130g). Yes, someone might say the ThinkPad X1 Carbon is 14", but the 13" MacBook Air actually has a 13.6" screen.

If I were in the market for a PC laptop, I'd definitely take a look at the ThinkPad X1 Carbon, but I'm also not worried about the weight of my MacBook Air. The X1 Carbon Intel ones are on sale right now since Panther Lake will be a huge upgrade coming soon, but even on clearance they aren't cheap. An X1 Carbon with 32GB RAM and 1TB storage (Ultra 7 268V, the cheapest one due to the sale) will cost $1,679 while a similar MacBook Air will cost $1,699 - and the M5 has 48% better single-core performance and 56% better multi-core performance (Geekbench). A 16GB/512GB (Ultra 5 225U) X1 Carbon is $1,538 compared to $1,099 for a MacBook Air - and the M5 has a 74% single and multi core advantage there.

Panther Lake might narrow the performance gap, but early indicators don't seem like that's the case. Even the top of the line Ultra X9 388H sees the M5 with a 36% single-core advantage while the Ultra X9 388H gets 3% faster multi-core. And I'm not sure the higher wattage "H" processors work for something like an X1 Carbon.

The highest non-H Panther Lake processor (Ultra 7 365) sees the M5 get 51% better single-core and 58% better multi-core. Maybe we'll see better, but it looks like Intel isn't closing the gap in 2026.


Does it? In my case, it was my father who dropped my mac but luckily everything was all safe with tis but a scratch. So perhaps that can be taken into factor as well that its more than one variable.

That being said, I am pretty clumsy but I have never dropped any hardware except a dumb phone which I threw out a lot and it was so small and tiny but it never had any problem.

And then one day I dropped it from top just a little bit and let it drop/slide inside my bag (like a cushion) and that day it died. I recently asked someone about it and turns out that its battery got inflated.


It's essential for thermals. Without the unibody, it would throttle sooner and you'd lose performance.


The aluminium chassis cannot be used for heat dissipation without risk of harming users. Which is why there is a "macbook air peformance mod" to add thermal-interface-material (instead of thermal insulation) to turn the chassis into a heatsink.

It's not a heatsink by default.


Not really. I did the thermal mod to my previous (M1) MacBook Air and it still didn’t get all that warm.

The Intel MacBook Pro I had before that one got far, far hotter - almost scalding hot if you really pushed it - without any modifications.


The last generation of Intel Macbooks was so bad... the i9 I was assigned from my job at the time would constantly go in and out of thermal throttling, making the whole experience effectively useless... It was also so locked down, I couldn't apply any mods to be able to underclock/volt the thing to something reasonable.

I really do hope that Linux becomes an option in more workplaces without being too locked down for developers.


Air has no thermal connection to the chassis for the purpose of making it safe to have in contact with skin.

People have been modding theirs to make this contact, though. And been getting a significant performance boost out of it.


I believe we are talking about slightly different things. Yes if they thermally coupled the body to the processor, then a small patch of the body would get very hot, burning the user.

However, the fact that the aluminum gets hot during prolonged use means that it is acting as a heat sink and cooling the CPU compared to a body made of plastic. Thermodynamics, it's the law!


>However, the fact that the aluminum gets hot during prolonged use means that it is acting as a heat sink and cooling the CPU compared to a body made of plastic. Thermodynamics, it's the law!

Not really. It's picking up "stray heat" that is radiated from the copper heatsink inside and conduction from the air in the fan system. It does not improve cooling the processor in any kind of manner. If it were plastic, the plastic would get warm too. Maybe it'll be a 2 degree difference.

Direct contact or bust.


It should improve ambient temperatures inside the body, allowing for more heat transfer.

It might be marginal, though.


It does actually help. All heat radiated into the aluminium isn’t in the copper, so makes it to the environment. The copper remains cooler overall.


The original Air lineup was thinner in the front and seemed a little lighter. The thicker front on newer airs gives more battery life, but I'm not a fan of it.


The thinness at the front was a bit of a hack though wasn’t it? So Steve Jobs could make it look good in photographs. I’d take the extra battery life any day.


The final version of the “wedge” Air case was an amazing piece of physical design. The lid had a large-radius complex curve that perfectly controlled reflections. The bottom case had a curve that made it look like the machine was hovering above the desktop from almost any angle. Calling that a “hack” is sort of like calling it a “hack” that a Ferrari looks fast even when it’s parked.

The new designs are overtly boringly utilitarian. I would say they intentionally look ugly. I guess this must have been intended as a marketing signal.

And it seems like it’s working since you think the new design delivers better battery life. It doesn’t! The 13-inch M1, M2, M3, M4, and M5 MacBook Airs are all specced for 18 hours of battery life.


I think its a matter of the chonkers feeling like you're getting what you're paying for. "This thing is so expensive! WHY is it so thin?"

Of course the zeitgeist keeps changing and what made sense yesterday might look like madness for those that aren't following things closely. As for myself, I very much prefer "slightly chonkier, but better heat dissipation" (coming from owning an intel mb pro and using it on my lap often).


I have the M1 MBA and M5 MBP. The wedge MBA feels noticeably thinner and the MBP feels kind of chonky in comparison. It's a bigger difference moving them one-handed than the specs would indicate.


Exactly this. And it makes sliding it in and out of bags and laptop sleeves so much easier.


I'm in the same boat. I have one of the original M1 MacBook airs, and the thicker front feels like overall a downgrade in hardware. Going up to higher ram amounts might be good for some of my datasets, but it's not needed for any software I run.

So I guess I'll wait for the next cycle and hope they return to the "Air" idea again.


I like the touchpad. Is there any competitor which is as good and exact? I noticed in Linux, it's not as exact.


Thinkpad touchpads are mediocre at best. Dell’s are a little worse than that IMHO.

I don’t understand why other laptop manufacturers don’t copy the Apple trackpad.


I liked Dell one, loved ThinkPad, and hate MBP trackpad. I guess it's a matter of taste?


I have Lenovo laptop with quite mediocre touchpad. I got used to use gestures instead of clicking and it works great for me.


If anybody else wondered about figures:

13.6 inch 2560x1664 screen, 1.23kg (13" Mac)

14.0 inch 1920x1200 screen, 0.98kg (14" Thinkpad)


It comes with a 2880 x 1800 OLED


As long as your wallet “comes with” an extra $2000 over the MBA.

(EDIT: ninja’d, I see.)


The $3000 version does. The air is $1000


> The 13 inch version is heavier than a ThinkPad X1 Carbon

And costs ~800 more for 16Gb/512 with a slower CPU and worse battery life.

As someone who spends his life on the road with a laptop, I strongly feel that anything that works for you under 3lbs is the sweet spot. The difference between 2.2 and 2.7lbs is miniscule in the grand scheme of my backpack.


But then you'd have to have a plasticky thinkpad with half the screen resolution...


It comes with a 2880 x 1800 120Hz OLED


https://www.lenovo.com/gb/en/p/laptops/thinkpad/thinkpadx1/t...

For an RRP of £3,259.99?

Compare that to the base 512GB, 16GB memory macbook air @ £1099.

The next comparable X1 Carbon I can find is: https://www.lenovo.com/gb/en/p/laptops/thinkpad/thinkpadx1/t...

RRP: £1,900.00 with this crappy display: 14" WUXGA (1920 x 1200), IPS, Anti-Glare, Non-Touch, 100%sRGB, 400 nits, 60 Hz


[flagged]



I really like my X1 Carbon gen 7, aside from the bizarre Ethernet "port" (it has built-in Ethernet, but they didn't have room for RJ45, so instead of just telling you to buy a USB one it's on a dongle that blocks one of its two USB-C ports when plugged in, eliminating the advantage of "doesn't use a USB port"). But aside from fantastic Linux support, it's got little to recommend it over a similar-vintage MBA, which has a much better look and feel.


Same here. If the rumored A18 Pro MacBook stays under 1kg, it would be very compelling.

Regarding lightweight laptops, the Fujitsu FMV Note U series (14-inch) weighs only 634g-917g with Arrow Lake 255H and a replaceable battery.


i run fedora and arch on my m2 air, via the UTM app which wraps Apple Silicon hypervisor, and it's _fantastic_.


I'm in the same boat and finding it disappointing.

For people saying this machine is so much faster, I don't care. My situation isn't the norm, but we're on HN. I have a powerful desktop that's my main compute machine and my laptop is a terminal. I need a web browser, whatever corporate shovelware I need, and a ssh connection (and tailscale). If I wanted to do real work locally I wouldn't be getting an Air.

While realizing I'm not the typical user, it's not like the typical Air user needs much compute anyways. The general public just uses web browsers.

Though one thing I'd love is if they could add just a little distance between the keyboard and screen so my screen doesn't get so dirty constantly... doesn't anyone use lotion at Apple?


There are a ton of rumors a much cheaper MBA is about to be announced.


Is Linux normally heavier?


Carbon fiber vs aluminum.


What I actually like about Apple products is the heft. They feel premium and the heaviness contributes a lot to the premium feel.

I tried a ThinkPad X1 Carbon as well, it felt like a toy.


I'm not sure you want heaviness in a laptop?


I have been maintaining this chart of phones with replaceable batteries available in the USA for 10 years now:

https://www.productchart.com/smartphones/removable_battery

Man, is it empty these days. The chart used to be pretty full. Now it only has about 1% of all phones that are in the Product Chart database. As the other 99% have fixed batteries.

I'm looking forward to see if the EU decision will push some companies to do this for their US versions too and revive the chart.


The EU directive doesn't even compel them to have those kinds of removable batteries in the EU, because being removable with commercially available tools is considered compliant [0]. The topic has been too obfuscated with hype pieces. Still, it would be nice to not have to break glass and melt glue to open up phones.

[0] https://repair.eu/news/making-batteries-removable-and-replac...


> The EU directive doesn't even compel them to have those kinds of removable batteries in the EU, because being removable with commercially available tools is considered compliant

This is a follow-up directive that goes further. https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CEL...:

“Any natural or legal person that places on the market products incorporating portable batteries shall ensure that those batteries are readily removable and replaceable by the end-user at any time during the lifetime of the product. That obligation shall only apply to entire batteries and not to individual cells or other parts included in such batteries.

A portable battery shall be considered readily removable by the end-user where it can be removed from a product with the use of commercially available tools, without requiring the use of specialised tools, unless provided free of charge with the product, proprietary tools, thermal energy, or solvents to disassemble the product.

Any natural or legal person that places on the market products incorporating portable batteries shall ensure that those products are accompanied with instructions and safety information on the use, removal and replacement of the batteries. Those instructions and that safety information shall be made available permanently online, on a publicly available website, in an easily understandable way for end-users.”


Why disallow requiring thermal energy? Electric hair dryers are common in Europe.


Yeah, thank god. As long as it’s easy to remove and replacement batteries can easily be purchased by individuals, I want my phone and battery glued, thank you very much.

I like apples approach to removable battery glue. Though it needs an extra tool. These days it should be easy to make a cheap USB-C PD powered thing that supplies a good DC voltage.


The electricity-controlled glue in Apple's iPhone is made by Tesa, a German Glue company


I don't see any Fairphone on the page, they are not sold in the US?


I'm not sure. I have not seen them on any large retailer in the USA like Amazon, Walmart, Newegg, BestBuy etc.

Maybe if someone here is in the USA and has bought one, they can chime in and tell where they got it from?


Yes, they are. The Fairphone 6 is available in the US through their official partner Murena.


You should add this phone to your list: https://en.wikipedia.org/wiki/Librem_5


My napkin-math approach to get a bird's eye perspective on the situation:

A $1T investment needs to produce on the order of $100B in yearly earnings to be a good investment.

Global GDP is about $100T.

So one way for things to work out for the AI companies would be if AI raises GDP by 1% and the AI companies capture 10% of the created value.


At some point AI may deliver the level of net economic benefit you reference, but it's not entirely clear that we're there yet.

Right now much of the direct monetization occurs via OpenAI and Anthropic, who together have around $30B in annualized revenue. They are burning cash like crazy, though admittedly have potentially sustainable unit economics (gross margins around 40-60% before revenue share).

However, they need to spend a huge chunk of revenue on training. OpenAI spent something like $9b on training against around $13-14b in rev in 2025 (different from annualized rev) according to The Information. Anthropic's mix is supposed to be similar. Also implies a lot (maybe majority) of their compute spend is training.

If scaling laws falter, what happens to training spending? What happens to competitive degree of differentiation given Chinese open source models are a few months behind frontier? Then what happens to margins? It is very fragile.


The earnings do not need to come via direct monetization.

Google search revenue for example was over $200B in 2025. This revenue will be tightly coupled to the quality of their AI models in the future.


That is fair but doesn't do much to push back against the risk to the independent model vendors, particularly in consumer. This has represented a huge portion of the AI capex so far. OAI alone represented 2GW of compute in 2025. The point is they are in a fragile position, and so are the economics of AI DC spending in aggregate.

With respect to Google, I'd also wonder about the economics of AI search vs traditional.


Googles search revenue comes from ads which depend somewhat on the quality and speed of the search result. Yeah, a better LLM could do it but a better pagerank with NLP that actually works again could do it.


Are you located in a country where Google does not yet show AI answers?

In most countries, AI answers are the central aspect of Google now. Not the ranked pages.


I use ddg instead of google but it does show answers. I don't go to google for chatbots, i do go to find answers and more than not i find myself unsatisfied with the LLM answer so i end up diving past SEO spam (also LLM written these days) to find where i need to go. It's very frustrating and i'm feeling very pessimistic about the future of the web. It seems to be atrophying.


That reminds me of "Chinese marketing" strategy by a lot of Western companies 30 years ago when their economy first opened up. There are billion people in China so if we can capture just 1% market share there then we'll make a fortune, right? Spoiler alert: it (mostly) didn't work.


Sometimes it works. Steve Jobs aimed for 1% market share with the iPhone:

https://youtu.be/VQKMoT-6XSg?t=4605

Now it is at 20%.


If I'm mistaken, then the article states that the investment is $1T annualized when taking software development costs into account [1] if the labs don't all suddenly decide to stop development.

That would mean earnings of ~ $1.1T would be required on that investment annually, so maybe on $2T of revenue, capturing 2% of the global GDP - so I'd estimate that GDP would need to go up more like 5-10% to justify this.

[1] https://substackcdn.com/image/fetch/$s_!Gf2t!,f_auto,q_auto:...


10% capture seems highly unlikely. That level of capture is only possible for b2b high touch sales, aka "call-me" pricing.

For call-me pricing to work, you have to ensure that any sort of public sticker price is not a suitable alternative. You can not have a sticker price, make the sticker price so high essentially nobody will buy it or by finding a feature like oauth that makes the public version infeasible for businesses.

And then you also have to maintain enough of a monopoly / oligarchy to sustain that level of pricing.

I don't think either of those two conditions will apply in the future.

AI providers now have a sticker price that provides basically all functionality, almost completely eliminating the opportunity for extremely high-margin b2b. They've decided a small slice of a large pie is bigger than large piece of a smaller pie. I suspect that's true and will continue to be true in the future.

An oligarchy is difficult to sustain with more than 3 global players. Right now we seem to have 3 frontier models for coding that can and will charge more than commodity prices. However there are open source non-frontier models that you can use for inference costs only and even if those don't keep up it seems likely there will be enough non-frontier models available that their pricing will also be at the commodity level. Those cheaper models will provide significant downward pressure on frontier pricing.


I don't think we have seen "all functionality" yet.

We have not seen iterative AI use for example.

The use case, where you tell the model "Solve this task. Then solve it again. Keep the better solution, then solve it again. On and on. Tomorrow, show me the best solution.".

And also not the "Run a company on your own" use case.

Those might make people and companies use models full-time. The price of that will be way different from current subscription prices. The TCO of a single instance of a SOTA model is on the order of $100k per year.


I believe that you're arguing "1% GDP increase due to AI is too conservative" rather than against "capturing 10% of the value increase is possible".


I think more realistic napkin map is 10% GDP bump and 1% capture. You'll still find a lot of people who think we're going to get more than a 10% GDP bump from AI, but it'll definitely be fewer.

Will AI increase the rate of GDP growth by 0.5% or so over 20 years?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: