Considerations about what goes on in agents internally will probably not be part of software development for long.
Personally, I already see LLMs and agents as blackboxes. I give each feature request to multiple LLMs and then compare the results. I don't manually use "sessions" at all. I just look at the outcome. When I dislike it, I "git reset --hard", change my prompts and restart the feature request.
To have an ongoing sense of which agents perform best, I keep a log and calculate an ELO score of which agents meet my demands best. This score is imporant to me, not so much how the agent achieves it.
Unless we do our own benchmarks, we have to take all the marketing fluff from the frontier labs at face value, and all public benchmarks degrade eventually as labs optimize towards them. OP’s approach is wasteful because it is brute force, but post says that an ELO is kept, so this is also an experiment, and I don‘t see what‘s wrong with that. You learn which model performs well in which settings which may save resources later. It‘s also wasteful to keep working with the wrong model/harness/tools for too long.
In an interactive session, adding "Fine, but make the button red" after the model generated a first solution more than doubles the tokens used. As the model now not only gets the original code and the feature request but also the updated code plus the change request as input tokens.
Sending a feature request to an LLM and then sending the feature request again with "The button shall be red" only doubles the tokens used.
I wrote my own agent, and it sends data to LLMs in this order: "General Prompts (How to write good code)" + "The Code" + "The Feature Request". This means the KV cache will be used even when the feature request changes.
And output tokens are usually way less than the input tokens.
So I think that my approach is very lightweight on token usage compared to an interactive session.
It would be interesting to measure it for the other agents out there. Sending a feature request two times vs an interactive session.
That’s usually not true due to caching. It may be true if you leave a large gap in between, but if you send “make it red” right after, then it’s purely incremental
you will be surrounded by an ecosystem of
devices, none of which stand alone, but are
more like portals to interact with your agents
I would be really happy with my phone + headphones as the device I use most. But only if I could use Gemini (or ChatGPT or Grok or any other chat agent) in voice mode and say "SSH into my GitHub Codespace soandso and implement feature soandso.". And it replies "Did it. I told copilot (or codex or whatever coding agent lives on that VM) to implement the feature".
And then a minute later I could ask it "Is copilot done yet?" and it replies "No, looks like it is still working on it". And then a minute later I ask again. It replies "Yes, it finished. It changed chart.py and styles.css. Do you want me to tell you what specific changes it made to the files?".
But it looks like none of the chat agents with voice interface have such a connector at the moment? An SSH connector would be the most useful. But a "GitHub Codespace connector" or something like that would also do.
I wonder if that will be a missing piece for long. If so, I would build an agent with voice mode and ssh connector myself. But I guess it should come out from the big guys any moment now?
> Yes, it finished. It changed chart.py and styles.css. Do you want me to tell you what specific changes it made to the files?"
A verbal diff sounds practically useless. Does it first read out the entire left-hand base, and then read out the entire right-hand target? Does it say loudly "REMOVING ... ADDING ... "? How would it read out something like Struct->Field? This seems lower fidelity than a visual confirmation, and I just don't think that voice commands make sense with this kind of work.
It would tell me about the changes like a human would.
"It changed the plot function so it takes another parameter called linewidth. It also added an input field in the stylecontrols section where the user can ...".
How would you detect the presence of bugs in this scenario? How would you make sure the LLM isn't adding yet another useless, redundant function to the code base? Even if there isn't a bug in this PR, do you not want to be familiar with the actual shape of the code in case you need to dig through it while bug hunting later?
Every time I try to take a hands-off approach to the code like this, I come to regret it later. The code ends up bloated and labyrinthine. When I let it grow unabated, it becomes gradually more difficult for the LLM to understand the intended structure as the project becomes too big for the model to keep the whole thing in its context.
How would you detect the
presence of bugs in this
scenario?
I would ask AI. "Did the last commit introduce any bugs or unintended consequences?". In fact I already use this prompt after every change I make manually.
How would you make sure the LLM
isn't adding yet another
useless, redundant function to
the code base?
By asking AI. In fact, I already run a long "Can you refactor anything in this codebase to reduce redundancy, improve readability, performance or maintainability" pretty regularly.
Are you ever reading the code? What do you do when the LLM can't fix a bug? Do you not wish you had a more intimate first-hand knowledge of the code when fixing things yourself?
Please don't tell me that never happens-- I've had one just in the last week and I use both OpenAI and Anthropic foundation models.
In fact, I usually let multiple LLMs implement the same feature, and then I compare them. I even run my own arena in which I calculate Elo scores for LLMs from my perspective of which one implemented features better.
Having the ability to control code agents via voice would not take away my ability to do that. But I think in the future, that will become less and less necessary. If we look back at this conversation in five years, it will look very archaic, and we will be used to having superhuman AI do everything for us. In 10 years, it will sound like a strange idea that humans were once fiddling with code to improve the quality.
Something something wasting machine cycles with a compiler.
Something something taking the crafts and the man out of craftsmanship to just get it out the door as quickly as possible.
All jest aside I mostly agree with you but I'd tack on another 20 years for a total of 30.
Though in this technological jump I don't think people are as excited (understandably) as when the teletype came on scene. I too like the potential but dislike the whole discourse around it, the ethics involved and the way it's deployed. Such is life I suppose.
I like how people think that if LLMs get to the point where they write code you can ship without reviewing it, that humans will still be in the loop "sshing into a code space" and "implementing features". Do you really think you'll even know what files are in that repo? Or that you'll be a necessary part of the process whatsoever?
I can tick files in Vim, those get concatenated into a prompt. Along with a feature request. Plus an instructions file that tells the LLM how to reply. Plus my general "rules for good code" file, plus one "rules for good code" file per language involved, plus a project specific overview file. The LLM then answers with a list of changes it wants to make to the code. My tooling then applies those changes and I look at them via "git diff". If I like it, I commit. If not, I change one of the prompts and start the process again.
Instead of replying with code changes, the LLM can also decide to request more files. I wrote a little DSL for that.
I described the beginnings of this workflow last July:
how would claude code work from a browser environment?
If you want an agent (like OpenClaw) to write software, why have it use another agent (Claude Code) in the first place? Why not let it develop the software directly? As for how that works in a browser - there are countless web based solutions to write and run software in the cloud. GitHub Codespaces is an example.
I personally won't allow full control for a long time.
On the other hand LLMs have been a very good tool to build bespoke tools (scripts, small CLI apps) that I can allow them to use. I prefer the constraints without having to think about sandboxing all of it, I design the tools for my workflow/needs, and make them available for the LLM when needed.
It's been a great middle ground, and actually very simple to do with AI-assisted code.
I don't "vibecode" the tools though, I still like to be in the loop acting more as a designer/reviewer of these tools, and let the LLM be the code writer.
No, it doesn't, I only run agents in a dedicated development environment (somewhat sandboxed in the file system) but that's how I've used them since the beginning, I don't want it to be accessing my file system as a whole, I only need it to look at code.
Don't think a web-based dev environment would be enough for my use case, I point agents to look into example code from other projects in that environment to use as as bootstraps for other tools.
Browser plugins have a security problem that's easy to miss: the agent runs inside your existing browser profile. That means it has access to your active sessions, stored credentials, autofill data — everything you're already logged into. A sandboxed machine is actually the safer primitive for untrusted agent tasks, not the more paranoid one. I work on Cyqle (https://cyqle.in), which uses ephemeral sessions with per-session AES keys destroyed on close, because you want agents in a cryptographically isolated context — not loose inside your personal browser where one confused-deputy mistake can reach your bank session.
Every week there is a news article about some script kiddie who shot themselves in the foot after vibe coding their production-ready app, without the help of any senior engineer, because, let's face it, who needs them, right? Only to end up deleting their production database, or leaking their credentials on a html page or worse, exposing their sensitive personal data online.
I'm actually pro-agents and AI in general - but with careful supervision. Giving an unpredictable (semi) intelligent machine the ability to nuke your life seems like the dumbest idea ever and I am ready to die on this hill. Maybe this comment will age badly and maybe letting your agents "rm -rf /" will be the norm in the next decade and maybe I'll just be that old man yelling at clouds.
German chancellor Friedrich Merz ...
lashed out at German workers to
“simply do a little more,”
Germany literally pays people to do nothing.
A friend of mine, an engineer who works in the German car industry, recently told me that nowadays he has a lot of free time. Because the company he works for has so few orders that the company is granted "Kurzarbeitergeld" - the government pays 60% of the salary if the employees work less.
That blew my mind. If I had fewer orders, I would work more to increase the quality of my product and my efficiency. Working less as a reaction to losing market share seems completely counterproductive to me.
When my eldest daughter was in high school (~2010, Argentina) there was a provincial policy where if every single student had a result below a certain score in a test, the scores had to be re assessed against the maximum result.
The resulting situation here was that she was constantly bullied into underperforming. Both cases are actually similar in that each individual has a personal incentive to underperform - the difference is that in your friend's case the policy is granted at the company level so no single employee can defect and break it for the rest, while in my daughter's case one high scorer could invalidate the reassessment for everyone, which is exactly what made defection punishable and the bullying emerge naturally.
This is the natural result of "equity" which is the academic jargon term for "forced equality of outcome". High achievers are attacked. People who push us forward are demonized. The low achievers are never pushed to be better. And the average drops.
It’s not that absurd and happens all over the world in university systems. I had a Comp. Sci. Professor that taught assembly and graded on a curve. As you might imagine the one guy that was a wizard at assembly caught flak from the unwashed masses.
I had another professor that not only did a curve but dropped statistical outliers to prevent this problem, he literally explained his system on Day 1 of the course. This was 15+ years ago and by no means a new idea.
I tried to search for it, but even the 2 documents that superseded the one from around the time my daughter was at school at not available.
I mean, the site doesn't even have a valid secure certificate so...
In the site below (In Spanish) you can search for 10/2019 and a cursory translation of the document title will show that this is the proper document (For 2019 onwards, the replaced doc 04/2014 isn't available either)
The problem is they don't get rid of their ICE cars any more - most of the world is transitioning to EVs and China - their previously biggest importer - has strict EV quotas and no one wants to buy German EVs that are too expensive and less capable than Chinese made EVs.
Germany was ahead in EVs and solar - both industries have been cuto off by conservative/free-market ICE lobbysts and these 2 huge global markets are now dominated by China instead.
> has strict EV quotas and no one wants to buy German EVs that are too expensive and less capable than Chinese made EVs
Both China and the US have enacted trade barriers against EU originated auto drive goods.
A Chinese VW ID4 is manufactured in Shanghai and an American one in Tennessee. And that is the crux of the issue - consumers are still open to buying a German badged product, but it won't be "Made in Germany".
And where else can Germany export?
The individual EU states are protective about exports by leaning hard on nationalism and union support as seen in France (Renault+Stellantis) and Italy (Stellantis), India demands China- and US-style JVs and domestic manufacturing as well even despite the EU FTA, Japan and SK prefer buying domestic, Russia is blocked due to sanctions, ASEAN+Africa is flooded with Chinese, Japanese, and Indian manufactured cars already, and South American is flooded with Chinese, American, Japanese or domestically manufactured cars.
Germany Inc will remain in Germany as long as Germany makes itself attractive. Otherwise, they will leave, as they have already done so for the US, China, the CEE, and increasingly India.
Broadly speaking they're not _as far away_ from Chinese EVs as we all make it here to be. However, the problem is that the stockholders of all these companies expect 500% YoY growth which isn't sustainable. Not to mention the cost of a car has grown significantly while all German cars have degraded substantially in quality. For example, you can expect Porsche to sell FOREVER the number of cars they manage to sell in the 00s and 10s:
Broadly speaking internet commentary about cars is crap and the differences between any two competing products (i.e. ignore the dishonest comparisons between a Chevy Cobalt and a Toyota Landcruiser) are way, way smaller than the online fanboys, shills and people with a very expensive purchasing decision they want to feel validated in will lead you to believe.
Depends on your position I guess. If you are a worker at a conveyor belt, it doesn't make sense to work more to produce more cars nobody needs. I think originally this policy was designed to save jobs during temporary downturns, not to save industries going downhill
If you start with a couple hundred thousand $, I can see that. But if you start with like $10,000, how do you stop market fluctuations from eating your money before your correct prediction turns into runaway compounding gains?
That is true for RnD not for Factory workers. Germany has quite strong workers rights, so mass layoffs are not a possible solution to safe money if facing lacking orders.
Essentially companies get some of the money back they and their employees paid as taxes.
> If I had fewer orders, I would work more to increase the quality of my product and my efficiency. Working less as a reaction to losing market share seems completely counterproductive to me.
That may work if you are a sole proprietor or small business person, but that's not how shareholder owned corporations work.
A sole proprietor is willing to work more if business drops (effectively lowering their compensation rate) because they are the beneficiary of any future gains that may (or may not) result from their short term sacrifice. If they want their employees to do the same they have to give them the same deal.
A large corporation can't easily make its employees work much longer for the same pay (except in the very short term), nor can it easily get shareholders to be OK with increasing spending on labor. This usually ends with massive layoffs when it can't sustain itself anymore.
That's one reason that smaller companies can be more nimble.
Actually this is not as easy as it sounds. Quite a few companies opt not to go into 'Kurzarbeit' because it means that you go under extra scrutiny as it is only there to stop major layoffs, which would cost our social insurance system even more. There is typically enough mechanisms to make it unattractive enough, that even unions accept unpaid leaves instead.
IMHO there is bigger productivity problems. After COVID sick leave has massively increased. Many women do not work as much as they want because child care is still sketchy. There is often simply no incentive to work more because engineering careers are quite limited: general pay compared to expenses is really good, but top performers earn considerably less than in other countries.
You cannot indefinitely ask employees to do more for less money. According to law and job contracts you can only do so much unpaid overtime. Usually 10h instead of 8h, and only when justifiable. Everything else is illegal. Paying 60% and working less is hoping for the environment or situation at the market to change, whether or not that is realistic.
Depends on where you work in the industry, there's a huge level of division of work. Upstream departments should work more on new products and marketing etc. But a little more downstream, there isn't much todo if not enough cars are ordered.
The intention of Kurzarbeitergeld is to prevent large layoffs. I honestly can't tell if that makes sense in the long run, but it seems reasonable for a political party trying to make it to the next term.
How does this blow your mind? Your suggested action is illogical since the standard way to deal with this problem is to fire workers, but if demand is seasonal, it means you constantly have to fire and rehire, so instead of firing full workers, they just fire part of their working hours.
The crazy part is that the government subsidizes this, which creates perverse incentives, but that's a different problem.
So let me get this straight: you have exactly one data point, and the second data point using is a German chancellor widely regarded as one of the worst by many measures. Right?
This is FUD. They said the same about the Greeks in 2008, it is complete BS. In any given org, passed a certain size you'll find ppl who slack a lot and people who work for three. Unfortunately that seems to the nature of large orgs, nothing special about Germans...
ps. I'm having a DejaVu. This is the exact same narrative Greek politicians used against the Greek population to justify them become poor overnight.
The one thing that interests me most when it comes to laptops these days is weight. So I jumped right into the tech specs section and looked it up. Since this is the "Air" laptop of the company that is popular for thin and lightweight devices, my hopes were high.
But ...
The 13 inch version is heavier than a ThinkPad X1 Carbon. Which has a 14 inch screen and can run Linux.
I bought a ThinkPad X1. Had to send it back for repairs three times in the first year, including a complete motherboard replacement, and it died again immediately after the warranty expired. Been a $2800 door stop since then. The case is flimsy plastic that gets beat to crap easily. The trackpad is over-sensitive in all the wrong ways which makes it hard to use as an actual laptop. Plus it's weaker and slower than an Air. Also unbearably loud and unbearably hot.
I don't like Apple as a company and I don't particularly like MacOS, but no one except Apple makes a laptop worth a damn.
Was it a Gen 1 device? I bought a Thinkpad X13 Gen 1 many years ago and it kept having blue screens from RAM errors and other problems. Eventually after many warranty attempts and motherboard replacements they sent me a new X13 Gen 4. This has been running Ubuntu with no problems for 4 years now, it might be more a "lemons" phenomenon than a general rule. Also, AFAIK, the case is metal with a "soft-touch" coating.
The Apple ARM processors are still in a league of their own but personally I'm not willing to give up my OS freedom of choice for that advantage.
Not my experience in the slightest, after two decades of personal thinkpads and around 20 issued to my team.
Also if you'd just spent that extra 120 bucks for the 3 year onsite warranty, you'd have a lenovo technician replacing your motherboard at a location of your choice the next working day.
Very different experience here. I have an X1 Extreme Gen 4 since 2022 (running Linux), and have had zero hardware issues so far. The only thing that's annoying is that it gets quite warm on the hand rest.
Eh. Just simply on stability and life, beyond CMOS battery and laptop battery changes, my 5x 2015-2018 lenovos are working like a charm. I love the plastic case, it flexes and catches falls better than the mac. The MBPs have fallen down and dent like crazy, leak electricity through the metal body, weight like crazy and still no OS freedom and no free app store and you got to rely on "homebrew"? It is wild that we are relying on "home" brew for making a machine from on the richest companies in the world palatable.
> but no one except Apple makes a laptop worth a damn.
That's pure nonsense. I'm a fan of the Asus ExpertBooks myself which seem to be largely ignored in these discussions. They weigh about 2 pounds, 15mm thick, they don't overheat, about 15 hours of battery, and pretty damn durable.
The Air is going to run laps around the X1, in literally every benchmark you can come up with besides "its not open source". I have that same processor in a much bulkier thinkpad and it thermal throttles instantly doing basic office multi-tasking, with the fan running constantly.
Same. I used to be ride-or-die for my Thinkpad Trackpoint™ keyboards, I even had an external USB one. Eventually I started to feel the RSI in the top of my hand. Haven't had any issues with a Macbook trackpad.
The last time I was excited about the performance of local computers was in the 90s I think.
Modern laptops are so insanely fast. Not sure if they are 2x, 10x or 100x faster than I need them to be. But I never hear fans. I never have to wait for the machine these days.
The keyboard and touchpad experience are nearly identical between the two... not nearly as good as old IBM Thinkpads used to be, but that's a trade with IMO the much better touchpad experience on Mac.
That said, I just don't think I can keep buying Apple hardware, just not a fan of the company... I only begrudgingly use Android as there isn't a reasonable, more open option.
I'll probably stick with my M1 air for personal use a couple more years then pass it on. My daughter is still using my now 13yo rMBP with 16gb/512gb. I wish the ram and storage upgrades on mac weren't so overpriced.
Apple has their supply lines locked in a few years ahead of time... they likely won't see downward pressure for a couple years still. Not that they might not still take advantage... though downward sales pressure is a trade off too.
I like the aluminum body a lot. I'm not particularly clumsy, but each of my macbooks ends up with some fall damage at some point over the 5+ years that I have it.
When I used to be assigned a plastic Dell work laptop, I dropped one onto the carpeted floor of my office because I thought it was going into my padded sleeve of backpack and that cracked the case, and broke the screen. I've accidentally yoinked my MBA (last intel one they made) off my desk, and while it dented the body of it, nothing broke. That is now my drum computer, and it gets regularly pelted with drumsticks when my grip tires.
My father recently dropped my macbook air from the car essentially on concrete bricks.
It has just gotten a single dent for something less than 0.5 cm and its on the side (although this damage was done when the laptop was closed so some damage is just above the laptop's display aluminium shell.
To be honest, its barely visible and everything is working and there was no damage on display or anything else for what its worth.
I usually don't like apple but damn the macbook air is tiny and can take some damage.
Although I am still just a little sad about the damage because the laptop was perfect condition beforehand now that we talked about it but its incredibly better than any other laptop atleast with that thing in mind. Gonna use this laptop for a long time (M1 Air)
As someone clumsy, I'm so grateful that my MacBook Air can take a beating. It has one slight dent of about 1mm in the 4 years I've had it and I definitely drop it or knock it off a desk or something a few times a year.
I'll take the extra weight of aluminum (0.3lb, 130g). Yes, someone might say the ThinkPad X1 Carbon is 14", but the 13" MacBook Air actually has a 13.6" screen.
If I were in the market for a PC laptop, I'd definitely take a look at the ThinkPad X1 Carbon, but I'm also not worried about the weight of my MacBook Air. The X1 Carbon Intel ones are on sale right now since Panther Lake will be a huge upgrade coming soon, but even on clearance they aren't cheap. An X1 Carbon with 32GB RAM and 1TB storage (Ultra 7 268V, the cheapest one due to the sale) will cost $1,679 while a similar MacBook Air will cost $1,699 - and the M5 has 48% better single-core performance and 56% better multi-core performance (Geekbench). A 16GB/512GB (Ultra 5 225U) X1 Carbon is $1,538 compared to $1,099 for a MacBook Air - and the M5 has a 74% single and multi core advantage there.
Panther Lake might narrow the performance gap, but early indicators don't seem like that's the case. Even the top of the line Ultra X9 388H sees the M5 with a 36% single-core advantage while the Ultra X9 388H gets 3% faster multi-core. And I'm not sure the higher wattage "H" processors work for something like an X1 Carbon.
The highest non-H Panther Lake processor (Ultra 7 365) sees the M5 get 51% better single-core and 58% better multi-core. Maybe we'll see better, but it looks like Intel isn't closing the gap in 2026.
Does it? In my case, it was my father who dropped my mac but luckily everything was all safe with tis but a scratch. So perhaps that can be taken into factor as well that its more than one variable.
That being said, I am pretty clumsy but I have never dropped any hardware except a dumb phone which I threw out a lot and it was so small and tiny but it never had any problem.
And then one day I dropped it from top just a little bit and let it drop/slide inside my bag (like a cushion) and that day it died. I recently asked someone about it and turns out that its battery got inflated.
The aluminium chassis cannot be used for heat dissipation without risk of harming users. Which is why there is a "macbook air peformance mod" to add thermal-interface-material (instead of thermal insulation) to turn the chassis into a heatsink.
The last generation of Intel Macbooks was so bad... the i9 I was assigned from my job at the time would constantly go in and out of thermal throttling, making the whole experience effectively useless... It was also so locked down, I couldn't apply any mods to be able to underclock/volt the thing to something reasonable.
I really do hope that Linux becomes an option in more workplaces without being too locked down for developers.
I believe we are talking about slightly different things. Yes if they thermally coupled the body to the processor, then a small patch of the body would get very hot, burning the user.
However, the fact that the aluminum gets hot during prolonged use means that it is acting as a heat sink and cooling the CPU compared to a body made of plastic. Thermodynamics, it's the law!
>However, the fact that the aluminum gets hot during prolonged use means that it is acting as a heat sink and cooling the CPU compared to a body made of plastic. Thermodynamics, it's the law!
Not really. It's picking up "stray heat" that is radiated from the copper heatsink inside and conduction from the air in the fan system. It does not improve cooling the processor in any kind of manner. If it were plastic, the plastic would get warm too. Maybe it'll be a 2 degree difference.
The original Air lineup was thinner in the front and seemed a little lighter. The thicker front on newer airs gives more battery life, but I'm not a fan of it.
The thinness at the front was a bit of a hack though wasn’t it? So Steve Jobs could make it look good in photographs. I’d take the extra battery life any day.
The final version of the “wedge” Air case was an amazing piece of physical design. The lid had a large-radius complex curve that perfectly controlled reflections. The bottom case had a curve that made it look like the machine was hovering above the desktop from almost any angle. Calling that a “hack” is sort of like calling it a “hack” that a Ferrari looks fast even when it’s parked.
The new designs are overtly boringly utilitarian. I would say they intentionally look ugly. I guess this must have been intended as a marketing signal.
And it seems like it’s working since you think the new design delivers better battery life. It doesn’t! The 13-inch M1, M2, M3, M4, and M5 MacBook Airs are all specced for 18 hours of battery life.
I think its a matter of the chonkers feeling like you're getting what you're paying for. "This thing is so expensive! WHY is it so thin?"
Of course the zeitgeist keeps changing and what made sense yesterday might look like madness for those that aren't following things closely. As for myself, I very much prefer "slightly chonkier, but better heat dissipation" (coming from owning an intel mb pro and using it on my lap often).
I have the M1 MBA and M5 MBP. The wedge MBA feels noticeably thinner and the MBP feels kind of chonky in comparison. It's a bigger difference moving them one-handed than the specs would indicate.
I'm in the same boat. I have one of the original M1 MacBook airs, and the thicker front feels like overall a downgrade in hardware. Going up to higher ram amounts might be good for some of my datasets, but it's not needed for any software I run.
So I guess I'll wait for the next cycle and hope they return to the "Air" idea again.
> The 13 inch version is heavier than a ThinkPad X1 Carbon
And costs ~800 more for 16Gb/512 with a slower CPU and worse battery life.
As someone who spends his life on the road with a laptop, I strongly feel that anything that works for you under 3lbs is the sweet spot. The difference between 2.2 and 2.7lbs is miniscule in the grand scheme of my backpack.
I really like my X1 Carbon gen 7, aside from the bizarre Ethernet "port" (it has built-in Ethernet, but they didn't have room for RJ45, so instead of just telling you to buy a USB one it's on a dongle that blocks one of its two USB-C ports when plugged in, eliminating the advantage of "doesn't use a USB port"). But aside from fantastic Linux support, it's got little to recommend it over a similar-vintage MBA, which has a much better look and feel.
I'm in the same boat and finding it disappointing.
For people saying this machine is so much faster, I don't care. My situation isn't the norm, but we're on HN. I have a powerful desktop that's my main compute machine and my laptop is a terminal. I need a web browser, whatever corporate shovelware I need, and a ssh connection (and tailscale). If I wanted to do real work locally I wouldn't be getting an Air.
While realizing I'm not the typical user, it's not like the typical Air user needs much compute anyways. The general public just uses web browsers.
Though one thing I'd love is if they could add just a little distance between the keyboard and screen so my screen doesn't get so dirty constantly... doesn't anyone use lotion at Apple?
Man, is it empty these days. The chart used to be pretty full. Now it only has about 1% of all phones that are in the Product Chart database. As the other 99% have fixed batteries.
I'm looking forward to see if the EU decision will push some companies to do this for their US versions too and revive the chart.
The EU directive doesn't even compel them to have those kinds of removable batteries in the EU, because being removable with commercially available tools is considered compliant [0]. The topic has been too obfuscated with hype pieces. Still, it would be nice to not have to break glass and melt glue to open up phones.
> The EU directive doesn't even compel them to have those kinds of removable batteries in the EU, because being removable with commercially available tools is considered compliant
“Any natural or legal person that places on the market products incorporating portable batteries shall ensure that those batteries are readily removable and replaceable by the end-user at any time during the lifetime of the product. That obligation shall only apply to entire batteries and not to individual cells or other parts included in such batteries.
A portable battery shall be considered readily removable by the end-user where it can be removed from a product with the use of commercially available tools, without requiring the use of specialised tools, unless provided free of charge with the product, proprietary tools, thermal energy, or solvents to disassemble the product.
Any natural or legal person that places on the market products incorporating portable batteries shall ensure that those products are accompanied with instructions and safety information on the use, removal and replacement of the batteries. Those instructions and that safety information shall be made available permanently online, on a publicly available website, in an easily understandable way for end-users.”
Yeah, thank god. As long as it’s easy to remove and replacement batteries can easily be purchased by individuals, I want my phone and battery glued, thank you very much.
I like apples approach to removable battery glue. Though it needs an extra tool. These days it should be easy to make a cheap USB-C PD powered thing that supplies a good DC voltage.
At some point AI may deliver the level of net economic benefit you reference, but it's not entirely clear that we're there yet.
Right now much of the direct monetization occurs via OpenAI and Anthropic, who together have around $30B in annualized revenue. They are burning cash like crazy, though admittedly have potentially sustainable unit economics (gross margins around 40-60% before revenue share).
However, they need to spend a huge chunk of revenue on training. OpenAI spent something like $9b on training against around $13-14b in rev in 2025 (different from annualized rev) according to The Information. Anthropic's mix is supposed to be similar. Also implies a lot (maybe majority) of their compute spend is training.
If scaling laws falter, what happens to training spending? What happens to competitive degree of differentiation given Chinese open source models are a few months behind frontier? Then what happens to margins? It is very fragile.
That is fair but doesn't do much to push back against the risk to the independent model vendors, particularly in consumer. This has represented a huge portion of the AI capex so far. OAI alone represented 2GW of compute in 2025. The point is they are in a fragile position, and so are the economics of AI DC spending in aggregate.
With respect to Google, I'd also wonder about the economics of AI search vs traditional.
Googles search revenue comes from ads which depend somewhat on the quality and speed of the search result. Yeah, a better LLM could do it but a better pagerank with NLP that actually works again could do it.
I use ddg instead of google but it does show answers. I don't go to google for chatbots, i do go to find answers and more than not i find myself unsatisfied with the LLM answer so i end up diving past SEO spam (also LLM written these days) to find where i need to go. It's very frustrating and i'm feeling very pessimistic about the future of the web. It seems to be atrophying.
That reminds me of "Chinese marketing" strategy by a lot of Western companies 30 years ago when their economy first opened up. There are billion people in China so if we can capture just 1% market share there then we'll make a fortune, right? Spoiler alert: it (mostly) didn't work.
If I'm mistaken, then the article states that the investment is $1T annualized when taking software development costs into account [1] if the labs don't all suddenly decide to stop development.
That would mean earnings of ~ $1.1T would be required on that investment annually, so maybe on $2T of revenue, capturing 2% of the global GDP - so I'd estimate that GDP would need to go up more like 5-10% to justify this.
10% capture seems highly unlikely. That level of capture is only possible for b2b high touch sales, aka "call-me" pricing.
For call-me pricing to work, you have to ensure that any sort of public sticker price is not a suitable alternative. You can not have a sticker price, make the sticker price so high essentially nobody will buy it or by finding a feature like oauth that makes the public version infeasible for businesses.
And then you also have to maintain enough of a monopoly / oligarchy to sustain that level of pricing.
I don't think either of those two conditions will apply in the future.
AI providers now have a sticker price that provides basically all functionality, almost completely eliminating the opportunity for extremely high-margin b2b. They've decided a small slice of a large pie is bigger than large piece of a smaller pie. I suspect that's true and will continue to be true in the future.
An oligarchy is difficult to sustain with more than 3 global players. Right now we seem to have 3 frontier models for coding that can and will charge more than commodity prices. However there are open source non-frontier models that you can use for inference costs only and even if those don't keep up it seems likely there will be enough non-frontier models available that their pricing will also be at the commodity level. Those cheaper models will provide significant downward pressure on frontier pricing.
I don't think we have seen "all functionality" yet.
We have not seen iterative AI use for example.
The use case, where you tell the model "Solve this task. Then solve it again. Keep the better solution, then solve it again. On and on. Tomorrow, show me the best solution.".
And also not the "Run a company on your own" use case.
Those might make people and companies use models full-time. The price of that will be way different from current subscription prices. The TCO of a single instance of a SOTA model is on the order of $100k per year.
I think more realistic napkin map is 10% GDP bump and 1% capture. You'll still find a lot of people who think we're going to get more than a 10% GDP bump from AI, but it'll definitely be fewer.
Will AI increase the rate of GDP growth by 0.5% or so over 20 years?
Personally, I already see LLMs and agents as blackboxes. I give each feature request to multiple LLMs and then compare the results. I don't manually use "sessions" at all. I just look at the outcome. When I dislike it, I "git reset --hard", change my prompts and restart the feature request.
To have an ongoing sense of which agents perform best, I keep a log and calculate an ELO score of which agents meet my demands best. This score is imporant to me, not so much how the agent achieves it.
reply