Hacker Newsnew | past | comments | ask | show | jobs | submit | zbentley's commentslogin

Kind of. Those exist, but because Linux’s formal ABI is syscalls and not libraries that combine them in known-safe ways, the clone speedups that make fork faster are a confusing and fragile API for low-level programmers to use.

That, and even those clone-without-pagetable-copy improvements leave a lot of slowness on the table. Being able to skip even disable-able functionality intended for fork would simplify code. Also, for programs that launch the same subprocess many times, a better API might allow caching away some of the pre-entrypoint initialization of exec.


> I will acknowledge industrialization improved people's access to wealth and materialism.

And reduced illness, increased education, increased access to better nutrition, increased lifespan, increased able lifespan (knees/back/teeth don’t give out as early), and lots more.

Like, even if I grant that this replaced human connection (and I’m not sure that’s true, nor am I sure if it is meaningfully true—access to water replaces thirst, too), some very substantial benefits were acquired in return.


The theory that improved health and safety and lifespan will shrink the urge to procreate is so far fetched I find it hard to imagine. The longer you live, the more likely you seek connection. It would be easier to imagine that long lifespan and better health makes people less attached to their spouse.

> improved health and safety and lifespan will shrink the urge to procreate

Not what I said at all. Note the “even if I grant … (which I don’t…)”.


Oh, I misunderstood. You were only talking about the greater positives of industrialization, to counter ideas against it. That is true without debate. Material wealth going up, plus improved health conditions, are all positives. Material wealth replacing human relationships is not.

Thanks for clarifying and amending.

We went too far though, the main causes of death in the west are now all due to overconsumption, more than 50% of westerners are overweight, etc.

> there's a reason authoritarians across the world are banning abortion and targeting birth control

I don’t think that’s because of birth rate decline. While authoritarians give lip service to that occasionally, it’s never their primary cited reason (which is usually some combo of religion, purported return to “traditional” prosperity via reduced promiscuity, aggression against feminist political opponents, etc). Also, most authoritarians aren’t that long-term in their goals.


Eh, that argument works on any claim and is nonfalsifiable-ish, so I think it can be ignored.

People buying more chocolate ice cream than vanilla? Could be changing preferences or Hersheys marketing, or it could be undetected brain worms. People voting for one political party over others? Could be that party is campaigning/governing in a more popular way, could be brain worms.

If there’s evidence of contaminants or whatever influencing behavior strongly enough to change large scale demographic trends, then present it. Otherwise, your best chance at good data is to take people at their word when they say why they do things.


We know some of the pharmaceutical residues in our sewage turn frogs gay (that really happened, that wasn't AJ making something up). We know pharmaceuticals can greatly affect people's sex drive, general mood, and other psychological factors. It's definitely not a stretch to guess we might be doing it to ourselves.

A few nice things about doing this in no particular order:

Embedding would make local dev/CI integration testing convenient.

Embedding replicated Redis with each application instance would give you HA benefits while infra-management complexity.

Embedded redis (even via local RPC) is still going to be faster than a lot of languages or frameworks’ built-in data structures. Large array operations in, say, Python are gonna slower than RPCing to Redis (assuming that the data structures are built gradually and not built all at once); to beat Redis you’d have to use numpy or something—-which is definitely preferable, but is extra work if your app already uses Redis for other things.

Just like choosing SQLite over e.g. LMDB or RocksDB, embedded Redis would be a nice future proofing option for small apps during the prototype phase; less would have to be changed to move Redis out of the app than if a different cache or persistence service were chosen.


Not the case; good abstractions are valuable, but the performance differences between runtimes are very real.

Take the example of some simple HTTP<->blob store service gets slammed with millions of requests when someone using the API does a backfill via some framework on their end that aggressively scales request volume up and out.

Something like, say, async Python/starlette with a coroutine per request is gonna perform slightly worse than Erlang, which in turn is gonna perform much worse than Go.

You're right that those differences are sometimes marginal when the latency of whatever IO the backend's doing dominates the equation. However, in my experience huge volume surges show issues with the runtime (the thing managing/launching multiplexed request handler routines) or the ecosystem (the backend IO libraries' ability to work with the runtime's IO multiplexing and make things like request coalescing easy or automatic) more often than you'd think.

It really takes surprisingly little volume to cripple a return-hello-world Phoenix app that indirects the "hello world" behind way too much middleware and message passing; it takes even less to kick over, say, a Gunicorn instance returning "hello world" at the bottom of the Django middleware stack. Golang with Gin, on the other hand, is surprisingly hard to cripple in the same way. And I say that as someone who likes Elixir and Python a lot more than I like Go!


Thank you. As a guy who made a career out of Elixir (and begins to regret it recently but oh well) I agree that Elixir's throughput is not amazing. However, it can get very far and we should always optimize for the most common usages.

I've personally rewritten one hobby and one professional projects from Elixir to Golang and loved the result; as you said, extremely difficult to bring down a Golang service to its knees.

One clarification: Phoenix server behind Caddy/nginx fairs better btw. But, details. Your point stands.

I am yet to see a Rust web/API service I wrote to _ever_ buckle under pressure and just crash. It was either an application bug (like the famous Cloudflare's `.unwrap()` error from the last weeks/months) or the Linux OOM killer. Literally never crashed. But I did witness it brutally murder a MySQL cluster because it couldn't serve it fast enough. That was both fun and terrifying to watch on the dashboards.


> I did witness it brutally murder a MySQL cluster because it couldn't serve it fast enough. That was both fun and terrifying to watch on the dashboards.

Haha yep. In my experience, everyone running CGI/process-per-request application servers is bullish on switching to a concurrent or cooperative runtime...until they realize they just removed the primary ratelimiter on downstream DB/service accesses.

The converse war stories are also amusing: people rewrite their whole app in a concurrent/asynchronous framework and nothing changes, because the DB driver is still farming out all queries to a tiny fixed-size threadpool of connections that was the bottleneck all along.


Oh yeah, definitely. If your DB server (or any storage backend) cannot have like 200+ connections alive at all times then it's absolutely pointless rewriting your app in Elixir or Golang. You'll just serve DB timeouts in your responses.

> You're right that those differences are sometimes marginal when the latency of whatever IO the backend's doing dominates the equation. However, in my experience huge volume surges show issues with the runtime (the thing managing/launching multiplexed request handler routines) or the ecosystem (the backend IO libraries' ability to work with the runtime's IO multiplexing and make things like request coalescing easy or automatic) more often than you'd think.

fair enough, although at this point we start talking about LB in front of the thing, consumption mechanics, autoscaling signals

i will still maintain that my simple advice for a dev worrying about scale, is that they should focus their efforts on ensuring downstream IO doesn't get overwhelmed (db read replicas, caching, etc) before optimizing runtime performance or autoscaling out unnecessarily.


> focus their efforts on ensuring downstream IO doesn't get overwhelmed (db read replicas, caching, etc) before optimizing runtime performance or autoscaling out unnecessarily.

All good advice, but the choice of runtime can affect the point at which autoscaling and load balancing even need to enter the conversation at all. Optimizing, say, a mostly in-memory cache service and writing it in Golang may yield results like "we can run a single instance of this and serve three orders of magnitude of business growth; slap it behind a DNSRR or a k8s NodePort for update/replacement/fast failover if it crashes, no complex load balancer needed", where writing the same thing in, say, PHP might require discussing orchestration/load balancing/memory/worker process recycling/autoscaling early on in the service's lifetime. Being able to skip those conversations (entirely or for a long time) is a very significant business benefit.


Eh, reduction counting isn't magic. Golang manages similar preemption semantics without counting that many operations (some tight loops do have barriers inserted every so often, but that's the exception and not the rule). And reduction counting has some serious costs! It slows the runtime down a shitload (and the BEAM is already in the bottom half of interpreted language runtimes by speed) and makes lots of JIT-flavored runtime optimizations slower or harder to implement.

I like immutability too; I wish Java and Golang did more of it. It costs a lot in terms of unexpected copies in the BEAM though, there's less copy-elision optimization than you'd think. That especially bites if you're doing a ton of message passing, because of how process heaps are implemented and how garbage collection (traditional or ETS/ThreadProgress-based) works.

I think what I want is something like Golang but with goroutine-based ownership semantics (or Rust with the Go runtime and goroutines): en excellent scheduler for extremely light-weight green threads, no refcounting or reduction counting, and all the clever optimizations around channel sending and copy elision--but no ability to use a value after it's sent to a channel, and only channel-based access to shared global state. That'd get most of the benefits of process-local heaps but without the (copying, cache/memory fragmentation) drawbacks.


These are all true and I have recognized those as innate limitations of the BEAM VM. For now I am OK with those but I am already skirting at the limit and I am starting to want to jump to Golang and Rust again.

This is a very good writeup.

Zooming way out (perhaps to the point of useless observation), it's a pity that the web embedded VSCode editor is signed into GitHub at all. Defense-in-depth or not, a huge vulnerability surface arises from that original sin. It'd be like if you had a god-permissioned GitHub API token stored in world-readable plaintext on your workstation for the malicious-NPM-package-of-the-week to find.

In a perfect world, it'd be awesome if the in-browser IDE launched with a temporary per-repo permission scope or token that allowed only pull and push to the repo in question; no github.com web session whatsoever. If you want the full GitHub web UI experience, well .... go back to github.com; make github.dev a single-repo service.

I'm assuming that's a) inconvenient for users, b) hard to implement, and c) a historical assumption baked into a lot of the github.dev tooling, though. Ah well.


> it'd be awesome if the in-browser IDE launched with a temporary per-repo permission scope

That's actually exactly what they do for codespaces. The token only has read/write on the repo you activated for the codespace [1]. They should definitely consider doing that for github.dev as well.

[1] https://orca.security/resources/blog/hacking-github-codespac...


Or they could’ve kept their bounty program running smoothly. But instead they pissed off another security researcher and received a zero days heads-up before public disclosure.

There is no excuse. GitHub runs a great program on HackerOne and it should just have been submitted there.

Also note that the person who found this was pissed because they had a difficult experience with submitting a bug for VSCode THREE YEARS AGO through MSRC which is _completely different_ than the GitHub H1 program and no doubt much more challenging with a different experience.

There is really no excuse for this irresponsible disclosure. They could have at least tried instead of holding a grudge for three years.


> GitHub runs a great program on HackerOne

I agree, for the record here's my HackerOne profile https://hackerone.com/ammar2/hacktivity?type=user

Just for context, that 2023 bug was initially reported to GitHub's HackerOne program and they explicitly told me it was out of scope for them and to take it to MSRC:

> We have reviewed the report and determined that the vulnerabilities is in VS code and the fix will be implemented by Microsoft. As a result, it is not eligible for reward under the Bug Bounty program. Please follow-up with Microsoft via the report you submitted.

There was also an additional bug that allowed an attacker to exfiltrate private repo contents with a github.dev link that MSRC also marked as not having security impact.

I absolutely loved working with GitHub folks on the GitHub bug bounty program, they're responsive, go into technical details with you and are awesome to deal with. MSRC is like the polar opposite of that.


> malicious-NPM-package-of-the-week

This is going to get worse and worse. I recently noticed AI harness (e.g. OpenCode) downloading random npm packages in the background and litter them everywhere in a few place in ~ and in your project dir, all without telling/asking you.

What's worse is that people don't seem to care even the devs.


You typically don't want to run opencode outside a sandbox anyway.

True, but security breach inside a sandbox/container can cause serious damage too(stealing your code/data/keys, spreading via your code/release etc). And containers aren't for security anyway(e.g. Copy Fail breaching to host https://xint.io/blog/copy-fail-pod-to-host)

It's rare that both of those align and it is very unlikely that both are used at once. Most of the exploits (if not all) just install rce, rat and/or steal env.

I think the problem lies in the fundamental design of VS Code extensions in general. They are essentially Node.js apps with full access to built-in modules, including fs. If the corresponding VS Code instance is launched with your user privileges, extensions can technically read files in ~/.ssh.

It is not safe in the sense that for every extension you install, you are essentially installing a new Node.js app with all its bundled dependencies. Even if you trust the publisher, I am sure there are many holes to exploit.


My comment had more to do with the in-browser VS Code instance. Regardless of the extension security model, having the github.dev webapp run under your full github.com account's permissions significantly expands the attack surface: if you launch github.dev in one repo and install a malicious extension, that extension can reach and compromise all repos your GitHub user can reach, private or public. Scoping it to one repo would only allow a malicious extension to write code in that repo and not mess with the GitHub API or other repos.

Separately, I think the debate around extensions/plugins in general boils down to the same conversation about trust and isolation we have for every third-party software supplier (package managers etc.).

Options include:

1. Vetting/blessing certain extensions.

2. Serving built extensions from a central registry/artifact store with security protections

3. Having VSCode organically grow a shitty version of different operating systems' "X wants to access Y; confirm?" permissions access system (a pain in the ass to do in a cross-platform way).

4. Having VSCode somehow run extensions as separate applications according to the OS and leveraging the OS's permission system (still hard, and because it's an IDE, rather a lot of extensions will need--or request because of sloppy extension code--very broad permissions, at which point an extension is one transitive dependency update away from compromising your system).

5. Running the entire VSCode instance in some sort of container/VM/sandbox (the amount of access holes folks poke in the Snap/Flatpak VSCode instances, and the number of common issues for which "stop using the container and install VSCode directly on the system" is the recommended fix does not give me hope that this will be adopted by anyone but the most expert, patient, and paranoid users).


I think it's ok to be signed-in when opening your own repositories, but definitely not when opening repositories from other accounts. And also the webview keyboard shortcut thing needs to be fixed to only allow harmless keybinds and NOT propagate to any keydown handler. Also on desktop it should be removed in favor of Electron intercepting directly. And on web it should probably disabled by the default.

> temporary per-repo permission scope or token that allowed only pull and push to the repo in question

How about pull from the repo but only push to a staging area from which the user, but not the token, can push for real?

Frankly, LLM agents should do this too. Letting your LLM push seems foolhardy to me.


You can just fork the repository, give it access to the fork and then merge what you want

This is a piece of cake using GitHub’s excellent permission system.

(I’m joking, of course. Service accounts are nowhere to be seen. OAuth can’t even scope to an organization, let alone a repository. And this whole github.dev thing illustrates that you don’t even need to explicitly grant permission to issue broadly scoped tokens.)

Also, forking is pretty heavyweight just to launch something that, for all anyone knows before starting actual work, is being used as a read only viewer.


Exe.dev has an integrations feature which is similar allowing you to grant access to specific repos without having give the VMs credentials. I think it’s a similar pattern to iron.sh.

I have been thinking more and more about how I might use this pattern.


the scope problem is actually worse with agents than with github.dev. the architectural answer is the same though: the credential the agent operates with should be scoped to the task and expire when the task ends.

Jules is heavily restricted in what it can do to your repos.

That makes so much more sense.

You can use SSH keys and GitHub deploy keys to approximate this. Can't speak for the security of it, but I have never set up GitHub with access to every repo. Not sure if there exists approximate functionality in other git forges though.

How does this work with the in-browser editor at github.dev?

> It'd be like if you had a god-permissioned GitHub API token stored in world-readable plaintext on your workstation for the malicious-NPM-package-of-the-week to find.

That's...exactly what the AWS CLI does.


If the malicious-npm-package-of-the-week is reading arbitrary files on your workstation, isn't it usually able to run git clone/push/whatever with your current credentials anyway?

Yes, but also no. For example in GitLab a user who’s infected could push code to a branch. Then it could even make a merge request to pull that branch into main (if main is protected).

But then someone else on the team should have to manually approve that MR to allow it to be merged to main.

This kind of defeats the ability of malware to push stuff out automatically.


Not if they're touch required in a secure enclave like a yubikey

Malware running on your computer can engineer a situation where you would naturally press that without suspecting anything.

1. Malware logs you out of github.com

2. It waits for you to navigate to the login page

3. It initiates an SSH/signing operation requiring physical touch

4. You hit login on github.com, a 2nd FIDO operation is queued up

5. You press the yubikey button, confirming the SSH operation

6. "Nothing happens", so you press it again to log in

7. You're now logged in, and your SSH credentials have just been hijacked.

Or it could just inject itself into your shell profile, and do this the next time you ssh anywhere. You never really know what you're confirming so Yubikey's threat model implicitly depends on the host device being trustworthy.

This is why hardware wallets for crypto have a physical display to confirm the address and the amount before signing the transaction.


Bad memories. I particularly enjoyed fighting with third-party programs that installed system cronjobs in the various tabs, and having to remember to go and find them after package upgrades and try to figure out how to robustly identify when their processes were running so my other cronjobs wouldn't overload or clobber state, since the third-party-installed jobs didn't play along with any lockfile-based coordination we used. Wants/WantedBy/Requires are godsends by comparison.

> If you have hundreds of jobs like this you need a proper queue and neither cronie nor systemd is the right tool

Eh sometimes, but you can get pretty far with one of two approaches:

1. Careful use of Requires= and Wants= to group your scripts into chains of jobs, which achieves fixed parallel (though at 100s of jobs, I hope you're generating those unit files with a tool like Puppet or https://github.com/karlicoss/dron or something and not doing this by hand).

2. Even better, just use a lockfile. `ExecStart="flock -F $TMPDIR/mylock <command>"` is pretty hard to beat. Use -F so as not to confuse KillMode and resource accounting and you're golden. Just don't use flock(1) timeouts; let systemd handle that. Heck, if you have that many cron jobs, you should be doing this even if you don't use systemd; otherwise job latency changes can cause reboot-style thundering herds out of the blue.

If you need semaphore behavior and still don't want a real job queue, waitlock (https://github.com/bigattichouse/waitlock) and many other CLIs have you covered.


1. This is spread across 500 files, maintainability goes out the window

2. If this for some reason fails, misconfiguration or unexpected shutdown, you could have a failure that's hard to track or debug

These are fine with a few services chained together, but this requires a shallow depth of dendencies. To have these theoretical hundreds of jobs chained together like this isn't practical or safe.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: