Hacker Newsnew | past | comments | ask | show | jobs | submit | lebovic's commentslogin

I used to work at Anthropic. I fully believe that the folks mentioned in the article, like Jared Kaplan, are well-intentioned and concerned about the relationship between safety research and frontier capabilities – not purely profit.

That said, I'm not thrilled about this. I joined Anthropic with the impression that the responsible scaling policy was a binding pre-commitment for exactly this scenario: they wouldn't set aside building adequate safeguards for training and deployment, regardless of the pressures.

This pledge was one of many signals that Anthropic was the "least likely to do something horrible" of the big labs, and that's why I joined. Over time, the signal of those values has weakened; they've sacrified a lot to get and keep a seat at the table.

Principled decisions that risk their position at the frontier seem like they'll become even more common. I hope they're willing to risk losing their seat at the table to be guided by values.


> I hope they're willing to risk losing their seat at the table to be guided by values.

that's about as naive as it can be.

if they have any values left at all (which I hope they have) them not being at the table with labs which don't have any left is much worse than them being there and having a chance to influence at least with the leftovers.

that said, of course money > all else.


I don't hold the belief that it's always better to have influence in a group where you don't trust leadership – in this case, those who decide at the metaphorical table – vs. trying to affect change through a different avenue.

It's probably naive, but it's also the reasoning that drove many early employees to Anthropic. Maybe the reasoning holds at smaller scales but breaks down when operating as a larger actor (e.g. as a single person or startup vs. a large company).


This is a common logical fallacy. It's not true that the party A with a few values can influence the party B with no values. It's only ever the case that party B fully drags party A to the no-values side. See also: employees who rationalize staying at companies running unethical or illegal projects.

Employees and employers are not sitting at the same table, this is a category error. We're talking lab to lab. Obviously in a fiercely competitive market like this with serious players not sharing the same set of rules it's close to pointless, but it's still better than letting those other players do their things uncontested.

> I joined Anthropic with the impression that the responsible scaling policy was a binding pre-commitment for exactly this scenario

Pledges are generally non-binding (you can pledge to do no evil and still do it), but fulfill an important function as a signal: actively removing your public pledge to do "no evil" when you could have acted as you wished anyway, switches the market you're marketing to. That's the most worrying part IMO.


If you're not willing to give up your RSUs you shouldn't be surprised that the executives aren't either.

The moral failing is all of ours to share.


I was willing to (and did) give up my equity.

I interviewed at Anthropic last year and their entire "ethics" charade was laughable.

Write essays about AI safety in the application.

An entire interview round dedicated to pretending that you truly only care about AI safety and not the money.

Every employee you talk to forced to pretend that the company is all about philanthropy, effective altruism and saving the world.

In reality it was a mid-level manager interviewing a mid-level engineer (me), both putting on a performance while knowing fully well that we'd do what the bosses told us to do.

And that is exactly what is happening now. The mission has been scrubbed, and the thousands of "ethical" engineers you hired are all silent now that real money is on the line.


> Every employee you talk to forced to pretend that the company is all about philanthropy, effective altruism and saving the world

I was an interviewer, and I wasn't encouraged to talk about philanthropy, effective altruism, or ethics. Maybe even slightly discouraged? My last two managers didn't even know what effective altruism was. (Which I thought was a feat to not know months into working there.)

When did you interview, and for what part of the company?

> knowing fully well that we'd do what the bosses told us to do [...] now that real money is on the line

This is a cynical take.

I didn't just do what I was told, and I dissented with $XXM in EV on the line. But I also don't work there anymore, at least one of the cofounders wasn't happy about it and complained to my manager, and many coworkers thought I had no sense of self preservation – so I might be naive.

The more realistic scenario is that a) most people have good intentions, b) there's a decision that will cause real harm, and c) it's made anyway to keep power / stay on the frontier, with the justification that the overall outcome is better. I think that's what happened here.


The EU should invite them over.

The kind of principles you talk about can only be upheld one level up the food chain. By govts.

Which is why legislatures, the supreme court, central banks, power grid regulators deciding the operating voltage and frequency auto emerge in history. Cause corporations structurally cant do what they do without voilating their prime directive of profit maximization.


I fully believe that Dario is 100% full of shit and possibly a worse person than Altman. He loves to pontificate like he's the moral avatar of AI but he's still just selling his product as hard as he can.

They are all the same given their motivations - Demis Hassabis is the only one who, to me at least, sounds genuine on stage.

Demis is a researcher first. Others are not.

The post is light on details, and I agree with the sentiment that it reads like marketing. That said, Opus 4.6 is actually a legitimate step up in capability for security research, and the red team at Anthropic – who wrote this post – are sincere in their efforts to demonstrate frontier risks.

Opus 4.6 is a very eager model that doesn't give up easily. Yesterday, Opus 4.6 took the initiative to aggressively fuzz a public API of a frontier lab I was investigating, and it found a real vulnerability after 100+ uninterrupted tool calls. That would have required lots of of prodding with previous models.

If you want to experience this directly, I'd recommend recording network traffic while using a web app, and then pointing Claude Code at the results (in Chrome, this is Dev Tools > Network > Export HAR). It makes for hours of fun, but it's also a bit scary.


This is actually a good concrete example of how to use AI for pen testing (which I've never had time to look at, so I realise it may be common). The issue I'm struggling with is cost - to point O4.6 at network logs, and have it explore...how may tokens/money do you burn?


How much would you pay a pen tester and/or appsec engineer to review your web app? I think it probably evens out.

(I’m not suggesting replacing either with opus, but just trying to put the cost into perspective)


See also this thread on storing data in space: https://news.ycombinator.com/item?id=46327158


Yep, the Anthropic API supported tool use well before an MCP-related construct was added to the API (MCP connector in May of this year).

While it's not an API, Anthropic's Agent SDK does require MCP to use custom tools.


If you can reproduce the issue with the other API key, I'd also love to debug this! Feel free to share the curl -vv output (excluding the key) with the Anthropic email address in my profile


FlowDeploy | Product Engineer, Bioinformatics Engineer| Full-time | REMOTE or ONSITE | https://flowdeploy.com

FlowDeploy builds dev tools for bioinformatics, and we're looking for a product-minded software engineer and a bioinformatics engineer. I think a former or future founder would do well in this role.

Curiosity matters more than domain-specific experience in bioinformatics, although some bioinformatics context is helpful: understanding if what you've built solves a problem requires talking to users and understanding them.

You would be working in a few key areas of our product:

- Improving our integration with bioinformatics pipelining languages like Nextflow and Snakemake.

- Building our core API. This is currently written with Express/Node.js with a Postgres database.

- Building the UI for launching, monitoring, and sharing bioinformatics pipelines and data. This is currently written in React with Typescript.

- Improving our pipeline execution. This is mostly in AWS Batch.

- Improving our data handling. Most raw data is stored in S3, with metadata in a Postgres database.

We're a very small team, and we plan to stay small until we have strong product-market fit. We're funded by Y Combinator, have revenue from the FlowDeploy product, and can keep going for years without raising additional funding.

Interested? Apply through YC's Work at a Startup:

- Product Engineer: https://www.ycombinator.com/companies/flowdeploy/jobs/KrwNpl...

- Bioinformatics Engineer: https://www.ycombinator.com/companies/flowdeploy/jobs/I9F9sI...

You can reach me directly at "noah" at this domain.


FYI

https://www.ycombinator.com/companies/flowdeploy/jobs/I9F9sI...

Peer’s Certificate has expired.

HTTP Strict Transport Security: true HTTP Public Key Pinning: false

Certificate chain:

-----BEGIN CERTIFICATE----- MIIFH<snip>


This is super cool! It's nice to see commercialization in the bioinfo space, after dealing with bedraggled servers running in your PI's lab for many years and dealing with insane packages (ever try to install QIIME?). I would have loved this job coming out of college.


Thanks! Ironically, I was hired for my first job in bioinformatics by one of the QIIME authors. Unfortunately, that didn't make it any easier.

I don't think anyone has really figured out commercialization in the space yet – us included. The community is still rooted strongly in academia, so commercializing requires a delicate balance between profitability and openness.

I imagine it's what building dev tools was like a couple decades ago. It's fun to see the field grow and evolve.


Certainly, and as I was having my swan song of a semester, it really did seem like things were turning a great corner on reproducibility, distribution of data, sharing code, building code that wasn't matlab scripts cobbled together and so forth.

QIIME is awesome, they have taken on the unenviable task of "dealing with" all those random sub libraries that are from hell.

If you're familiar with Galaxy[0], we used that back in the day and wrote plugins at my lab so we could have researchers use the tools we were building. it feels like that type of 'platform of data + programs" would be easier to monetize. I mean, the workloads are there, people like plug and play, it could use a lot of sprucing up and some paid people to solve the nasty parts.

And yes, I think it's a rite of passage - I mean your own FASTA counter, of course! [1] ;)

[0] https://usegalaxy.org/ [1] https://git.ceux.org/dna-utils.git/


I'm not seeing anything in the bioinformatics space at the moment. What job did you end up doing after college can I ask?


Got recruited as a run of the mill PHP dev at a local medium sized business. Turns out my PI did not get the grant and so I ended up going private


This pattern is common. Anecdotally, I think the majority of people trained in bioinformatics end up working full-time in standard software engineering.

I think this is starting to change. Next-generation sequencers and other imaging devices are causing more wet labs to produce massive amounts of data – which is increasing the number of companies hiring for bioinformatics roles.


Yup, ain't that just how it goes. I'll probably make the leap soon myself sometime this year, and I'm not looking forward to playing "skill-tetris" with recruiters.


I wouldn't worry. Research really sets apart new grads. Everyone else got a degree too, but doing cool research is usually a good conversation starter during interviews!


Ah. This would be perfect for me (see my bio), but I'm only UK and EU based. I'd do remote if I could.


Argh, this could be a perfect fit for you. I'm disappointed that we won't be able to make it work for these roles.

Candidly, we haven't figured out how to do international remote work well. Hopefully we will in the future!


Hah, no worries - if ever you start anything in the UK/EU space, feel free to ping me!


Will do!


I work with Snakemake for computational biology. I see a lot of confusion as to why Snakemake exists when workflow management tools like Airflow exist, which mirrors my sentiment when moving from normal software to bio software.

Snakemake is used mostly by researchers who write code, not software engineers. Their alternative is writing scrips in bash, Python, or R; Snakemake is an easy-to-learn way to convert their scripts into a reproducible pipeline that others can use. It's popular in bioinformatics.

Snakemake also can execute remotely on a shared cluster or cloud computing. It has built-in support for common executors like SLURM, AWS, and TES[1].

Snakemake isn't perfect, but it helps researchers jump from "scripts that only work on their laptop" to "reproducible pipelines using containers" that easily run on clusters and cloud computing. Running these pipelines is still pretty quirky[2], but is better than the alternative of unmaintained and untested scripts.

There are other workflow managers further down the path of a domain-specific language, like Nextflow, WDL, or CWL. Nextflow is a dialect of Java/Groovy that is notoriously difficult to learn for researchers. Snakemake, in comparison, is built on Python and has a less steep learning curve and fewer quirks.

There are other Python based workflow managers like Prefect, Metaflow, Dagster, and Redun. They're great for software engineers, but don't bridge the gap as well with researchers-who-write-code.

[1] TES is an open standard for workflow task execution that's usable with most bioinformatics workflow managers, like HTML for browsers.

[2] I'm trying to fix this (flowdeploy.com), as are others (e.g. nf-tower). I think the quirkiness will fade over time as tooling gets better.


I don't get why you claim something like airflow doesn't bridge the gap well with resear hers who write code. I've worked with wdl extensively, and I still think that airflow is a superior tool. The second I need any sort of branching logic in my pipeline, the ways of solving this feel like you are working against the tool, not with it.


The bioinformatics workflow managers are designed around the quirkiness of bioinformatics, and they remove a lot of boilerplate. That makes them easier to grok for someone who doesn't have a strong programming background, at the cost of some flexibility.

Some features that bridge the gap:

1. Command-line tools are often used in steps of a bioinformatics pipeline. The workflow managers expect this and make them easier to use (e.g. https://github.com/snakemake/snakemake-wrappers).

2. Using file I/O to explicitly construct a DAG is built-in, which seems easier to understand for researchers than constructing DAGs from functions.

3. Built-in support for executing on a cluster through something like SLURM.

4. Running "hacky" shell or R scripts in steps of the pipeline is well-supported. As an aside, it's surprising how often a mis-implemented subprocess.run() or os.system() call causes issues.

5. There's a strong community building open-source bioinformatics pipelines for each workflow manager (e.g. nf-core, warp, snakemake workflows).

Airflow – and the other less domain-specific workflow managers – are arguably better for people who have a stronger software engineering basis. For someone who moved wet lab to dry lab and is learning to code on the side, I think the bioinformatics workflow managers lower the barrier to entry.


> are arguably better for people who have a stronger software engineering basis

As someone who is a software developer in the bioinformatics space (as opposed to the other way around) and have spent over 10 years deep in the weeds of both the bioinformatics workflow engines as well as more standard ones like Airflow - I still would reach for a bioinfx engine for that domain.

But - what I find most exciting is a newer class of workflow tools coming out that appear to bridge the gap, e.g. Dagster. From observation it seems like a case of parallel evolution coming out of the ML/etc world where the research side of the house has similar needs. But either way, I could see this space pulling eyeballs away from the traditional bioinformatics workflow world.


The problem with Airflow is that each step of the DAG for a bioinformatics workflow is generally going to be running a command line tool. And it'll expect files to have been staged in and living in the exact right spot. And it'll expect files to have been staged out from the exact right spot.

This can all be done with Airflow, but the bioinformatics workflow engines understand that this is a first class use case for these users, and make it simpler.


If community moderation worked perfectly, there would be no reason to moderate. dang was clear and consistent in his moderation, even though he received the most backlash I've seen him face in the thread [1].

Seymour Hersh's stories have faced similar backlash every time they're released, including counter-statements by the US government. He has put out more dubious pieces as of late – he could be right or wrong about this – but I'd rather be exposed to his ideas than have them censored.

Anecdotally, I found the Seymour Hersh story intellectually gratifying, and was forewarned of the murkiness of its contents by the HN comments. I think all functioned pretty well on the HN side.

[1] https://news.ycombinator.com/item?id=34712496


Hey HN! I wanted a way to quickly switch between local and cloud execution for prototyping, so we built Lug. Notably, Lug doesn't require you to define your dependencies; it tries to extract and rebuild the same environment automatically. That means the decorator can literally just start as "@lug.hybrid(cloud=True)".

In the background, a new instance is spun up for every request. That means this has a high cold-start time, so is best for longer functions. I use it for computational biology (which contains some ML) batch processing.

You can also switch between cloud environments with the "provider" argument, but it's limited to just a couple options right now. We use it to cycle between AWS and our on-prem servers depending on capacity.


Seems plausible. It would be awesome if AWS had a publicly accessible metric to plan around this, do you know of anything like that?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: