Please explain to me why all the major players are buying Nvidia then? Is HIP a ...

woctordho · 2026-02-18T06:51:44 1771397504

Major players in China don't play like that. MooreThreads, Lisuan, and many other smaller companies all have their own porting kits, which are basically copied from HIP. They just port every piece of software and it just works.

If you want to fight against Nvidia monopoly, then don't just rant, but buy a GPU other than Nvidia and build on it. Check my GitHub and you'll see what I'm doing.

mathisfun123 · 2026-02-18T06:59:37 1771397977

> Is HIP a drop in replacement? No.

You don't understand what HIP is - HIP is AMD's runtime API. it resembles CUDA runtime APIs but it's not the same thing and it doesn't need to be - the hard part of porting CUDA isn't the runtime APIs. hipify is the thing that translates both runtime and kernels. Now is hipify a drop-in replacement? No of course but because the two vendors have different architectures. So it's absolutely laughable to imagine that some random could come anywhere near "drop-in replacement" when AMD can't (again: because of fundamental architecture differences).

colordrops · 2026-02-18T07:05:49 1771398349

Who said "some random"? Read the whole thread. I was suggesting AMD invest BILLIONS to make this happen. You're aguing with a straw man.

bigyabai · 2026-02-18T07:30:06 1771399806

I think you misunderstand what's fundamentally possible with AMD's architecture. They can't wave a magic wand for a CUDA compatibility layer any better than Apple or Qualcomm can, it's not low-hanging fruit like DirectX or Win32 translation. Investing billions into translating CUDA on raster GPUs is a dead end.

AMD's best option is a greenfield GPU architecture that puts CUDA in the crosshairs, which is what they already did for datacenter customers with AMD Instinct.

colordrops · 2026-02-18T08:32:07 1771403527

I do not misunderstand.

Let's say you put 50-100 seasoned devs on the problem, and within 2-3 years, probably get ZLUDA to the point where most mainstream CUDA applications — ML training/inference, scientific computing, rendering — run correctly on AMD hardware at 70-80% of the performance you'd get from a native ROCm port. Even if its not optimal due to hardware differences, it would be genuinely transformative and commercially valuable.

This would give them runway for their parallel effort to build native greenfield libraries and toolkits and get adoption, and perhaps make some tweaks to future hardware iterations that make compatibility easier.

zvr · 2026-02-18T14:05:53 1771423553

Before the "ZLUDA" project completion, they would be facing a lawsuit for IP infringement, since CUDA is owned by NVIDIA.

colordrops · 2026-02-18T16:45:19 1771433119

They would win, compatibility layers are not illegal.

bigyabai · 2026-02-18T18:21:53 1771438913

Win against who? AMD is the one that asked them to take it down: https://www.tomshardware.com/pc-components/gpus/amd-asks-dev...

And while compatibility layers aren't illegal, they ordinarily have to be a cleanroom design. If AMD knew that the ZLUDA dev was decompiling CUDA drivers to reverse-engineer a translation layer, then legally they would be on very thin ice.

bigyabai · 2026-02-18T18:14:27 1771438467

ROCm is supported by the minority of AMD GPUs, and is accelerated inconsistently across GPU models. 70-80% of ROCm's performance is an unclear target, to the point that a native ROCm port would be a more transparent choice for most projects. And even then, you'll still be outperformed by CUDA the moment tensor or convolution ops are called.

Those billions are much better-off being spent on new hardware designs, and ROCm integrations with preexisting projects that make sense. Translating CUDA to AMD hardware would only advertise why Nvidia is worth so much.

> it would be genuinely transformative and commercially valuable.

Bullshit. If I had a dime for every time someone told me "my favorite raster GPU will annihilate CUDA eventually!" then I could fund the next Nvidia competitor out of pocket. Apple didn't do it, Intel didn't do it, and AMD has tried three separate times and failed. This time isn't any different, there's no genuine transformation or commercial value to unlock with outdated raster-focused designs.

KeplerBoy · 2026-02-18T07:51:24 1771401084

This is a big part of AMD still not having a proper foothold in the space: AMD Instinct is quite different from what regular folks can easily put in their workstation. In Nvidia-land I can put anything from mid-range gaming cards, over a 5090 to an RTX 6000 Pro in my machine and be confident that my CUDA code will scale somewhat acceptably to a datacenter GPU.

bigyabai · 2026-02-18T08:07:21 1771402041

This is where I feel like Khronos could contribute, making a Compute Capability-equivalent hardware standard for vendors to implement. CUDA's versioning of hardware capabilities plays a huge role in clarifying the support matrix.

...but that requires buy-in from the rest of the industry, and it's doubtful FAANG is willing to thread that needle together. Nvidia's hedged bet against industry-wide cooperation is making Jensen the 21st century Mansa Musa.

mathisfun123 · 2026-02-18T15:02:52 1771426972

No I'm arguing with someone who clearly doesn't understand GPUs

> invest BILLIONS to make this happen

As I have already said twice, they already have, it's called hipify and it works as well as you'd imagine it could (ie poorly because this is a dumb idea).