Yep, exactly this. And I have so much less anxiety that I have to use my 5-hour/weekly usage or I lose it... with deepseek api the credits never expire, I can use them when I want, how much I want and the prices are ridiculously low for the quality/intelligence/performance.
I think this is very true. They probably got scared of the almost 1b weekly active users of ChatGPT, and how people would rather ask ChatGPT than use Google. It will be a balance but this is a great opportunity for smaller search engines to make a real comeback.
With deepseek and xiaomi mimo models slashing their prices 99%, I don't see a great future for openai / antrhopic with regards to their 1T valuations. Maybe 1T valuation will be the whole market, West + East.
Most of the corporate world in the EU or North America will be hesitant to rely on Chinese AI providers. There are some very real blockers for that for things like data security, compliance, etc. And recent geopolitics don't help.
Legalities aside, you need to look not at the model quality but at the infrastructure needed to scale these models from tens (now) to hundreds (soon) of millions of users. Only a handful of companies actually have the resources and funding to do that. That's what these huge valuations are based on. These companies are gearing up to scale to these levels. That's why they are spending on data centers. Whoever has access to those data centers gets to tap into the revenue stream of people using models running on those.
The market for frontier models is roughly split between OpenAI, Anthropic, and Google. And then you have companies like X/SpaceX, Amazon, and Microsoft being more successful with their infrastructure than their AI products and companies like Apple, Meta that have the money and the aspiration but are so far not really managing to be very successful with their AI strategies.
Deepseek is just very poorly positioned to capture a lot of the enterprise revenue in the EU or North America. But they might become very dominant outside the US/EU. And of course China itself is going to be a huge market and equally unlikely to want to be depending on US owner AI companies.
There is still a risk of supply-chain attack. People give LLMs direct access to their entire infrastructure via tools, and never check the code produced. It's not difficult to steer an LLM during training so that they'd output malware only when prompted a certain way, and that wouldn't come up during the initial evaluation.
Personally I see no difference between China and America in terms of risks of them embedding "backdoors" so to speak, but I disagree when people claim that open-weight models are obviously safe just because they can be ran locally.
> It's not difficult to steer an LLM during training so that they'd output malware only when prompted a certain way
Perhaps, but that's also a good way to lose users+reputation as there's no way to control when said malware is generated. Once the first instance is discovered cybersec researchers will have a field day reproducing it and showing the world.
It is not a trivial challenge setting up model serving infra for ~1T or larger models, especially in a high reliability environment (e.g. your team is using it for work, or you're using it to power production apps). Sure, there are third party providers, although the quality and reliability of their inference varies.
Run Deepseek on Deepinfra then? Or Fireworks if US-based is important. None of these are real issues outside maybe convincing your legal team to do a bit of homework.
I don't think you are appreciating the physical constraints here. Deepseek doesn't really have the hardware in the US or EU to do anything at scale.
Sure, you can self host a non-frontier OSS model yourself; including Deepseek. And no doubt some people will pay one of the companies I mentioned to rent the infrastructure to do exactly that. Much of the rest of the world will be paying directly for direct access to the frontier models.
As for the legal/compliance stuff, I recommend you don't take any big decisions on that front without consulting lawyers. My understanding of that is that most serious companies in the EU have to take these topics pretty seriously. I'm sure in the US, hosting all your data and secrets in Chinese data centers isn't a whole lot less controversial.
The Chinese could of course choose try to match the current levels of investment Google, OpenAI, Anthropic, etc. are putting into local infrastructure. But as far as I know they aren't and there are probably a few political blockers for that.
Without infrastructure, their role is being a niche player in these markets. It doesn't really matter how good they are if they can't scale to most of the market.
I mean, Western providers such as Fireworks AI/Microsoft Foundry (US) or Tensorix (EU) already are offering many of these models on their own hardware with all the typical compliance boxes ticked through a standard API. Either as open weight models or through partnerships with Chinese firms, or both. DeepSeek etc. do not have to do anything here other than making their models available to Western partners (either as open weights or through a licensing agreement).
They'll still have their dedicated enterprise customers. I think the Chinese providers will pull more of the single users who're paying their own way, than those backed by company budget. And it's a pretty good split as the demand becomes better distributed, resulting in better service (I'll never forgot must how bad access to Claude became until they got access to Colossus) and less potential for lock-in (we really don't want there to be a duopoly, etc on good AI).
The point is that, with a sufficiently complex setup (with skills, MCPs, prompts, etc.) the difference in AI models will impact the quality of work. You might not care now, but you might care when you have 2 million lines of code and zero idea whats going on.
The point is vendor lock-in. The vibe coding community has reinvented vendor lock-in and is bound to repeat every mistake associated with it.
Pretty much every single detailed prompt made after trial, error, and refinement is tailored to a specific LLM. They will all perform worse used with other LLMs than a similar prompt tailored for the second LLM would perform, and at times quite poorly.
How well would it work to ask the working LLM to rewrite the prompt to get the best results? Do the models understand enough about themselves to do that?
Claude has a /product-self-knowledge skill, and I am sure the others have something similar. So yes, it is possible if you work with care, as necessary with all things LLM related. There are hundreds if not thousands of skills on github that were created just this way.
It's not like you aim to do it, you are just in a feedback loop improving results for the tool you are using. It is inherent in any prompt developed through iteration.
You can use Codex as an orchestrator and claude code via mimo/deepseek api as executor. I've read this a lot before but when you really try it, it is really something in the way you can stretch your credits.
It runs like shit though in terms of tokens/second and still has a reduced context window. Vs a single claude prompt can easily get into 300k tokens without breaking a sweat.
I want local AI to be a thing but the hardware isn’t here yet, because the only options are a Mac Studio or DGX machines strapped together. RAM prices needs to crash before local AI has a chance at actually competing.
The more recent Chinese models are no longer heavily limited by context size. It can easily fit in RAM on a prosumer laptop. (You can also use swap space to extemd that, since context is only written to once per inference, thus a relatively mild wear-and-tear concern.)
You’re right, and it feels like these people saying otherwise either don’t use these tools professionally (and therefore can’t tell a difference between local/cloud models) or literally just haven’t tried running local models
As soon as I can buy hardware for less than 5k that runs an opus 4.6+/5.5 model locally I will do it instantly
reply