DeepSeek Permanently Reduces The Price Of Its Flagship V4 Model By 75 Percent

jaykrown@lemmy.world · 1 day ago

DeepSeek Permanently Reduces The Price Of Its Flagship V4 Model By 75 Percent

Aceticon@lemmy.dbzer0.com · 3 hours ago

It would be hilarious if Chinese companies were the ones that punctured the investment bubble around AI in America.

Razen@lemmy.world · 3 hours ago

Are they eating the cost? How are they able to do it while others are unable to?

Wispy2891@lemmy.world · 2 hours ago

easy: users pay the difference with their data

magic_smoke@lemmy.blahaj.zone · 29 minutes ago

Yeah because american tech companies never spy on your or steal your data…

Kynsey@lemmy.ml · 9 minutes ago

LOL right? Like does the other person not realize that unless your running a local LLM or using something like duck.ai ALL of the AIs are training on your convos. It reminds me of all the “Chinese Surveillance” fearmongering around TikTok as if Meta and Instagram don’t do the exact same thing.

BeMoreCareful@lemmy.world · 2 hours ago

I just stumbled upon game Jesus asking some on the same questions

https://youtu.be/1H3xQaf7BFI

plz1@sh.itjust.works · 7 hours ago

DeepSeek never said it was permanent in their pricing materials, the article writer did. They are just taking the current expiration date off an existing discount. It’s absolutely a shot across the bow at Claude, OpenAI, et al., but the author was click-baiting, as is tradition.

ammonium@lemmy.world · 2 hours ago

No?

The deepseek-v4-pro model API pricing will be officially adjusted to 1/4 of the original price after the 75% discount promotion ends on 2026/05/31 15:59 UTC.

https://api-docs.deepseek.com/quick_start/pricing

ayyy@sh.itjust.works · 19 hours ago

“Permanently” lol it’s a subscription and the terms say they can change the price at any time. How is it legal for them to advertise with the word “permanent”?

BlackLaZoR@lemmy.world · 13 hours ago

lol it’s a subscription

It’s actually API access price, and it’s charged per input + output tokens. $0.87 per million tokens is damn cheap.

They probably have super cheap electricity and it’s possible they use cheap Chinese Ai chips for inference.

ragebutt@lemmy.dbzer0.com · 6 hours ago

china is expanding energy tremendously to the point that the USA simply cannot compete. Even if data centers all get built tomorrow they will soon bottleneck because energy demands can’t be met in a timely manner. The median time to get a new power plant online is 5 years. Meanwhile china is investing heavily not only in expansion of their grid, but expansion into renewable energy. They’ve added 8x the power to their grid that the us did just in 2023 and if anything their pace has risen since then. Their renewable grid is 3x the size of the entire us grid

In terms of raw performance US firms were months ahead and that gap is shrinking. Dola-seed is ranked second behind opus by us firms with a gap of under 3% in benchmark performance

This performance gap closing and energy superiority is why ultimately DeepSeek v4 pro outperforms opus 4.6. Opus is the clear winner, but not by a very appreciable amount, and ranges from 11-26x more expensive. Chinas hardware isn’t more efficient but their energy superiority puts them way ahead; their cloudmatrix uses well over 100% more energy than nvidia g200 but their energy costs are sometimes as little as 1/8th American costs per kWh

The race to superiority here is ultimately does America substantially update and expand their grid before Chinas domestic chip manufacturing bridges the hardware gap that has been created by things like export controls? My money is on China here; Huawei, SMIC, etc have an engineering problem that is rapidly being addressed with gigantic state sponsorship (and frankly the major bottleneck is EUV lithography, which they are actively pursuing, though this is an issue that even with tens of billions will take many years to catch up to the west). While those barriers are real the American barriers are an extremely complex regulatory system (which is ultimately why trump is being directed to gut everything in terms of environmental and worker protections), funding (the oligarchs want this but not enough to part with their money, they want us to fund it), and unlike China the US drastically changes trajectory every 4-8 years.

Eager Eagle@lemmy.world · 18 hours ago

60% of the time it works every time

Greyghoster@aussie.zone · 14 hours ago

Don’t use any of them much and from my limited experience they all seem to be pretty much the same. In fact DeepSeek probably has been a little better than ChatGPT.

blargh513@sh.itjust.works · 12 hours ago

As long as you don’t mind them harvesting every tiny bit of data you feed it.

I don’t like the big US players, but at least they’re doing a tiny bit to keep out of your shit. Deepseek is not pretending at all. I suppose it’s at least honest and the price point is REALLY tempting. Openclaw gets expensive fast with the number of tokens it consumes. I burned through $30 in two days with it using Claude Haiku/Sonnet. Plugging it into cheap LLM is a nice idea, but no thanks.

elucubra@sopuli.xyz · 8 hours ago

I don’t like the big US players, but at least they’re doing a tiny bit to keep out of your shit.

Oh, bless your heart, you sweet summer child.

Greyghoster@aussie.zone · 7 hours ago

Really depends on your point of view. Personally I see the US AI push as a maximum harvest and it is hard to see the Chinese as being worse. The US has really gone flat out destroying whatever credibility and moral authority they may have had. As I said I don’t use the technology that much and the queries are pretty innocuous, so it may be different for others.

bestbry@lemmy.world · 2 hours ago

I personally prefer to hand my data to the Chinese. Us is an evil place with evil people nowadays. The Chinese never did anything against me personally.

InFerNo@lemmy.ml · 16 hours ago

I think it’s meant to convey that it’s not a temporary deal on the old price, but a permanent new price point.

ayyy@sh.itjust.works · 14 hours ago

What is the effective difference? It’s not like they’re offering long term contracts.

aceshigh@lemmy.world · 20 hours ago

Prices are funny. My last job we were changing clients extra for doing a thing that didn’t cost us anything and was fast to do. How much we charged was completely arbitrary and depended on the partners mood. It’s all made up folks.

VAK@lemmy.world · 8 hours ago

So it depends on willingness to pay, not cost

Buddahriffic@lemmy.world · 20 hours ago

Yeah, which is why the “if minimum wage increases, so will prices” aregument is BS. They were going to charge the highest price they thought they could either way, the difference is that they are forced to increase the amount that goes to the people they are trying to pay the least.

Rioting Pacifist@lemmy.world · 16 hours ago

There is an element of minimum wage increasing, increasing prices because now there are more people that can afford to pay for things.

But yes it isn’t because costs go up, and it really only applies to things people on minimum wage can afford and it’s always less than the increase in wages.

aceshigh@lemmy.world · 19 hours ago

This would impact the companies pnl though, so shareholders and c suite will get less money. That’s why they’re scaring people into not wanting to increase wage.

badgermurphy@lemmy.world · 13 hours ago

The hilarious irony is that is not even conclusive. There are plenty of studies, both real-world and contrived, that indicate that employers paying more, in broad, yields returns in excess of the added payroll costs.

Not only are there more customers, but increasing pay increases the quality and quantity of labor output.

MalReynolds@slrpnk.net · 1 day ago

The lower prices could be aimed at undercutting the competition.

Mobster voice: Sure would be a pity if the monetization potential of those 2 huge IPOs (3 if you count SpaceX with xAI deadweight rolled in) went boom when that’s all that’s holding your economy out of recession (depression depending on how they cook the books).

chilldrivenspade@lemmy.world · 1 day ago

“permanently” means nothing when it comes to technology

Valmond@lemmy.dbzer0.com · 2 hours ago

I see you have never made a temporary fix in software.

Rioting Pacifist@lemmy.world · 21 hours ago

All numbers in AI are made up it’s wild to see tankies glaze DeepSeek’s fake numbers while being skeptical of Western corporations’ numbers

BlackLaZoR@lemmy.world · 13 hours ago

All I see is good and cheap model. It doesn’t even have to be perfect, just in ballpark of mainstream models.

Rioting Pacifist@lemmy.world · 12 hours ago

Them cutting consumer prices doesn’t show that though.

It’s wild that people normally critical of AI boosting will drink Koolaid if it’s China flavored

BlackLaZoR@lemmy.world · 6 hours ago

Most people don’t care about Tiananmen square and politics. And cheap =/= bad. It’s a fallacy

Rioting Pacifist@lemmy.world · 6 hours ago

WTH are you on about?

Does being a tankie cause brain damage?

BlackLaZoR@lemmy.world · 3 hours ago

What Tankie brain damage? I use tools that do the job at low cost.

Sektor@lemmy.world · 7 hours ago

If someone beats up your bully you have sympathy for them, regardless of the reasons why they beat them.

Calfpupa [she/her]@lemmy.ml · 21 hours ago

Not glazing when its simply enjoying watching China beat the US at its own game

Rioting Pacifist@lemmy.world · 20 hours ago

But the numbers are fake, so it really doesn’t mean much to reduce a fake number by 75%, it isn’t an indicator that DeepSeek is beating anyone at anything.

Calfpupa [she/her]@lemmy.ml · 17 hours ago

What do you mean by the numbers are fake? Are you saying the worth is over inflated? If that’s the case, of course it is, none too different than virtually any other commodity.

Rioting Pacifist@lemmy.world · 16 hours ago

What does that number meaningful represent as DeepSeek doing well?

They can afford to lose more money on this? They have lower operating costs? They have a better way to make money of their users?

It could indicate any/all/none of theses

BlackLaZoR@lemmy.world · 13 hours ago

No wonder. Since deepseek has open license, they have to compete with 3rd party providers, and in case of smallest models with local generation.

yesman@lemmy.world · 23 hours ago

I’m unfamiliar with AI chatbots that you pay for. What is a token?

mic_check_one_two@lemmy.dbzer0.com · 21 hours ago

A token is basically just a word. Know how your phone’s auto suggest tries to anticipate the words you want to use as you type? In this case, your phone is using an extremely small token amount (typically only the previous two or three words you have typed) to try and predict your next word, which would also be a token. Your phone only uses a few tokens at a time, because as token count rises, processing requirements also quickly balloon.

And AI chat is basically the same concept, but with a massively inflated token limit. Instead of looking at your previous two or three words, it looks at entire conversations. And it also uses tokens to generate responses, the same way your phone is using one token at a time to predict your next word.

So when you pay for tokens, you’re essentially paying for a word count. As you continue a conversation, the token requirement for each subsequent request will increase, because it is attempting to look at the entire context of the conversation you have had.

Models have built-in token limits, to put a cap on how much memory is required to run the model. As conversations stretch on and you reach the model’s token limits, it will begin losing context for things that happened earlier. It will try to summarize earlier parts of the conversation to shorten them but keep relevant pieces in memory, or it will just outright drop old parts of the conversation and “forget” that context, the same way my phone has already forgotten the start of this sentence.

It’s a little more complicated that “each word is a token”, because the chatbot will combine your prompts with its own internal systems. Especially as conversations stretch on, and it begins to summarize old parts to keep them in memory. But that’s the most straightforward way to explain it.

Peruvian_Skies@sh.itjust.works · 23 hours ago

In very simple terms, a token is more or less a word. You pay per input and output tokens (your prompts and the answers) as they correlate the most closely to the energy expended by the LLM to process your request.

Mwa@thelemmy.club · 23 hours ago

Still gonna self host it instead (maybe)

qaz@lemmy.world · 17 hours ago

FYI the flash model is ~158 GB

Mwa@thelemmy.club · 16 hours ago

The destiled models?

byte_0verflow@lemmy.ml · 1 day ago

Thank you daddy Xi

(des)mosthenes@lemmy.world · 23 hours ago

https://youtu.be/LB4QmlVYHAo westside gunn adlibs