Yves here. In the excitement over DeepSeek, this post provided a needed reminder of an important AI issue front and center: that AI does not have enough original human content to make for adequate training sets and is therefore often training on AI generated material. In other words, this is massive, institutionalized garbage in, garbage out.
By Kurt Cobb, a freelance writer and communications consultant who writes frequently about energy and environment. His work has also appeared in The Christian Science Monitor, Resilience, Le Monde Diplomatique, TalkMarkets, Investing.com, Business Insider and many other places. Originally published at OilPrice
- DeepSeek’s efficient and affordable AI model disrupts the market, threatening the profitability of established AI developers.
- The widespread adoption of AI, fueled by DeepSeek’s model, could lead to an information crisis as AI systems increasingly rely on AI-generated content.
- Despite increased efficiency, the demand for AI and electricity will likely continue to grow, driven by new applications and broader accessibility.
In 1865 British economist William Stanley Jevons explained to the public that increased efficiencies in the use of resources per unit of production do not generally lead to lower consumption of those resources. Rather, these efficiencies lead to higher consumption as many more people can now afford the more efficiently produced goods which carry a lower price tag. Jevons was referring to coal, the cost of which was falling and demand for which was rising due to increased efficiencies in production. His idea became known as The Jevons Paradox.
When the Chinese-based artificial intelligence (AI) upstart DeepSeek demonstrated last week that complex and powerful AI can be delivered for a tiny fraction of the cost and resources of current AI tools, DeepSeek’s competitors cited The Jevons Paradox and told investors not to worry. Demand for AI would now grow even more rapidly in response to greater efficiencies and thus lower costs.
What those competitors failed to mention is that DeepSeek’s breakthrough is great news for buyers of AI tools, but very bad news for current developers who are sellers of those tools. DeepSeek is giving away free or at only 3 percent of competitors’ prices (for those needing application programming interface services) something comparable to the very expensive products of its competitors. This suggests that the hundreds of billions of dollars spent developing those expensive tools may have just gone up in smoke. That investment may never be recouped.
Moreover, DeepSeek has shown that its powerful AI tool can run on a laptop, so the need for vast cloud computing resources is not necessary in many cases. In addition, DeepSeek’s AI tool is open source and can be freely distributed. This means anyone can see the code, customize it, perhaps improve upon it AND make money off the improved or customized version. And, because anyone can see the code, anyone can see how DeepSeek achieved such efficiencies and design their own AI tool to match or exceed those efficiencies.
The one thing the big AI developers are right about is that at these new prices (free or nearly free) the demand for AI is likely to grow much more rapidly as it is applied to situations where AI was previously too expensive to justify—just as The Jevons Paradox suggests. And that means it is probably wrong to think that these vast new efficiencies will eliminate the need for large expansions of electric generating capacity. The demand for additional generating capacity will still be there. It may just rise at a slower rate than previously forecast.
This is NOT an endorsement of what is about to happen. In fact, the more rapid spread and even wider use of AI is likely to create problems at a faster rate. More efficient and broader use of AI means that the human sources of information will be driven from the marketplace even sooner—the very ones that are essential if AI is to have real information from informed experts and writers. What comes next is AI feeding on AI-generated information, a kind of digital cannibalism that will not end well.
It’s worth noting that expertise does not actually reside on the page. It resides in the minds of a community of interacting experts who are constantly debating and renewing their expertise by evaluating new information, insights and data from experiments and real-world situations.
When the information generated by this kind of expertise is gone from the web or at least crippled, what kind of nonsense will AI tools spew out then? One thing is almost certain: The nonsense will now come more quickly and from more and more of the systems we rely on. That’s hardly a comforting thought.
well FWIW, a number of AI models, including DeepSeek passed the Japanese National Medical Examinations , with scores above 95%. I would say that is not too shabby.
AI-inspired digital cannibalism is baked in based on how it works: learning from existing data. Apologies for resorting to economics jargon but the Production Possibilities Curve (concave to the origin) showing (for instance) the trade-off between cell-phone size and battery life is well-understood and can conceivably provide good “input” data for an AI.
However, some of us cut our teeth on points NOT on the PPC: conceptualising emerging technologies that are not yet commercially available. This kind of thinking is what led to game-changers like the iphone and partly explain why the Nokia brick went from hero to zero.
AI cannot, by definition, give us input data on these goods that don’t exist yet. We must construct (often expensive) specialised choice models and other surveys to help the consumer understand how this “hypothetical but close to production” product could change their lives. Human sources of info/satisfaction will indeed be pushed out if we go all in on AI, rather than do the kind of research that is all about “what if?” We are currently in danger of getting “locked in” to the PPC curve, when human ingenuity has so often in the past caused us to move “north-east” of the PPC and consider what could be done.
I am a total ignorant and naively ask: to reduce this problem, might there be any way to ban AI content published in web sites, or any other publicly available media? I know the answer is not except if AI-generated content was forced to wear a tag.
Several YouTube channels I watch now have an on-screen disclaimer to the effect that “This video contains no AI generated content”. There have been appreciative comments. However, as you’d expect, not all such channels are telling the truth: subtle inflections or full blown mistakes in the narration pronunciation have alerted me to AI. That’s before you even get to odd sentence structure that only a native-English speaker might spot.
There is a big issue at the moment concerning the proliferation of science channels that are simply AI slop scraped from legit channels. I’ve adopted a zero-tolerance strategy: once you upload a video I suspect is AI and not based on proper research done first-hand I unsubscribe and tell YouTube not to recommend me that channel again.
While the author varies incautiously in his focus, a laptop cannot store an elaborate data set. Two terabytes or thereabouts.
That storage is trivial to a cloud data server, which will require large energy resources. It will be required for the most robust, if suspect, access.
At least that’s the way I see it, but I am continually outpaced by the possible.
> that AI does not have enough original human content to make for adequate training sets and is therefore often training on AI generated material.
This means AI will never have enough original human content. All content subsequent to last year must be considered to have been corrupted by AI, whether directly or by inflecting someone’s brain.
It was amazing in the ’70’s when the entire Library of Congress was able to fit in a few file cabinets of microfiche. Then came laserdisks! Diderot’s dream was to put all human knowledge in the Encyclopédie. Which is silly since it didn’t include the names of everyone’s pets, so ‘Knowledge’ so defined by knowledgeable folks.
AI is useful for things like submarine detection, where the data is generated automatically, and of such quantity as to overwhelm humans. It’s the Human content thingee that’s the problem, both technically and at a metaphysical level. I do not want a solar-powered Elon2 mimic floating in space implementing glitchy bondvillain decrees through the payment system. The real thing is unstable enough.