Generative AI will continue to get cheaper

Forum Topics Generative AI will continue to get cheaper

Ipsum

Added 3 months ago

There was a time when making a phone call cost money. Making a long distance call cost even more money. It doesn’t anymore and I think the cost of geneartive AI will follow the same trend.

AI startups are all losing money, but that’s not the whole story. One of the surprising statements I’ve been from the frontier AI labs is that their inference is generally profitable.

Sam Altman says "We're profitable on inference".

Jensen Huang in a podcast interview:

… I’m so pleased that these tokens are now profitable, that people are generating, I heard somebody, or heard today that open evidence, speaking of them, 90% gross margins. I mean, those are very profitable tokens. […] Cursor, their margins are great. Claude's margins are great. For the enterprise use of Open AI, their margins are great.

Training a model costs an staggering amount of money, but inference — generating content using that model — appears to be profitable much of the time. On top of this the cost of tokens is going down rapidly. Newer models are better, but also cheaper to use.

Token usage is going up, however. More advanced reasoning models use a lot more tokens than the final output you see.

But there are also alternative models that are even cheaper to use than the frontier models. Small models can be trained, either from scratch or distilled from bigger, more sophisticated models, and tailored to specific tasks. These small models won’t compete with ChatGPT or Claude but can be very useful on tasks with constraints. Meta has a custom model they use for generating ads. It would not be very useful for writing code but excels at generating text for ads. Meta runs this in their own data centres so I expect it is far cheaper than what they would pay to Google or OpenAI for a similar service.

Most companies won’t be able to do that yet but I believe we will see smaller models trained for specific tasks as the cost comes down and the infrastructure emerges to do this more cheaply.

I keep seeing and hearing people say the AI labs are losing money, so this can't last. I think they are pattern matching against startups like Uber and Doordash. They were initially in an attempt to gain market share and under cut the incumbent taxis. They have since raised prices, so there is some expectation the AI startups will, too. Maybe, they will but the delivery startups needed human bodies to drive things around. The AI labs are constrained on electricity and chips which is a much better problem to have. You can scale those up eventually. At the same time the efficiency of the hardware and software can improve.

If you’re assuming generative AI costs will rise, consider what it would look if they become cheaper, or even free.

Raseekingalpha

Added 3 months ago

Thats a great thought, only thing i would like to add is that as it gets cheaper consumption also will rise.

I will give the example of phone calls when international calls were costly i remmeber people using it sapringly as it became cheaper people started using international calls freuqnlty & call duration incresed exponentialy. Jevons Paradox still applies, so as ai becomes mre useful we will start generating more work for humans to do along with AI

Chagsy

Added 3 months ago

Great point @Ipsum

As a counter, I would point out that you use an analogy of phone calls. But then say that the mental model of Doordash and Uber are incorrect. Why would phone calls be a better model?

Secondly, SA has a vested interest in convincing investors that he has a profitable company (should he choose to run it that way). My question is “when does the training stop?” I suspect it never does, and the cost of the training may get cheaper per token but the complexity increases exponentially. So the cost may well never come down. That cost has to be passed on to the users of inference. JH probably has similar motives given the circular nature of these businesses funding

Lastly, I can’t recommend highly enough, the recent “Boss Class” podcast series from the Economist.

its a lighthearted and very funny take on management issues. I found the first two series interesting and highly entertaining, but series 3 is all about AI. The impact on companies, job opportunities and society. He interviews people at the cutting edge of this issue and it’s amazing to listen to how they see the future. I drove from Brisbane to the Blue Mountains (11 hrs) and listened to the whole series in …..well, series. When I had finished the series I drove in silence for 3 hrs just thinking about what I heard.

I’ve read a lot on the subject, as I’m sure we all have, and have probably been confident I understood a bit about it. I’m really not so sure anymore

The last “bonus” interview (with Tom Blomfield a partner in Y combinator) which I will link below, was the clincher for me. TLDR, nothing cannot be disrupted unless it already has an established network

That’s the only moat left. All the others will go. It might happen sooner or later, but native AI apps will probably disrupt…nearly everything.

I’m not going to rush. I’m going to think about this for a while. Then I’m not going to think about it for a while longer . Then I’m going to think about it for a bit more again, and I suspect, I’m going to radically alter how I view companies future prospects.

It really might be different this time.

Sorry Warren and Charlie

First 5 get to listen for free!

mikebrisy

Added 3 months ago

@Chagsy thanks for flagging. As an Economist subscriber, I spend much of my weekends listening to their podcasts, but had missed this Boss series.

I've listened to the first 6 episodes and found them utterly compelling, and it has taken me deeper into this topic, which obviously I, like others here have been thinking about and reading about a lot.

The series is some of the best content on AI I've come across in a while.

To create 5 more opportunies for the final interview, here's my free gift link for other StrawPeople.

First 5 get to listen for free!

Here are some of my reflections from the 6 episodes I've listened to on this rainy QLD weekend; they laregly reinforce some key ideas I've been forming:

Over time, AI will profoundly change the nature of work
It will be a multi-decadal change, because AI ultimately has to be embedded in human organisations, that require adaptation/evolution
Some industries will change faster than others
Back office, white collar, processing type roles and businesses are being disrupted first
New types of human work will evolve (e.g. legal-technical specialist)
Industry incumbents who have already leaned heavily into data analytics and process automation, and have built deep and agile capabilities in these domains AND who invest strategically in building incremental AI capabilities, stand to be winners.
Leaders in various sectors have already been investing in relevant capabilities for years, and in some cases decades (my post on Friday about medical imaging ovr the decades)
Change will likely progress in fits and starts
Management skill will be a key differentiator - no different to any other general technology adoption in history
Industries that rely heavily on barriers like a) trust and b) regulatory control will be slower to be disrupted, but are not immune
It is important not to think about AI as just about efficiency and cost reduction. It is also about improved service delivery, that can make services more widely available where previously there were barries to adoption or access. (e.g., Garfield.AI the UK example of AI legal services to give access to legal recourse for low level bill nonpayment)

One key quote that got me: Find the thing that you are better at than anybody else, and use AI to level up all the other capabilities you need. Or words to that effect.

I am interested to see to what extent the final 3 bonus episodes shift my thinking. What I am still less clear about - is how well AI native start-ups will be able to develop and/or acquire all of the capabilities necessary to disrupt imcumbent industry leaders, before these can adapt. (The Amazon vs. Walmart analogy)

Chagsy

Added 3 months ago

Nice summary @mikebrisy

i await your impressions of the last three with interest!

Not all of them are as cataclysmic as the last, for sure. I probably posted with a degree of recency bias.

also, apologies @Ipsum - having re-read your post, I understand what you mean more clearly.

It’s really difficult to work this out and anchoring on historical analogies gives us some way of making sense of it all. I’m so unsure of which ones to use!!

best of luck everyone.

stock picking seems to be getting harder. And it was never easy to start with!

jcmleng

Added 3 months ago

@Chagsy, @mikebrisy , thanks for flagging the Boss Class. Signed up and have done the first 3 episodes. I think it is very insightful because it covers the topic from the various different perspectives, and it has a huge focus on the deployment of these capabilities in the real world. I found this both eye-opening and refreshing.

Similar to @Chagsy, I am so glad I am done with work. I can't even begin to comprehend the barnfights that must be happening with AI implementation - that tension between "quick wins, show results", "inherent laziness of people to do things properly", "security risks", "need for human review and guardrails" - they all diametrically opposed issues which clash violently.

Have summarised the 5 key takeaways for the first 3 episodes below, each of them loaded if you stopped to think about the implications of what might be ... my reaction in italics.

Episode 1: Fat Layer of Humans

2025 Prediction: “Software engineers are like farmers, while AI, is like a combine harvester. The world is going to have a lot more food and a lot fewer farmers in very short order” - this really feels like the imminent end of software programmers as I knew it
Very, very high certainty that almost all knowledge-based work will be done at a higher level by AI within our lifetime for sure - everyone who does work on a computer - this is what @Chagsy pointed out in a separate post
Jagged frontier - border between useful and useless AI is not a straight line, its a mountain range
Companies that see AI as a cost cutting tool are fundamentally misreading the moment - rather than replacing employees with Gen AI systems, managers should be thinking about how they reinvent their organisations - which companies struggle to do, hence the need for IT/Management Consultants more than ever, just like in any other IT revolution
Second worry - a failure to reimagine how work gets done - this phrase is coming up a lot in SAAS company commentary

Episode 2: Feeling The Vibe (as in Vibe Coding)

Some 2/3’rds of coders say they use an AI tool at least once a week
AI models were made by people who do not fully understand your business and your employees do not fully understand AI models, other than in software coding - this divide between the techheads and the business have always been there since the days of Systems Integration, ERP's, Dot Com and now AI. Nothing new but inifinitely more dangerous because of all the tech prior, AI is the easiest to self-serve and self-deploy.
Hard to scale models which have to work with subjective and fuzzy criteria eg. “how to make this model funnier” - a machine has no chance if humans can't articulate what this means in super clear terms to begin with
Vibe coding is not good for systems security - LLM's train on all publicly available code in the world which are not necessarily secure, so it has introduced a lot of new vulnerabilities. From the results at XBOW, can immediately tell which has been vibe-coded - cyber security criticality just went up many notches, LLM's are as smart or dumb as the data that is out there. I'm definitely topping up on my HACK ETF!
Vibe coding is a great way to turn ideas into prototypes, to demo, not memo, but it shouldn't get you to the finishing line. To avoid carnage, your work needs to be checked by a human expert. - Amen. I still can't see how vibe-coding, on its own, is going to make SAAS enterprise software completely redundant.

Episode 3: The Easy Button

Tasks which are repetitive but with slight varations tend to be very amenable to Gen AI
Make sure to count ALL the costs of deployment, especially the hidden costs - (1) how much does it cost to adapt the LLM to fit the demands of the particular task, particularly when a high degree of correctness is required (2) cost of fail safe to stop any remaining AI-generated errors from causing real-world damage - this often ends up being a human in the loop and inconveniently get in the way of the scability story. Glossing through these costs is typically the source of why project blow ups happen - classic IT project management, true of every IT project really.
“I genuinely think that the people who are going to be the most successful in the coming years are those people who can resist hitting the Easy button because it will always be an option for you and it will increasingly be a tempting option for you" - this applies MORE SO to organisations and management, especially those that focus on "quick wins" rather than doing things properly
Treat what the model gives you as the starting point not a finished product - which is not what lazy organisations and management will do, to their detriment

Ipsum

Added 3 months ago

Thanks for the recommendation @Chagsy . I enjoy the Money Talks podcast but had missed the Boss series. I’ve been working my way through them and also enjoying the notes from others posted here.

I agree that comments by Sam Altman and others can’t be taken at face value without weighing their own interests. One of those is to keep the money pouring into sustain the training costs, but what caught my attention the difference between a vague promise of “we’ll be profitable in a few years when we are at scale” versus “we are profitable on inference”. On balance I consider it’s probably more likely true than not. He has made it several times and a fairly specific statement. Other claims I’m far more skeptical of (for instance, it’s not clear to me the current generative AI will deliver the artificial general intelligence the startups are all working towards).

As @Raseekingalpha points out above, cheaper means more more consumption. We are seeing that now with all the vibe coding going on. AI coding models are delivering software at a marginal cost of almost zero, which is staggering. A lot of this will be software that didn’t previously get made because it wasn’t worth it. It could be software for very specific tasks that have a limited audience, or software where the value didn’t previously justify the cost. Demand for software isn’t fixed so we are going to see a lot more of it.

At a software conference some months back I had a discussion with the owner of a company that sold software into some specific industry (finance or lending, I think). He was very enthusiastic about how much more productive his developers were and didn’t seem too worried about competitors entering the market. The biggest problem he told me was not moving too fast for his customers. They’d already bought and deployed the software he provided. He couldn’t ship changes too quickly because the businesses couldn’t move that fast. They don’t want to retrain staff on a new process too often. I wasn’t expecting this. As a technophile I’m always looking forward to the next update!

Larger businesses may be more willing to invest in training if they can see a clear return on the investment. But they are also slower to move and risk averse. Security remains a big concern and generative AI introduces new kinds of threats. Prompt injection is a specific problem that hasn’t been solved yet. If you deploy a AI agent to help answer your email, how do you make sure it doesn’t reply to a scammer’s email? You can try and train it to be safe, but these models are probabilisitc, not deterministic, so you can’t be 100% sure. I think more likely they will need close human supervision which requires people to still be in the loop.

My current thinking is that generative AI will have an impact on the software industry but not the end of it. I’m looking for companies that will benefit from AI tools but are hard to dislodge from their installed customers. Networks effects are great, but switching costs can also help. The current contenders I'm watching include:

Promedicus - regulated and risk averse industry, should be able to accelerate development

Xero - a two sided market, which means they’d have to get both accountant and their clients to switch. A new entrant could target a specific vertical but it would take time. Though I do wish they'd stop acquiring things.

AI Media - hardware encoders provide a moat. Will be sometime before they are replaced.

REA - network effect: everyone has to be on it. Software should have disrupted the real estate industry years ago but hasn’t so I don’t think it will anytime soon.

Energy One and Catapult are two I need to understand more but may be candidates for my list.

mlwooding

Added 3 months ago

yeah I think companies like XRO have accountant lock-in. The biggest challenge with AI and SaaS companies is BtoB, what I mean by that is enterprise. For example, the company that I work for has switched out a number of SaaS solutions we use and consolidating - this means Salesforce, ServiceNOW are going to be fine for years, the issue is the mid sized SaaS, they are having increasing customer churn and the "set of steak knives" they used to toss in with the core product? Customers are just not switching on the steak knives and demanding discounts at renew time. Expect to see acquisitions of unlisted SaaS companies as they look to be rescued and I agree, XRO wants an application channel ecosystem but then goes and acquires products to integrate in competition with their ecosystem - this is a dangerous high wire balancing act.

Ipsum

Raseekingalpha

Chagsy

mikebrisy

Chagsy

jcmleng

Ipsum

mlwooding

Place trade