@Shapeshifter , thought more deeply on your comments as it bothered me somewhat and I wanted to refresh my understanding of AIM's approach in this space. I went back to my notes on the SM Meeting and the AIM AGM - extracts of the notes plus a bit of context around the key statements (bold, italics) are per below.
It does appear to me that:
- the approach is to extend the current moat for live captioning into Audio Description and Voice - that makes complete sense in terms of leveraging your moat into adjacents.
- AIM does a lot more to Audio Description than the translation bit - the end to end process is an involved one which per AIM, involves ~25 hours of human effort, which LEXI Audio Description will fully replace and improve on - the capability to contextualise is key
- the market for Audio Description is still the big-end media/government customers
- AIM does not compete from a technology perspective but integrates whats out there - if VLC further expands capability, it could well be used in the future AIM toolkits/ecosystems
Will keep the VLC risk on my radar. Never say never for sure, at this stage, but I don't quite see the risk impact.
--------------
SM Meeting
The big prize if the $69b grey space which is to use AI to deliver products that involve voice and dubbing - 2 products to be delivered in Oct
- Audio Description - equivalent captioning-for-deaf people, for blind people - describes the visual element of the storytelling that a blind person would not pick up but are really important to the storyline and which is not actually verbally said eg. Clark Kent goes into the phone booth, spins around and then comes out with ....
- Traditional process for Audio Description is 25 human hours for 1 hour of program content
- Legislated minimums for audio description in the UK and Europe, not yet in Australia but legislation is about to be introduced, currently about 14 hours a week on Govt-funded channels eg. ABC, SBS
- Media industry has no appetite to add extra costs for struggling media organisations - frustration in the blind community on how to get access to these services
- AIM’s offering, LEXI Audio Description, is fully automated - uses Gen AI not just to read audio, but to read the vision, understand the text, and to make the decisions that previously was made by humans ie. Identify what were the salient elements of the visual story telling that were not described in the text, find a gap in the audio to insert the description into, write the script for that, have a voiceover artist read it in, mix the audio, and have it available as an audio track - this is what makes up the 25 hours today
- Expect LEXI Audio Description to massively increase the TAM because it will be a case of “why wouldn't I just do this”? as the economics become very compelling
- While the blind community is not a large one (deaf/hard on hearing population is about 10x the blind population), the market is actually the media organisations who is or will be compelled to provide audio description content
Competition
- AIM does not compete from a technology perspective - it integrates and USES the evolving technology to improve its offering within their customer workflows
- AIM DOES NOT DO THE AI ENGINE - it does not invest in speech detects engines
- Invest instead in the API calls that allows AIM to connect their customers signals with the latest and greatest AI engines - not betting on any AI text engine to win, have and use dozens of AI text engines in the Cloud
AGM
The addressable market for the LEXI VOICE is significantly larger than for LEXI captions. By continuously incorporating the latest advancements in AI, we continue to offer market leading solutions, powered by the same architecture that has delivered success in the US Broadcast market, further strengthening and extending our defensive moat anchored in LEXI over iCap with secure access to customer data.
A major milestone was reached in November 2023 when LEXI Live overtook the quality of legacy human captioning achieving an average score of 98.7% on the NER scale, which is an accepted industry method for determining accuracy of captions. In the last 12 months, the quality of LEXI Live has continued to increase, with results now in excess of 99.1%. Because of this sustained quality improvement momentum in LEXI, less than 20% of our business will involve human curation by December 2025.
Over the period to FY29 we anticipate LEXI superseding humans, as it first did with LEXI Live Captioning in 1H24) in all other AI-driven language products serviced by our LEXI Toolkit – including the category that dominates the language services market by value - VOICE.