In September, Google unveiled a new feature in its artificial intelligence (AI) writing platform, NotebookLM, that caused a stir in the tech press and the wider culture: the ability to take a document you upload and generate a podcast about it that sounds strikingly real and engaging.
At the moment, the feature appears to be little more than a source of fun and amusement, with podcasts whose AI hosts strive to conjure “meaning from the meaningless” going viral on social media. But beneath the surface of this delightful new curiosity, we can see glimpses of the next stage in the evolution of digital media — one that will extend the social and political effects, both good and bad, of the demise of broadcast media.
At the moment it’s hard to see the feature’s larger significance, because the feature is currently so limited. Podcasts come in a standard form: a conversation between a male voice and a female, roughly seven to 10 minutes long, and you can choose what they focus on. But Google promises to soon offer more control over voices, format and length. What, then, is so special about this?
The breakthrough is taking place on a formal level. A number of signs point to a quantum leap in what is now possible.
In recent years, various companies have tried to solve a simple problem: how to use AI to give us a quick and affordable tool to read aloud books, articles and other documents of our own choosing. None have really succeeded. Dozens of services, such as Speechify and NaturalReader, offer relatively good text-to-speech translation, but they’re all hobbled in the same ways. They read well but not that well. They only read; they cannot summarize or condense. And, more crucially, they’re too expensive for wide adoption. Ten to fifteen dollars a month gets you about a dozen hours of transcription time. Not enough for an entire book.
But what Google has unveiled with its podcast feature in NotebookLM overcomes all these limitations at once. The quality of the voices, the banter, the conversational ease — all are a huge cut above the calibre of realism we find in the earlier generation of AI speech tools. It’s similar to the leap in AI image generation from crude, pixelated graphics of earlier models to the sharp, photo-realistic images we now expect from the various platforms. As one commentator notes, the voices in Google’s AI podcasts are not perfect or indistinguishable from real humans, but they have reached a point of being good enough to produce audio we would choose to listen to if it brought other benefits.
Google’s provision of the feature for free, at scale, suggests they have solved the efficiency problem — or are prepared to cover the cost of it for the time being. The podcasts are, of course, relatively short, but there’s no limit to how many you can make. Competing platforms — OpenAI, Meta, Claude — will soon follow with an analogous feature and will be compelled to offer it for free.
But the most crucial new facet of what Google has wrought here is the linking of its podcast feature to what may be AI’s killer app itself: the ability to have a language model work on documents of your choosing. NotebookLM produces strikingly good summaries of articles, notes, slides, YouTube videos, audio files, even entire books. Putting the two together — AI summary and podcasts — gives us a glimpse of a new media paradigm: cannibalized media. A content diet consisting of things we chose to feed AI, to cannibalize, for consumption in the form of condensed audio summaries, and perhaps eventually video.
Very soon, as we gain more control over the format of AI audio output, more of us will begin to use these tools in ways similar to how we used blogs, podcasts and YouTube at an earlier stage of the internet — as a tool for finding more personally relevant content. But just as the shift from broadcast to niche digital media caused a siloing effect, the shift from niche to cannibalized media will take us further toward a kind of media solipsism, in which much of what we listen to is completely unique. More of the time we’ll be opting not to joint a tribe of like-minded people to listen to a podcast on some niche topic but to get lost in our own mix of book and article summaries, the odd YouTube video, a daily Substack.
The key in each case is that AI will mediate, standing between us and the original sources. We likely won’t stop reading or listening to our favourite human content a good part of the time. But with tools as good as Google’s AI podcasts, one can easily foresee a point in the near future when it will make more sense to opt not to listen to a full hour-long podcast or read a lengthy article or book, but instead to get the thrust of it in a 10-minute summary, read to us by an AI voice so realistic and effective that we use it all the time.
Just as broadcast media lost its dominance to digital media, it seems likely that digital media will eventually give way to AI media. What this will mean for society and politics has yet to be seen. But if history is any guide, it will make a difference.