For years now, the idea of voice technology catching up to our chatty ways felt like a faraway future as opposed to an imminent reality. But as smart speakers and voice assistants continue to see brisk sales, organizations are finding that their digital transformation journeys simply can’t end with websites and mobile. Instead, the humble website is just the starting point in a milieu of digital experiences that need to be orchestrated, not micromanaged. Organizations need to treat voice as a first-class citizen lest they be left behind as the technological “quickening” continues apace.
Until recently, customer experiences were couched in the written word and other textual trappings of the web: links, calls to action, lead generation forms. But today, as pageless customer experiences become increasingly ubiquitous in a homebound economy thanks to the accelerating adoption of smart home systems and immersive headsets, extending our content and commerce applications into the realm of voice has never been more urgent.
A new phrase is emerging in marketing, content strategy and content design circles: voice content. For the first time, our content isn’t just words on a page; it speaks back to us, capable of rich, human conversations. Instead of us navigating written content, voice content negotiates with us. In many ways, our content has always been ripe for extrication from the hub-and-spoke trees that characterize our breadcrumb-laden, site-mapped websites.
But is your content ready for a voice-first future? A future where a primary way customers will move through your content is by negotiating, not navigating — and listening through, not leafing through? The conundrum of spoken content is why I wrote a book on the topic, "Voice Content and Usability."
Users increasingly expect their customer experiences to have a voice component, whether on a smartphone, on websites, or on a full-fledged voice assistant. The relative ease of use of voice assistants like Google Home and Amazon Alexa is engendering a significant paradigm shift away from page- and screen-bound brand experiences toward those increasingly mediated by audio, sound and voice. As the pandemic has galvanized sales of smart speakers and smart home systems, it’s never been more important to provide a conduit for content through voice.
Unfortunately, customer experiences haven’t kept pace with this new baseline of user expectations. Even today, very few brands make the information they deliver through websites and mobile applications easily accessible through a voice interface.
The way our content is structured and formatted doesn’t help either. After decades marinating in the motifs of the web, our copy is strewn with links and calls to action that not only induce head-scratching for voice users, they’re also impossible to use in a voice-only way.
Customer experiences slinging content need to pursue an omnichannel content strategy that avoids privileging one conduit over another — just as digital experience orchestration levels the playing field for all user experiences. Though the web has long been a bulwark for digital content, voice content and immersive content are shaking up the realms of content strategy and content design. An omnichannel content strategy entails unmooring our customer experiences from the ways of the web and preparing them for a future that embraces other pageless channels we haven’t even imagined yet.
Simply put, voice content is content delivered through the medium of sound, typically through synthesized speech on voice interfaces. Typically, voice content is delivered without the convenience of a supplemental screen, meaning all information travels through a spoken conversational interface. Much of the content on our websites, like FAQs, are already in a conversational cadence, easily lending themselves into a voice interface. But just because content is voice-friendly doesn’t mean it’s voice-ready.
Related Article: The Future Is Multimodal: Why Voice Alone Will Never Be the Answer
One of the biggest reasons our content isn’t ready for voice users is the very reason why that same content is so appropriate for and optimized for the web. Voice interfaces, especially pure voice interfaces that lack screens, don’t play in the same visual sandbox as websites and other screen-bound interfaces. Instead of the hub-and-spoke, networked information architectures that characterize websites, voice interfaces obligate a more linear and unidirectional approach to delivering content. Voice content prizes progressive disclosure over the comprehensiveness of web content.
Links, calls to action and lead generation forms are all mainstays of the web, but they have little to no applicability in a voice interface. This means brands need to find analogous ways to help voice users find what they need while balancing the fact that parallel versions of content for web and voice often lead to maintenance nightmares down the road. To maintain a single source of truth for content and optimize content for channels we haven’t even foreseen yet, we need an omnichannel content audit.
Content audits are at times maligned for their roots in less glitzy elements of content strategy like regulatory compliance and content planning, but auditing content is in fact one of the most valuable and critical steps in ensuring your content is ready not just for voice but also for immersive environments like augmented and virtual reality. I discuss some of the most important steps in my new book: setting evaluation criteria, gauging each item of content in non-web contexts, and recommending solutions for any “web-only” or “web-biased” results.
This may seem a bit of a stretch. After all, spoken content and written content differ substantially, leading to questions about whether it truly is appropriate for us to refactor written web content to be legible and discoverable as spoken voice content given the nuanced gap between how we speak and how we write. Nonetheless, not only do most organizations lack the ability to manage multiple editorial versions at once; much of the web content we already have in place is conversational in nature, and there is also little to no desire to juggle the maintenance burden of web content alongside spoken content.
Voice content is a highly interdisciplinary affair, with many teams involved in cross-functional decisions about the still-nascent technology space, the appropriate dialogues and flows in which voice content should be situated, and of course, the ways in which voice content is delivered to the end user through synthesized speech. But given the growing adoption of voice content, especially in the public sector, it’s an important trend to watch.
Related Article: Quality Over Quantity: Publish Less, Audit More
Undergirding my new book is a case study that demonstrates the power of voice content for state and local governments. Ask GeorgiaGov is the first-ever voice interface for residents of the state of Georgia and among the very first content-driven voice interfaces and Alexa skills in existence. Launched in October 2017, it was one of the first authentic examples of how an organization could effectively and efficiently deliver critical content through voice interfaces. It was part of Digital Services Georgia’s ongoing efforts to reach every Georgian by improving their work on accessibility and outreach to elderly Georgians.
Ask GeorgiaGov began from a single key premise when it came to the state’s overarching content strategy. At a time when local government budgets are being slashed left and right across the United States, Digital Services Georgia was very clear about the need to single-source their content from a unified CMS for both web and voice from the get-go, in order to prevent any maintenance headaches later on. This constraint presented both challenges and opportunities as we worked on the first-ever conversational content audit that identified some of the problematic elements we found in the transformation of previously web-only content into omnichannel and voice-ready content.
What might be appropriate for a website isn’t appropriate for a voice interface. Links don’t display in a blue color and with an underline in a voice interface and are in fact indistinguishable from surrounding recited text. Similarly, how do users manage to navigate an interface completely untethered to a visual context when the mere act of navigation becomes a process of negotiation? How do users respond to listening through content as opposed to leafing through it? And how can we make calls to action more feasible or less disruptive in a voice interface?
Ask GeorgiaGov was a perfect example of how a previously web-only content strategy can be optimized for the omnichannel. Eight months after launch, we conducted a retrospective at a time when voice interfaces still performed quite poorly and found that 79.2% of all interactions led to a successful delivery of content. The proof is in the pudding as today, residents of Georgia can also benefit from an online chatbot that answers state government queries.
Many organizations still aren’t ready to migrate off the web and into the experiences that more digital natives and homebound customers call home: the voice interfaces and immersive realities that go beyond the boxy browsers in which almost all content remains today. After all, many of us are still early in our digital transformation journeys and taking our content online first. But as pageless experiences continue to take hold, we’ll need to be prepared for life off the web — and that begins with voice content.