lit

Voice Content and Usability

We’ve been having conversations for thousands of years. Whether to convey information, conduct transactions, or simply to check in on one another, people have yammered away, chattering and gesticulating, through spoken conversation for countless generations. Only in the last few millennia have we begun to commit our conversations to writing, and only in the last few decades have we begun to outsource them to the computer, a machine that shows much more affinity for written correspondence than for the slangy vagaries of spoken language.

Computers have trouble because between spoken and written language, speech is more primordial. To have successful conversations with us, machines must grapple with the messiness of human speech: the disfluencies and pauses, the gestures and body language, and the variations in word choice and spoken dialect that can stymie even the most carefully crafted human-computer interaction. In the human-to-human scenario, spoken language also has the privilege of face-to-face contact, where we can readily interpret nonverbal social cues.

In contrast, written language immediately concretizes as we commit it to record and retains usages long after they become obsolete in spoken communication (the salutation “To whom it may concern,” for example), generating its own fossil record of outdated terms and phrases. Because it tends to be more consistent, polished, and formal, written text is fundamentally much easier for machines to parse and understand.

Spoken language has no such luxury. Besides the nonverbal cues that decorate conversations with emphasis and emotional context, there are also verbal cues and vocal behaviors that modulate conversation in nuanced ways: how something is said, not what. Whether rapid-fire, low-pitched, or high-decibel, whether sarcastic, stilted, or sighing, our spoken language conveys much more than the written word could ever muster. So when it comes to voice interfaces—the machines we conduct spoken conversations with—we face exciting challenges as designers and content strategists.

Voice Interactions

We interact with voice interfaces for a variety of reasons, but according to Michael McTear, Zoraida Callejas, and David Griol in The Conversational Interface, those motivations by and large mirror the reasons we initiate conversations with other people, too (http://bkaprt.com/vcu36/01-01). Generally, we start up a conversation because:

  • we need something done (such as a transaction),
  • we want to know something (information of some sort), or
  • we are social beings and want someone to talk to (conversation for conversation’s sake).

These three categories—which I call transactional, informational, and prosocial—also characterize essentially every voice interaction: a single conversation from beginning to end that realizes some outcome for the user, starting with the voice interface’s first greeting and ending with the user exiting the interface. Note here that a conversation in our human sense—a chat between people that leads to some result and lasts an arbitrary length of time—could encompass multiple transactional, informational, and prosocial voice interactions in succession. In other words, a voice interaction is a conversation, but a conversation is not necessarily a single voice interaction.

Purely prosocial conversations are more gimmicky than captivating in most voice interfaces, because machines don’t yet have the capacity to really want to know how we’re doing and to do the sort of glad-handing humans crave. There’s also ongoing debate as to whether users actually prefer the sort of organic human conversation that begins with a prosocial voice interaction and shifts seamlessly into other types. In fact, in Voice User Interface Design, Michael Cohen, James Giangola, and Jennifer Balogh recommend sticking to users’ expectations by mimicking how they interact with other voice interfaces rather than trying too hard to be human—potentially alienating them in the process (http://bkaprt.com/vcu36/01-01).

That leaves two genres of conversations we can have with one another that a voice interface can easily have with us, too: a transactional voice interaction realizing some outcome (“buy iced tea”) and an informational voice interaction teaching us something new (“discuss a musical”).

Transactional voice interactions

Unless you’re tapping buttons on a food delivery app, you’re generally having a conversation—and therefore a voice interaction—when you order a Hawaiian pizza with extra pineapple. Even when we walk up to the counter and place an order, the conversation quickly pivots from an initial smattering of neighborly small talk to the real mission at hand: ordering a pizza (generously topped with pineapple, as it should be).

Alison: Hey, how’s it going?

Burhan: Hi, welcome to Crust Deluxe! It’s cold out there. How can I help you?

Alison: Can I get a Hawaiian pizza with extra pineapple?

Burhan: Sure, what size?

Alison: Large.

Burhan: Anything else?

Alison: No thanks, that’s it.

Burhan: Something to drink?

Alison: I’ll have a bottle of Coke.

Burhan: You got it. That’ll be $13.55 and about fifteen minutes.

Each progressive disclosure in this transactional conversation reveals more and more of the desired outcome of the transaction: a service rendered or a product delivered. Transactional conversations have certain key traits: they’re direct, to the point, and economical. They quickly dispense with pleasantries.

Informational voice interactions

Meanwhile, some conversations are primarily about obtaining information. Though Alison might visit Crust Deluxe with the sole purpose of placing an order, she might not actually want to walk out with a pizza at all. She might be just as interested in whether they serve halal or kosher dishes, gluten-free options, or something else. Here, though we again have a prosocial mini-conversation at the beginning to establish politeness, we’re after much more.

Alison: Hey, how’s it going?

Burhan: Hi, welcome to Crust Deluxe! It’s cold out there. How can I help you?

Alison: Can I ask a few questions?

Burhan: Of course! Go right ahead.

Alison: Do you have any halal options on the menu?

Burhan: Absolutely! We can make any pie halal by request. We also have lots of vegetarian, ovo-lacto, and vegan options. Are you thinking about any other dietary restrictions?

Alison: What about gluten-free pizzas?

Burhan: We can definitely do a gluten-free crust for you, no problem, for both our deep-dish and thin-crust pizzas. Anything else I can answer for you?

Alison: That’s it for now. Good to know. Thanks!

Burhan: Anytime, come back soon!

This is a very different dialogue. Here, the goal is to get a certain set of facts. Informational conversations are investigative quests for the truth—research expeditions to gather data, news, or facts. Voice interactions that are informational might be more long-winded than transactional conversations by necessity. Responses tend to be lengthier, more informative, and carefully communicated so the customer understands the key takeaways.

Voice Interfaces

At their core, voice interfaces employ speech to support users in reaching their goals. But simply because an interface has a voice component doesn’t mean that every user interaction with it is mediated through voice. Because multimodal voice interfaces can lean on visual components like screens as crutches, we’re most concerned in this book with pure voice interfaces, which depend entirely on spoken conversation, lack any visual component whatsoever, and are therefore much more nuanced and challenging to tackle.

Though voice interfaces have long been integral to the imagined future of humanity in science fiction, only recently have those lofty visions become fully realized in genuine voice interfaces.

Interactive voice response (IVR) systems

Though written conversational interfaces have been fixtures of computing for many decades, voice interfaces first emerged in the early 1990s with text-to-speech (TTS) dictation programs that recited written text aloud, as well as speech-enabled in-car systems that gave directions to a user-provided address. With the advent of interactive voice response (IVR) systems, intended as an alternative to overburdened customer service representatives, we became acquainted with the first true voice interfaces that engaged in authentic conversation.

IVR systems allowed organizations to reduce their reliance on call centers but soon became notorious for their clunkiness. Commonplace in the corporate world, these systems were primarily designed as metaphorical switchboards to guide customers to a real phone agent (“Say Reservations to book a flight or check an itinerary”); chances are you will enter a conversation with one when you call an airline or hotel conglomerate. Despite their functional issues and users’ frustration with their inability to speak to an actual human right away, IVR systems proliferated in the early 1990s across a variety of industries (http://bkaprt.com/vcu36/01-02, PDF).

While IVR systems are great for highly repetitive, monotonous conversations that generally don’t veer from a single format, they have a reputation for less scintillating conversation than we’re used to in real life (or even in science fiction).

Screen readers

Parallel to the evolution of IVR systems was the invention of the screen reader, a tool that transcribes visual content into synthesized speech. For Blind or visually impaired website users, it’s the predominant method of interacting with text, multimedia, or form elements. Screen readers represent perhaps the closest equivalent we have today to an out-of-the-box implementation of content delivered through voice.

Among the first screen readers known by that moniker was the Screen Reader for the BBC Micro and NEEC Portable developed by the Research Centre for the Education of the Visually Handicapped (RCEVH) at the University of Birmingham in 1986 (http://bkaprt.com/vcu36/01-03). That same year, Jim Thatcher created the first IBM Screen Reader for text-based computers, later recreated for computers with graphical user interfaces (GUIs) (http://bkaprt.com/vcu36/01-04).

With the rapid growth of the web in the 1990s, the demand for accessible tools for websites exploded. Thanks to the introduction of semantic HTML and especially ARIA roles beginning in 2008, screen readers started facilitating speedy interactions with web pages that ostensibly allow disabled users to traverse the page as an aural and temporal space rather than a visual and physical one. In other words, screen readers for the web “provide mechanisms that translate visual design constructs—proximity, proportion, etc.—into useful information,” writes Aaron Gustafson in A List Apart. “At least they do when documents are authored thoughtfully” (http://bkaprt.com/vcu36/01-05).

Though deeply instructive for voice interface designers, there’s one significant problem with screen readers: they’re difficult to use and unremittingly verbose. The visual structures of websites and web navigation don’t translate well to screen readers, sometimes resulting in unwieldy pronouncements that name every manipulable HTML element and announce every formatting change. For many screen reader users, working with web-based interfaces exacts a cognitive toll.

In Wired, accessibility advocate and voice engineer Chris Maury considers why the screen reader experience is ill-suited to users relying on voice:

From the beginning, I hated the way that Screen Readers work. Why are they designed the way they are? It makes no sense to present information visually and then, and only then, translate that into audio. All of the time and energy that goes into creating the perfect user experience for an app is wasted, or even worse, adversely impacting the experience for blind users. (http://bkaprt.com/vcu36/01-06)

In many cases, well-designed voice interfaces can speed users to their destination better than long-winded screen reader monologues. After all, visual interface users have the benefit of darting around the viewport freely to find information, ignoring areas irrelevant to them. Blind users, meanwhile, are obligated to listen to every utterance synthesized into speech and therefore prize brevity and efficiency. Disabled users who have long had no choice but to employ clunky screen readers may find that voice interfaces, particularly more modern voice assistants, offer a more streamlined experience.

Voice assistants

When we think of voice assistants (the subset of voice interfaces now commonplace in living rooms, smart homes, and offices), many of us immediately picture HAL from 2001: A Space Odyssey or hear Majel Barrett’s voice as the omniscient computer in Star Trek. Voice assistants are akin to personal concierges that can answer questions, schedule appointments, conduct searches, and perform other common day-to-day tasks. And they’re rapidly gaining more attention from accessibility advocates for their assistive potential.

Before the earliest IVR systems found success in the enterprise, Apple published a demonstration video in 1987 depicting the Knowledge Navigator, a voice assistant that could transcribe spoken words and recognize human speech to a great degree of accuracy. Then, in 2001, Tim Berners-Lee and others formulated their vision for a Semantic Web “agent” that would perform typical errands like “checking calendars, making appointments, and finding locations” (http://bkaprt.com/vcu36/01-07, behind paywall). It wasn’t until 2011 that Apple’s Siri finally entered the picture, making voice assistants a tangible reality for consumers.

Thanks to the plethora of voice assistants available today, there is considerable variation in how programmable and customizable certain voice assistants are over others (Fig 1.1). At one extreme, everything except vendor-provided features is locked down; for example, at the time of their release, the core functionality of Apple’s Siri and Microsoft’s Cortana couldn’t be extended beyond their existing capabilities. Even today, it isn’t possible to program Siri to perform arbitrary functions, because there’s no means by which developers can interact with Siri at a low level, apart from predefined categories of tasks like sending messages, hailing rideshares, making restaurant reservations, and certain others.

At the opposite end of the spectrum, voice assistants like Amazon Alexa and Google Home offer a core foundation on which developers can build custom voice interfaces. For this reason, programmable voice assistants that lend themselves to customization and extensibility are becoming increasingly popular for developers who feel stifled by the limitations of Siri and Cortana. Amazon offers the Alexa Skills Kit, a developer framework for building custom voice interfaces for Amazon Alexa, while Google Home offers the ability to program arbitrary Google Assistant skills. Today, users can choose from among thousands of custom-built skills within both the Amazon Alexa and Google Assistant ecosystems.

Fig 1.1: Voice assistants like Amazon Alexa and Google Home tend to be more programmable, and thus more flexible, than their counterpart Apple Siri.

As corporations like Amazon, Apple, Microsoft, and Google continue to stake their territory, they’re also selling and open-sourcing an unprecedented array of tools and frameworks for designers and developers that aim to make building voice interfaces as easy as possible, even without code.

Often by necessity, voice assistants like Amazon Alexa tend to be monochannel—they’re tightly coupled to a device and can’t be accessed on a computer or smartphone instead. By contrast, many development platforms like Google’s Dialogflow have introduced omnichannel capabilities so users can build a single conversational interface that then manifests as a voice interface, textual chatbot, and IVR system upon deployment. I don’t prescribe any specific implementation approaches in this design-focused book, but in Chapter 4 we’ll get into some of the implications these variables might have on the way you build out your design artifacts.

Voice Content

Simply put, voice content is content delivered through voice. To preserve what makes human conversation so compelling in the first place, voice content needs to be free-flowing and organic, contextless and concise—everything written content isn’t.

Our world is replete with voice content in various forms: screen readers reciting website content, voice assistants rattling off a weather forecast, and automated phone hotline responses governed by IVR systems. In this book, we’re most concerned with content delivered auditorily—not as an option, but as a necessity.

For many of us, our first foray into informational voice interfaces will be to deliver content to users. There’s only one problem: any content we already have isn’t in any way ready for this new habitat. So how do we make the content trapped on our websites more conversational? And how do we write new copy that lends itself to voice interactions?

Lately, we’ve begun slicing and dicing our content in unprecedented ways. Websites are, in many respects, colossal vaults of what I call macrocontent: lengthy prose that can extend for infinitely scrollable miles in a browser window, like microfilm viewers of newspaper archives. Back in 2002, well before the present-day ubiquity of voice assistants, technologist Anil Dash defined microcontent as permalinked pieces of content that stay legible regardless of environment, such as email or text messages:

A day’s weather forcast [sic], the arrival and departure times for an airplane flight, an abstract from a long publication, or a single instant message can all be examples of microcontent. (http://bkaprt.com/vcu36/01-08)

I’d update Dash’s definition of microcontent to include all examples of bite-sized content that go well beyond written communiqués. After all, today we encounter microcontent in interfaces where a small snippet of copy is displayed alone, unmoored from the browser, like a textbot confirmation of a restaurant reservation. Microcontent offers the best opportunity to gauge how your content can be stretched to the very edges of its capabilities, informing delivery channels both established and novel.

As microcontent, voice content is unique because it’s an example of how content is experienced in time rather than in space. We can glance at a digital sign underground for an instant and know when the next train is arriving, but voice interfaces hold our attention captive for periods of time that we can’t easily escape or skip, something screen reader users are all too familiar with.

Because microcontent is fundamentally made up of isolated blobs with no relation to the channels where they’ll eventually end up, we need to ensure that our microcontent truly performs well as voice content—and that means focusing on the two most important traits of robust voice content: voice content legibility and voice content discoverability.

Fundamentally, the legibility and discoverability of our voice content both have to do with how voice content manifests in perceived time and space.




lit

Humility: An Essential Value

Humility, a designer’s essential value—that has a nice ring to it. What about humility, an office manager’s essential value? Or a dentist’s? Or a librarian’s? They all sound great. When humility is our guiding light, the path is always open for fulfillment, evolution, connection, and engagement. In this chapter, we’re going to talk about why.

That said, this is a book for designers, and to that end, I’d like to start with a story—well, a journey, really. It’s a personal one, and I’m going to make myself a bit vulnerable along the way. I call it:

The Tale of Justin’s Preposterous Pate

When I was coming out of art school, a long-haired, goateed neophyte, print was a known quantity to me; design on the web, however, was rife with complexities to navigate and discover, a problem to be solved. Though I had been formally trained in graphic design, typography, and layout, what fascinated me was how these traditional skills might be applied to a fledgling digital landscape. This theme would ultimately shape the rest of my career.

So rather than graduate and go into print like many of my friends, I devoured HTML and JavaScript books into the wee hours of the morning and taught myself how to code during my senior year. I wanted—nay, needed—to better understand the underlying implications of what my design decisions would mean once rendered in a browser.

The late ’90s and early 2000s were the so-called “Wild West” of web design. Designers at the time were all figuring out how to apply design and visual communication to the digital landscape. What were the rules? How could we break them and still engage, entertain, and convey information? At a more macro level, how could my values, inclusive of humility, respect, and connection, align in tandem with that? I was hungry to find out.

Though I’m talking about a different era, those are timeless considerations between non-career interactions and the world of design. What are your core passions, or values, that transcend medium? It’s essentially the same concept we discussed earlier on the direct parallels between what fulfills you, agnostic of the tangible or digital realms; the core themes are all the same.

First within tables, animated GIFs, Flash, then with Web Standards, divs, and CSS, there was personality, raw unbridled creativity, and unique means of presentment that often defied any semblance of a visible grid. Splash screens and “browser requirement” pages aplenty. Usability and accessibility were typically victims of such a creation, but such paramount facets of any digital design were largely (and, in hindsight, unfairly) disregarded at the expense of experimentation.

For example, this iteration of my personal portfolio site (“the pseudoroom”) from that era was experimental, if not a bit heavy- handed, in the visual communication of the concept of a living sketchbook. Very skeuomorphic. I collaborated with fellow designer and dear friend Marc Clancy (now a co-founder of the creative project organizing app Milanote) on this one, where we’d first sketch and then pass a Photoshop file back and forth to trick things out and play with varied user interactions. Then, I’d break it down and code it into a digital layout.

Figure 1: “the pseudoroom” website, hitting the sketchbook metaphor hard.

Along with design folio pieces, the site also offered free downloads for Mac OS customizations: desktop wallpapers that were effectively design experimentation, custom-designed typefaces, and desktop icons.

From around the same time, GUI Galaxy was a design, pixel art, and Mac-centric news portal some graphic designer friends and I conceived, designed, developed, and deployed.

Figure 2: GUI Galaxy, web standards-compliant design news portal

Design news portals were incredibly popular during this period, featuring (what would now be considered) Tweet-size, small-format snippets of pertinent news from the categories I previously mentioned. If you took Twitter, curated it to a few categories, and wrapped it in a custom-branded experience, you’d have a design news portal from the late 90s / early 2000s.

We as designers had evolved and created a bandwidth-sensitive, web standards award-winning, much more accessibility-conscious website. Still ripe with experimentation, yet more mindful of equitable engagement. You can see a couple of content panes here, noting general news (tech, design) and Mac-centric news below. We also offered many of the custom downloads I cited before as present on my folio site but branded and themed to GUI Galaxy.

The site’s backbone was a homegrown CMS, with the presentation layer consisting of global design + illustration + news author collaboration. And the collaboration effort here, in addition to experimentation on a ‘brand’ and content delivery, was hitting my core. We were designing something bigger than any single one of us and connecting with a global audience.

Collaboration and connection transcend medium in their impact, immensely fulfilling me as a designer.

Now, why am I taking you down this trip of design memory lane? Two reasons.

First, there’s a reason for the nostalgia for that design era (the “Wild West” era, as I called it earlier): the inherent exploration, personality, and creativity that saturated many design portals and personal portfolio sites. Ultra-finely detailed pixel art UI, custom illustration, bespoke vector graphics, all underpinned by a strong design community.

Today’s web design has been in a period of stagnation. I suspect there’s a strong chance you’ve seen a site whose structure looks something like this: a hero image / banner with text overlaid, perhaps with a lovely rotating carousel of images (laying the snark on heavy there), a call to action, and three columns of sub-content directly beneath. Maybe an icon library is employed with selections that vaguely relate to their respective content.

Design, as it’s applied to the digital landscape, is in dire need of thoughtful layout, typography, and visual engagement that goes hand-in-hand with all the modern considerations we now know are paramount: usability. Accessibility. Load times and bandwidth- sensitive content delivery. A responsive presentation that meets human beings wherever they’re engaging from. We must be mindful of, and respectful toward, those concerns—but not at the expense of creativity of visual communication or via replicating cookie-cutter layouts.

Pixel Problems

Websites during this period were often designed and built on Macs whose OS and desktops looked something like this. This is Mac OS 7.5, but 8 and 9 weren’t that different.

Figure 3: A Mac OS 7.5-centric desktop.

Desktop icons fascinated me: how could any single one, at any given point, stand out to get my attention? In this example, the user’s desktop is tidy, but think of a more realistic example with icon pandemonium. Or, say an icon was part of a larger system grouping (fonts, extensions, control panels)—how did it also maintain cohesion amongst a group?

These were 32 x 32 pixel creations, utilizing a 256-color palette, designed pixel-by-pixel as mini mosaics. To me, this was the embodiment of digital visual communication under such ridiculous constraints. And often, ridiculous restrictions can yield the purification of concept and theme.

So I began to research and do my homework. I was a student of this new medium, hungry to dissect, process, discover, and make it my own.

Expanding upon the notion of exploration, I wanted to see how I could push the limits of a 32x32 pixel grid with that 256-color palette. Those ridiculous constraints forced a clarity of concept and presentation that I found incredibly appealing. The digital gauntlet had been tossed, and that challenge fueled me. And so, in my dorm room into the wee hours of the morning, I toiled away, bringing conceptual sketches into mini mosaic fruition.

These are some of my creations, utilizing the only tool available at the time to create icons called ResEdit. ResEdit was a clunky, built-in Mac OS utility not really made for exactly what we were using it for. At the core of all of this work: Research. Challenge. Problem- solving. Again, these core connection-based values are agnostic of medium.

Figure 4: A selection of my pixel art design, 32x32 pixel canvas, 8-bit palette

There’s one more design portal I want to talk about, which also serves as the second reason for my story to bring this all together.

This is K10k, short for Kaliber 1000. K10k was founded in 1998 by Michael Schmidt and Toke Nygaard, and was the design news portal on the web during this period. With its pixel art-fueled presentation, ultra-focused care given to every facet and detail, and with many of the more influential designers of the time who were invited to be news authors on the site, well... it was the place to be, my friend. With respect where respect is due, GUI Galaxy’s concept was inspired by what these folks were doing.

Figure 5: The K10k website

For my part, the combination of my web design work and pixel art exploration began to get me some notoriety in the design scene. Eventually, K10k noticed and added me as one of their very select group of news authors to contribute content to the site.

Amongst my personal work and side projects—and now with this inclusion—in the design community, this put me on the map. My design work also began to be published in various printed collections, in magazines domestically and overseas, and featured on other design news portals. With that degree of success while in my early twenties, something else happened:

I evolved—devolved, really—into a colossal asshole (and in just about a year out of art school, no less). The press and the praise became what fulfilled me, and they went straight to my head. They inflated my ego. I actually felt somewhat superior to my fellow designers.

The casualties? My design stagnated. Its evolution—my evolution— stagnated.

I felt so supremely confident in my abilities that I effectively stopped researching and discovering. When previously sketching concepts or iterating ideas in lead was my automatic step one, I instead leaped right into Photoshop. I drew my inspiration from the smallest of sources (and with blinders on). Any critique of my work from my peers was often vehemently dismissed. The most tragic loss: I had lost touch with my values.

My ego almost cost me some of my friendships and burgeoning professional relationships. I was toxic in talking about design and in collaboration. But thankfully, those same friends gave me a priceless gift: candor. They called me out on my unhealthy behavior.

Admittedly, it was a gift I initially did not accept but ultimately was able to deeply reflect upon. I was soon able to accept, and process, and course correct. The realization laid me low, but the re-awakening was essential. I let go of the “reward” of adulation and re-centered upon what stoked the fire for me in art school. Most importantly: I got back to my core values.

Always Students

Following that short-term regression, I was able to push forward in my personal design and career. And I could self-reflect as I got older to facilitate further growth and course correction as needed.

As an example, let’s talk about the Large Hadron Collider. The LHC was designed “to help answer some of the fundamental open questions in physics, which concern the basic laws governing the interactions and forces among the elementary objects, the deep structure of space and time, and in particular the interrelation between quantum mechanics and general relativity.” Thanks, Wikipedia.

Around fifteen years ago, in one of my earlier professional roles, I designed the interface for the application that generated the LHC’s particle collision diagrams. These diagrams are the rendering of what’s actually happening inside the Collider during any given particle collision event and are often considered works of art unto themselves.

Designing the interface for this application was a fascinating process for me, in that I worked with Fermilab physicists to understand what the application was trying to achieve, but also how the physicists themselves would be using it. To that end, in this role,

I cut my teeth on usability testing, working with the Fermilab team to iterate and improve the interface. How they spoke and what they spoke about was like an alien language to me. And by making myself humble and working under the mindset that I was but a student, I made myself available to be a part of their world to generate that vital connection.

I also had my first ethnographic observation experience: going to the Fermilab location and observing how the physicists used the tool in their actual environment, on their actual terminals. For example, one takeaway was that due to the level of ambient light-driven contrast within the facility, the data columns ended up using white text on a dark gray background instead of black text-on-white. This enabled them to pore over reams of data during the day and ease their eye strain. And Fermilab and CERN are government entities with rigorous accessibility standards, so my knowledge in that realm also grew. The barrier-free design was another essential form of connection.

So to those core drivers of my visual problem-solving soul and ultimate fulfillment: discovery, exposure to new media, observation, human connection, and evolution. What opened the door for those values was me checking my ego before I walked through it.

An evergreen willingness to listen, learn, understand, grow, evolve, and connect yields our best work. In particular, I want to focus on the words ‘grow’ and ‘evolve’ in that statement. If we are always students of our craft, we are also continually making ourselves available to evolve. Yes, we have years of applicable design study under our belt. Or the focused lab sessions from a UX bootcamp. Or the monogrammed portfolio of our work. Or, ultimately, decades of a career behind us.

But all that said: experience does not equal “expert.”

As soon as we close our minds via an inner monologue of ‘knowing it all’ or branding ourselves a “#thoughtleader” on social media, the designer we are is our final form. The designer we can be will never exist.




lit

Opportunities for AI in Accessibility

In reading Joe Dolson’s recent piece on the intersection of AI and accessibility, I absolutely appreciated the skepticism that he has for AI in general as well as for the ways that many have been using it. In fact, I’m very skeptical of AI myself, despite my role at Microsoft as an accessibility innovation strategist who helps run the AI for Accessibility grant program. As with any tool, AI can be used in very constructive, inclusive, and accessible ways; and it can also be used in destructive, exclusive, and harmful ones. And there are a ton of uses somewhere in the mediocre middle as well.

I’d like you to consider this a “yes… and” piece to complement Joe’s post. I’m not trying to refute any of what he’s saying but rather provide some visibility to projects and opportunities where AI can make meaningful differences for people with disabilities. To be clear, I’m not saying that there aren’t real risks or pressing issues with AI that need to be addressed—there are, and we’ve needed to address them, like, yesterday—but I want to take a little time to talk about what’s possible in hopes that we’ll get there one day.

Alternative text

Joe’s piece spends a lot of time talking about computer-vision models generating alternative text. He highlights a ton of valid issues with the current state of things. And while computer-vision models continue to improve in the quality and richness of detail in their descriptions, their results aren’t great. As he rightly points out, the current state of image analysis is pretty poor—especially for certain image types—in large part because current AI systems examine images in isolation rather than within the contexts that they’re in (which is a consequence of having separate “foundation” models for text analysis and image analysis). Today’s models aren’t trained to distinguish between images that are contextually relevant (that should probably have descriptions) and those that are purely decorative (which might not need a description) either. Still, I still think there’s potential in this space.

As Joe mentions, human-in-the-loop authoring of alt text should absolutely be a thing. And if AI can pop in to offer a starting point for alt text—even if that starting point might be a prompt saying What is this BS? That’s not right at all… Let me try to offer a starting point—I think that’s a win.

Taking things a step further, if we can specifically train a model to analyze image usage in context, it could help us more quickly identify which images are likely to be decorative and which ones likely require a description. That will help reinforce which contexts call for image descriptions and it’ll improve authors’ efficiency toward making their pages more accessible.

While complex images—like graphs and charts—are challenging to describe in any sort of succinct way (even for humans), the image example shared in the GPT4 announcement points to an interesting opportunity as well. Let’s suppose that you came across a chart whose description was simply the title of the chart and the kind of visualization it was, such as: Pie chart comparing smartphone usage to feature phone usage among US households making under $30,000 a year. (That would be a pretty awful alt text for a chart since that would tend to leave many questions about the data unanswered, but then again, let’s suppose that that was the description that was in place.) If your browser knew that that image was a pie chart (because an onboard model concluded this), imagine a world where users could ask questions like these about the graphic:

  • Do more people use smartphones or feature phones?
  • How many more?
  • Is there a group of people that don’t fall into either of these buckets?
  • How many is that?

Setting aside the realities of large language model (LLM) hallucinations—where a model just makes up plausible-sounding “facts”—for a moment, the opportunity to learn more about images and data in this way could be revolutionary for blind and low-vision folks as well as for people with various forms of color blindness, cognitive disabilities, and so on. It could also be useful in educational contexts to help people who can see these charts, as is, to understand the data in the charts.

Taking things a step further: What if you could ask your browser to simplify a complex chart? What if you could ask it to isolate a single line on a line graph? What if you could ask your browser to transpose the colors of the different lines to work better for form of color blindness you have? What if you could ask it to swap colors for patterns? Given these tools’ chat-based interfaces and our existing ability to manipulate images in today’s AI tools, that seems like a possibility.

Now imagine a purpose-built model that could extract the information from that chart and convert it to another format. For example, perhaps it could turn that pie chart (or better yet, a series of pie charts) into more accessible (and useful) formats, like spreadsheets. That would be amazing!

Matching algorithms

Safiya Umoja Noble absolutely hit the nail on the head when she titled her book Algorithms of Oppression. While her book was focused on the ways that search engines reinforce racism, I think that it’s equally true that all computer models have the potential to amplify conflict, bias, and intolerance. Whether it’s Twitter always showing you the latest tweet from a bored billionaire, YouTube sending us into a Q-hole, or Instagram warping our ideas of what natural bodies look like, we know that poorly authored and maintained algorithms are incredibly harmful. A lot of this stems from a lack of diversity among the people who shape and build them. When these platforms are built with inclusively baked in, however, there’s real potential for algorithm development to help people with disabilities.

Take Mentra, for example. They are an employment network for neurodivergent people. They use an algorithm to match job seekers with potential employers based on over 75 data points. On the job-seeker side of things, it considers each candidate’s strengths, their necessary and preferred workplace accommodations, environmental sensitivities, and so on. On the employer side, it considers each work environment, communication factors related to each job, and the like. As a company run by neurodivergent folks, Mentra made the decision to flip the script when it came to typical employment sites. They use their algorithm to propose available candidates to companies, who can then connect with job seekers that they are interested in; reducing the emotional and physical labor on the job-seeker side of things.

When more people with disabilities are involved in the creation of algorithms, that can reduce the chances that these algorithms will inflict harm on their communities. That’s why diverse teams are so important.

Imagine that a social media company’s recommendation engine was tuned to analyze who you’re following and if it was tuned to prioritize follow recommendations for people who talked about similar things but who were different in some key ways from your existing sphere of influence. For example, if you were to follow a bunch of nondisabled white male academics who talk about AI, it could suggest that you follow academics who are disabled or aren’t white or aren’t male who also talk about AI. If you took its recommendations, perhaps you’d get a more holistic and nuanced understanding of what’s happening in the AI field. These same systems should also use their understanding of biases about particular communities—including, for instance, the disability community—to make sure that they aren’t recommending any of their users follow accounts that perpetuate biases against (or, worse, spewing hate toward) those groups.

Other ways that AI can helps people with disabilities

If I weren’t trying to put this together between other tasks, I’m sure that I could go on and on, providing all kinds of examples of how AI could be used to help people with disabilities, but I’m going to make this last section into a bit of a lightning round. In no particular order:

  • Voice preservation. You may have seen the VALL-E paper or Apple’s Global Accessibility Awareness Day announcement or you may be familiar with the voice-preservation offerings from Microsoft, Acapela, or others. It’s possible to train an AI model to replicate your voice, which can be a tremendous boon for people who have ALS (Lou Gehrig’s disease) or motor-neuron disease or other medical conditions that can lead to an inability to talk. This is, of course, the same tech that can also be used to create audio deepfakes, so it’s something that we need to approach responsibly, but the tech has truly transformative potential.
  • Voice recognition. Researchers like those in the Speech Accessibility Project are paying people with disabilities for their help in collecting recordings of people with atypical speech. As I type, they are actively recruiting people with Parkinson’s and related conditions, and they have plans to expand this to other conditions as the project progresses. This research will result in more inclusive data sets that will let more people with disabilities use voice assistants, dictation software, and voice-response services as well as control their computers and other devices more easily, using only their voice.
  • Text transformation. The current generation of LLMs is quite capable of adjusting existing text content without injecting hallucinations. This is hugely empowering for people with cognitive disabilities who may benefit from text summaries or simplified versions of text or even text that’s prepped for Bionic Reading.

The importance of diverse teams and data

We need to recognize that our differences matter. Our lived experiences are influenced by the intersections of the identities that we exist in. These lived experiences—with all their complexities (and joys and pain)—are valuable inputs to the software, services, and societies that we shape. Our differences need to be represented in the data that we use to train new models, and the folks who contribute that valuable information need to be compensated for sharing it with us. Inclusive data sets yield more robust models that foster more equitable outcomes.

Want a model that doesn’t demean or patronize or objectify people with disabilities? Make sure that you have content about disabilities that’s authored by people with a range of disabilities, and make sure that that’s well represented in the training data.

Want a model that doesn’t use ableist language? You may be able to use existing data sets to build a filter that can intercept and remediate ableist language before it reaches readers. That being said, when it comes to sensitivity reading, AI models won’t be replacing human copy editors anytime soon. 

Want a coding copilot that gives you accessible recommendations from the jump? Train it on code that you know to be accessible.


I have no doubt that AI can and will harm people… today, tomorrow, and well into the future. But I also believe that we can acknowledge that and, with an eye towards accessibility (and, more broadly, inclusion), make thoughtful, considerate, and intentional changes in our approaches to AI that will reduce harm over time as well. Today, tomorrow, and well into the future.


Many thanks to Kartik Sawhney for helping me with the development of this piece, Ashley Bischoff for her invaluable editorial assistance, and, of course, Joe Dolson for the prompt.




lit

Rocket science (2007) / written and directed by Jeffrey Blitz [DVD].

[U.K.] : Optimum Releasing, [2008]




lit

Khodorkovsky : how the richest man in Russia became its most famous political prisoner (2011) / written, produced and directed by Cyril Tuschi [DVD].

[U.K.] : Trinity film, [2012]




lit

Taiwan Rejects China's Claims Over South China Sea Amid Military Escalation

the Chinese coast guard using water cannons against Philippine vessels, further escalating tensions.




lit

Probing the occurrence, sources and cancer risk assessment of polycyclic aromatic hydrocarbons in PM2.5 in a humid metropolitan city in China

Environ. Sci.: Processes Impacts, 2024, Advance Article
DOI: 10.1039/D3EM00566F, Paper
Decai Liu, Xingquan Li, Jiaxin Liu, Fengwen Wang, Yan Leng, Zhenliang Li, Peili Lu, Neil L. Rose
Fifty-two consecutive PM2.5 samples from December 2021 to February 2022 (the whole winter) were collected in the center of Chongqing, a humid metropolitan city in China.
To cite this article before page numbers are assigned, use the DOI form of citation above.
The content of this RSS Feed (c) The Royal Society of Chemistry




lit

Experimental factors influencing the bioaccessibility and the oxidative potential of transition metals from welding fumes

Environ. Sci.: Processes Impacts, 2024, Advance Article
DOI: 10.1039/D3EM00546A, Paper
Manuella Ghanem, Laurent Y. Alleman, Davy Rousset, Esperanza Perdrix, Patrice Coddeville
Experimental conditions such as extraction methods and storage conditions induce biases on the measurement of the oxidative potential and the bioaccessibility of transition metals from welding fumes.
To cite this article before page numbers are assigned, use the DOI form of citation above.
The content of this RSS Feed (c) The Royal Society of Chemistry




lit

Co-culture of benzalkonium chloride promotes the biofilm formation and decreases the antibiotic susceptibility of a Pseudomonas aeruginosa strain

Environ. Sci.: Processes Impacts, 2024, Accepted Manuscript
DOI: 10.1039/D4EM00035H, Paper
Caihong Wang, Qiao Ma, Jiaxin Zhang, Nan Meng, Dan Xu
Benzalkonium chloride (BAC) is a disinfectant with broad-spectrum antibacterial properties, yet despite its widespread use and detection in the environments, the effects of BAC exposure on microorganisms remain poorly documented....
The content of this RSS Feed (c) The Royal Society of Chemistry




lit

Leaving problems of people to winds, Ministers going on political tours: Harish Rao




lit

Metropolitan commissioner appointed as coordinating officer for the survey

The order also named three officials as monitoring officers, each in charge of two zones




lit

Congress govt doing injustice by not releasing sanctioned money for Dalit Bandhu: BRS MLA Kaushik Reddy

BRS MLA Kaushik Reddy says he is being threatened of cases for supporting Dalits




lit

Plurality and identity: on the educational relations between chemistry and physics

Chem. Educ. Res. Pract., 2025, Advance Article
DOI: 10.1039/D4RP00288A, Perspective
Pedro J. Sánchez Gómez, Mauricio Suárez
To cite this article before page numbers are assigned, use the DOI form of citation above.
The content of this RSS Feed (c) The Royal Society of Chemistry




lit

“It is not just the shape, there is more”: students’ learning of enzyme–substrate interactions with immersive Virtual Reality

Chem. Educ. Res. Pract., 2025, Advance Article
DOI: 10.1039/D4RP00210E, Paper
Henry Matovu, Mihye Won, Roy Tasker, Mauro Mocerino, David Franklin Treagust, Dewi Ayu Kencana Ungu, Chin-Chung Tsai
To cite this article before page numbers are assigned, use the DOI form of citation above.
The content of this RSS Feed (c) The Royal Society of Chemistry




lit

In situ self-reconstructed hierarchical bimetallic oxyhydroxide nanosheets of metallic sulfides for high-efficiency electrochemical water splitting

Mater. Horiz., 2024, 11,1797-1807
DOI: 10.1039/D3MH02090H, Communication
Yaning Fan, Junjun Zhang, Jie Han, Mengyuan Zhang, Weiwei Bao, Hui Su, Nailiang Wang, Pengfei Zhang, Zhenghong Luo
The obtained bimetallic sulfide catalyst can be reconstituted as FeCoOOH, which has high efficacy for water splitting. The activation energy barrier of key reaction steps can be effectively reduced by dual-metal cooperation.
The content of this RSS Feed (c) The Royal Society of Chemistry




lit

Advances and challenges in the modification of photoelectrode materials for photoelectrocatalytic water splitting

Mater. Horiz., 2024, 11,1638-1657
DOI: 10.1039/D4MH00020J, Review Article
Longyue Yang, Fang Li, Quanjun Xiang
With the increasing consumption of fossil fuels, the development of clean and renewable alternative fuels has become a top priority.
The content of this RSS Feed (c) The Royal Society of Chemistry




lit

Improved photovoltaic performance and stability of perovskite solar cells by adoption of an n-type zwitterionic cathode interlayer

Mater. Horiz., 2024, Advance Article
DOI: 10.1039/D4MH00253A, Communication
Young Wook Noh, Jung Min Ha, Jung Geon Son, Jongmin Han, Heunjeong Lee, Dae Woo Kim, Min Hun Jee, Woo Gyeong Shin, Shinuk Cho, Jin Young Kim, Myoung Hoon Song, Han Young Woo
Integration of NDI-ZI as a cathode interlayer in perovskite solar cells improves both device efficiency and stability, mitigating halide and Ag ion migration by chemically capturing ions via electrostatic Coulombic interactions.
To cite this article before page numbers are assigned, use the DOI form of citation above.
The content of this RSS Feed (c) The Royal Society of Chemistry




lit

Construction of hierarchical porous and polydopamine/salicylaldoxime functional zeolite imidazolate framework-8 via controlled etching for uranium adsorption

Mater. Horiz., 2024, Accepted Manuscript
DOI: 10.1039/D3MH02108D, Communication
Kai Tuo, Jin Li, Yi Li, Chuyao Liang, Cuicui Shao, Weifeng Hou, Zhijian Li, Shouzhi Pu, Chunhui Deng
Efficient uranium extraction from seawater is critical for developing the nuclear industry. Herein, a polydopamine/salicylaldoxime decorated hierarchical zeolite imidazolate framework-8 (H-PDA/SA-ZIF-8) is constructed by using a controlled etching process. Benefiting...
The content of this RSS Feed (c) The Royal Society of Chemistry




lit

Advancing Lithium-Sulfur Battery Efficiency: Utilizing a 2D/2D g-C3N4@MXene Heterostructure to Enhance Sulfur Evolution Reactions and Regulate Polysulfides in Lean Electrolyte Conditions

Mater. Horiz., 2024, Accepted Manuscript
DOI: 10.1039/D4MH00200H, Communication
Vijay Kumar, Otavio Augusto Titton Dias, Abdelaziz Gouda, Ritu Malik, Mohini M. Sain
Lithium–sulfur batteries (LSBs) show promise for achieving a high energy density of 500 Wh/kg, despite challenges such as poor cycle life and low energy efficiency due to sluggish redox kinetics...
The content of this RSS Feed (c) The Royal Society of Chemistry




lit

Composited silk fibroins ensured adhesion stability and magnetic controllability of Fe3O4-nanoparticle coating on implant for biofilm treatment

Mater. Horiz., 2024, Advance Article
DOI: 10.1039/D4MH00097H, Communication
Kecheng Quan, Zhinan Mao, Yupu Lu, Yu Qin, Shuren Wang, Chunhao Yu, Xuewei Bi, Hao Tang, Xiaoxiang Ren, Dafu Chen, Yan Cheng, Yong Wang, Yufeng Zheng, Dandan Xia
Magnetic propulsion of nano-/micro-robots is an effective way to treat implant-associated infections by physically destroying biofilm structures to enhance antibiotic killing.
To cite this article before page numbers are assigned, use the DOI form of citation above.
The content of this RSS Feed (c) The Royal Society of Chemistry




lit

High-Entropy Materials for Thermoelectric Applications: Towards Performance and Reliability

Mater. Horiz., 2024, Accepted Manuscript
DOI: 10.1039/D3MH02181E, Review Article
NOUREDINE OUELDNA, Noha Sabi, Hasna Aziam, Vera Trabadelo, Hicham Ben youcef
High-entropy materials (HEMs), including alloys, ceramics and other entropy-stabilized compounds, have attracted considerable attention in different application fields. This is due to their intrinsically unique concept and properties, such as...
The content of this RSS Feed (c) The Royal Society of Chemistry




lit

Strength-ductility materials by engineering coherent interface at incoherent precipitates

Mater. Horiz., 2024, Accepted Manuscript
DOI: 10.1039/D4MH00139G, Communication
Dongxin Mao, Yuming Xie, Xiangchen Meng, Xiaotian Ma, Zeyu Zhang, Xiuwen Sun, Long Wan, Korzhyk Volodymyr, Yongxian Huang
In the quest for excellent light-structural materials that can withstand mechanical extremes for advanced applications, design and control of microstructures beyond current material design strategy become paramount. Here, we design...
The content of this RSS Feed (c) The Royal Society of Chemistry




lit

Microcage flame retardants with complete recyclability and durability via reversible interfacial locking engineering

Mater. Horiz., 2024, 11,1867-1876
DOI: 10.1039/D4MH00116H, Communication
Furong Zeng, Lei He, Jianwen Ma, Danxuan Fang, Zhiwei Zeng, Tongyu Bai, Rong Ding, Bowen Liu, Haibo Zhao, Yuzhong Wang
A new facile and scalable interfacial locking engineering strategy is exploited to endow reversible microcages with infinite chemical recyclability to starting monomers, exceptional durability, high flame-retardant efficiency, and extensive applicability across diverse polymers.
The content of this RSS Feed (c) The Royal Society of Chemistry




lit

Not Just Art, an initiative by city-based Youth4Jobs brings hope to the artists with disabilities

A celebration of arts and abilities




lit

Watch | An online design studio inspired by Tamil literature




lit

French artist Olivia de Bona showcases the possibilities of wall art at her exhibition in Thiruvananthapuram

Olivia’s works were showcased at Alliance Francaise de Trivandrum as part of the second edition of Wall Art Festival




lit

A world of new possibilities in the Tarab Khan exhibition at Gurugram’s camera museum

Titled ‘At the Gates of Talbosh’, the paintings at Museo Camera offer an escape from reality and create a new world




lit

Guru Dutt's sister and famous painter Lalitha Lajmi passes away at 90

Over the decades, Lajmi has held several exhibitions at international art galleries in Paris, London and Holland




lit

Abstract paintings by Chennai artist Bhagwan Chavan comments on freedom and responsibility

A collection of abstract paintings by Bhagwan Chavan deliberates on freedom and the responsibility that it comes with it




lit

A moment for just transition litigation to take wing

The core issue in the M.K. Ranjitsinh case — the protection of the Great Indian Bustard in energy projects — can be used to facilitate equitable and inclusive climate action




lit

New migrant realities in Karnataka’s gig sector

In the context of the broader ramifications of Karnataka’s local employment law, five State-level reports from the southern States — on the Editorial and Opinion Pages — on labour conditions on the ground. In Karnataka, social security schemes which demand domicile status could be ignorant of the migrant realities of today




lit

Reality of reel life, exploitation as a structural problem

The findings in the K. Hema Committee report must pave the way for reforms in the film industry; the government needs to take an effective role in this




lit

Instability and uncertainty stalk Bangladesh

The troubles in Bangladesh are by no means over and India may need new strategies to deal with the situation to its east




lit

Indian military export to Israel — aiding genocide

The top court’s dismissal of a petition on the subject highlights the limits of judicial review over executive decisions in matters of foreign policy, especially in violations of humanitarian law




lit

Bridging the chasm of global inequality

The big lesson from the Summit is that developing countries have yet to exploit the opportunities presented by the U.N. system




lit

The gruelling course of litigation in India

Court scheduling and case management continue to be a hurdle that litigants face




lit

Israel’s brutality in Gaza, India’s pin-drop silence

New Delhi’s response to the Gaza war is a reflection of the ‘new India’s’ collective anti-colonial amnesia




lit

‘IS militants kill 30 civilians in central Afghanistan’

There has so far been no official claim of responsibility from the group.




lit

19 LeT militants killed in airstrikes in Afghanistan

The statement, however, did not provide the details of the air strikes and was silent on whether international coalition was involved in the operation.




lit

Pakistan’s polity doesn’t have capacity to sustain normal ties with India: Menon

"I would characterize [India—Pakistan relations] today as managed hostility, which I hope it stays managed," Mr. Menon said in response to a question.




lit

Pakistan court moved against ‘secret demolition’ of Hindu temple

The petitioners added that different newspapers had reported, along with pictures, that the ancient Hindu temple located in Mohallah Wangrhi Garah was being secretly demolished so a commercial plaza could be constructed there.




lit

Wage hike of little help for Sri Lankan estate workers

From the time the British brought them down from Tamil Nadu a century ago, estate workers have been toiling in Sri Lanka’s famed tea estates.




lit

32 ‘elite’ Muslims have joined IS, says Sri Lanka




lit

Bodies of Karachi airport attack militants to be exhumed

Three men are facing trial before an anti-terrorism court for allegedly providing logistic support, funds and weapons to the attackers.




lit

Pakistan bans two militant outfits

Jamaat-ul-Ahrar, a splinter group of the Tehreek-i-Taliban Pakistan, and Lashkar-i-Jhangvi Al-Alami (LeJ) were banned.




lit

Credible minimum deterrence needed for regional stability:Aziz

Aziz claimed that Pakistan is maintaining "minimum nuclear deterrence" for peace and stability in the region and called upon the international community to desist from policies and actions that undermine strategic stability in South Asia.




lit

Supreme Court directs Centre to establish mandatory accessibility standards for disabled persons

The bench found that one of the rules of the Rights of Persons with Disabilities (RPWD) Act does not establish enforceable, compulsory standards, but rather, it relies on self-regulation through guidelines




lit

Water quality of Ganga in UP deteriorating due to discharge of sewage: NGT

Earlier, while considering the prevention and control of pollution in the Ganga, the green body sought compliance reports from various States, including Uttar Pradesh




lit

Layer of smog covers Delhi as air quality remains 'very poor' for 10th day post-Diwali

As the air pollution levels in the national capital reach "very poor," doctors say that even people with no history of respiratory diseases are suffering from breathing issues




lit

Kozhikode Corporation declared fully digitally literate, Opposition unhappy

UDF-led Oppn. boycotts event; senior citizens in 75 wards trained to use mobile phones, internet, says Mayor