Back Link
Reader View

After working on the Vision Pro, this AR veteran is going back to phones

www.theverge.com · April 30, 2026 · 15:30

Posts from this topic will be added to your daily email digest and your homepage feed.

Pixi CEO Mark Drummond thinks headsets are not worth the hassle for AR developers.

Posts from this author will be added to your daily email digest and your homepage feed.

When Mark Drummond was working on the Vision Pro at Apple, he had a bit of an epiphany that didn’t really fit Cupertino’s preferred narrative.

Drummond was managing the Character Intelligence Team, which among other things built the Encounter Dinosaurs demo. Preinstalled on the Vision Pro, the demo puts viewers eye to eye with interactive prehistoric creatures.

“We built that with Jon Favreau,” Drummond says, referring to Apple’s long-standing partnership with the Mandalorian director. “Before the headset was available to us to take down to Burbank, we used iPhones and iPads,” he recalls. Relying on mobile devices for demos made sense. VisionOS is essentially an iPadOS fork. “It did actually work out. We had a really good experience with iPhone and iPad,” he says.

Through that experience, Drummond realized that these mobile devices weren’t just decent stand-ins for the headset. “What we learned looking for sources of surprise and delight with interactive characters in mixed reality is that the headset is actually not the best [device] for this kind of thing,” he says.

“I still think it’s a pretty fabulous piece of hardware,” Drummond says. However, headsets can also be alienating, and separate viewers from the world and from the people around them. “It’s kind of lonely,” Drummond says. Having an AR app on a phone, on the other hand, makes it much easier to show it to others. “People can lean in over your shoulder,” Drummond says.

That’s why, shortly after leaving Apple in 2023, Drummond embraced mobile devices for augmented reality storytelling. For the past two and a half years, he has been working on a new iPhone AR app called Pixi that’s all about mobile-first interactive storytelling.

In short, Pixi is working on something that could best be described as the AR version of the email greeting card. When it launches in the coming weeks, Pixi will let anyone pick an interactive character and a scenario, add a personalized message, and then send it to their contacts via iMessage or WhatsApp. Once the recipient opens such a Pixi message, the character appears overlaid in the camera view of their real-world environment, and interacts with them.

During a recent demo, Pixi’s AR experiences included an animated cat and a robot. They could tell jokes, play tic-tac-toe with you, or challenge you to a game of Whac-A-Mole, right on your desk.

At first glance, none of it seemed especially groundbreaking if you’ve ever played with an AR Snapchat filter or any other mobile AR app. “If you’re not slightly embarrassed by the first product, you launched way too [late],” Drummond says, suggesting that the company will incorporate additional characters and more advanced storylines in the future.

However, Pixi is incorporating some interesting interactive elements: The cat comedian, for instance, delivers one dad joke after another until the app detects the viewer cracking a smile via the phone’s front-facing camera. After that, the character thanks the viewer for their attention like a comedian ending their show and delivers a personalized greeting message.

“In order to make a character feel like it’s present, it has to pay attention,” Drummond says. That includes not just paying attention to the viewer, but also the environment they are in. “This kind of attention is only possible through on-device AI,” he says. Pixi uses AI and machine learning to recognize facial expressions, and the app also downloads custom ML models on the fly to recognize objects and then incorporate them into a story.

When mobile AR first became popular with Snapchat filters and ARKit-based apps, developers had very little to work with when it came to interacting with a viewer’s real-world environment. Basically, apps would detect surfaces and just put any virtual object on top of those surfaces. Over time, AR apps became smarter about occlusion, allowing characters to hide behind real-world objects. Eventually, some even offered rudimentary object detection.

With recent advancements in AI, there’s a potential to supercharge object recognition, to the point where apps will be able to recognize classes of objects, understand how they work, and incorporate them into interactive experiences. Here’s how Drummond envisions Pixi’s AR greeting cards will work in the future:

“Let’s say that I have a friend [who is] a lawyer. She got promoted, and I want to say congratulations. I send her a [Pixi with a] golden retriever, which appears on the office floor. It looks at her, then starts sniffing. It walks over to a filing cabinet, and is very focused on a drawer in that filing cabinet. She thinks: Dog, that’s just old tax returns. The dog will not relent. It looks, sniffs, looks winsome. She opens the drawer. Inside, we’ve hidden a huge virtual dog biscuit [with the message]: ‘Congratulations, see you Friday.’”

Would all this be technically possible with a Vision Pro app as well? Perhaps. But there are plenty of other reasons holding the headset back — including that $3,500 price tag. “It’s expensive,” Drummond acknowledges, adding, “I still think it’s a steal given how much amazing technology you’ve got, just those displays alone.”

However, the high price of the device has also held back adoption, with IDC estimating that the company shipped just 45,000 units during last year’s holiday quarter. “It doesn’t have great market penetration,” Drummond says.

Even people who own one may be reluctant to put it on every time someone sends them a two-minute clip or experience. “You don’t always have it to hand, and [you’ve got] the setup, teardown costs,” Drummond says. “We have phones with us all the time.”

It’s no secret that the Vision Pro was just a first step for Apple, which is reportedly working on its own smart and AR glasses. However, Drummond believes that such glasses products won’t be able to replace smart phones for years to come due to physical constraints. Phones are now capable of running complex machine learning models locally, while glasses will have much more limited power and compute capacities. Because of those constraints, they’ll likely function as another display for notifications and directions, similar to the Apple Watch. “I think the Watch and the spectacles from Apple will do much the same job,” he says.

This means that AR developers looking to build interactive entertainment experiences may find that the phone remains the best platform for the foreseeable future — a platform that is getting more capable by the day, thanks to AI.

A free daily digest of the news that matters most.