Designing the first set of apps for Apple Vision Pro while helping VisionOS become a powerful OS for app developers.

Project Overview

The last project I worked on while I was at Apple was redesigning the iWork apps for Apple Vision Pro and VisionOS. Much of the OS wasn’t designed when the other app design leads and myself were brought into the project. The OS team’s goal was that in having the various app design leads flesh out their respective apps with the foundational design decisions of the pre-beta version of VisionOS, it would help inform their roadmap for what functionality needed to be prioritized to have a sufficient level of support for third-party developers.

My Contribution

As this was an ultra-black project, I was the only member of my design team that could work on it, so the early concepting and designing was done by myself, working with Apple’s OS design team for design reviews and guidance on platform design decisions. I left the company about three months into the project, so I didn’t see this to completion.

Problem

When I joined this project, Apple Vision Pro was early in its development, especially with VisionOS. A lot of hardware and user input  problems had already been solved, like looking at an object to select, and pinching your fingers together to click or drag it. They had already decided that every app would be displayed as a plane in the real-world environment, and that you could drag apps around and drop them in specific positions in your room. But there were still a lot of questions around how the OS would support different scenarios, especially when the expectation was that iOS developers could easily port their iPad app to be a VisionOS app. Some of the questions I focused on were:

  1. How do document-based apps behave if you want to open multiple documents side-by-side? Could there be multiple app windows? Is there a primary app window for app-level dialogs?
  2. How do you resize a window to see more of a document as opposed to zooming it to see the document larger?
  3. How should parent/child windows behave? Should child windows would anchored to their parent window, or can you drag them independently of the parent window?

User & Business Goals

There are two versions of iWork apps: the MacOS and the iOS (including the iPadOS) versions. MacOS iWork is more advanced: it has more functionality, additional features, and was designed to support more complex workflows. The iOS version was designed to support most common workflows, but it was designed to be simple and easy more than it was to be powerful. You could still access most of the features, but it might take more steps or be inside discrete modes, rather than something you can achieve more quickly on a Mac. The goal for the VisionOS version was find a happy middle-ground: to be based on the iOS app (which was a technical decision), but with an infinite amount of screen space, which inherently encourages more complex workflows. If you have infinite space to work, you want to have all your tools open and readily accessible rather than switching tabs.

There was also a strong focus to create special features that showed off the advantage of doing work with AR/VR headsets, to inspire app developers to make create a wide range of apps and to attract consumers who worried that they couldn’t justify paying $3,500 for a glorified gaming and media consumption device.

iPadOS Keynote alongside MacOS Keynote. MacOS apps have space for more buttons in the sidebar as well as buttons in the toolbar. On iOS/iPadOS, many controls are hidden in menus that display from the Format, Insert, and More buttons.

Solution

The reference designs provided the OS design team gave guidance for the design framework: a single window proportioned similarly to a landscape iPad, with a blurry translucent surface for UI panels. After establishing the baseline designs for the hero screens: slide editor, light table, and presentation mode, I did a lot of iterations on the sizes of controls. Some variations had large buttons that were reminiscent of iOS buttons that you would tap with your fingertips, but those proved to be too large and chunky to be able to fit enough controls without scrolling. Mac-styled controls looked dated and traditional while floating in mid-air on a translucent surface.

There was also a strong focus to create special features that showed off the advantage of doing work with AR/VR headsets, to inspire app developers to make create a wide range of apps and to attract consumers who worried that they couldn’t justify paying $3,500 for a glorified gaming and media consumption device.

Show mode: an immersive view of your slideshow that’s intended for audiences to view the animations, videos, and sounds in your presentation. The presenter controls are attached to the user’s vantage point, and follows the user as they move around the room.

Theater Mode

The new headset hardware and sensors allowed for a new kind of user experience, and we wanted to use it to identify new solutions to existing user problems, and we landed on the user’s need to rehearse their presentation. Oftentimes when inexperienced presenters stand on stage for the first time, its a disorienting and overwhelming experience. From a spatial perspective, it’s a much different feeling from talking to a group of people 10ft away from you than it is to talk to thousands of people who are 100ft away from you. You have to lift your head higher, you have to project your voice louder, all of your body movements have to be exaggerated because you’re so far away from your audience. To do this well, it requires a lot of professional training and practice. We thought the Apple Vision Pro’s immersive experience was a perfect way to help people with this problem.

In Keynote’s Theater mode, the presenter is placed in the center of the stage in a large theater. Aside from rows of seats, the only thing the presenter sees is the presenter display (current slide, next slide, notes, and a timer) to emulate what they would normally have as down-stage monitors. The actual theater screen is behind them. This helps train presenters to not turn their back from the audience to look at the slide, which is more common in a conference room. The presenter can also jump to different seats in the theater, to get a viewer’s perspective of the stage, slide legibility, and presenter size.

3D Objects

As we were designing the iWork apps for VisionOS, one of the shortcomings we initially identified was that it was an underwhelming experience to be in an immersive, three-dimensional experience but then work with apps that were all two-dimensional. The closest analogy is being in a museum, walking around a space and looking at flat surfaces. We knew that to create a really magical moment, we needed to let the user create in 3D. But documents, spreadsheets, and slideshows are all 2D-based documents.

I worked to align my product team that for the launch of iWork on VisionOS, we needed to add a new feature to the roadmap: the ability to add 3D objects into a document. This way, a user could “pull out” the 3D object and interactive with it spatially, rotating it around to see its various sides, resizing it to be larger, and being able to walk around it. But that alone didn’t seem especially useful for people trying to create richer, more meaningful documents. So we designed this feature with the question: if users could have 3D objects in documents, what are the scenarios that would be most common or most useful? If people were studying an object from all angles, it would make sense to be able to add callouts and descriptions that pointed to areas of the object. In Keynote, it would make sense for people to be animate the rotation and scale of objects, to be able to tell a spatial narrative.