4

Vision Two ‒ Virtual Worlds Appear

As we did in previous chapters in Part II, we will be looking at an industry vertical in this chapter. This chapter's vertical is Technology, Media, and Telecommunications (TMT). Since this vertical goes to the core of what Spatial Computing is very much about, it will necessarily be longer than the other chapters that focus on industry verticals.

New realities are appearing thanks to Virtual Reality and Augmented Reality. In a few years, though, the capabilities of VR and AR will change and morph into more sophisticated Spatial Computing glasses, which will arrive with massive increases in bandwidth thanks to 5G and new AI capabilities, which are due in part to the R&D being done on autonomous vehicles. Here we dive into some of the fundamentals of the TMT vertical, as well as the profound changes we expect.

From 2D to 3D

As we look back at the technology industry, especially when it comes to personal computers and mobile phones, which customarily present two-dimensional images on a flat screen, you can see that there's an underlying set of goals―beliefs, even. These include things such as connecting everyone together, giving you "superpowers" to analyze business data with spreadsheets, and the ability to communicate better with photos and videos. Previous generations were just about giving us better tools than the mechanical ones that existed before, whether the printing press or the old rotary phones that so many of us now-old people had in their homes.

Increasing the Bandwidth

Elon Musk puts it best: the goal is to increase the bandwidth to and from each human. Now he's investing in a new kind of computing with his company Neuralink that includes "jacking in" and hooking a computer directly up to your brain. That will take a lot longer to show up for most people than Spatial Computing, simply because of the cost of opening up a human skull and having a robot surgically implant tiny wires directly in a brain.

One way to look at it is via our current computing interfaces. If you have a thought that you want to communicate with other people, you probably use a mouse or a finger on a screen to open an application or a window, and you use a keyboard to type a message. That's pretty much the same whether you want to communicate in a Microsoft Word document, an SMS text message, an email, or a Tweet on Twitter. On the other side, your reader sees what you wrote on a screen and that person's eyes and visual perception system translates that into something their brain can understand, process, store, integrate, and then maybe reply to.

That process of reading is pretty fast in comparison to the process of writing. There's friction in communication. We're feeling it right now as we type these words into Microsoft Word and are trying to imagine you reading this book.

The process isn't a whole lot faster even if we use video. After all, you can even have someone, or, better yet, an AI, read this text to you while you walk down the street. That would be hard to do if we made a video. (Note: We previously discussed the relevance of AI, including Machine Learning, Deep Learning, and Computer Vision in Chapter 2, Four Paradigms and Six Technologies, so we won't replicate that information here.)

Plus, you can skim and do other things with text that video makes very hard. Your mind, on the other hand, would far rather watch a movie than read a movie script, or watch a football game on TV than read a report about it later.

Where we are going with this is that the tech industry is spending billions of dollars in an attempt to make it easier to communicate and work with each other, or even to better enjoy the natural world. We see the ultimate effect of this in the "jacking-in" fantasy that Elon Musk and others are trying to make a reality. Some take a more negative view, like John von Neumann and his term "the singularity," which was later popularized by Vernor Vinge in his 1993 essay in which he wrote that it would signal the end of the human era. Scary stuff, but the need to communicate our thoughts with not just other people, but machines, like we are doing now with Siri, Alexa, Google Assistant, and Facebook's Portal, are opening up new ways of communicating with machines. Elon Musk's Neuralink, which uses a robot to hook up tiny wires directly to a human brain, is the most forward approach we've seen actually being put to use. That said, we don't think we'll sign up for that in the early exploratory years of the company!

In its current form, such technology brings with it deep side effects and, as of today, costs hundreds of thousands of dollars due to the necessity of a surgical procedure to install something like Neuralink. If you have, for example, Parkinson's, you may get such a technology implanted by the end of the 2020s, but the rest of us will be stuck wearing some form of on-the-face computing, whether eyeglasses or contact lenses, for the foreseeable future.

Change is Coming

Spatial Computing is evolving quickly, with several different device types (which we'll go into more depth about in a bit), but they all require putting screens on your face. We dream of a lightweight pair of glasses that will bring the whole spectrum of Spatial Computing experiences that cover the reality-virtuality continuum introduced by researcher Paul Milgram, when he said that soon humans would be able to experience extended reality, or XR, along a spectrum from "completely real" on one side to "completely virtual" on the other.

Today we have devices like Microsoft's HoloLens 2 that give us tastes of VR, but they aren't really great devices for experiencing the "completely virtual" side of the spectrum, opting to let the user see the real world instead of presenting big, glorious high-resolution virtual screens.

Within the next three years, or certainly by 2025, we'll see devices that let you switch between real and virtual so quickly and seamlessly that some might start to lose track of the reality of what they are experiencing. We'll cover the pros and cons of that in the later chapters here but it's also good for us to slow down and revisit the technology that's here today so that we can have a decent conversation about where things have been and where they might be going.

We argue that it's easier for you to understand what, for example, a Macintosh could do if you knew what previous computers could do and the design constraints that were placed on earlier devices and ecosystems. Our kids forget why the Dodge tool in Adobe Photoshop looks like a lollipop, but those of us who used to work in darkrooms with chemicals and enlargers remember using a tool like that to make parts of photos lighter. The same will happen here.

First, we constantly get pushback every time we explain that you will soon be forced to wear something on your face to fully appreciate the higher communication speed that Spatial Computing brings to your brain.

Yes, why glasses? Why can't we get to this "high bandwidth to and from the brain" world with standard monitors?

Thomas Furness, the American inventor and VR pioneer, asked himself just that question back in the 1960s when he was designing cockpits for pilots in the military. He, and many others, have tried over the years to come up with better monitors. The US government funded HDTV research for just that reason. The answer comes in how we evolved over millions of years. Evolution, or some might say God, gave us two eyes that perceive the analog world in three dimensions and a large part of our brains are dedicated to processing the information our eyes take in.

If we want to take in more and get the bandwidth to our brains higher, we need to move our monitors onto our eyes. If we want to improve the outbound bandwidth from our brains to the computing world, we need to use voice, hands, eyes, and movement and add more sensors to let us communicate with computers with just our thoughts. All of this requires wearing sensors or microphones either on our faces or heads. There is no way around this except in special situations. Andy Wilson, one of Microsoft's top human/machine researchers up in Redmond, Washington, has a lab with dozens of projectors and computers where he can do a lot of this kind of stuff in one room―sort of like the Holodeck in Star Trek. But even if his room is successful, most of us don't want to compute only while sitting in a singular place, and most humans simply don't have the space in their homes and offices for such a contraption anyway.

We see this in high-end uses already. The pilots who fly the F-35 fighter jets tell us they are forced to wear an AR helmet simply to fly the plane. "I'll never lose to an F-16" one pilot at Nellis Airforce Base told us.

"Why not?" we asked.

"Because I can see you, thanks to my radar systems and the fact that your plane isn't stealthy, but you can't see me. Oh, and I can stop and you can't."

Soon many of us will be forced to follow the F-35 pilot in order to maximize our experiences in life. Those of us who already wear corrective lenses, which are about 60 percent of us, will have an easier time, since we already are forced to wear glasses simply to see. We see a world coming, though, where Spatial Computing devices bring so much utility that everyone will wear them, at least occasionally.

This will bring profound changes to humanity. We are already seeing that in VR. Users of VR are already experiencing a new form of entertainment and media; one that promises to be a bigger wave than, say, the invention of cinema, TV, or radio were in previous generations. This new wave builds on the shoulders of those previous innovations and even subsumes them all: wearing the glasses of 2025, you will still be able to read a magazine, watch a TV show, or listen to your favorite radio station (or the podcasts that replace them). As all of these improve, we are seeing new kinds of experiences, new kinds of services, and even a new culture evolving.

To those whom we have told this story, much of this seems improbable, weird, dystopian, and scary. We have heard that with each of the previous paradigm shifts too, but because this puts computing on your face, the resistance is a bit more emotional. We see a world coming over the next few years where we mostly will get over it.

Emergence as a Human Hook

So why do you need to put on a headset? Well, when you're inside the headset, you'll be able to experience things you've never been able to experience on a computer monitor before. But what's the purpose?

The Brain and VR

Why do human beings like playing in VR? Emergence is one reason, and we'll cover a couple of others in a bit. What is emergence? When applied to video games, emergence is the feeling you get from playing complex situations, and unexpected interactions show up from relatively simple game dynamics. The same feelings are why we go to movies. It's why we like taking regular drives on a curvy road. It's why we like holding a baby for the first time. That's why we like falling in love. All of these create positive chemical reactions in our brains that are similar to the emergence that happens in video games. VR enables this more frequently. Since the 1970s and 1980s, board games and table-top role-playing games such as Cosmic Encounter or Dungeons and Dragons have featured intentional emergence as a primary game function by supplying players with relatively simple rules or frameworks for play that intentionally encouraged them to explore creative strategies or interactions and exploit them toward the achievement of victory or a given goal. VR makes all that much more powerful.

If we have sharper and more capable screens we can experience things that make our brains happier, or, if you will, a flow state, which is the euphoria that both workers and surfers report as they do something that brings high amounts of emergence.

The Right Tool for the Job

The thing is, we're going to need different kinds of headsets for different things. VR headsets separate us from others. They aren't appropriate to wear in a shopping mall, on a date night, or in a business meeting. But, in each of these situations, we still want digital displays to make our lives better and to give us that feeling of emergence.

It's the same reason we would rather watch a football game on a high-resolution HDTV, instead of the low-resolution visuals of our grandparents' TVs. It's why we tend to upgrade our phones every few years to get better screens and faster processors. When we watch sports or movies on higher-resolution screens or bigger screens, or both, we can experience emergence at a higher rate.

Emergence is also why notifications on our phones are so darn addictive. Every time a new notification arrives, a new hit of dopamine arrives with it. Marketers and social networks have used these game dynamics to get us addicted and build massive businesses by keeping us staring at our feeds of new items. VR promises to take our dopamine levels from these mechanisms to new levels. In later chapters, we'll cover some of the potential downsides of doing that, but here we'll discover why Spatial Computing devices are seen as a more powerful way to cause these dopamine hits, and thus, a promise to build massive businesses.

These dopamine hits, along with powerful new ways to be productive, are coming from a range of new devices―everything from tiny little screens that bring new forms of your phone's notification streams, to devices that so completely fool your brain that they are being used to train airplane pilots, police, and retail workers.

The "Glassholes" Show the Way

It started with a jump out of a dirigible at a Google programming conference back in May 2014. The jumpers were wearing a new kind of computer: one that put a little screen and a camera right near their right eye. They broadcast a video feed from them to everyone. People jumped up to be the first to order one (we were amongst the first in line) the demo was so compelling.

Defeated Expectations

That said, the demo oversold what would actually materialize. Before we got ours, we thought it would be an experience so futuristic that we had to put down $1,500 to be the first to experience it. Now, don't get us wrong, the first year of wearing Google Glass was pretty fun, mostly. People were highly interested in seeing it. At the NextWeb conference, attendees stood in line for an hour to try ours out. Most people walked away thinking they had seen the future, even though all they saw was a tiny screen that barely worked when we talked to it with "Hey, Google" commands.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSO8AC783CE.tmp

Photo credit: Maryam Scoble. This photo of coauthor Robert Scoble wearing Google Glass in the shower went viral and has been on BBC TV many times, along with many other media outlets.

As we got to wear ours, it became clear that Google Glass wasn't that futuristic and certainly didn't meet expectations. Robert Scoble's wife, Maryam, upon seeing it for the first time, asked, "will it show me anything about my friends?" Truth is: no, it didn't, but that was an expectation. There were other expectations too, set up by that first demo. The camera wasn't that good. Certainly, it wasn't as good as everyone thought it was, which later became a problem since people thought it was constantly recording and uploading, or, worse, doing facial recognition on everyone.

The expectations were, though, that it was a magical device that you could look at anything with and learn more about that person or thing. Those expectations were never really corrected by either Google or the people who owned them, who soon gained a derisive name: "Glassholes." This derisive tone came about after a journalist named Sarah Slocum wore a pair into a bar and was, in her words, accosted due to wearing them. After that, we saw more businesses putting up signs forbidding users from wearing them, and Google soon gave up trying to sell them to consumers.

Where Did Things Go Wrong?

We are frequently asked, how did Google Glass go so wrong? And what does it mean for the next wave of products about to arrive?

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSOA83D710C.tmp

Photo credit: Google. Here is a closer look at the original Google Glass, which had a small transparent screen, along with a camera, microphone, speaker, and a battery.

We think it mostly came down to that dramatic launch and the expectations it caused, but there was a basket of new problems that humans had never considered until Google forced this into public view. Doing such a spectacular launch with people jumping out of dirigibles over a San Francisco convention center with thousands of people in the audience, all watching on huge screens, with millions of other viewers around the world, just set a bunch of expectations that it couldn't meet.

We list the main complaints many had here:

  • The screen was too small.
  • The three-dimensional sensors weren't yet ready.
  • The Graphics Processing Units (GPUs) were very underpowered so the graphics that it could show were very rudimentary.
  • The battery life wasn't great.
  • It didn't have screens or cameras for both eyes, so it wasn't appropriate for doing more serious AR like what Microsoft's HoloLens 2 does today.
  • The cameras weren't as good as in your standard GoPro.
  • The screens were of far lower resolution and size than even your cheapest phone at the time (and compared to today's devices, are like a postage stamp compared to a letter-sized piece of paper).

Worse yet, Google didn't expect the social contract problems that showed up months after they shipped. Humans had evolved to look into each other's eyes to verify and communicate trust, and here was a new device messing with that. Many people focused negative attention on the camera. This touched on privacy expectations that hadn't been explored by previous technologies, and the lack of utility or affordability just added into a perfect storm of negative PR that it couldn't escape from.

That said, Google Glass was, and is, an important product that set the table for a whole range of new products. It was the first computer that you wore on your face that most people heard about, at a minimum, and got a whole industry thinking about putting computers on people's faces. But, this is a hard space to get right. Why? Because it's so damn personal. The computer is on your face in front of your eyes, or as we learned earlier this year from a new start-up, Mojo Vision, actually on your eyes in the form of a contact lens. This isn't like a desktop computer that you can hide in an office and only use when you aren't trying to be social. When fully expressed, these new computing forms change what it means to compute and what it means to be human.

What It Means to Be Human

Whoa. Change what it means to be human? Yes. Soon you'll be able to do things with these glasses that futurists have been predicting for decades. Everything from ubiquitous computing to next-generation AR that changes how we perceive the real world in deep ways. Qualcomm's new prototypes have seven cameras on a device about the size or smaller than Google Glass, which only had one camera. One of these cameras watches your mouth with Artificial Intelligence that was barely being conceived when Google Glass was announced. The others watch your eyes and the real world in high detail that bears little resemblance to what Glass' low-resolution camera was able to do.

It's important for us to pause a moment and take a birds-eye view of the Spatial Computing landscape though, and get a good look at all the devices and where they came from. Spatial Computing is a big tent that includes products that are small and lightweight, like the Google Glass, all the way to devices like the Varjo, which as of 2020, opened with an initial cost of $10,000 and needed to be tethered to a $3,000 PC. The Varjo is big and heavy, but the experience of wearing one is totally futuristic compared to the early Google Glass prototypes.

No longer are we stuck with a tiny postage stamp of a screen. Inside the Varjo, you have huge high-resolution screens. Where Google Glass is like a mobile TV on a kitchen counter, Varjo's device is like being in an IMAX theater. With it, you can simulate flying in an Airbus Jet's cockpit. Where the Google Glass could barely show a few words of text, the Varjo lets you feel like you are in the cockpit as a pilot, with tons of displays blinking away and everything so photorealistic that you feel like you are looking at hundreds of real knobs and levers in a real plane. Out the window, while wearing the Varjo, you can see cities and landscapes emerge, along with runways from a distant airport you were supposed to land at. All simulated, yes, but your brain sees so much resolution it thinks it is real.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSOF3E956FA.tmp

Photo credit: Varjo. The highest-resolution VR headset, as of early 2020, is the Varjo XR-1. This $10,000 device is used by Volvo to design and test future car designs.

As 2020 evolves, you'll see a spectrum of devices with far more features and capabilities soon escape from R&D labs―including a small pair of glasses. Focals by North is a good example, and one that's similar to what Google Glass tried to do. On the other side of the spectrum are VR headsets from Valve, Varjo, Oculus, and others. We'll detail them here to give you context on the market and the kinds of devices that fit under the Spatial Computing tent.

When we say these will change what it means to be human, we mean that soon, the way that you experience the real world will be completely changed by AR. Already Snapchat is changing faces, floors, and buildings with a simplified version of AR. Soon every part of your human experience will be augmented as new Spatial Computing glasses, headsets, and other devices appear. The Varjo is giving us an early taste of just how complete these changes could be―inside the headset, you experience a virtual world in such detail that you no longer want to turn it off. It is being used to train pilots, and when we got a demo, the virtual cockpit that was presented was so sharp and colorful that we imagine that someday we'll just experience the whole world that way. These changes will deeply change human life. We predict that the changes will be deeper than those brought to us by the automobile or telephone.

Embodiment, Immersion, Presence, Empathy, and Emergence

In February 2014, Mark Zuckerberg visited Stanford's Virtual Human Interaction Lab. Prof. Jeremy Bailenson and his students walked Facebook's founder and CEO through a variety of demos of how VR could change people's experiences. He walked across a plank that made you feel fear of falling off. He experienced how VR could help with a variety of brain ailments and played a few games. Bailenson ran us through the same demos years later and said we had the same experiences including being freaked out by finding ourselves standing on a narrow piece of wood after the floor virtually dropped away to reveal a deep gap. Bailenson adeptly demonstrates how VR enables powerful emergence, along with a few new ways to cause our brains to be entertained: embodiment and immersion, not to mention a few other emotions, like the vertigo felt on that wooden plank.

Embodiment and Immersion

Embodiment means you can take the form of either another person or an animal. Bailenson showed us how embodiment could give us empathy for others. In one demonstration we experienced life as a black child. In another demo, we had the experience of being a homeless man who was harassed on a bus. Chris Milk, a VR developer, turned us into a variety of forms in his experience "Life of Us" that he premiered at the Sundance Film Festival in 2017. We watched as people instantly flew as they discovered that they had wings, among other forms, from being a tadpole to a stock trader running down Wall Street.

Immersion is the feeling that you are experiencing something real, but in a totally virtual world, in other words, being present in that world. The perception is created by surrounding the user of the VR system with images, sound, and other stimuli that provide an engrossing total environment.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSO5D15A918.tmp

Photo credit: Facebook. The $400 Oculus Quest changed the market by making VR far easier thanks to having no cords and no need for a powerful PC to play high-end VR games and other experiences.

Within a few months of getting the demos from Bailenson and his team, Zuckerberg would acquire Oculus for $2 billion. Oculus was started a couple of years prior in Irvine, California, by Palmer Luckey, Brendan Iribe, Michael Antonov, and Nate Mitchell. The early VR device was a headset that had specialized displays, positional audio, and an infrared tracking system. Sold on Kickstarter, Oculus raised $2.4 million, which was 10 times what was expected.

Evolution

Since 2012, Oculus has evolved. The headset was paired with controllers that let a user touch, grab, poke, move, shoot, slap, and do a few other things inside virtual worlds. A new term, 6DOF, got popular around the same time. That stands for Six Degrees of Freedom and meant that the headset and controllers could completely move inside a virtual world. Previous headsets, including the current Oculus Go, were 3DOF, which meant that while you could turn your head you couldn't physically move left, right, forward, or backward. To experience the full range of things in VR, you'll want a true 6DOF headset and controllers.

Systems that aren't 6DOF won't deliver the full benefits of immersion, presence, embodiment, emergence, or the empathy that they can bring. That said, there are quite a few cheaper headsets on the market and they are plenty capable of a range of tasks. 3DOF headsets are great for viewing 360-degree videos, or lightweight VR for educational uses that don't require navigating around a virtual world, or the other benefits that 6DOF brings. It is this addition of 6DOF to both the controllers and the headset that really set the stage for the paradigm shift that Spatial Computing will deliver.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSO616AA6E6.tmp

Photo credit: Robert Scoble. Facebook had huge signs when Oculus Quest first showed up, showing the market advantages Facebook has in getting its products featured.

As we open the 2020s, the product that got the most accolades last year was the $400 Oculus Quest from Facebook. This opened up VR to lots of new audiences. Before it came out most of the other VR headsets, and certainly the 6DOF ones, were "tethered" to a powerful PC. Our first VR system was an HTC Vive connected to a PC with an Nvidia GPU and cost more than $2,500. Today, for most uses, a $400 Quest is just as good, and for the few more serious video games or design apps that need more graphic power, you can still tether to a more powerful PC with an optical cable that costs about $80.

Facebook has the Edge

Facebook has other unnatural advantages, namely its two billion users, along with the social graph they bring (Facebook has a contact list of all of those users and their friends). That gives Facebook market power to subsidize VR devices for years and keep their prices down. The closest market competitor, Sony's PlayStation VR, is a more expensive proposition and requires a tether to the console.

The Quest comes with a different business model, though, that some find concerning: advertising. In late 2019, Facebook announced it would use data collected on its VR headsets to improve advertising that its various services show its users.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSOCFCC3BE4.tmp

Photo credit: Robert Scoble. Here Scoble's son, Ryan, plays on Oculus Quest while his dad charges his Tesla at a supercharger.

This gives Facebook a near monopoly in sub-$500 devices that bring full-blown VR and it is the only one that brings this in a non-tethered way.

Yes, Sony's PlayStation VR has sold more units at five million sold, which seems impressive until you remember that device has been on the market since 2016, and the experience of using a PlayStation is far inferior to using the Quest with its better controllers, a better selection of things to do, and the integration into Facebook's social graph, which enables new kinds of social VR that others will find difficult to match. The Quest sold about half a million units between when it launched in May 2019 and January 2020. The Quest was one of the hottest products during Christmas 2019 and was sold out for months. Even well into April 2020, the Quest remained sold out and could be found on eBay for a hefty premium over its $400 retail price.

What is notable is the shifting of price points. When PlayStation VR first came on the market it was a lot cheaper for a kid to get when compared to buying a $2,000 gaming PC and a $900 headset, like the HTC Vive. The Quest obliterated that price advantage, and it's the business subsidy that Facebook is giving to its Oculus division that's turning the industry on its head. Anyone who wants to go mainstream has to figure out how to deal with that price differential due to the advertising-subsidized pricing that Facebook has. Google could also do the same, but its forays into Spatial Computing, including Glass and the Gear VR headset systems that used mobile phones, have fallen flat with consumers because they didn't provide the full VR magic of embodiment, immersion, presence, and so on.

Market Evolution

Our prediction is that Facebook will consolidate much of the VR market due to its pricing, social element, and the advantages that come with having a war chest of billions of dollars to put into content. Its purchase of Beat Saber's owner, Beat Games, in late 2019 showed that it was willing to buy companies that make the most popular games on its platform, at least in part to keep those games off of other platforms that will soon show up. Beat Saber was the top-selling game in 2019, one where you slice cubes flying at you with light sabers. This consolidation won't hit all pieces of the market, though. Some will resist getting into Facebook's world due to its continued need for your data to be able to target ads to you. Others, including enterprise companies, will want to stay away from advertising-supported platforms. For that, we visited VMware's Spatial Computing lab in its headquarters in Palo Alto.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSO2952DF92.tmp

Photo credit: Robert Scoble. A wide range of headsets hang on the wall inside VMware's Spatial Computing lab in its Palo Alto headquarters.

At VMware, they buy and test all the headsets, so they can see all the different approaches various VR and AR headset manufacturers are taking. VMware's main business is selling virtual machines to run the world's internet infrastructure inside data centers and manage all the personal computers that connect to it.

Why is it investing in Spatial Computing? Because it is building an array of management infrastructure for enterprises to use, and sees a rapidly growing demand for Spatial Computing. Walmart, for instance, bought more than 10,000 VR headsets for its training initiatives. Simply loading an OS on that many headsets is a daunting challenge―even more so if corporate IT managers want to keep them up to date, keep them focused on one experience, and make sure that they haven't been hacked and have access to appropriate things on corporate networks and other workers.

Inside VMware Labs, Matt Coppinger, director, and Alan Renouf, senior product line manager, walked us through the headsets and the work that they are doing with both VR and AR. They told us about enterprises who decided against going with Facebook's solution because Facebook wanted to control the software loaded on the headset too tightly. Lots of enterprises want to control everything for a variety of reasons, from the ease of use that comes from getting rid of everything but one app that's already running when you put on the headset, to being able to control which corporations the collected personal data is shared with.

They both split up the market into different use cases:

  • 6DOF tethered/high end: This means Varjo, with its ultra-high-resolution and low-latency reign, which is why Volvo uses it for car design. The headsets at the high end have sharper screens and a wider field of view, which makes them the best choice for architecture, design, and use cases where you have highly detailed and complex things to work on, like an Airbus cockpit simulation for teaching pilots to fly.
  • 6DOF tethered/mid-market and lower market: Valve Index, HTC Vive, HTC Cosmo, and Oculus Rift headset products fit here. These tethered headsets allow the use of Nvidia's latest GPUs, so are great for working on three-dimensional design, architecture, and factory work but are much more affordable than the $10,000 Varjo. Most of these will run $1,500 to $3,000 for a decent PC and about $1,000 for the headset systems.
  • 6DOF self-contained: The Oculus Quest. This is the most exciting part of the VR market, they say, because it brings true VR at an affordable price with the least amount of hassle due to no cord and no external PC being required.
  • 3DOF self-contained: The Oculus Go, which is a headset that only does three degrees of freedom, has lots of corporate lovers for low-end, 360-degree media viewing and training with a minimal amount of interactivity or navigation. This is the headset that Walmart did all its training on, although it's now switching to the Quest to get more interactive-style training and simulations.

Keep in mind we have been through many waves of VR devices. VR has existed in the military since the mid-1960s. In the 1990s, people had VR working on Silicon Graphics machines (if you know what those are, that dates you to at least 40 years old or so). We flew plane simulators on such systems. But the VR form factor back then was huge and a Silicon Graphics machine cost more than $100,000.

We tend to think of that phase as an R&D phase, where the only uses possible were either military or experiences that cost a lot of money per minute. One of our friends had opened a retail shop where you could pay $50 for a few minutes inside one to simulate flying a fighter jet. Since most of us couldn't afford the fuel for a real jet, this was as close as we were going to get and it was pretty fun. Today's version of the same is an LBE (Location-based Entertainment) experience, where you can pay $20 to $50 for an average of 15 minutes per person to be "inside a movie" at Sandbox VR, The Void, Dreamscape Immersive, and others.

2014's New Wave

Instead of looking so far back, though, we see the new wave of Spatial Computing really started in 2014 when Facebook bought Oculus. That purchase by Mark Zuckerberg really started up a whole set of new projects and caused the world to pay attention to VR with fresh eyes.

We still remember getting a "Crescent Bay" demo, one of the first demos available on the Oculus Rift headset, in the back room at the Web Summit Conference in Ireland in 2016. That headset was tethered to a PC. Two little sensors were on a table in front of us. Once we were in the headset, they handed us controllers. It started out in darkness. When the demo started, we found ourselves on the top of a skyscraper.

It was the first time we had felt vertigo; our brains freaked out because it was so real-feeling due to the immersive power of VR that we thought we might fall to our deaths.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSO16177570.tmp

Photo credit: Facebook. The original Oculus Rift VR system, with its two sensors and controllers. Not seen here is the PC with a big GPU inside, usually made by Nvidia, that it's tethered to.

Now how did a computer make us scared of falling to our death? Well, the headset, as it was moved around, showed a 360-degree world. We felt like we were really on top of a skyscraper and could fall over the edge. This was far different than any computer or medium we had ever seen or experienced before. The power of putting highly tracked screens on our face that could reveal a virtual world was like magic. We barely noticed the cord that was hanging from a device over our head leading to a PC on the side of our play area.

Later, after we bought our own, we would get the limitations of the sensor system. It built a virtual room that we couldn't really move out of. In our case, in our homes, it was a virtual box that was a few feet wide by a few feet long. The cord and the sensors kept us from leaving that box and those limitations started to chafe. We couldn't play VR while on the road traveling. We couldn't take it to shopping malls. Trying to show people VR in their backyards proved very difficult. Getting even two of the systems to work meant getting black curtains to separate the two systems, since the sensors didn't like being next to another set that was trying to operate.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSOC44A2CFE.tmp

Photo credit: HTC. The HTC Vive was popular because of its superior tracking system. These black boxes, called "Lighthouses," were aimed at a VR player and sprayed a pattern of invisible light that let the system track the headset and controllers accurately. This type of tracking system is still used in many high-end headsets.

Let's talk about those sensors because how tracking is done is a key differentiator between headsets. HTC Vive is still a favorite amongst heavy users of VR, in part, because its tracking system is so good. It consists of two black boxes (you could use more, if needed, in, say, a large space) that spray invisible light on a VR user walking around. This invisible light is used to track headsets, controllers, and newer trackers that users could put on objects, their feet, and other things. These black boxes, called "Lighthouses," are better at tracking than the ones Facebook sold with its Oculus Rift system. The original Oculus ones were simpler cameras that weren't quite as exact and couldn't track as many things as the HTC system did.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSOE6CC61BC.tmp

Photo credit: Robert Scoble. A girl plays with an HTC Vive. Here you see the tether to a gaming PC.

A lot of these advantages, though, have been whittled away at by the Oculus team, who shipped quite a few updates to make its system work better and, finally with the Oculus Quest, this older "outside-in" tracking system was replaced by a series of four cameras on the front that implemented the newer tracking system that didn't require putting boxes or cameras around a VR user.

Inside-out

We believe this newer system, which industry people call "inside-out" tracking because the sensors are looking from the headset "out" into the real world, is better for Spatial Computing because it frees users from having to calibrate sensors and place boxes or cameras around, and it also means you have the freedom to walk around much bigger spaces, leading to wearing glasses in a few years where you can play virtually anywhere without worrying about the tracking.

The Oculus "inside-out" tracking also shows up in its Rift S model, which replaces the earlier Rift, with its two external trackers. Why does Facebook have three models – the Go, the Quest, and the Rift S? The Quest is optimized for low-cost 6DOF. The Rift S has the best performance. Compared to the Quest, graphics inside look sharper and nicer and the field of view is a little larger. The Go is a low-cost media viewer and we don't even like including it in a discussion about VR's capabilities because it is only 3DOF and can't let you do interactive games or experiences the way the Quest or the Rift S can.

Competing with the Rift S is the Valve Index, which got many gamers to salivate with its higher specs, including a sharp set of 2880x1600 monitors with a 130-degree field of view. The Oculus Quest, for comparison, only has 1440x1200 pixels per eye and a slower 72 Hz refresh rate and a smaller field of view at around 100 degrees. The negative for all that performance is that the Valve requires a user to be tethered to an expensive gaming PC.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSO15457B2A.tmp

Photo credit: Valve. The Valve Index got a lot of great reviews from gamers, who valued its high-resolution screens with a wide field of view.

Finally, the road to bringing us this magic has been quite rocky. Even the best headsets haven't sold more than a few million units; the Quest struggled to get to half a million by the end of 2019, and many attempts at bringing the masses into VR have failed. Two notable failures are Google Daydream and Microsoft's Mixed Reality licensed headsets.

Why did these do so poorly in the market? Well, for different reasons.

Mistakes Were Made

The Google Daydream headset, and a similar idea from Samsung, let users plop their phones into a headset. This idea promised low-cost VR, but it didn't deliver very well. First of all, the mobile phones that were on the market three years ago didn't have hugely powerful GPUs, so the graphic capabilities were muted. Worse, mobile phones often got too hot because of this underpowered nature, and VR pushed the phone's capabilities harder than anything else could, draining the batteries quickly on devices people needed for a lot more than just playing in VR.

Ours crashed a lot, too, due to the processing demands that heated everything up. The weight of the phone, too, meant that these headsets weren't very comfortable, with a lot of weight out in front of your nose.

What really killed them, though, was that they were only 3DOF, and there just wasn't enough content to view and the people who might have hyped them up had just gotten their full 6DOF systems from HTC or Facebook and weren't likely to spend hours playing underpowered mobile-based headsets.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSOBA6ACCC8.tmp

Photo credit: Google. One of the attempts that didn't go well with consumers was the Google Daydream headset that let you drop your mobile phone into this device to do VR. But it wasn't very full-featured; it couldn't do 6DOF, for instance, and only ever had very limited content developed for it.

The Microsoft Mixed Reality headsets tried to fix many of these problems but also failed due to a lack of customers, mostly because the people who really cared about VR saw these attempts as underpowered, with poor sensors, and not having much marketing behind them. Translation: the cool games and experiences were showing up on the HTC Vive or Oculus Rifts. Things like Job Simulator, which every child under the age of 13 knew about due to the brilliant influencer marketing of its owner, Owlchemy Labs, which now is owned by Google, were selling like hotcakes, and games like that needed the more powerful headsets.

We tend to think it was a mistake, too, for Microsoft to call these VR headsets "mixed reality." That confused the market since they came after Microsoft had released the critically acclaimed HoloLens, which we'll talk about a lot more in a bit, which was a true mixed reality experience where monsters could come out of your actual walls. These VR devices weren't that and confused the market and made the market wonder why they weren't just saying these are VR headsets. Either way, these are gone now except for the Samsung HMD Odyssey, and the Oculus Quest has set the scene for the next wave of Spatial Computing products.

We are seeing products in R&D that bear very little resemblance to the big, black headsets that are on the market as the 2020s open up. In fact, at CES, Panasonic showed off a new VR prototype that looked like a steampunk pair of lenses and was much smaller than the Oculus Rift. Oculus co-founder Palmer Luckey made some Twitter excitement when he posted a photo and praise of the Panasonic (which we doubt will do well in the market due to its poor viewing angle, lack of content, and lack of 6DOF).

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSO35F27616.tmp

Photo credit: Microsoft. The original Microsoft-licensed VR headsets from Lenovo, Dell, Acer, and HP [from left to right and top to bottom]. The Samsung HMD Odyssey is missing from the photo, as it came out many months later than these, was relatively more expensive, and had the reputation of being the best quality of the bunch. Only the Odyssey is still available for direct purchase at a $499 retail price.

Either way, VR is here now, and the magic it brings of immersion, embodiment, and presence sets the stage for a whole range of new capabilities that will show up over the next few years.

The Spectrum ‒ Augmented Reality from Mobile to Mixed Reality

The popularity of Amazon's Alexa and its competitors show us that humans are hungry for a new kind of computing. This form of computing lets us see new layers of information on top of the existing world. It is computing that surrounds us, walks with us through shopping malls and assists us while working out. This kind of computing could potentially even save our lives as it collects our visual, biometric and textual data.

Closer Computers

To make it possible to get this level of human-machine interface, we will need to get computers closer to our brains and certainly our eyes.

We can do some of this kind of computing by holding a phone in our hands, which is already used for AR by kids playing Minecraft Earth or Pokémon Go. Holding a mobile phone, though, won't get us to the promised land of true Spatial Computing. It must be said, however, that developers are using mobile to build a variety of things that are important to pay attention to.

As the 2020s open, we are seeing some of these wearables evolving in the marketplace with a spectrum of devices.

On one side, we have very lightweight devices that look similar to the Google Glass device. A good example of these lightweight devices is Focals by North, which got some hype at the Consumer Electronics Show (CES) 2020, but there are a number of them from Vuzix, Epson, Nreal, and others. These have screens that show you a variety of information overlaid on top of the real world and do it with a form factor that looks like a pair of sunglasses.

At the other end of the spectrum are devices that completely change our visual experience as we move around in the real world. These devices which include Microsoft's HoloLens 2 could be seen as a new kind of "mixed-reality" device that brings advanced graphical features and a ton of sensors along with a new form and Artificial Intelligence to see, process, and display all sorts of new experiences, both visual and auditory, that profoundly changes how we interact with the real world.

It is these higher-end devices that have us dreaming of a radically new way of computing, which is what we are covering in this book: Spatial Computing. With these devices, you're wearing seven cameras along with a bunch of other sensors and because of the amount of processing that needs to be done and the kinds of visual displays that need to be included, these devices tend to be both heavier and more expensive than devices, for instance, like Focals by North.

It is the near future, though, that has us most excited. Qualcomm announced a new XR2 chipset and reference design that will make powerful AR devices possible that will make the HoloLens 2 device soon seem underpowered, heavy, and overpriced. We're expecting a wave of new devices to be shipped in 2021 based on the XR2 with new optics that will be far better than anything we've seen to date.

Let's go back, though, to where AR came from.

The Origins and Limitations of Augmented Reality

For us, the potential for AR came up in 2011 as Peter Meier, then CTO of a small AR company in Munich, Germany, took us into the snow outside of its offices and showed us dragons on the building across the street. These virtual dragons were amazing, but the demo was done with a webcam on a laptop. Hardly easy to do for consumers. Earlier attempts, like Layar, which was an AR platform for mobile phones, didn't hit with consumers because the mobile phones of the day were underpowered, and most didn't appreciate AR on small screens. These early efforts, though, did wake up a lot of people, including us, to the possibilities of AR. It also woke Apple up, which started working earnestly on Spatial Computing.

Within a few years Apple had purchased that company, Metaio, and Peter today works there. Since then Apple has purchased dozens of companies that do various pieces of Spatial Computing, and that acquisition trend continues into 2020. While we were typing this Apple bought another company that does AI that will prove important to the future we describe here. Metaio's work was the foundation of a new set of capabilities built into iPhones. Developers know this AR framework as "ARKit." Google has a competitive framework for Android users called "ARCore."

There are now thousands of applications that use ARKit, ranging from apps to let you measure things to apps that radically change the real world in terms of everything from shopping to new forms of video games. If you visit Apple's store at its new headquarters in Cupertino you will be handed an iPad where you can see an augmented display of the building across the street from the store.

Touch the screen and the roof pops off so you can see what it's like inside. Touch it again and you'll learn about the solar panels that provide clean energy to the building. Touch it again and you can see a wind analysis layer of the building.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSO8E694294.tmp

Photo credit: IKEA. IKEA Place lets you move furniture around in your house and try it out to see if it fits your lifestyle before you buy it and unpack it, only to discover it doesn't work.

It is apps like these that show the power of AR, but they also show you the limitations, of which there are many. Let's dig into a few of the limitations of mobile AR:

  • You have to hold the phone: This keeps you from being able to use more capable controllers, or even your hands, to touch, manipulate, or shoot virtual items.
  • Tracking often doesn't work well: This is especially the case in dark rooms or places that don't have a lot of detail or in moving vehicles like planes or buses, and tracking is almost impossible to get. Apple, even in its headquarters, put a grid pattern on top of the building model so that the camera on the iPad would be able to figure out where it was and align virtual items accurately and quickly.
  • Occlusion doesn't always work: Occlusion means that virtual items properly cover up things behind them. Or, if a human or, say, a cat walks in front of the virtual item, the virtual item disappears behind that moving human or animal properly.
  • Virtual items don't look natural for a variety of other reasons: It's hard for your iPhone to be able to properly place shadows under virtual items that match real-world lighting conditions. It's also hard to make virtual items detailed enough to seem natural. For things like virtual cartoons, that might not be a big deal, but if you want to "fool" your user into thinking a virtual item is part of the real world it's hard to do that because of the technical limitations inherent in today's phones and the software that puts these virtual items on top of the real world.
C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSOB3E389C2.tmp

Photo credit: Niantic. Pokémon Go lets you capture characters on top of the real world in its popular mobile game, which generated about a billion dollars in revenue in 2019.

The other problem that most people don't think about when they are holding their mobile phone is that their screen is actually fairly small. Even the biggest iPhones or Samsung devices are only a few inches across. That hardly is enough to give your brain anything close to the immersion, embodiment, and other advantages of VR. The problem of having to hold the phone or tablet is the most daunting, actually, particularly for workers. Imagine working on a factory line where you need both of your hands to insert a part into a product. Now, imagine having to put down that part to hold up a phone to augment your work and see details on what you need to do next to properly hook that part up.

Or, if you are playing video games, imagine trying to play paintball, or go bowling with your friends. The act of holding a phone, especially with a tiny screen, limits your ability to recreate the real-world version of those games.

This is why companies like Apple, Qualcomm, Magic Leap, Facebook, Google, and others have already spent billions of dollars to develop some form of wearable AR glasses, blades, or head-mounted displays, if you wish to call them that.

Wearable AR

The industry has been attempting to have a go at this for quite some time and already there have been some notable failures.

ODG, Meta, DAQRI, Intel Vaunt, and others populate the AR "burial zone" and insiders expect more to come. This is really a risky game for big companies. Why?

Well, to really take on the mass market, a company will need the following:

  • Distribution, stores: The ability to get people to try on these devices.
  • A brand: Because these computers will be worn on your face, consumers will be far more picky about brands. Brands that are perceived as nerdy will face headwinds. Already, Facebook announced it is working with Luxottica, which owns brands like Ray-Ban, Oakley, and many others.
  • A supply chain: If their product does prove successful, tens of millions will buy it. There are only a few companies that can make those quantities of products and use the latest miniaturization techniques, along with the latest materials science, to make devices lightweight, strong, and flexible to absorb the blows that will come with regular use.
  • Marketing: People don't yet know why they need head-mounted displays, so they will need to be shown the advantages. This will take many ads and other techniques to get consumers to understand why these are better than a smartphone alone.
  • Content: If you get a pair of $2,000 glasses home and you can't watch TV or Netflix, or play a lot of different games, or use them at work with a variety of different systems, you will think you got ripped off. Only a few companies can get developers to build these, along with making the content deals happen.
  • Design and customer experience: Consumers have, so far, rejected all the attempts that have come before. Why? Because the glasses that are on the market as of early 2020 generally have a ton of design and user experience problems, from being too heavy to not having sharp enough visuals, to looking like an alien force made them. Only a few companies have the ability to get where consumers want them to be. One company we have heard about has built thousands of different versions of its glasses and it hasn't even shipped yet. The cost of doing that kind of design work runs into hundreds of millions of dollars.
  • Ecosystem: When we got the new Apple Watch and the new Apple AirPods Pros in late 2019, we noted that turning the knob on the Watch (Apple calls it the Digital Crown) causes the audio on the headphones to be adjusted. This kind of ecosystem integration will prove very difficult for companies that don't have phones, watches, headphones, computers, and TV devices on the market already.
  • Data privacy: These devices will have access to more data about you than your smartphones do. Way more. We believe consumers are getting astute about the collection of this kind of data, which may include analysis of your face, your vascular system (blood vessels), your eyes and what you are looking at, your voice, your gait and movement, and more. Only a few companies can deal satisfactorily with regulators concerning this data and can afford the security systems to protect this data from getting into the wrong hands.

This is why when Meron Gribetz, the founder of Meta, told us that he thought he had a chance to be disruptive to personal computers, we argued with him. He thought the $200 million or so that had been invested in his firm would be enough. We saw that a competitive firm, Magic Leap, had already raised more than a billion dollars and thought that meant doom for his firm. We feel Magic Leap, even with its current war chest of $2.7 billion dollars, doesn't have enough capital to build all eight of these required aspects from the preceding list. In fact, there have been news reports that indicate that Magic Leap is currently for sale. In April 2020, Magic Leap announced massive layoffs of about 50 percent of its workforce, or more than 1,000 people, and refocused away from attempting to sell to consumers.

We turned out to be right about Meta, for his firm was later decimated and sold for parts. Others, too, had the same dream of coming out of the woodwork and leaving a stamp on the world. ODG's founder Ralph Osterhout had quite a bit of early success in the field. He invented night-vision scope devices for the military and that got him the ability to have a production line and team in downtown San Francisco to come up with new things.

He sold some of his original patents to Microsoft where they were used on the HoloLens product. But, like Meron's company, his firm couldn't stay in the market.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSO17410F20.tmp

Photo credit: Robert Scoble. Meron Gribetz, founder of Meta, shows off how his headset could augment the world and let you use your hands to work.

These early attempts, though, were important. They woke developers up to the opportunities that are here if all the pieces can be put together and they saw the power of a new way of working, and a new way of living, thanks to AR.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSOEA7BE22E.tmp

Photo credit: Intel. Intel Vaunt promised great-looking smart glasses with innovative digital displays, but didn't provide the features that potential customers would expect.

They weren't alone in their dream of changing computing for all humans, either. Some other notable failures tell us a bit about where we are going in the future. Intel's Vaunt aimed at fixing a ton of problems with the other early attempts. It had no camera due to the PR problems that Google Glass got into and the last version could only show two-dimensional overlay visuals in black and white. It also had no speaker or microphone. It looked like an everyday pair of glasses.

Too Much and Too Little

It was Intel's fear of the social problems of smart devices on your face, though, that doomed it. In April 2018, it announced it was canceling the Vaunt project. To do AR and the virtualized screens that would give these devices real utility beyond just being a bigger display on your face, the device needs to know where it is in the real world.

Not including cameras meant no AR. No microphones meant you couldn't use your voice to control the devices. No speakers meant you still needed to add on a pair of headphones, which increased complexity, and consumers just decided to stay away. Not to mention Intel didn't own a brand anyone wants on their face and didn't have the marketing muscle or the stores to get people to try them on and get the proper fit and prescription lenses.

DAQRI, on the other hand, went to the other side of the spectrum. They included pretty great AR and had a camera and much better displays. Workers wearing them could use them for hours to, say, inspect oil refinery pipes and equipment.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSOED3A3E6C.tmp

Photo credit: Robert Scoble. DAQRI's headset gathered hundreds of millions in investment and showed promise for enterprise workers, but never took off because of its tether and expensive costs.

Where Intel threw out any features that made the glasses seem weird or antisocial, DAQRI kept them in. It was big, bulky, and had a tether to a processing pack you would keep on your belt or in a backpack. This appeared very capable, and because they were aimed at enterprise users who didn't need to worry about looking cool or having a luxury brand associated with the product, it seemed like it would see more success. And for a while it did, except for one thing: the price tag.

Many corporate employees heard about this system, gave DAQRI a call, then found out the costs involved, which could run into hundreds of thousands of dollars, maybe even millions, and that killed many projects. In September 2019, it joined the failure pile and shut down most operations.

Getting Past the Setbacks

Now, if we thought the failures would rule the day, we wouldn't have written this book. Here, perspective rules. There were lots of personal computing companies that went out of business before Apple figured out the path to profitability (Apple itself almost went bankrupt after it announced the Macintosh). Luckily, the opportunity for improving how we all compute is still keeping many innovators energized. For instance, at CES 2020, Panasonic grabbed attention with a new pair of glasses. Inside were lenses from Kopin, started by John Fam.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSOE59B6B5A.tmp

Photo credit: Robert Scoble. John Fam of Kopin shows off how his lenses work in a demo. His lenses and displays are now in Panasonic's latest head-mounted displays and soon, he promises, others.

One thing that's held back the industry has been the quality of lenses and displays, and Fam is one of the pioneers trying to fix the problems. What problems? Well, if you wear a Magic Leap or a HoloLens, you'll see virtual images that are pretty dark, don't have good contrast, are fairly blurry, and are fairly thick. They also have disadvantages including a small viewing area, usually around 50 degrees, when compared to the 100 degrees on offer in most VR headsets. Because these new displays and lenses need to display virtual images while letting a user see through to the real world, they also are very difficult to manufacture, which is why a Magic Leap is priced at $2,500 and a HoloLens is $3,500 (as of January 2020, but we expect these prices to come down pretty rapidly over the next few years).

Today, there are only a few approaches to getting virtual images into your eyes. One approach is waveguides. Here, tiny little structures in something that looks like glass bounce light dozens of times from projectors until the right pixel gets to the right place and are reflected into your eyes. You can't really see these extremely small structures with the naked eye, but this process makes for a flat optic that can be fitted into a head-mounted display. It has lots of disadvantages, the most significant of which is that waveguides are typically not very efficient.

The ones in the first HoloLens presented virtual images that could barely be seen outdoors, for instance. They also present eye-strain problems because it is hard to make these kinds of optics present images at various focal depths. Getting virtual images close to your eyes? Forget it, but even ones that are a few feet away don't usually work at various depths as well as taking into account things in the real world, which puts strain on your eyes. Two terms optics experts talk a lot about are vergence and accommodation.

Vergence and Accommodation

Vergence is how your eyes move inward as you try to focus on things that are closer and closer to you. Cross-eyed? That's vergence and if your glasses don't align images well as they get closer, they can cause your eyes to get much more tired than when they have to focus on real-world items for a long time.

Accommodation refers to how your eye changes shape―and how the lenses in your eyes move to focus on things close up. The eye often uses cues from the real world to figure out how to do that and in smart glasses, these cues aren't available because everything is presented on a single focal plane. Eye strain appears again as an issue.

Microsoft with HoloLens 2, its latest AR device, went in a different direction―it is using a series of lasers, pointed into tiny mirrors that reflect that light into the back of your eye, which allows you to see virtual images. So far, this approach brings other problems. Some users say they see a flicker, while others say that colors don't look correct.

Either way, we haven't seen one monitor/lens system yet that fixes all these problems. The closest we've seen is one from Lumus, but because it stayed with waveguides, it needed huge projectors pushing light into the optics. This might be good for something as big as a HoloLens (which weighs about a pound) but the industry knows it needs to ship products that are closer to what Intel's Vaunt was trying to do if it wants to be popular with users.

We hear new optics approaches are coming within the next few years, which is what will drive a new wave of devices, but even those spending billions are no longer hyping things up. Both Facebook's founder Mark Zuckerberg and Magic Leap's founder Rony Abovitz have been much more muted when discussing the future with the press lately. Zuckerberg now says he believes we will get glasses that replace mobile phones sometime in the next decade.

Fast, Cheap, or Good

The saying goes that when designing a product, you can have it cheap, fast, or good, except you can only have two of the three. By "good," we mean well designed and appealing for the user: a good and intuitive user interface, appealing aesthetics, and so on. If you've got a "good," fast product, it's not going to be cheap. If it's both fast and cheap, it's unlikely that it's had so much time and expertise put into the design process to bring you a "good" user experience. Spatial Computing product designers will have to make other choices on top of this canonical choice as well. Lightweight or the best optics? A cheap price tag or the best performance? Private or advertising-supported? Split the devices up into their component parts and you see the trade-offs that will have to be made to go after various customer contexts. Aiming a product at someone riding a mountain bike will lead to different trade-offs from aiming one at a construction worker inspecting pipes, for instance.

Since we assume you, our reader, will be putting Spatial Computing to work in your business, we recognize you will need to evaluate the different approaches and products that come to market. These are the things you will need to evaluate for your projects. For instance, having the best visuals might be most important to you, and you are willing to go with a bigger, and heavier, device to get them.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSOFFF79C78.tmp

Photo credit: Robert Scoble. DigiLens Display at Photonics West in 2019. Photonics West is where the industry's optics experts go to show off their latest work and here a potential customer tries on DigiLens' latest offering.

Here are the 10 technology areas where devices will compete:

  • The monitors and optics: These control what you see. The ones that present more data in a bigger field of view with more pixels, or higher resolution, are usually in the bigger and more expensive devices than the ones on the lightweight side of the scale.
  • The GPU: GPU speed means how many polygons per second can be processed. The more powerful the GPU in your device, the more detailed the virtual items presented can be. Again, the more powerful ones will be in devices that can supply the battery power these will need and the heat handling that comes along with more powerful chips.
  • Battery: If you want to use your devices for more hours without needing to charge, you'll need a bigger and heavier one. There will, too, be various approaches to charging said batteries that you will need to consider.
  • Wireless capabilities: Will it have a 5G chip so it can be used on the new high-speed wireless networks? Or will it only have Wi-Fi capabilities, like current Microsoft HoloLens devices? That's great if you'll only use it at home or work, where Wi-Fi abounds, but makes it harder to use elsewhere.
  • The sensors: Newer and bigger devices will have several cameras and a variety of other sensors―one to look at your mouth for data to build avatars and study your sentiment, one for your eyes, another to gather light data, so it can adjust the monitor to be brighter or darker depending on conditions, and four for the world, two of which will be black and white for tracking and two will be color for photography and scanning items. In addition, there's usually an Inertial Measurement Unit (IMU), which watches how the unit moves. An IMU actually has a group of sensors―an accelerometer, a gyroscope, a compass, and a magnetometer. These give the system quite a bit of data about how the headset is moving. The top-of-the-line units also have some three-dimensional sensors. All these sensors add many new capabilities, including the ability to have virtualized monitors that stick to surfaces such as real-world tables, or AR characters, masks, items, and such.
  • AI chip: Newer devices will add on chips to decipher what the unit is seeing and to start predicting the next moves of both the user and others in their field of view.
  • Corrective lenses: Many users will need custom lenses inserted, or systems that can change the prescription of the system. Apple and others have patented various methods to bend lenses to supply different corrective prescriptions.
  • Audio: On some systems, we are seeing array microphones in some high-end units (an array has a number of microphones that can be joined together in software to make it so that they work in higher noise situations, such as a factory floor). Also, a variety of speakers are available, with more on the way. Some even vibrate the bones in your face, which passes audio waves to your inner ear.
  • Processing: Some systems are all self-contained, like the Microsoft HoloLens, where everything is in the unit on your head. Others, like Nreal and Magic Leap, need to be tethered to processor packs you wear, or to your phone.
  • The frame and case: Some design their products to be resistant to a large amount of abuse, others design them to be hyper-light and beautiful. There will be a variety of different approaches to water resistance and shock resistance, and trade-offs between expensive new materials that can provide flexibility and reduce device weight while looking amazing, versus cheaper materials that might be easier to replace if broken.

Each product that we'll see on the market will have various capabilities in each of these areas and trade-offs to be made.

Differences To Be Aware Of

We won't go into depth on each, since we're not writing this book to help the product designers who work inside Apple, Facebook, or others, but want to point out some differences that business strategists, planners, and product managers will need to be aware of.

For instance, let's look at the lenses. Generally, you'll have to know where you will attempt to use these devices. Will they be outside in bright sunlight? That's tough for most of the current products to do, so might limit your choices. Lumus, an optics company from Israel, showed us some of its lenses that would work outdoors. They said they are capable of about 10,000 nits (a unit of measure for how bright a monitor is), versus the 300 to 500 nits in many current AR headsets which only work in fairly low light.

The problem is that Lumus' solution, at least as it was presented to us in 2018, was expensive and fairly big. This is appropriate for a device the size of a HoloLens, but you won't see those in glasses anytime soon. The Lumus lenses were a lot sharper than others we had experienced before and since, and had a bigger field of view, but product designers haven't yet included those in products, in part because of their big size and higher cost.

At the other end of the scale were lenses from Kopin and DigiLens. DigiLens showed us glasses in early 2019 that were very affordable (devices would be about $500 retail) but had a tiny field of view―about 15 degrees instead of the 50 that Lumus showed us, which is about what the HoloLens 2 has, and not as much resolution either. So, reading a newspaper or doing advanced AR won't happen with the DigiLens products, at least with what we've seen up to early 2020.

So, let's talk about how these products get made. When we toured Microsoft Research back in 2004, we met Gary Starkweather. He was the inventor of the laser printer. That interview is still up on Microsoft's Channel 9 site at https://channel9.msdn.com/Blogs/TheChannel9Team/Kevin-Schofield-Tour-of-Microsoft-Research-Social-Software-Hardware.

In the preceding video, Starkweather talks about his research into the future of computing. Starkweather was looking for ways to build new kinds of monitors that would give us ubiquitous computing. In hindsight, this was the beginning of where the HoloLens came from. In a separate lab at Microsoft Research nearby, Andy Wilson was working on software to "see" hands and gestures we could make to show the computer what to do, among other research he and his team was doing.

He still is working on advancing the state-of-the-art human-machine interfaces at Microsoft Research today. Starkweather, on the other hand, died while we were writing this book, but his work was so important to the field that we wanted to recognize it here.

From Kinect to the HoloLens

This early research from the mid-2000s soon became useful as three-dimensional sensors started coming on the scene. In 2010, Microsoft announced the Kinect for Xbox 360. This was a three-dimensional sensor that attempted to bring new capabilities to its video game system. The system could now "see" users standing in front of it. Microsoft had licensed that technology from a small, then unknown Israeli start-up, PrimeSense.

Why was Kinect important? It soon became clear that Kinect wouldn't be a success in the video game world, but Microsoft's researchers saw another way to use that three-dimensional sensor technology. Shrink it down, put it on your face and aim it at the room, instead of players. That would let a wearable computer map the room out, adding new computing capabilities to the room itself.

That's exactly what Alex Kipman and his team did on the first HoloLens. That device had four cameras and one of those three-dimensional depth sensors that mapped out the room, then converted that map into a sheet of polygons (little triangles). You can still see that sheet of triangles once in a while when using the HoloLens and needing to do calibration.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSOF90AD146.tmp

Photo credit: Microsoft. HoloLens 2 brings a much more comfortable headset with way better gesture recognition, along with a flip-up screen so workers don't need to remove it as often.

The sensors on the front of a Spatial Computing device, like a Microsoft HoloLens, build a virtual layer on top of the real world where virtual items can be placed. Software developers soon become adept at manipulating this virtual layer that many call the "AR Cloud." This database of trillions of virtual triangles laid on top of the real world provides the foundation for everything to come from simple text to fully augmented or virtual characters running around.

When the HoloLens was introduced, it brought a new computing paradigm. It's useful to look at just some of the new things it did that weren't possible before:

  • Screens can now be virtualized: Before HoloLens, if you wanted a new computer screen you had to buy one from a retailer and hook it up. They are big, awkward, heavy, and expensive. With HoloLens you just gestured with your hands or told the system to give you a new screen. Then you could grab that virtual screen and place it anywhere in the room around you. There aren't any limits on how many screens you could get, either.
  • Surfaces can be changed: When using RoboRaid, one of the games we bought for our HoloLens, aliens could blow holes in our real walls and then crawl through them to attack. The effect was quite stunning and showed developers that you could "mix" reality in a whole new way, keeping the real world but replacing parts of it with virtual pieces.
  • Everything is three-dimensional (and not like those glasses at the movies): The first thing you figure out when you get a HoloLens is that you can drop holograms, which are three-dimensional items that move, nearly everywhere. We have a clown riding a bicycle around our feet as we write this paragraph, for instance. Now that might not sound that interesting, but you can view your business data in three dimensions too. Over at BadVR, a start-up in Santa Monica, California, Suzie Borders, the founder, showed us quite a few things, from shopping malls where you could see traffic data on the actual floor to factories where you could see data streaming from every machine. This new way of visualizing the world maps to the human brain and how it sees the real world, so it has deep implications for people who do complex jobs, such as surgeons, for instance.
  • Everything is remembered: Our little bicycle-riding hologram will still be here even a year later, still riding in the same spot; the monitors we placed around our office stick in the same place. Data we leave around stays where we left it. Microsoft calls these "Azure Spatial Anchors" and has given developers new capabilities regarding this new idea.
  • Voice first: The HoloLens has four microphones, arranged in a way to make it easy for it to hear your commands. You can talk to the HoloLens and ask it to do a variety of things and it'll do them and talk back. While you can do the same with Siri on an iPhone, this was built into the system at a deep level and feels more natural because it's on your head and always there.
  • Gesture- and hand-based systems: Ever wanted to punch your computer? Well, HoloLens almost understands that. Instead, you open your hand and a variety of new capabilities pop up out of thin air. You can also grab things with your hands, and poke at them now too, all thanks to the sensors on the front of the device that are watching your hands, and the AI chip that makes sense of it all.

Given this combination of technologies and capabilities, which are superior to those that the Magic Leap or any other current AR headset has, we believe that the HoloLens historically will be viewed as a hugely important product that showed the world Spatial Computing at a depth we hadn't seen before.

Technology Takes a Leap

All others would use the HoloLens as a measuring stick, which brings us to Magic Leap. Magic Leap gathered investors with a promise that it would make a big magic leap into this new kind of computing. In hindsight we don't think the investors knew how advanced Microsoft's efforts were. Its first round of $540 million came back in 2014, two years before HoloLens hit the market.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSO5CC2B544.tmp

Photo credit: Magic Leap. The Magic Leap One is an attractive consumer-targeted product that has more graphical capabilities than HoloLens, but requires a tether to a processor pack, which HoloLens doesn't need.

The hype before Magic Leap shipped its first product, the ML1, was that it constituted a real graphics breakthrough, with optics that were way ahead of others in the market. That didn't come true, at least not in its first product, and as mentioned, Magic Leap's future as a continuing company as this book goes to press is an unknown.

As 2020 opens up, HoloLens is hitting some production problems of its own. It can't make enough devices to supply all the demand and some customers are reporting problems with the images seen in its devices, which has led to speculation that it is having a tough time manufacturing the laser/micro-mirror system that it uses to get light into your eyes. Microsoft now claims that those problems are behind it, but the delays show how difficult getting these products to market is.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSOF8A77FF2.tmp

Photo credit: Nreal. Nreal won praise at CES 2020, with some press stating that it was the best product there. This headset is expected to be released this year for $499.

Both the HoloLens 2 and Magic Leap ML1 feel like big elephants or whales, since having a whale jumping out of a gymnasium floor was how Magic Leap became known in a demo video that oversold its graphical capabilities.

This brings us to Nreal, and a lawsuit that Magic Leap filed against that company.

Patents and Problems

Nreal started when Chi Xu left Magic Leap in 2016, disappointed with the speed that the well-funded company was moving at He, along with Bing Xiao, co-founder and chief optical engineer, and Chi Xu, the founder, joined forces in China. The subtext is that China has long been seen as a place that plays loose and fast with intellectual property designed in the United States. Magic Leap promptly sued Nreal and accused Chi Xu of stealing AR secrets from his former employer. It was alleged that Magic Leap were using patents it acquired from Meta in its complaint against Nreal, and Rony Abovitz tells us that intellectual property rights are important to Magic Leap, which has invested billions in its product development. This case will play itself out, but for now, Nreal is getting products on the market that are fairly capable and a lot lower cost than Magic Leap―it was selling a $499 model like hotcakes at CES 2020. Having the Chinese supply chain at their back door will prove to be a big competitive advantage for Nreal, we think, and make it a company to watch.

This leads us to the elephant in the room: Apple. Apple has all the ingredients to make the meal already, from distribution to the brand to the supply chain. While there are a ton of rumors out there, it's good to note that no one really knows what will ship until Tim Cook calls the press in to see the new product. So far that hasn't happened, and even planned products can get delayed or radically changed. It's worth noting that the iPod, from conception to being on the shelf, took less than a year, which shows just how fast companies can move to make new products if motivated.

That said, Apple let details leak from an internal meeting where it detailed two new products. The first is a headset that would do both VR and AR and that will arrive in 2022. The second is a pair of lighter AR glasses, which will come in 2023. We expect when Apple does come to the market it will be with products that focus on what Apple does best: ecosystem integration and user experience.

We see that Apple plays a huge role in the industry and we believe that consumer adoption won't happen until they get into the market.

Compare Apple with Nreal and others, and you see the problem. While glasses from Nreal have pretty good optics with bright images and wide field of view, the company doesn't yet do very much that will get consumers excited, and it isn't clear how it will excite developers enough until it sells millions of units. In Silicon Valley, this is seen as a "chicken and egg" problem―how will it sell enough without apps, and how will it get apps without selling enough?

Can any company make the consumer market happen in a huge way without Apple? It's hard to see how, but after Apple arrives, the market will be open to alternative approaches, the same way that after the iPhone grabbed the market's attention, Google was then able to grab most of the market share with its Android OS and phones that ran it.

That all said, we have seen inside R&D labs and know that there are many products under development and that by 2025 we should see massive numbers of people wearing Spatial Computing devices. That will open up many new business opportunities and that's got us excited. For instance, there will be a hunger for new kinds of content, so we'll see new kinds of cameras and other technologies to help satiate that hunger.

Volumetric and Light Field: Capturing Our World in a New Way

DARPA had a challenge: can you take a photo through a bush?

That led to Computer Vision breakthroughs at Stanford University and other places that are still changing how we look at capturing images today. That challenge, back in 2007, caused Stanford University researcher Marc Levoy to move away from using a single lens to take a photo and, instead, build a grid of cameras, all connected to a computer where software would gather rays of light coming through a bush and collect them piece by piece from all the different cameras, creating a light field. Then, a computer would re-assemble those pieces, sort of like putting together a puzzle.

Shrinking Concepts, Expanding Horizons

Today, the light field concept has shrunk from using individual cameras arranged on a set of shelves to using pieces of image sensors, sometimes even putting microscopic lenses on top, and the field of Computer Vision has greatly expanded. Computer Vision is used in a variety of products, from a June Oven, which can tell whether you have put a piece of toast or a steak into your oven, to autonomous cars, where Computer Vision is used to do a variety of tasks.

The technique gathers data from a variety of sensors and fuses that into a "frame," which in Artificial Intelligence lingo is all the data from one specific point in time. A Tesla's frame, for instance, includes the data from its seven cameras, its half-dozen ultrasonic sensors, its radar sensor, and a variety of other IMUs that have motion, compass, and other data. Once a frame has been captured, then a variety of Computer Vision techniques, which are a subset of Artificial Intelligence, can be performed on that frame.

These new techniques have deep implications for entertainment, and, well, pretty much everything―since we'll soon see camera arrays in tons of products.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSO1397D4D0.tmp

Photo credit: Robert Scoble. Part of Marc Levoy's photographic research lab at Stanford University in 2007. This research led to the development of new kinds of cameras that are still being improved today.

Already camera arrays are being used to capture dancers on "So You Think You Can Dance" and many other things, particularly important in Spatial Computing's VR and AR. When Buzz Aldrin talks to you in a VR headset, he was captured on a camera array with a new technique called "volumetric" video, which means you can walk around Buzz's virtual image as he talks to you.

Shooting Some Volumetric

Before we go on, we should define some terms. Volumetric video does what it sounds like: it measures volumes. When creative people say, "I'm shooting some volumetric," that means they are capturing video inside a special studio that has one of these arrays of cameras. We visited one multimillion-dollar studio, owned by Intel, near Hollywood to get a look at how it's done. There, a sizable room has a dome surrounding a space that can fit half a basketball court. In the dome, as we looked up, we could see more than 100 cameras.

The walls of the dome were green so that computers could get rid of the dome and easily insert virtual imagery later in editing. Under the dome, Intel had more than 100 cameras connected to a data center where the videos from each camera were stitched together to make something that editors could use and then distribute to people who would see that scene in a VR headset. This three-dimensional dataset is actually pretty useful for regular movies and TV shows, too, because there are infinite camera positions surrounding the action. So, a director could choose a different camera angle to include in the movie she is directing. If that one doesn't work, she could choose another of the same scene. There's a whole industry of volumetric studios, including Metastage, which was backed by Microsoft and utilizes Microsoft technology for Volumetric Capture for VR and AR experiences.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSOBD4FA35E.tmp

Photo credit: Intel Studios. This Volumetric Capture dome has 100 4K cameras around the dome, all pointed inside. When joined together the cameras make a three-dimensional image set that you can walk around, or through.

This is different from 360-degree video. Where volumetric studios include hundreds of cameras aimed inward, toward a single point, 360-degree cameras start at a single point and aim outward. This is a useful technique to capture, say, video of yourself skiing down a hill, since when you get home you want to be able to see the mountain in all directions, including your family members skiing in front of and behind you as you glide down the mountain.

This brings us to light fields. Strict volumetric cameras don't capture the data about all the photons of light streaming off of the subjects that are being captured. Light fields go further, to actually record the angles that light is reflecting off of subjects and entering a camera array.

This, in theory, will let you capture scenes in a much more natural way, which looks amazing to your eye if you view it in a Spatial Computing device on your face (light field lenses are just starting to be shown on visors, too―CREAL got critical acclaim at CES 2020 for its device that shows light field data).

The real magic of light fields, though, will be that they will enable far more interactive scenes and much greater editing control over objects and scenes. Today, a 360-degree video, and even most volumetric data, is not interactive at all. You sit there and you watch, or with volumetric video, you walk around while watching. But soon, in scenes with light field capture and lenses, you might be able to walk around things and even touch them and pick them up (given enough programming to enable that). Light fields are the holy grail of capturing light, as they do so in such a detailed way that you can do a ton more with that data later.

Google's The Relightables system already uses some of these techniques to let content developers edit three-dimensional images after shooting. That system uses 331 custom color LED lights, along with an array of high-resolution cameras, and a set of custom depth sensors to create volumetric datasets with a taste of light field-style features. Thanks to capturing light in a unique way the computer can separate out the color and luminosity data from the physical form of the subject being videoed. That way a producer could mess with how an object looks later on a computer screen.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSOC6F2071C.tmp

Photo credit: Google. Google's The Relightables system uses 331 custom color LED lights, an array of high-resolution cameras, and a set of custom depth sensors to create volumetric datasets where the lighting can be changed later.

Video professionals are pushing even further into this field of using arrays of cameras to do new things. Michael Mansouri, founder of Radiant Images, a firm that used to rent cameras to movie studios and now does R&D on cameras for those studios, showed us his latest camera arrays that are doing things Ansel Adams never dreamed of doing with a camera.

Camera Magic

Mansouri's cameras can capture thousands of frames per second, for instance, and are synced with extremely fast strobe lights―this super-fast capture lets him build massive datasets that contain all the data to build light fields or volumetric constructions. He arranges about 100 of these cameras around an actor to capture him or her in ways that just weren't possible a few years ago. His cameras capture every ray of light and the angle at which it entered the camera array, so computers later can "remix" the data.

He said that he is doing tons of tricks to capture more data than the human eye can see, too.

For instance, TV images run at 30 frames per second. He can push his cameras to capture many more frames than is needed to satisfy the human eye and he oversamples those frames―the cameras he is using in one of his arrays can capture thousands of frames per second, which makes for great slow-motion captures, but he is using fast-frame captures to do new things and capture new data that helps Artificial Intelligence do new things with video. One frame might be overexposed, another underexposed. One might be black and white, another with color saturation pushed high. Then, the Computer Vision systems that he and others are building can find new detail in the data to make images not only sharper and clearer but also add capabilities that are hard to explain in a book.

The effect of all this light field work will be video that you could walk around in a very natural way. Imagine walking through a meadow in Yosemite at sunset. It won't be like a stage, where the facade is exposed as soon as you walk behind something and see that it's just a board painted to look nice from the front; here, you can walk around every plant and tree, and the deer 20 feet away looks real. You'll never find yourself "backstage."

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSO628E278A.tmp

Photo credit: Radiant Images. Radiant Images' Nebula Light Field and Volumetric system has dozens of high-frame-rate video cameras that it uses to both capture volumetric and light field data.

That kind of experience will arrive soon to our headsets. Michael says, "why stop there?" He sees a world where you will walk onto a virtual football field and watch the Superbowl from a new vantage point: that of the quarterback actually playing the game. His team actually shot a commercial for Oakley that gave a taste of how that would feel.

What comes after that might blow even him away: new datasets from the real world might let you walk off of that virtual football field and back into your seat, or even to places outside of the stadium, with lots of virtual holograms along the way. A decade from now, Hollywood could use these new datasets to bring us an entirely new form of entertainment. One that isn't delivered on film anymore, but, rather, a dataset we call an AR Cloud that streams to your headset as you walk around.

AR Clouds (3D Mapping) and Internet Infrastructure Including 5G

As we move into a world of glasses, we will want to move away from a world that requires so many apps. After all, do you want to load an app just to see where a product is when you get to a shopping mall? Or do you want to load another if you get to a ski resort just to see which runs are open? Or yet another to see the menu at a restaurant? No.

What strategists are planning is a new contextual system that brings you functionality where and when you need it, and that stretches onto the field of entertainment as we just discussed.

A New Kind of Map

When you walk into a shopping mall, you'll see all sorts of virtual helpers. When you go skiing, you'll see the names of the trails as you ski past. When you go to a grocery store, assistants will pop up, showing you more details on top of products, along with navigation to the products that are on your shopping list. When you drive, you'll see other things, like maps and controls for your car. When you exercise, you'll see your health apps contextually pop up. This contextual system will also let you experience new forms of entertainment from football to movies where you can follow an actor into, say, a burning building.

Now some of this can happen with simple location data, but in a true Spatial Computing world there will be virtual beings moving around, helping you out, plus you will want to play new kinds of games against other people, along with tons of augmented, or automated, things that either add data onto the world or keep track of the world to add new utility.

To get to this deeper augmentation, literally every centimeter in the world will need to be mapped in a new three-dimensional dataset that most people haven't yet considered. These maps barely exist today, mostly being built for autonomous vehicles or robots to use to navigate around streets. The Mapbox CEO, Eric Gundersen, laid out how this will all work. His firm already provides maps to millions of developers and if you use apps like Yelp, Snapchat, or Foursquare, you are already seeing maps provided by his firm.

He and others see a need for a new kind of map: one that doesn't just have a single line for a street, but that looks like the real world.

Mapping cars are building the skeleton for this new kind of map, which some of the industry has started calling an "AR Cloud." As vehicles roll by on the street, they are making high-resolution maps and ingesting, via AI, every sign and lots of other features. If you see Apple or Google's latest maps, you'll notice they are getting more three-dimensional features every day, and, in the case of Google, are able to use AR and Computer Vision to sense where you are, thanks to you turning on your camera, and show you navigation aids. This is using an early form of this kind of AR Cloud and gives us a taste of the world that will rapidly evolve over the next decade.

Some places, though, will need to be mapped out in a lot more detail. For instance, if you want to play games where monsters jump around your home, like you can with a HoloLens or Magic Leap, then those systems will need to build a three-dimensional map of your walls, ceiling, floor, and any objects in your home.

Building a Map

When recently acquired 6D.ai used the term AR Cloud (it was the firm that popularized the term), that's what they meant. That firm had up until recently shipped a Software Developer's Kit (SDK) that had let developers build AR Cloud capabilities into their apps.

What did 6D.ai's code do? It asked users to turn on their camera and then, as users moved their phone around where they were, it instantly built a three-dimensional structure of what it saw. Moving around enough, it soon captured a digital twin of your living room, it that's where it was being used. We've used this software in parks, stores, our yards, and at shopping malls. In each space, it built a new three-dimensional map.

Now this map, which usually is kept hidden from users, is quite special. It is actually a sheet of polygons, those little triangles that are built by computers to simplify a three-dimensional world to a dataset that a small computer can keep track of.

As Matt Miesnieks, co-founder and CEO of 6D.ai, showed us the app we noticed that it was building a virtual copy of the real world―all made from those polygons―and then all sorts of things could be done with that virtual copy. He showed us virtual things moving around that virtual room and new kinds of games where balls bounce off of these virtual walls. Keep in mind, this "copy" of the real world is laid on top of the real world, so the user thinks the balls are bouncing off of their real walls. They aren't; the code for the game is built to work with the AR Cloud, or digital copy, of the house that was just scanned.

Soon technology like that built by 6D.ai will map out almost everything as we walk around, or, at least, use maps that either a mapping car or someone previously walking around built for us. Often these new three-dimensional infrastructures won't be seen by users. After all, that's the case already. Do you really know how Google Maps works under the little blue dot that shows your phone on a street? Very few do.

We will share a bit about how Spatial Computing will work (autonomous cars, robots, and virtual beings will all need these new maps) because there aren't any standards today. Mapbox's data doesn't interoperate with the maps that Apple or Google are building, and we can list dozens of companies that are building three-dimensional maps for various reasons, from Tesla to Amazon, along with a raft of others that aren't well known by consumers, like Here Technologies or OpenStreetMap.

If you think about having a world where hundreds of millions, or even billions, of people are walking around with glasses and other devices that are using these maps, you start to realize something else: current LTE wireless technology, which delivers two to ten Mbps of bandwidth, just won't be able to keep up with the data demands of having to download and upload trillions of these polygons every time someone is moving around the world, not to mention all the other data that users will embed into this AR Cloud.

Ready for 5G

Now you understand why so many of us, like telecom analyst Anshel Sag, are excited by 5G. He explains that 5G is a spectrum of new wireless technologies that, at the low end, will increase our bandwidth more than three times and at the high end, provided by millimeter-wave radio, will give us hundreds of times, maybe even thousands of times, more bandwidth.

Today's 4K videos don't need that kind of bandwidth, but when you have a team of people playing a new kind of virtual football on a field made up of high-resolution AR Cloud data, you can see where 5G will be needed.

5G promises not just a lot more bandwidth, which can support these new use cases, but much lower latency too: it should take around two milliseconds to get a packet from your device to a nearby cell tower. That's way less than current technology. That means your virtual football will be much more in "real time" and the ball won't hang in the air while it is waiting for a computer somewhere else to render it on the way into your hands from your friend, who might be playing from a few miles away, or even somewhere else in the world, thousands of miles away.

There's a third benefit to 5G, too, where many times more devices can use a single cell tower. When we go to baseball games with 40,000 other people, it's typical that we can't even make a call because the wireless systems used just weren't designed for thousands of devices to be used at the same time on one cell tower. 5G fixes that problem, and also means that cities and factory owners, not to mention companies like Fitbit or Tesla, with their millions of users and products in the real world, can have millions of sensors, all communicating with new clouds.

The problem is that there are some downsides to 5G. To get the highest speeds (some journalists are showing that they can get around two gigabits per second, which is faster than a fiber line to your home) you need to be fairly close to a cell tower that has the newer millimeter-wave antennas on it (very few have these radios right now) and this new high-spectrum radio wave doesn't go through walls or other obstacles very well, so even if you live or work near one of these new towers, you will probably need to buy a repeater or mesh network for your home or business to bring that bandwidth indoors. At CES 2020, several manufacturers were showing off mesh networks that supported both 5G as well as the new Wi-Fi 6 standard, which brings a sizeable boost to Wi-Fi networks that get the new gear.

These downsides only apply to the high end of 5G, though. Most smartphone users won't be able to tell the difference anyway. Already on mobile phones, you can stream video just fine with only LTE, and any form of 5G will be just fine on a smartphone. It is when we move to new Spatial Computing glasses that you might notice the difference when you are connected to one of the high-end 5G networks, which provides more than a gigabit in bandwidth. Now, when you see the new iPhone, you'll know that it is really laying the groundwork for true Spatial Computing that will provide us with capabilities that are inaccessible on our mobile phones, such as interactive virtual football.

New Audio Capabilities Arrive

We are seeing an audio revolution unfold on two fronts: new "voice-first" technology that powers systems such as Siri or Amazon's Alexa, and new spatial audio capabilities with greatly improved microphones, processing, speakers, and the integration of these things into contextual systems. We are getting tastes of this revolution in Apple's very popular AirPods Pros, or Bose's new Frames, which have tiny speakers on the frame of eyeglasses that let you listen to new directional audio while also hearing the real world because they aren't headphones, like AirPods, that cover your ear drums.

Audio Augmentation

Apple uses the processing capabilities in AirPods (there's more compute power inside these little $250 devices than was shipped on the iPhone 4) to do a new audio trick: pass through real-world sound in a "transparent" mode. That gives you a hint at the potential augmentation that could be done with new audio devices. Apple and others have patents that go further, using bone transmission from devices touching your face, they promise private communication that only a wearer can hear, along with even better audio quality.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSO32571828.tmp

Photo credit: Apple. Apple AirPods Pro earbuds not only do noise canceling but can mix audio from the real world and from digital sources together.

These revolutions in voice and audio are hitting at almost the same time and once we get them built into our Spatial Computing glasses, we'll see a new set of capabilities that business people will want to exploit.

These new capabilities have been brewing for some time―it's just now that computing has shrunk and microphones have gotten inexpensive, thanks to the smartphone and its billions in sales, that we are seeing these new capabilities coming to consumers.

It was back in 2005 that we first saw array microphones on desks inside Microsoft. Back then, putting four high-quality microphones, and the computer to join those together to do new kinds of sensing, into a box cost about $10,000. Today a $150 pair of Pioneer Rayz headphones have six microphones embedded into their earbuds and along the cord that drapes around your neck. Why use these array microphones? The Pioneers have the best noise cancelling we have seen, that's why, and having better, low-noise microphone arrays helps the computer audio response systems that are evolving, like Apple's Siri, work a lot better. After all, if Siri can't hear you, it can't answer you.

Spatial Audio

The processing included inside the headphones can do much more, though. Early headphones had no idea where users were looking. Are they walking north, west, east, or south? The headphones had no idea. With Bose's new Frames, however, that no longer is true. Inside are sensors that tell the system which way the user is walking and the speaker system is designed to present audio from different places. So, if it wants you to go to the pizza place across the street it sounds like the voice is actually coming from that pizza place, which helps you get your bearings and makes new kinds of experiences, like virtual tour guides, possible.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSO6906B876.tmp

Photo credit: Bose. Bose Frames present spatial audio. These can put audio on specific things in the real world. Imagine walking by a building and it is talking to you.

Think about if a system needed to get your attention―hearing "free coffee over here" means a lot more if that sound comes from your left or right, no?

This leads us to predicting a new audio-based operating system. You might have different audio assistants located in different places around you. Your email assistant's voice might come from two o'clock. Your social network one? Ten o'clock.

These new audio capabilities are also arriving as new Artificial Intelligence and much smarter voice assistants are coming. Apple is rebuilding Siri from scratch, we hear. It isn't alone. Amazon has 10,000 people working on Alexa, trying to make it better than similar systems from Google, Facebook, Samsung, and Apple.

In just a few years, it seems like everything has got Amazon Alexa built in. There are microwave ovens, locks, lights, phones, and much more that now have microphones and can do stuff when you say, "Hey, Alexa."

To those who have experienced Alexa in their homes, this is probably the first taste of ubiquitous computing―computing that's always on, and always waiting for commands. How did this happen?

We see that these "voice-first" capabilities align deeply with Spatial Computing. Already our Tesla car is getting voice upgrades that let us ask it to take us places. Soon consumers will expect the same from all of our computing experiences.

One reason we are excited about Spatial Computing is that it will provide a visual layer to these voice-first systems. A major problem with, say, Amazon Alexa, is that you aren't sure what it actually can do. For instance, did you know that the Internet Archive has the entire Grateful Dead recorded history for you to listen to just by talking to Alexa or its competitor, Google Assistant? It's there, the Internet Archive even wrote an Alexa Skill for it, but figuring that out is nearly impossible unless you already know it's there.

Add in a visual component, though, and voice systems get a lot more powerful. Yes, you could look down at your phone for that visual component, but what if you are skiing and want to talk to Siri? "Hey Siri, what run should I take to get to the lodge?" isn't nearly as nice as if Siri could actually show you which way to turn on your glasses' screen, and asking it "Hey Siri, what can you do with Foursquare?" would lead to a much more satisfying experience if you had some visual displays to see instead of trying to listen to all the commands it could respond to.

Put these all together and soon we'll get a quite different consumer: one that can find out anything nearly instantly just by asking for help and one that will be fed information via their ears in ways that seem impossible today. Where does that matter? Well, gaming is already using spatial audio in VR to make new experiences possible.

Gaming and Social VR Experiences

Ryan Scoble has a girlfriend he has met in VR. They play in Rec Room, the "virtual social club," every afternoon. Bowling, Paintball, and a ton of custom games. He has never met her, and doesn't know where she really lives. He doesn't know her real name (her screen name is LOLBIT) but they play together as if she were a 10-year-old living next door.

He isn't alone in this.

A World Without Rules

VR has brought its magic to gaming and even if you aren't a gamer, it's useful to try a variety of games in VR because of what they can teach you about where the user experience is going. These games are increasingly social and collaborative because VR lets you feel the presence of another person, the same way you would if that person was in front of you in real life.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSOB3A393F4.tmp

Photo credit: Robert Scoble. Mark Zuckerberg, Facebook founder and CEO, talks about video game success stories at the Oculus F8 Connect conference in 2019. Here he highlighted the success of games like Beat Saber and then announced a new social gaming platform, Horizon.

This is the lesson that Lucas Rizzotto learned. He's a game developer and has developed several VR experiences including the popular "Where Thoughts Go," which lets you explore a virtual world populated by other people's thoughts. He says there's a generational shift underway between people who play VR games and those who don't. "Kids will talk with their parents," he says, pointing out that lots of weird conversations are soon going to happen due to this shift. "You won't understand, but I was an eggplant," he says a kid might try soon explaining to a real-world family member.

One of the magic things about VR is that you can be embodied in other forms, either animal, human, or vegetable. Anyone who is playing with Snapchat's AR face features, called filters, knows this, but in VR it goes to the next level. Chris Milk, founder of Within, a company that makes augmented and VR experiences and games, showed us how he could make people into a variety of things: birds, apes, or even fish. We watched him demonstrate this in his VR experience, "Life of Us," at Sundance in 2017 where people started flying within half a second of realizing they had wings instead of arms.

The next generation, he says, will grow up in a world where there are no rules. At least not physical ones. Want to jump off that building in VR? Go ahead, there aren't physical consequences like in the real world. Or, if you want to be an eggplant, you can be that too.

The Ripples of the VR Phenomenon

Some other shifts are underway, at least the major part because of gaming on VR:

  • Players expect more than just entertainment: They want to pay for something they play through and where they come out with new insights about themselves.
  • Content creation will go mainstream: Digital skills of the future will be artisanal skills. Already we are seeing a raft of new tools develop, everything from Tilt Brush, which lets people draw whatever they want in three-dimensional space, to modeling tools.
  • People will build their own realities and experiences: We are seeing this trend well underway in Minecraft and other games.
  • Crafting will be done with digital materials that don't necessarily exist in the real world: The new tools are letting people create anything their minds can come up with, everything from complex art, to entirely new cities. Already Rec Room has more than a million virtual places to play in, most created by the users themselves.
  • Gaming is now 360 degrees and much more interactive: This is much closer to experiential theater, like the influential "Sleep No More," an off-Broadway play in New York where audience members walk around a huge warehouse and actors are all around, performing their lines. While enjoying the play you can take in the complex theatre set, even picking some things up and discovering what they do more.
  • We will be introduced to new, synthetic characters that are far more lifelike than before: Magic Leap has already shown a virtual being, Mica. Mica looks like a human, and she plays games with you, among other things, yet she's completely synthetic. Rizzotto says he's already seeing people making digital pets in VR.
  • Collaboration is much better and deeper in VR: You can already play basketball, paintball, and do a variety of other things together in VR, with more on the way, thanks to Facebook's announcement of "Horizons," a new social VR experience.

It is the collaboration part that's most interesting, though, and has certainly caught Mark Zuckerberg's attention. Rec Room has attracted an audience, and Zuckerberg knows that. Horizon seemed squarely aimed at Rec Room, with very similar kinds of games and social experiences, albeit with a promise of better integration into your existing friend pool.

This leads us to a key point: social software is only really fun if you know the people you are playing with. If you are like Ryan Scoble, you can invest the time in a game and make some friends there, but most of us just want to get in and do something fun quickly. Up until now, there just weren't enough other people on VR to ensure that you had someone else to play with, and almost certainly none of them would be your real-life friend. That's starting to change now as more people get VR.

Building a VR Community

It has been tough-going building a social experience for VR, though. Many studio heads expected VR to sell a lot better than it did. It turns out that this is another chicken-and-egg waiting game: the social aspect really works best if your friends can play with you, but most of your friends don't yet have VR. Lots of early social VR systems either sold for very little money, like AltspaceVR, which had announced it was going out of business and then was purchased by Microsoft at what were rumored to be fire-sale prices. Others have recently struggled to keep their doors open, but have had to shut down or sell off, like Sansar which comes from the company Linden Lab, developers of Second Life. Philip Rosedale was the founder of Linden Lab before leaving that company to start the VR-based company High Fidelity. Rosedale told us it is hard to get the critical mass to make social experiences work if only a few million headsets have sold.

Even Rec Room acknowledged that the lack of VR users is a problem and it now has apps on iOS and Android so that mobile players can join some of the fun. This is a smart strategy because this helps mobile users realize they are second-class citizens and aren't playing in an immersive, hands-on-the-ball kind of way. This brings us to a major benefit of playing in VR―you are forced to move around, which leads to greater fitness. In some games, like The Climb, you have to move your hands like you are climbing in the real world. We've found ourselves sweating after a few rounds in games like that, and in Rec Room, with its basketball, paintball, bowling, and other movement-activating games, you see this too. As we went to press, Chris Milk's company, Within, released "Supernatural" which is an exercise app, complete with volumetricly-captured coaches and a subscription plan that works like a gym.

Fitness uses of VR have even spread to gyms. A Boise, Idaho-based start-up, Black Box VR, has brought a new kind of weight training machine to its gyms. Founder Ryan DeLuca showed us how it works: you put on a VR headset and grab two handles at the ends of cables coming out of the machine. Then you select which exercise you want, and it walks you through that inside the headset. At the start of each round of exercises, it starts up a game. Keep up your exercise and the game gets more fun and intense, and you gather more points. It was so fun that afterward, we realized we were sore, but at the time we didn't feel like we were working out on the machine. It has since opened in other locations, including one in San Francisco.

Denny Unger, CEO and creative director of Cloudhead Games, whose VR game The Gallery won game of the year at VR Awards 2018, says this fitness effect surprised even him. Its latest game, Pistol Whip, has been used by people to get into shape and lose weight, and when Unger and his team started designing it they never thought it would lead to better health as a benefit, he told us. Pistol Whip is a rhythm game, which means you shoot and interact with things to the rhythm of music. This form has proven very popular with VR gamers. Unger even admits to being influenced here by Beat Saber, along with a few other games that you play to music. In Beat Saber, little boxes, and other things like bombs, fly at you to the beat of various music tracks that you choose and you have to slice them with a light saber to keep the game going and the music pumping. Both games are lots of fun and get you a good cardio workout as you get to higher and higher levels.

Great Expectations

Unger says even better games are in the offing. He is very excited by Valve's Half Life: Alyx which came out in March 2020. He says this game is a signal that the top game studios are finally building amazing experiences solely for VR. While the announcement of Half Life: Alyx caused a big controversy amongst gamers, many of whom didn't like being forced to buy a VR headset to enjoy this game, Unger sees that this is a forcing function that will bring many other people into the VR world, which will benefit all the studios.

Unger has a dream, however, of a decade from now, where you are walking around with glasses that do both VR and AR, or a newer form of both mixed together. His dream is that he wants what he calls "replacive environments."

What he means is that in the future you'll be walking around the streets and some bars will be decked out with virtual worlds that replace every surface. "It would be like walking into a bar that is set up like a medieval pub," he said. In such a world he sees a lot of controversy, maybe even regulation, because it would, he says, "radically change the economy of our planet." What he means is that in such a world everything will be virtual, creating less of a need for physical things to make the places we go to more interesting and more of a need for virtual worlds, which will drag with them all sorts of new transaction types. Even ordering a beer in one of these places will be quite different. Your server might be a synthetic being that comes over to take your order and a robot delivery system will bring you your beer. Virtual beings, like the one Magic Leap showed off at the Game Developer's Conference in 2019, named "Mica," have him thinking of new interaction types. "Get down to the primal instinctual experience," he says, while explaining that what he wants to build next is a VR game that feels like you are inside a movie.

The thing is, to get the full experience of what he's thinking about, you might need to head to your local shopping mall where entrepreneurs are opening new LBE experience centers, called LBEs. Here you get access to the best headsets, along with haptic guns that shake your hands when you shoot them, and other devices to bring your VR experience to a whole new level.

Location-based VR

There are places in shopping malls and other big venue areas that let you experience virtual experiences that are impossible to have at home. The idea of LBE VR experiences is not new―there were several exploratory versions created in the 1990s, but they were not successful due to the limitations of the technology available at that time. The latest iteration of LBEs started as the brainchild of Ken Bretschneider, one of the original co-founders of a company called The Void, which was started in 2014.

The Void

Ken's idea behind The Void was to enable several people, initially up to six, to be able to play and be immersed in a VR experience together. The technology needed to be able to do this was formidable and in many cases, proprietary technology needed to be developed, which included creating new batteries for the computers that people would wear as backpacks and were connected to the VR headsets and novel wireless system, among other items.

We have talked with Ken several times over the years since 2017, and he was one of the first people in the Spatial Computing industry to talk about using AI characters in VR LBE experiences, something that is currently still in the formative stages.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSOA9A1C222.tmp

Photo credit: The Void. Players getting suited up to play the Star Wars: Secrets of the Empire experience.

The first public experience that was marketed at The Void was a Ghostbusters-themed VR experience called Ghostbusters: Dimension, which opened at Madame Tussauds in Times Square in New York City in 2016. The VR experience was created as a tie-in to the new Ghostbusters film. The cost for the experience was $55 a person. The Void eventually went on to be more controlled by Disney, who had been investing in the company, and several other VR experiences have been developed using Disney IP. There are currently 17 The Void locations in the world, with pricing for experiences ranging from $34.95 to $39.95 per person.

A total of six experiences are playing in different locations. Ken moved on to focus on building his immersive park, Evermore, as well as The Grid, an entertainment destination that includes proprietary VR technology, an immersive indoor karting race track and dining hall. The Grid's first location is in Pleasant Grove, Utah, very close to the location of Evermore, with further locations planned in 2020.

Dreamscape Immersive

Another LBE company featuring VR experiences is Dreamscape Immersive, which was started in 2016 by Kevin Wall, an Emmy-award-winning producer; Walter Parkes, who helped create DreamWorks; and Bruce Vaughn, who was Chief Creative Executive at Walt Disney Imagineering.

Their first location is located at the Westfield Century Mall in Los Angeles, along with another location in Dallas, with new locations opening soon in Dubai and Ohio. Their business model differs from The Void in that they are actively seeking franchise opportunities. The cost for their experiences is $20 per person and they currently offer four different experiences, one of which uses intellectual property from DreamWorks.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSOA316C680.tmp

Photo credit: Dreamscape Immersive. Dreamscape's Westfield Century City mall location in Los Angeles.

One Dreamscape Immersive experience that we were particularly impressed with was Curse of the Lost Pearl: A Magic Projector Adventure due to its high-quality visuals and smart use of interactivity. For example, we were actively pulling a lever and reaching out to hold a "torch" all while in our VR headsets.

We talked with Bruce Vaughn, the CEO of Dreamscape Immersive, recently. He told us that the vision for Dreamscape is for it to produce originals that would spawn their own valuable IP. In his eyes, Dreamscape is an opportunity to build the next kind of entertainment studio―one that produces interactive experiences that most notably celebrate narrative storytelling.

Sandbox VR

Sandbox VR is another LBE company that shows promise. Sandbox raised $68 million in their Series A funding round led by Andreessen Horowitz, including an additional amount in 2019, bringing their reported total to $83 million for that year. The cost per experience per person ranges from $35 to $40. A differentiation point from other LBEs is that there is more interactivity that can be experienced due to motion-capture technology being worn by players.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSOC578708E.tmp

Photo credit: SandBox VR

Sandbox currently has five experiences playing in two locations, Los Angeles and Cerritos, California. A total of 16 locations is planned by the end of 2020, in new cities including New York, Austin, San Diego, and Chicago.

There are several other smaller LBE companies all over the world, mainly catering to the countries and regions where they are located. A major difficulty of the LBE business model includes the high cost associated with creating a VR experience, which can range from $10 million to highs in the region of $500 million, based on the sophistication of the level of interactivity built into the experience. Another major hurdle is that revenues are very much tied to the throughput of people who purchase the experiences―that is, the shorter the experiences could be, the more people could potentially move through the experiences. Due to this, LBE VR experiences are generally short in nature, ranging from 10 to 15 minutes in length when longer experiences would be better at attracting people. However, longer experiences would have to be priced quite exorbitantly in order for businesses to recoup their costs.

Similar business issues befall the cinematic and interactive VR ecosystem, along with other issues, including opportunities to view and general distribution.

Cinematic and Interactive Experiences

Outside of gaming, social VR, and VR LBEs, there are Spatial Computing experiences currently made mostly for the festival circuit.

Festivals That Include VR

The reason why these experiences are shown this way is that the present state of digital distribution of Spatial Computing entertainment experiences is at a very initial unsophisticated state. We won't be focusing on distribution issues here, as it would take us further away from what the focus of this section is.

Cinematic VR experiences are those that are actually 360-degree experiences―that is, they are not actually VR. However, on the festival circuit, the term Cinematic VR has taken off, so we will briefly address it, as well as addressing those non-gaming experiences that are interactive, hence are VR. Additionally, there has recently been a very small and slowly growing body of AR experiences that are narrative in nature, shown at festivals.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSO61EBBCC.tmp

Photo credit: SXSW. People at SXSW 2019 enjoying Cinematic VR in Positron Voyager chairs.

Major festivals where Cinematic VR, Interactive VR, and AR experiences are shown include the Sundance Film Festival in Park City, Utah; SXSW in Austin, Texas; the Tribeca Film Festival in New York City; the Cannes Film Festival; the Venice Film Festival; and the Vancouver International Film Festival. The Toronto International Film Festival did show Cinematic VR for a few years but discontinued it after having a change in leadership.

There has been much talk about how Spatial Computing festival experiences have been overly experimental and avant-garde and not made with the public in mind. We tend to agree with this characterization of the state of affairs. One of the major issues that impacts what kind of content gets made is how much money is made available to produce it. From 2015 until 2017, many VR headset companies, including Facebook, Samsung, and HTC, provided content funds to jump-start the industry. However, ever since 2018, much of this money has dried up.

Outside of self-financing narrative content for Spatial Computing environments (VCs and other kinds of investors do not generally have content funds), it is very difficult to have the resources necessary to make these kinds of experiences. As a result, only those that are extremely driven and also have access to funds are able to create them. The kinds of narratives that get developed, as a result, are not necessarily those that could actually be marketed for revenue. This brings us back to the original problem: that there is no real place to market Spatial Computing narrative experiences for profit.

Marketing the Unmarketable

In 2018, Eliza McNitt and her team received about $1.5 million for Spheres, a three-part episodic Interactive VR production that won the Grand Prix award in the Venice Film Festival VR section. The deal was struck at the 2018 Sundance Film Festival with a company called CityLights buying it.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSO7E80AFBA.tmp

Photo credit: Cannes Film Festival. Marketing for Cannes XR 2019. Cinematic VR experiences, over Interactive VR and AR, get the lion's share (over 90 percent) of the spots for showing at Cannes XR, even though the photo shows a person in a VR headset using controllers to interact in an experience.

This is one of very few deals that have happened so far for Spatial Computing narrative experiences. However, even after this, the company that bought Spheres did not know how to distribute it, effectively burying it.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSOE8E43FD8.tmp

Photo credit: Tribeca Film Festival. Marketing for the 2019 Tribeca Virtual Arcade.

Even when it comes to showing Spatial Computing narrative experiences at festivals, there is very little guarantee that ticket-buyers will have the opportunity to actually experience them. Although there were several Interactive VR experiences featured at the 2019 Tribeca Film Festival Virtual Arcade, most people who bought tickets were not able to view them due to the paucity of available viewing slots. This is actually the norm in general versus a special case when it comes to how festivals are featuring Cinematic VR, Interactive VR, and AR narrative experiences.

Every year, there are a few standout experiences. Two experiences that were featured on the 2019 festival circuit include Gloomy Eyes, an animated VR experience created by Jorge Tereso and Fernando Maldonado, and Everest – The VR Film Experience, a 360-degree documentary directed by Jonathan Griffith. Both of these were featured on the 2019 Vancouver International Film Festival's VIFF Immersed program (Gloomy Eyes also being shown at Sundance) and both won an award at VIFF in their respective areas.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSO24F92BA6.tmp

Photo credit: Sundance Film Festival. Gloomy Eyes was one of the Cinematic VR experiences shown at Sundance 2019 and won an award at VIFF Immersed 2019.

We had the opportunity of interviewing Brian Seth Hurst of StoryTech Immersive and the organizer of VIFF Immersed for the years 2017-2019. The major takeaway from that interview was that most Cinematic VR, Interactive VR, and AR narrative experiences are made with the expectation that only a handful of people will ever get to view them, making the idea of broad appeal a very alien expectation.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSO6A96DEA4.tmp

Photo credit: Vancouver International Film Festival (VIFF). At VIFF Immersed 2019, the Cinematic VR experience Everest - The VR Film Experience won "Best in Documentary."

One Cinematic VR experience that actually approaches broad appeal is Seth Hurst's 2017 piece, My Brother's Keeper, which tells the story of two brothers fighting on opposite sides of the American Civil War and was backed by PBS. But, there have been very few in this category as of yet.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSO5955052.tmp

Photo credit: PBS. 2017's My Brother's Keeper incorporated several novel camera techniques using 360-degree cameras. The directors and directors of photography were Alex Meader and Connor Hair, with Brian Seth Hurst producing.

We have been impressed in general by what the immersive artist and producer Chris Milk, of the company Within, has created for festivals; as of late, his focus has moved from Interactive VR to AR narrative experiences, with one of them being Wonderscope, an AR iOS app for kids.

Engineering Experiences

Three studios that have produced some really good Interactive VR experiences are Fable Studio, Baobab Studios, and Penrose Studios. Former Oculus Story Studios people, including Edward Saatchi, founded Fable Studio in 2018. Fable went on to create Wolves in the Walls, a three-episode Interactive VR adaptation of the Neil Gaiman and Dave McKean story. In 2019, it was awarded with a Primetime Emmy for outstanding innovation in interactive media. It is one of a number of relatively few VR narrative experiences available for download online from the Oculus store. And it is free, which is the status of almost all the VR narrative experiences available online.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSO4878E430.tmp

Photo credit: Fable Studio

In 2019, Fable announced that its focus will be on virtual beings using natural language processing and other AI technology. Whispers in the Night, which debuted at 2019's Sundance Film Festival, featured Lucy, the same character as in Wolves in the Walls. The experience was an Interactive VR animated experience where people who were in the experience could talk directly with Lucy, and together with her, we discover what's truly hiding inside the walls of her house.

Another studio that is pushing boundaries is Baobab Studios, founded in 2015 by Eric Darnell, an acclaimed former DreamWorks director and writer, and Maureen Fan, a former VP of games at Zynga. In 2019, Baobab won two Daytime Emmy awards for its latest Interactive VR animated experience, Crow: The Legend. It is also available on the Oculus store, along with an earlier VR animated experience, INVASION!; both are free.

In Crow: The Legend, which is based on a Native American folk tale, characters who are animals are made human-like―they can talk to each other and have human emotions. The interaction that the person has who is experiencing Crow is limited, but very well done. Using VR controllers, snow can be "sprinkled" into a scene turning it into a winter environment. Flowers, similarly, can be placed, creating a spring scene. As we travel through the virtual space, we can "conduct" a symphony while gliding through an asteroid shower, using VR controllers to elicit different musical sounds and tempos.

We talked with Jonathan Flesher when he was still Baobab's Head of Business Development and Partnerships in 2017. The company stance at that time on the level of interactivity a VR animated experience should have was one based on conservatism. The idea was to pepper the experience with choice possibilities for interaction, but leave most of it as non-interactive. Reasons for this include the high expense of instilling interactivity in an experience, along with the notion that interactivity is very potent. Crow: The Legend embodies this stance in a very high-quality way.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSO846949BE.tmp

Photo credit: Baobab Studios

Another studio that is conservative in its approach to quality with regard to interactivity is Penrose Studios. Penrose was started in 2015 by Eugene Chung, former Head of Film and Media at Facebook. Chung and his team are highly experimental, and have made several VR animated experiences that did not go further than being tests. Penrose's latest VR animated experience, Arden's Wake: Tide's Fall, released in 2018, is a continuation of Arden's Wake, which won the first Lion for Best VR awarded at the 2017 Venice Film Festival. Arden's Wake and Arden's Wake: Tide's Fall are not available to download online, but two earlier VR animated experiences both from 2016, Allumette and The Rose and I, are available on the Oculus store for free.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSO73AB5C7C.tmp

Photo credit: Penrose Studios

Arden's Wake: Tide's Fall continues the story of Meena, a young woman who lives in a lighthouse in a post-apocalyptic world. Much of what used to be above ground is now underwater. Meena tries to unsuccessfully save her father after he falls into the water.

Even though Arden's Wake: Tide's Fall is not interactive, its visuals are stunning. Penrose is currently working on some experimental VR experiences that are interactive in unexpected ways.

The future for both VR and AR narrative experiences is one where more interactivity is used in a much more natural way, with more effects, such as haptics and smell being used artfully. Haptics is any form of communication involving touch, and it is the most difficult of the relevant Spatial Computing technologies to incorporate.

C:Users
obAppDataLocalMicrosoftWindowsINetCacheContent.MSO509603EA.tmp

Photo credit: Venice Film Festival. An interactive VR experience that featured smell effects showcased at Venice in 2019.

The current use of smell is minuscule and when it is used, it tends to come off as gimmicky. Another entertainment form where haptics and smell would be useful is with VR LBEs, which makes sense since they would add more realism.

Content Production and Previsualization

Over a decade ago, while working on Avatar, the filmmaker James Cameron created a technique where actors wear motion-capture suits while being filmed inside digital backgrounds in real time. Even more recently, in films such as Ready Player One and Solo: A Star Wars Story, among others, filmmakers have started using VR assets and headsets to plan shots using a virtual world set. Cameron is also using VR for the previsualization of his Avatar sequels.

Another filmmaker, Jon Favreau, has been especially enthusiastic about the use of VR for content production and the previsualization of scenes. His most recent film where he utilized VR in this way is 2019's The Lion King, and he is actively using his technique, called "V-cam," for episodes of The Mandalorian, a Star Wars series for the Disney+ streaming service.

Favreau also used a more nascent version of his technique on 2016's The Jungle Book. Additionally, he partnered up with start-up Wevr to make Gnomes and Goblins, an interactive VR fantasy world.

The V-cam technology helps with fast turnarounds on schedule and on budget, allowing VFX supervisors to create the look of scenes in real time and providing major benefits to the actors and the crew, effectively, according to Favreau, turning the film's production process into a "multiplayer filmmaking game."

Except for a single photographed shot, The Lion King was filmed entirely in VR. In order to work out scenes, Favreau and others would put on their VR headsets to plan exactly where their cameras and lights should go, using handheld controllers to move the virtual equipment to create the scenes using the decided-upon calibrations.

Regarding the use of VR for previsualization, Favreau has also said, "We'll probably have to come up with some sort of new language... It's nice to be able to turn to these new technologies that could otherwise be a threat and use them to reinvent and innovate."

The future of the creation of movies and episodes points to actors in VR headsets doing their scenes inside a virtual setting, captured by virtual cameras controlled by a VR headset-wearing crew.

The Merging of the Virtual and the Real

What we've been showing you is how a new metaverse will appear. An entire set of virtual worlds, or layers, that you can experience with Spatial Computing glasses will soon be here. As you look around with these Spatial Computing glasses, you will see the real world and these virtual layers "mixed" with each other in very imaginative ways, and we just presented the pieces that are currently under development that will make this happen. Often the changes will be small, maybe a virtual sign on top of food, but other times they might be huge, like presenting a complete virtual world you can move around in and interact with. Over the next few years, we'll see an acceleration toward these new virtualized worlds as the devices that we wear get far more capable and lighter and new experiences are created for them.

For our next chapter, we will delve into the benefits that Spatial Computing brings to manufacturing. The use cases there have a much more practical nature than what we discussed in this chapter, though Spatial Computing's abilities will similarly change how we work.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.8.110