2

A Tilted Playing Field

The best minds of my generation are thinking about how to make people click ads. That sucks.

—Jeff Hammerbacher

Outside of The Dalles, Oregon, in a nondescript industrial park along a barren stretch of the Columbia river, sits a $1.2 billion complex of hangar-like buildings. It was here, in 2006, that Google built its first mega data center. Google’s mammoth computer warehouses squat alongside other industrial facilities: grain elevators, a derelict aluminum smelter, a plant that turns potatoes into French fries. Megawatts of power pulse into the compound through high-voltage lines, and on winter days steam rises from four-story-high silver cooling towers.

Google now has fifteen mega server farms like this worldwide, along with numerous smaller facilities. Microsoft, Facebook, Apple, and Amazon have similar server farms too, including several just up the Columbia River. Between 2003 and 2013, Google spent $59.6 billion dollars just on research, development, facilities, and equipment—three times (in constant dollars) what the U.S spent to build the atom bomb.1 When Google’s site first went live, it was hosted on just two computers on the Stanford University campus. But by 2009, with its new warehouse-scale computers online, a typical Google search touched thousands of computers, with the results returned in a fifth of a second.

Google’s data centers represent a jaw-droppingly massive investment, larger than the GDPS of more than one hundred individual countries. Yet if folk theories of the internet are correct, Google’s data factory in The Dalles should not exist.

We have been told—again and again—that the internet is a “post-industrial” technology. Online there is supposedly no need to spend millions on broadcast towers or printing presses or similar capital equipment.2 With the industrial economics that homogenized print and broadcast media gone—so the story goes—barriers to entry fall and audiences radically spread out.

Google’s data center at The Dalles stands in contradiction to this fable. Google’s facility is exactly what it looks like: an industrial mill, a digital smelter refining not ore but information. Google’s data factories are just as critical for it today as broadcast equipment was for NBC in an earlier era. The persistence of smokestack economies in the digital age should give us pause. If we have gotten something this basic wrong, what else have we missed?

This chapter, and the two that follow, aim to show how big sites got so big. A central concern is stickiness—the factors that allow sites and apps to attract and keep an audience (see chapter 1). Critically, many tactics that promote stickiness get cheaper per user as sites get bigger. The internet thus provides economies of scale in stickiness. Bigger, more popular sites and platforms thus find it easier to attract still more visitors, and to build up habits of readership. The economies of scale that shape countless traditional industries, from airlines to automakers, remain powerful in the digital economy. Understanding digital audiences starts with digital economies of scale.

This chapter does not attempt to provide an exhaustive list of every online economy of scale—that would require a much longer chapter. Rather, the goal is to highlight some of the most powerful and best-documented forces that skew the game toward the largest players. Size advantages alone are not the full story, as we shall see. But with so many strong economies of scale, of so many different types, in so many different areas of digital media, it is time to stop pretending that the internet is a level playing field.

NETWORK EFFECTS

In the early 1900s America was a hodgepodge of competing, incompatible telephone networks. The expiration of the Bell telephone patents had led to an explosion of telephone companies and cooperatives. Telephones had become cheaper, and telephone service was increasingly available even for those outside a city. In many places, though, multiple subscriptions were required to reach everyone with a phone. Those on non-AT&T networks also could not make long-distance calls.

Under the leadership of Theodore Vail, AT&T embarked on an effort in 1907 to consolidate the telephone system under its own banner. AT&T argued that users would be better off with a single integrated phone network. As Vail put it in the AT&T 1908 annual report,

a telephone—without a connection at the other end of the line—is not even a toy or a scientific instrument. It is one of the most useless things in the world. Its value depends on the connection with the other telephone—and increases with the number of connections.3

In a wave of advertising AT&T declared its commitment to “one system, one policy, universal service.” The campaign was a success, and it turned AT&T from one of America’s most reviled companies into a well-liked, government-regulated monopolist.

The telephone system thus has become the canonical example of network effects or (more formally) positive network externalities. Such effects arise when the value of a good or service depends strongly on whether others use it too. Network effects are also referred to as “demand-side economies of scale.” Even if per-customer costs stay steady, the product becomes more valuable as more people join the network.

There is increasing acknowledgment that internet services can follow the same pattern as the telephone system, especially for sites that depend on communication between users.4 Facebook and Twitter, for instance, are useful only if other people use them. Network effects make it difficult to compete with established players. Many other microblogging sites have tried to compete with Twitter, but none has been able to reach critical mass.5

Acknowledgment of network effects is a welcome change from the rigid (though still common) belief that the internet is a leveling force. Unfortunately, talk about network effects has come with two common misunderstandings.

First, “network effects” is often, and inaccurately, used as a synonym for all economies of scale.6 Not every size advantage is a network effect. A social network with no users is useless, while a search engine or an online app like Google Docs might still be valuable even before becoming popular. Confusing different sorts of scale economies leads to misunderstandings and ultimately bad policy.

Second, there is the persistent misuse of “Metcalfe’s Law,” a rule of thumb—and definitely not a real law—named after ethernet inventor Robert Metcalfe. As popularly understood, Metcalfe’s Law claims that the value of a network increases with the square of the number of connected users. A network that connected a hundred users, for example, would be a hundred times more valuable than than a network that connected ten.

The endorsement of Metcalfe’s Law has been repeated at the highest levels of public policy. Reed Hundt, former chairman of the Federal Communications Commission, declared that Metcalfe’s Law, along with Moore’s Law, “give us the best foundation for understanding the Internet.”7 Despite this, there has never been evidence for Metcalfe’s Law in large real-world networks, and it has been regularly debunked for two decades.8 Metcalfe’s assumption that all ties are equally valuable is not true in a social context. People care more about talking to their family and friends and business associates than they do about talking to distant strangers.

But even if network effects have been overstated, sites and apps with a strong social component show remarkable stickiness. As of this writing, Facebook is the most popular site on the web, and it owns the two most popular apps (Facebook and Facebook Messenger) on both iOS and Android. Facebook’s popularity is consistent with the large body of research on the power of social influence to change behavior.9 Facebook’s early expansion was partly driven by a “surround strategy”:

If another social network had begun to take root at a certain school, Thefacebook [sic] would open not only there but at as many other campuses as possible in the immediate vicinity. The idea was that students at nearby schools would create a cross-network pressure, leading students at the original school to prefer Thefacebook.10

The explicit goal was to maximize social pressure on prospective users.

On a smaller scale, network effects impact sites that rely on reader-created content. Some research on news sites shows that increasing user “engagement”—and especially comments—helps to make a site stickier.11 Those who post online want to have their words read by others. Sites without a critical mass of viewership have trouble attracting comments.

Some sites have considered commenting systems to be central to their business. For example, for years the Huffington Post had one of the largest and most sophisticated comment systems in the news business. The Huffington Post used thirty human coders on duty around the clock, coupled with computerized filtering technology, and handled more than eighty million comments during 2012.12 The core filtering tech came from HuffPo’s 2010 acquisition of the startup Adaptive Semantics, and the system learned over time which comments to filter, which to post, and which to send for human feedback.

These advantages and big investment make the ultimate fate of the Huffington Post’s comment system sobering. In 2014 the Huffington Post gave up, and moved entirely to Facebook comments. In announcing the change, the Huffington Post noted the powerful gravity of network effects, explaining that the shift would “[bring] the discussions and debates to the places where you engage with them the most.”13

Supply-side and demand-side economies of scale can thus be mutually reinforcing: sites with lots of commentators can afford better technology, and better technology platforms attract more comments. Yet few sites have been able to reach that virtuous circle. Even hugely popular sites like the Huffington Post have abandoned the Sisyphean task of policing comments, instead relying on big platforms (especially Facebook) to validate and track users. Relinquishing user comments to Facebook has reportedly boosted traffic and increased civility for many news sites, though at the cost of further empowering Facebook over content producers.14

ARCHITECTURAL ADVANTAGES

Network effects are certainly a concentrating force online, one factor that makes many sites stickier as they get larger. But they are far from the whole story behind why the Internet favors bigger companies. Debates about the internet still often start with talk about the medium’s “openness,” about how the “peer to peer” architecture of the internet treats all websites equally.

But such talk is increasingly obsolete. Changes on the internet mean that the architecture of large and small sites is no longer comparable. There is overwhelming evidence that large firms’ architectural edge translates directly into a larger audience and more revenue.

To a rough approximation, internet firms produce goods from two primary inputs: lots of high-tech industrial equipment, and lots of software code. But as we know from long experience in traditional markets, both software production and equipment-heavy industries favor the very largest firms.

Industrial economics has long studied the “minimum efficient scale”: the minimum size that, for example, a factory needs to achieve in order to have the lowest costs. With some important exceptions, the answer for industrial plants has long been that they scale up very, very large.15 This is nothing new. Alfred Chandler’s classic history of American capitalism, The Visible Hand,16 is filled with examples of nineteenth-century entrepreneurs investing in single factories large enough to saturate the world market. The web combines the economic pressures that produced AT&T with the forces that produced Microsoft.

To understand how the internet looks different today than it did in the 1990s, it is useful to walk though the architecture of today’s internet giants. We will start with Google. Because of its public statements, we know more about Google’s efforts than those of other firms. Still, companies like Microsoft, Facebook, and Amazon have all made similar investments in server farms and high-scalability software platforms.

For all the talk about how the information economy is leaving “industrial economics” behind, Google’s server farms have demonstrated the same economies of scale we have long seen in smokestack industries, where the largest plants are most efficient. Google does not say exactly how many computer servers it runs, but by early 2017 the company had fifteen mega data centers like the one in The Dalles, not counting numerous smaller facilities; past estimates have pegged the number of servers at 2.4 million as of early 2013.17

Running on these data centers is a dizzyingly complex software stack written by Google’s engineers. One key early piece of this infrastructure was the Google File System, which allowed Google to store large “chunks” of data seamlessly across multiple different servers.18 Google created its own storage and communication formats, and developed new tools for machines to share common resources without increasing latency.19 Google’s BigTable and MapReduce allowed it to store and process, respectively, datasets spread across thousands of separate machines.20

As the Google infrastructure has matured, its software architecture has grown even bigger and faster. Google’s Caffeine and Percolator indexing tools have allowed for incremental processing and updating. New web pages now appear in the index as soon as they are crawled, dropping the average age of documents in the Google database by half.21 The revised Google file system, codenamed “Colossus,” has been reworked to provide increased responsiveness for “real time” applications like Gmail and YouTube. Google has even built new globally distributed database systems called Spanner and F1, in which operations across different data centers are synced using atomic clocks.22 The latest iteration of Borg, Google’s cluster management system, coordinates “hundreds of thousands of jobs, from many thousands of different applications, across a number of clusters each with up to tens of thousands of machines.”23

In recent years Google’s data centers have expanded their capabilities in other ways, too. As Google has increasingly focused on problems like computer vision, speech recognition, and natural language processing, it has worked to deploy deep learning, a variant of neural network methods. Google’s investments in deep learning have been massive and multifaceted, including (among other things) major corporate acquisitions and the development of the TensorFlow high-level programming toolkit.24 But one critical component has been the development of a custom computer chip built specially for machine learning. Google’s Tensor Processing Units (TPUs) offer up to eighty times more processing power per watt for tasks like image processing or machine translation, another advantage Google has over competitors.25

These investments in the data center would mean little, of course, without similarly massive investments tying them to the outside world. Google’s data centers are connected with custom-built high-radix switches that provide terabits per second of bandwidth. Google has made large purchases of dark fiber to speed traffic between its servers and the rest of the web. Frustrated with off-the-shelf equipment, Google has made its own routers, and in some cases even laid its own undersea fiber-optic cable.26 Both moves have been mimicked by key competitors such as Facebook.

Google has also greatly expanded its peering capabilities. Peering is the sharing of internet traffic between different computer networks, and it requires a physical fiber optic interconnection. According to information Google posted on PeeringDB.com, the company had 141 public and seventy-nine private peering locations as of June 2013. Google’s publicly acknowledged bandwidth, just from public peering locations, is 3.178 terabits—or 3,178,000 megabits—per second. For perspective, that number is equal to the entire bandwidth of all fiber optic cables running between the United States and Europe.27

The parallel investments of the biggest web firms have fundamentally transformed the internet’s architecture. According to one report, the tipping point occurred between 2007 and 2009.28 In 2007 a typical request for a web page would still go from a consumer’s local ISP network (say a local cable broadband provider), up through regional networks, across the national internet backbones run by firms like MCI and AT&T, and then down through regional and local layers to the server hosting the content. Once the target server received the request, the process would reverse, with data packets streaming upward to regional networks, over the backbone, and then back down to the local user. This model was largely unchanged from the internet’s original Cold War–era design.

By 2009, as big investments by large websites came online, traffic patterns had been transformed. Because the largest content producers had hooked fiber directly into local ISP networks, or even colocated servers in ISPs’ data centers, the portion of data traveling over the national backbones dropped. Packets took fewer hops, and users saw their web pages and videos load faster, at least when they visited the largest sites. This shift of traffic to the edges is crucial for high-bandwidth, low-latency uses, like online video or interactive web applications. But it also challenges the notion that the internet is still a peer-to-peer network. Google might have its fiber hooked directly into Comcast’s network, but small sites do not.

Google’s hardware, networking infrastructure, and software stack all show how big internet firms become more efficient as they scale up. Google or Facebook or Microsoft or Amazon can deploy more storage, more computing power, and more bandwidth per dollar than smaller firms. Per computer, these big data centers are also much cheaper to run.

Large data centers have long been more efficient than smaller data centers. According to a 2012 industry survey, though, even large data centers have a typical power usage efficiency (PUE) between 1.8 and 1.89—meaning that for every watt spent on running the server itself, an additional four-fifths of a watt was spent on cooling and running the data center.29 Google reported a PUE of 1.1 as of mid-2013. Facebook, whose servers are slightly newer and built according to similar principles, claims a PUE of 1.08.

The largest sites thus have roughly one-eighth or one-tenth the overhead electricity costs of traditional large data centers. In a business where electricity is the largest operating cost, that represents a powerful economy of scale. Even Google’s investments in machine learning—discussed more in chapter 3—have helped here. Applying Google’s DeepMind methods reduced cooling costs for data centers by 40 percent.30

Still, the key question is whether these infrastructural economies translate into advantages in stickiness—in attracting and maintaining audiences. Here we have substantial evidence that the answer is yes.

Many of Google’s other advantages are closely tied to its edge in computing and networking scale. Exceptionally low computing and storage costs are essential for both personalizing content and for ad targeting (as we will see in the next chapter). The additional capacity offered by Google or other large companies has sometimes been part of the direct pitch to consumers. When Google launched Gmail in 2004, it provided users with a gigabyte of storage at a time when most other webmail sites offered just four megabytes. Google was able to offer 250 times more storage only because of its hardware and software investments. Many of those who switched to Gmail in 2004 still visit Google dozens of times a day to check their inbox.

Google’s advantages in computing scale have also proven quite flexible. While web-scale data centers are an enormous upfront cost, they can be adapted to do many different tasks. Moreover, Google has benefited enormously from integration between its core web-scale technologies and the many applications it provides to users. As Google engineer Sean Quinlan explains,

One thing that helped tremendously was that Google built not only the file system but also all of the applications running on top of it. While adjustments were continually made in GFS to make it more accommodating to all the new use cases, the applications themselves were also developed with the various strengths and weaknesses of GFS in mind.31

Integration economies are classic economies of scale.

And, of course, Google’s architecture is blazingly fast. This fact alone makes the site stickier.

As the introduction showed, even small differences in site responsiveness quickly grow into big differentials in traffic. In the words of Marissa Mayer, “Speed is the most important feature.”32 Every part of Google’s infrastructure is designed around this “gospel of speed.” As Google Senior Vice President Urs Hölzle explains, “‘Fast is better than slow’ has been a Google mantra since our early days, and it’s more important than ever now.”33 For example, Hölzle reports that four out of five users click away from a page if a video stalls while loading.

Data from other websites shows much the same thing. Experiments with the Bing search engine showed that adding two seconds of delay immediately decreases page views by 1.9 percent and revenue by 4.3 percent. Microsoft quickly ended this experiment for fear of permanently losing customers.34 AOL similarly reported that users with faster load times viewed more web pages: those with the speediest response times viewed 7.5 pages on average, while those with the slowest load times viewed only five.35

In response to this reality, every Google service must meet strict latency budgets. Many Google engineering offices feature “performance dashboards” on large screens showing constantly updated latency numbers across various Google services. As Hölzle puts it, “We have one simple rule to support this Gospel of Speed: Don’t launch features that slow us down. You might invent a great new feature, but if it slows down search, you either have to forget it, fix it, or come up with another change that more than offsets the slowdown.”36

Google has even created its own web browser, Chrome, which has surpassed Firefox and Microsoft Edge (previously Internet Explorer) in market share. Part of the motivation was to collect more user information for targeting ads and content. But according to Google’s public statements, the single biggest reason for building Chrome was, again, speed—particularly faster speed with complex interactive websites. Most sites cannot build a new web browser, and then make it the most popular in the world, in order to speed up their site.

Google focuses on speed so intently that it now ranks other sites in its search results based on how quickly they load.37 From Google’s point of view, this makes perfect sense: speed does predict how much its users will like a website. Google wants people to use the web as much as possible, and sending users to a slow site increases the odds they will stop and do something else. But for smaller firms, the direct disadvantages of having a slow site are compounded by the speed penalty that Google assesses. Ranking sites by speed further advantages big sites over smaller ones.

DESIGN ADVANTAGES

In March 2009, Google’s chief designer, Douglas Bowman, left the company to take a job at Twitter. In a blog post, Bowman said that he was leaving because of the clash between his classical design training and Google’s obsessive culture of data:

Yes, it’s true that a team at Google couldn’t decide between two blues, so they’re testing 41 shades between each blue to see which one performs better. I had a recent debate over whether a border should be 3, 4 or 5 pixels wide, and was asked to prove my case. I can’t operate in an environment like that.38

Google’s focus on data and metrics, according to Bowman, meant that designers were relegated to trivial problems. But Bowman also acknowledged that Google’s approach had been extremely successful. In fact, Google’s approach is part of a much wider industry shift. The profusion of online controlled experiments has produced a new model of design. And while small sites can (and increasingly do) use online experiments too, the model provides significant economies of scale for the largest firms.

Web design, like other research and development expenses, commonly produces large economies of scale. We have two decades of research showing that the design of a website can strongly influence site traffic and site revenue.39 Once the initial design is finished, it is no more expensive to show users a beautiful and usable site than it is to show them an ugly or confusing one. In economic terms, design behaves like software because it is encoded in software: good design is expensive to produce, but essentially free to reproduce.

But Google’s example also shows key differences between the online design process, particularly at the largest firms, and the design process elsewhere. For physical products, design happens at the beginning of the production process. Once initial design work and small-scale testing is completed, the assembly line is fired up, and consumers are presented with a finished, final product. Similarly, with media such as newspapers and magazines, the stories change, but the overall layout is largely static.

Many of the largest web firms now use large-scale online experiments as an essential design and testing tool. Companies that rely on the technique include Amazon, eBay, Etsy, Facebook, Google, Groupon, Intuit, LinkedIn, Microsoft, Netflix, Shop Direct, StumbleUpon, Yahoo, and Zynga.40 The design process at these firms can now be constant, dynamic, incremental. Larger firms like Google or Microsoft now run hundreds of online experiments concurrently.

This testing infrastructure reframes the entire process of design. For large digital firms, design is not just about hiring and (as we saw above) retaining competent designers with a strong aesthetic sense and an understanding of usability principles. Increasingly, website and app design is about building a comprehensive testing infrastructure to optimize every element. As Google researchers report, “We evaluate almost every change that potentially affects what our users experience.”41 Design becomes a tracking and storage and data-crunching problem. Large digital firms can leverage their massive infrastructure and engineering expertise, along with their large user base, to build better sites than those of their competitors. As Microsoft’s online experiment team emphasizes, “Building the infrastructure to do this cheaply and efficiently is the real challenge.”42

It is thus unsurprising that testing infrastructure at large sites has grown in concert with the rest of large sites’ hardware and software platforms. For example, the Google file system was intended to be employed just for indexing and crawling, but the company’s research and quality teams soon clamored to use the system to store large datasets.43 And as Google’s platform expanded, it also focused on developing new tools for in-house researchers and testers. Today Google’s Sawzall programming language, as well as its Tenzig and Dremel database tools, provides ways of analyzing the vast volumes of log data.44

Starting with a given design as a base, this testing infrastructure means that individual design elements have been heavily tweaked by testing, leaving few small changes that might increase traffic. But there is an important limit to this design approach: the local maximum problem. Even if a site tests countless small changes to its current design—every possible shade of blue, or exactly how many pixels should be placed between the site logo and the text that follows—users might actually prefer a completely different design altogether.

Still, research suggests that website usability is mostly “a first order problem.”45 The elements that go into a design are often easily separable. Making a good website usually does not depend on a complex interaction between design elements of the kind that is hard to discover with A/B testing.

Google itself seems to have belatedly taken a hybrid approach. One of Larry Page’s first efforts when he became CEO in 2011 was to launch “Project Kennedy,” a cross-company effort to unify and beautify Google’s projects. In 2013, Google rolled out a new “design language” for its websites. As Google designer Matias Duarte explained, the new redesign tried to offer a more holistic vision, rather than a bunch of “little incremental changes.”46 This effort culminated in the unveiling of Google’s “material design” language, which made user experiences on Android, ChromeOS, and the web simpler, more consistent, and more polished.47

Still, this process is more a refinement than a repudiation of Google’s focus on testing. Designs are still extensively tested before widespread deployment. Google’s oft-cited maxim “focus on the user, and all else will follow” is primarily about measuring and testing every part of the user experience. The predominantly data-driven approach may be more novel than more traditional design methods, but both approaches advantage large sites over their competitors. Smaller sites often do not have the hardware and staff to build this testing architecture, and they lack the statistical power to detect small effects.

Small changes to websites can lead to substantial bottom-line effects. Bing’s Experimentation System is credited with adding hundreds of millions of dollars to Microsoft’s bottom line. Just as important, though, were the large sums saved by catching damaging changes before they went live: “The system has also identified many negative features that we avoided deploying, despite key stakeholders’ early excitement.”48 Ron Kohavi, now the head of Microsoft’s online experiment team, relates an example of this from his time at Amazon.com:

[I remember] how the Amazon Order Pipeline team wanted to introduce the new version based on the new app server, Gurupa, and [I] insisted that an A/B test be run: it failed with a 2 percent revenue loss. The team dug deep for two weeks, found “the” bug, and wanted to ship. No, you need to pass an A/B test was the message. The team ran it and it failed again. The new pipeline shipped after five iterations. It is not just new ideas that fail, but re-implementations of existing ones are not as good as we initially think.49

For established web firms, deploying a new product version that fails is one of the single largest business risks. A robust testing infrastructure can help mitigate those liabilities. At the same time, it can limit opportunities for upstart competitors.

Testing infrastructure has also shaped firms’ strategic investment. As we have seen, Google’s exponential spending increase on server farms and networking hardware was initially a large departure for the company. This multibillion-dollar risk would not have been undertaken without reams of evidence that it would pay off in more traffic and revenue.

ADVERTISING AND BRANDING

In 1896 a group of investors led by Adolph Simon Ochs bought the New York Times for $250,000. Prior to the purchase the Times had been on the verge of bankruptcy. Founded in the early 1850s, the Times had been more restrained and intellectual in its coverage than sensationalist competitors like the World or the Journal. The financial panic of 1893 had hurt all newspapers, but it had especially damaged the Times, which had dominated the market for financial advertising. At its nadir the paper had a daily circulation of just 9,000 readers.

What, then, did Ochs and his fellow investors actually buy? Surprisingly little. The newspapers’ printing presses were old, run down, and not worth much. The linotype machines were rented. The newspaper had even been forced to rent offices in a building that it had originally owned. By the time Ochs purchased it, there was “virtually nothing [left] but the name and goodwill of the paper.”50

The physical infrastructure of media distribution does matter, whether it relies on printing presses or server farms. Still, we should not forget that key assets in media organizations have always been, in a sense, virtual. A century later the editorial page of the New York Times stated that the internet had repealed the old aphorism that “freedom of the press belongs to those who own one.” Yet, as the Times’ history itself shows, physical printing presses were only part of the Times’ ability to reach and grow an audience.

The story of the Times encapsulates two key features of media that remain relevant in the digital age. First, the Times had value, even in the late nineteenth century, as a paper more “serious” than its competitors. The New York Times brand conveyed a specific set of characteristics to its readers. Second, the Times had value as a marketplace for advertising. Media companies serve as a vehicle by which other brands establish and maintain themselves. Chapters 3 and 4 will look at how larger sites generate more ad revenue per user than comparable smaller sites. This section will focus on the first question: how larger sites are better able to build and maintain their online brands.

In the short term, a brand name is something that can be used with almost no additional cost. Adding a logo to a cheap handbag, or adding the Wall Street Journal masthead to a hastily rewritten press release, might immediately increase perceived value. But if Louis Vuitton started selling ugly vinyl handbags for five dollars a piece, the value of the brand would erode quickly. Companies spend vast sums of money building and defending their brands. For many firms their brands are their single most valuable asset.

Media have an especially intimate relationship with branding, because most media products are experience goods.51 It is difficult for consumers to know how much they will like a given media product, such as a news article or a rock album, without consuming it first. Consumers go to the online New York Times or the Huffington Post or Reddit today because they found the site content interesting or entertaining the last time they visited. Experience goods tend to produce strong inertia effects, in which consumers develop strong brand loyalties.52 When it is tough to judge quality in advance—that is, when search costs are high—consumers tend to stick to previous patterns of consumption.

The primary costs of maintaining brands are twofold: the costs of maintaining consistent quality (however defined), and the costs of advertising. But in economic terms it does not matter whether the quality difference is real or just perceived. As economists Carl Shapiro and Hal Varian write, “Customer perceptions are paramount: a brand premium based on superior reputation or advertising is just as valuable as an equal premium based on truly superior quality.”53

Many business scholars have examined what drives these perceptions of site quality. Their work shows—consistent with the previous sections—that technical site traits such as download speed and navigability, as well as a site’s visual appeal, are strongly associated with perceived quality.54 All of these site performance and appearance traits, as we just saw, produce steep scale economies.

Moreover, many nontechnical traits associated with site quality similarly favor larger sites. Scale economies in online advertising can also come from threshold effects, with larger firms finding it easier to reach the tipping point at which ads are most effective. For example, standard advertising doctrine states that ads are most effective when they are seen multiple times, and that repetition is especially important for unfamiliar brands.55 As we will see in the next chapter, this dynamic remains powerful online. Retargeting—in which prospective consumers are chased around the web by numerous ads for the same product—is more successful than showing an ad just once.56 The need for a larger ad campaign disadvantages smaller firms.

In the abstract, then, there is plenty of reason to expect that branding is a powerful force for online traffic concentration. More direct evidence of just how powerful advertising effects are can be found in two key areas: the search engine market and the market for online news.

The search engine marketplace is an effective duopoly. Google and Microsoft’s Bing now divide almost the entire market, with Google having twice Bing’s share (including licensed Bing searches on Yahoo!). Yahoo! spent billions of dollars on research, hardware, and acquisitions to build up a search business to compete with Google.57 But in 2009 Yahoo! gave up, and signed a ten-year agreement with Microsoft to use Bing in exchange for 88 percent of Yahoo!’s search revenue. Microsoft’s current search efforts began in earnest in 2005. The Bing search engine (previously MSN Search) had been dubbed a “black hole” by analysts; between 2006 and 2013 Microsoft lost $12.4 billion in their online division, of which Bing is the core.58

Despite enormous early financial losses, Microsoft built a search engine far better than Google itself was a decade ago, though, of course, Google’s search engine is a moving target. Bing’s search results have for years closely overlapped with Google’s results, especially for popular searches. And as Bing has improved, the overlap has grown greater still.

Why did Yahoo! and Bing fail to dent Google’s market share even as they dramatically improved in quality, and even produced quite similar results? One key reason is the power of Google’s brand. Users’ deep affection for Google is a striking feature of web use research. In a study on online trust, Hargittai, Fullerton, Menchen-Trevino, and Thomas found that many participants used highly emotional language, telling researchers that “I love Google” and that Google is “the best search engine.”59 Jansen, Zhang, and Mattila similarly report that “the depth of the positive sentiment [toward Google] is amazing,” with several participants using the word “love,” to describe their affection for the company.60

Similarly powerful results emerge from experimental studies. Jansen, Zhang, and Schultz showed that users strongly preferred Google-branded results over the same results attributed to other search engines.61 Pan and collaborators used eye-tracking technology to show that users focused on just the top few search results.62 This remained true even when the researchers reversed the real Google rankings, placing the last results first, something they attributed to users’ built-up trust in the Google brand.

Microsoft has actually based a major advertising effort on this sort of A/B testing. Microsoft’s “Bing It On” ad campaign, like the long-running Pepsi Challenge, asks users to compare unbranded results from both Bing and Google side by side. Microsoft claims that most users prefer Bing to Google, based on a self-funded study.63 Outside research has not supported Microsoft’s claims. Using blinded experiments, Ataullah and Lank64 and Ayres et al.65 found that users deprived of brand labels still had a slight preference for the Google results. Tests by news organizations also found that Google slightly bettered Bing, especially for the less common searches that Microsoft conveniently excluded.66 Still, both search engines overlap so strongly that Google’s edge in any handful of searches is modest. Ataullah and Lank ultimately conclude that “While Google may outperform Bing in blind searching, trust in the Google brand is a much more significant factor in Google users’ search preferences.”67

Google’s example shows how a brand built on an early technical lead can persist even after the quality difference between it and its competitors drastically narrows. In the online world, just as with cars or soft drinks or luxury clothing, branding tends to be persistent.

The sorts of strong brand effects seen with search engines also powerfully shape the online news market. Much recent work has focused on growing partisan self-selection in news. In the process, this work has shown that many online news consumers have robust—even fervent—brand preferences.

In a survey experiment using a nationally representative online sample, Shanto Iyengar and Kyu Hahn68 took real-time news headlines and randomly attributed the stories to either Fox News, CNN, NPR, or the BBC. Republicans and conservatives showed a strong preference for Fox over all other news outlets, and an aversion to CNN and NPR. Liberals showed the opposite: a strong aversion to Fox, and a roughly equal enthusiasm for CNN and NPR. These brand preferences held not just for hard news, but also for soft news topics such as sports and travel.

Just as significantly, subjects were much more interested in reading stories attributed to major news outlets than the control group, which saw identical headlines without source labels. Established brand names drove traffic to news, while subjects largely ignored unbranded, anonymous news stories.

Natalie Stroud finds similar results in her book Niche News.69 Using a modified version of Google news content, Stroud randomly attributed headlines to either Fox News or CNN. While the topic of the story mattered, there were nonetheless strong partisan brand preferences. Strong, durable branding effects are one more reason to expect online traffic to be persistently concentrated.

USER LEARNING

The preceding section shows how advertising builds familiarity with a product, and teaches consumers to associate a brand with (hopefully positive) attributes. All of this requires consumer learning, of a sort.

Yet websites also benefit from user learning in much deeper ways. Users prefer websites not just that they know, but that they know how to use. While brand-specific skills are a powerful force in many markets, the web provides an especially large role for brand-specific consumer learning.

Evidence for the importance of user learning comes from many areas of scholarship, including longstanding work on the so-called “digital divide.” While early digital divide research focused on disparities in access, recent work has focused on large, and strikingly persistent, gaps in web users’ skills. In this area the work of sociologist and communication scholar Eszter Hargittai and her collaborators has been especially important. Even as the web has diffused widely, Hargittai and her coauthors have shown that many common tasks remain difficult for most users.70 Some have suggested that younger users will show fewer differences in digital skills, but the data has challenged these claims. Hargittai has shown that even many of these so-called “digital natives” struggle with basic online tasks.71 A key coping mechanism is “reliance on the known,”72 in which users stick to familiar routines and trusted name brands.

These findings dovetail with economics research showing that the buildup of user skills over time can produce strong brand loyalty among consumers.73 Importantly, this can happen even if the competing products are identical in their quality and initial ease of use. As experienced users become more proficient, they tend to keep using products that they are already invested in. Companies with an established customer base thus find it easier to maintain market share.

Brand-specific skills are believed to be especially powerful in software markets.74 Even comparatively simple software, like a word processor, can have a steep learning curve. For example, the marketplace transition from WordPerfect to Microsoft Word was extremely slow, because users’ skills did not transfer from one product to the other.75 Upstart software firms facing an established incumbent cannot succeed just by producing a slightly better product at a slightly better price. The new product has to be so dramatically better that it overcomes the costs of switching—most of which do not have anything to do with the new software’s retail price. The costs in training and lost productivity and simple frustration usually far exceed the cost of buying the software itself.

But if the effects of brand-specific skills are well-known in traditional software markets, we should now expect these effects online. The web increasingly reproduces desktop software, from email programs to photo editing to games to spreadsheets to word processors. The growth of Ajax76 and related technologies has shifted some processing from a distant server to the user’s browser, allowing a web-based word processor to be as responsive as traditional software.77 Other technologies have pushed in the opposite direction, shifting computation and storage onto remote web servers. As the difference between applications that run in the cloud and those that run on a local device shrinks, users are anchored to familiar sites just as they have long been tied to familiar software programs.

But even when interaction with websites is less complicated than learning a new software program, scholarship has found that the web can produce “cognitive lock-in.” Johnson, Bellman, and Lohse provide the example of visiting a new grocery store for the first time.78 It takes time to learn the physical layout of the store, the shelf location of the milk or mangoes or mayonnaise. After the first few visits, familiarity with the grocery store makes it increasingly attractive relative to competitors. Johnson and collaborators present evidence that the same dynamic exists in online shopping. Users spend less time at easier-to-learn sites, but these sites also produce more return visits and more purchases.

Work on cognitive lock-in helps explain a key puzzle. Some early economics work suggested that, by lowering search costs and switching costs for consumers, the web would lessen users’ loyalty to particular outlets, and even produce “frictionless commerce.”79 Yet these expectations have been confounded by studies showing that consumers are at least as loyal online as they are in offline environments.80 Subsequent work, consistent with that of Hargittai and collaborators, mentioned earlier, has emphasized the role of habit and routine. Kyle Murray and Gerald Häubl argue that the web produces what they term skill-based habits of use.81 As users become more practiced and proficient, their browsing behavior becomes increasingly automatic. It becomes harder and harder to change their patterns of usage.

PATH DEPENDENCE AND THE DYNAMICS OF LOCK-IN

In early 2004 a student-created social networking site took over an Ivy League campus. Within a month most students on campus were posting photos and blogs and polls, sharing music and upcoming events, and commenting on their friends’ activities. After conversations with Silicon Valley luminaries, such as Napster creator Sean Parker, the site expanded to many other campuses. The founder dropped out of school to work on the site full time. Soon the nascent social network had hundreds of thousands of users.

This is not the story of Facebook. Rather, it is the story of Campus Network, an early Facebook competitor that began at Columbia University. Started by Adam Goldberg, an engineering and computer science student, Campus Network (initially called CU Community) launched before Facebook. It began with a head start in features and functionality. Blogging and cross-profile discussion were built into Campus Network from the start, while these features came later to Facebook. In Goldberg’s telling, Sean Parker urged Mark Zuckerberg to purchase Campus Network and to hire Goldberg.

So why was Facebook ultimately successful, while Campus Network ended up shutting down in 2006? Journalist Christopher Beam suggests several possible factors that may have outweighed Campus Network’s early lead.82 One is money. Facebook quickly pursued financial backing, while Campus Network turned down advertisers and declined to seek venture capital funding. This early capital allowed Facebook to hire more developers, to quickly add more features, and to rapidly expand into new markets.

Facebook was also simpler at first, letting users sign up with just a name, email address, and password. And Facebook was prettier. Wayne Ting, who was in charge of business and legal work for Campus Network, said the site looked like it was designed by “somebody who loves dungeons and dragons.”83 Perhaps, too, Facebook’s Harvard origins provided additional cachet compared to Campus Network’s Columbia heritage.

Ultimately, Facebook was able to expand faster. By the time Facebook had a million users it was four times the size of Campus Network, and growing more quickly. Campus Network had lost the race.

It is easy to draw the wrong conclusion from the early competition between Facebook and Campus Network. One might debate whether things would have been different if Campus Network had pursued venture capital, or if it had pushed to expand faster, or if it had just offered a less clunky interface. Perhaps so.

But the more profound lesson is that these advantages were small and largely arbitrary. Because Facebook was slightly better slightly earlier, and because it was lucky, it won the entire market. The web is not like economic sectors such as agriculture or logging or mining, where firms run up against hard constraints on their size. Digital firms online, much like software firms or telegraph companies, face few natural limits to their scale. Once a winner begins to emerge the market becomes highly inflexible. Early small events are magnified instead of being averaged away.

Economists commonly call this pattern lock-in. Lock-in emerges when the costs of switching grow large enough to outweigh any potential benefits. While the early stages of an emerging online niche are strongly dynamic, again and again once-open digital niches have locked in. Small differences in stickiness compound quickly.

Lock-in occurs not despite the fact that the web is constantly changing, but precisely because of its dynamic character. The evolutionary, constantly compounding character of web traffic is why digital niches lock in so quickly.

This chapter has documented many powerful but different forms of increasing returns on the web, each of which contributes to lock-in. Increasing returns come from more than just network effects—though those matter, especially for sites that allow direct user-to-user communication. Larger sites can be faster, and have more computing power and storage, while still costing less per user to run. Established sites benefit from branding, and (as we will see shortly) they can charge far more per user in advertising. Bigger sites are prettier and easier to use, and they provide larger, higher-quality bundles of content. And as users develop habits and site-specific skills they are increasingly anchored to sites that they already know.

But despite these increasingly obvious facts, the internet is often still portrayed as a magical fairyland of frictionless commerce and perfect competition. We still see news articles like a piece in Forbes, in which the headline blares “Anyone Anywhere Can Build The Next Google—There Are No Barriers.” Executives at today’s digital Goliaths still repeat the refrain that “competition is only a click away.” Google chairman Eric Schmidt declared that the company should be worried because “somewhere in a garage [someone is] gunning for us. I know, because not long ago we were in that garage.”84 Former FCC chairman Tom Wheeler claimed in his book that the internet serves an inevitably decentralizing force—a sentiment that Trump administration FCC chair Ajit Pai has echoed (as we will discuss later on).

The notion of an intrinsically open, ever-competitive internet is still the central assumption behind U.S. communication policy, and the foundation of a substantial body of scholarship. But these claims are more and more at odds with reality. An internet on which most traffic never touches the public backbone is no longer a peer-to-peer network, nor one that (as FCC chairman Wheeler suggested) “pushes activity to the edges.”85 Unfettered competition never guarantees consumer-friendly outcomes in markets with strong economies of scale. Economics has long known that increasing returns can cause an industry to adopt a technology that later proves inefficient.86 There is no guarantee that Facebook is a better site for consumers than, say, Campus Network would be today if it had survived.

Amid this discussion of lock-in online, one thing has been left out. In many markets, the largest barrier to switching is the hassle of finding and evaluating potential alternatives. How often do most people get an alternative quote for their car insurance? How often do they try out new brands of deodorant or toothpaste? How many consumers are really getting the lowest possible rate on their mortgage, or the highest interest rate on their savings account? Most of the time firms and individuals stick with products and services that are known to be good enough.

Search costs, then, can also produce lock-in. And these search costs will be the subject of the next chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.76.89