Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Gordon HaffHow Open Source Ate Softwarehttps://doi.org/10.1007/978-1-4842-3894-3_2

2. From “Free” to “Open Source”

Gordon Haff¹

(1)

Lancaster, Massachusetts, USA

We’ve seen how free and open source software not only came onto the scene but became an essential part of the computing landscape. Not just cheaper than proprietary software but a model that encourages the creation of software ecosystems and new innovations.

However, in the process of telling this story, I’ve moved quickly past some details. Are the oft-conflated “free” and “open source” one and the same, or do they capture an important distinction? What do you need to know about the legal basis for this type of software without getting too deeply into law, licenses, and contracts? Is it safe? How does this open source thing work if you want to participate?

This chapter takes you from the context to today’s on-the-ground reality.

Words Can Matter

Richard Stallman and his Free Software Foundation have long emphasized the word “free.” But Stallman didn’t intend that to mean someone had to give you a copy of software without charging for it. Rather, as he would later add in a footnote to the GNU Manifesto: “I have learned to distinguish carefully between ‘free’ in the sense of freedom and ‘free’ in the sense of price. Free software is software that users have the freedom to distribute and change. Some users may obtain copies at no charge, while others pay to obtain copies—and if the funds help support improving the software, so much the better. The important thing is that everyone who has a copy has the freedom to cooperate with others in using it.”

The oft-repeated shorthand is that free software is about free as in freedom rather than free as in beer.

Why Free

The GNU Manifesto was a reaction to what was happening in the computer industry more broadly at the time. Convivial academic collaborations were giving way to fragmented and proprietary commercial products.

As Steven Weber writes in The Success of Open Source (Harvard University Press, 2005), “The Free Software Foundation was fighting a battle with this narrative on philosophical grounds. To reverse the decline, Stallman sought to recapture what he thought was essential about software—the ideology of cooperative social relations embedded within the knowledge product. The philosophy implanted in the software was as important as the code.”

There were a variety of terms floating around for this new type of software. Brian Behlendorf of the Apache web server project, an early open source software success that become an important element in the Internet buildout, favored “source-code available software.” But free software became the common term. And, for newcomers, distinguishing free as in freedom from free as in beer was often confusing.

The Coining of “Open Source”

In 2018, writing on the 20th anniversary of the coining of the term “open source,” Christine Peterson—who had been the executive director of the Foresight Institute—recounts how she was focused “on the need for a better name and came up with the term ‘open source software.’ While not ideal, it struck me as good enough. I ran it by at least four others: Eric Drexler, Mark Miller, and Todd Anderson liked it, while a friend in marketing and public relations felt the term ‘open’ had been overused and abused and believed we could do better. He was right in theory; however, I didn't have a better idea, so I thought I would try to go ahead and introduce it.”¹

At a meeting later that week, the terminology again came up when discussing promotion strategy related to Netscape’s plan to release its browser code under a free software type of license. A loose, informal consensus developed around the open source term.

Influential publisher and event organizer Tim O’Reilly would soon popularize it. In addition to using the term himself, “on April 7, 1998, O'Reilly held a meeting of key leaders in the field. Announced in advance as the first Freeware Summit, by April 14 it was referred to as the first ‘Open Source Summit.’”

Pragmatism and Commercialism

The initial motivation for this shift in terminology was primarily practical. People were tired of explaining that it was perfectly well and good to charge for “free software.”

However, it served another purpose that one suspects many proponents such as O’Reilly realized from the beginning. It helped to distance open source as a broader movement for organization, collaboration, innovation, and development from the narrower and more philosophical bent of free software.

O’Reilly would note the following year that “the Free Software Foundation represents only one of the traditions that make up the open source movement” with university software development traditions—most notably Berkeley’s BSD—also being significant. Furthermore, much focus around free software ended up being around licenses, which may have been important (especially at first) but were only one component of broader discussions around openness and user freedoms.

The shift in terminology also reflected how open source software, including but hardly limited to Linux, was becoming commercially interesting to existing vendors, new vendors, and end users. Some traditional IT suppliers like IBM were making huge investments in open source. New companies that were inextricably linked to open source were aborning.

The first Internet wave in the latter half of the 1990s was built on open source. Large-scale hosting providers, those building specialized software appliances for serving email and web pages, and the early iterations of search engines and other now familiar services all required open source software to function. Many wanted to customize the software they depended on. Furthermore, the scale at which they operated increasingly made alternatives to proprietary software an economic necessity.

Commercialism led to something of a schism between hobbyists and those focused more on the ideological roots of free software on the one hand and the more pragmatic and profit minded on the other. A focus on free as in freedom software versus software developed collaboratively and in the open because it was more effective to do so. The two aspects were by no means mutually exclusive, but there was clearly tension for both philosophical and pragmatic reasons.

Today, one interpretation is that the two perspectives have mostly reached a rapprochement; the practical benefits of open source software include user flexibility and freedoms that at least overlap their ideological counterparts. A more cynical view is that pragmatism and profit have come to dominate open source—at least with respect to where much of the attention and effort flows. In reality, it’s a bit of both and whether it’s closer to one or the other is going to depend on your priorities and point of view.

How Open Source Licensing Works

The choice of licenses is usually a less contentious topic than it once was. But licenses as a legal concept remain an important part of open source software’s foundations. As a result, it’s useful to broadly understand their role and some of the key distinctions among them that can affect how open source development works.

While it doesn’t control the use of the “open source” term, the open source definition and set of approved licenses maintained by the Open Source Initiative (OSI) serves as a generally accepted standard for what constitutes an open source software license. Core principles include free redistribution, providing a means to obtain source code, allowing modifications, and lack of prohibitions about who can use the software and for what purpose.

The last point means that you can’t say, “This software is free to use for educational purposes only.” Open source has succeeded in part because approved open source licenses don’t place restrictions on how you can use software you’ve obtained from some source. Depending upon the license, you may have obligations if you, in turn, redistribute the software or incorporate into some else that you then distribute. But use is fair game.

Do You Have to Give Back or Not?

There are two broad categories of licenses. One includes “copyleft” or reciprocal licenses, of which the General Public License (GPL) is the best known and is the one used by Linux. The other includes “permissive” licenses, most notably the Apache, BSD, and MIT licenses.

There are a variety of other licenses as well. Some are essentially legacy licenses that have just never been retired. Others are designed to be more suitable for other copyrighted material such as books or photographs. For example, the CC BY-SA Creative Commons license variant requires attribution but does not prohibit use or remixing.²

It’s worth emphasizing that material released under an open source license is still copyrighted. Indeed, copyright law is integral to the working of open source licensing that grants users various rights that they wouldn’t have under a default “All Rights Reserved” copyright. In most countries, creative works—including computer software—are automatically copyrighted as soon as they’re created.

There are some exceptions. For example, US copyright law places works of the US federal government in the public domain upon creation. An individual can also choose to place their own work in the public domain, although this is controversial for rather arcane legal reasons. The Creative Commons website released a copyright waiver in 2009 called CC0 as an alternative to a public domain for this reason.

Different licenses impose more, fewer, or different types of restrictions within that general framework. But copyleft and permissive capture the core distinction.

Protecting the Commons

A copyleft license requires that if changes are made to a program's code, and the changed program is distributed outside an organization, the source code containing the changes must likewise be distributed (or otherwise made reasonably available). Permissive licenses don't include that requirement.

Copyleft says that you can’t take someone’s code, change it or mix it up with other code, and then ship the resulting program in machine-readable form without also making it available in human-readable source code form. There’s a philosophical point here. If you take from the commons—that is, use open source software someone else has created—you also have to give back to the commons if you distribute it.

This reciprocity requirement had roots in the practical. In a software world that was seemingly becoming increasingly proprietary and profit seeking, the thinking went, why wouldn’t corporations naturally vacuum up code from the commons, give little in exchange, and effectively become free-riders.

After all, the “tragedy of the commons” was a social science phenomenon articulated way back in 1833 by the British economist William Forster Lloyd, from the hypothetical example of unregulated grazing on common land in the British Isles. Better to require reciprocal contributions, especially given that the advantages of open source as a development model mostly hadn’t yet been articulated and proven.

But it was also just a reflection of a software movement that was as much about philosophical principles as it was practical results.

A couple of things that cause confusion are worth highlighting. Let’s look at them now.

Seeing Through the Copyleft Mire

The first is that precisely defining “mix it up” gets into technical and legal details that, absent substantial case law, are a matter of some debate. For example, some argue that the manner in which two programs are combined together—that is, whether they’re dynamically or statically linked in Unix parlance—makes a difference. Others say it doesn’t.

However, to the degree that there’s a risk, it’s usually not so much that it’s hard to determine whether code is actually being mixed because it runs into an ambiguous edge case. Rather, it’s that code under a copyleft license was deliberately but carelessly used in projects where it wasn’t appropriate or was against company policies.

The other point is that copyleft licenses are specifically about distribution . Again, there are some nuances but, essentially, if you distribute software, for profit or otherwise, by itself or as part of a hardware product, you must make the source code for the work that incorporates the copyleft code available. In other words, so long as software is used internally, there’s no requirement to distribute the code.

Permissive Licenses Gain

We’re seeing an ongoing shift to more permissive licenses. Matthew Aslett of market researcher 451 Group wrote in 2011 that “2010 was the first year in which there were more companies formed around projects with non-copyleft licenses than with strong copyleft licenses.”

More recent data shows a continuing trend, as do the anecdotal observations of industry observers.

Black Duck by Synopsys, which automates security and open source license compliance, maintains a Knowledge Base of over two million open source projects. As of 2018, among projects using the top five licenses accounting for 77 percent of the total projects, about two-thirds used a permissive license (MIT, Apache 2.0, or BSD).

There are some differences among these permissive licenses, primarily with respect to the types of copyright and other notices that must be retained and displayed. Patent language, both explicit and implied, also differs; newer licenses are more likely to call out patents specifically in the text of the license. These differences will be important to many organizations shipping products that make use of open source code. However, we can mostly think of these permissive licenses as giving permission to use code covered by such licenses without meaningful restriction.

In general, this shift reflects a lessening concern about preventing free-riders and more on an increasing focus on growing communities.

There are indeed many free-riders. That’s a given. But open source has been widely embraced by all manner of companies because they've found that open source is a great way to engage with developer and user communities—and even with competitors. It's emerged as a great model for developing software and capturing innovation wherever it's happening.

Furthermore, in a cloud services world, the GPL doesn’t even protect against free-riding especially well. If you sell me a service delivered through a web page rather than software I download, that’s not distribution from the perspective of the GPL. Yet, this is the increasingly dominant way through which you use many types of software. Salesforce.com. Amazon Web Services. Microsoft Azure. Google Cloud Platform. All you need is a web browser to use any of them. You don’t need software that is distributed in the traditional sense of shipping bits on disk or making them available for download.

To some this is indeed a loophole and the Affero General Public License, which redefines the meaning of distribution to include software delivered as a service, was introduced to close it. But it’s not widely used.

Driving Participation Is the Key

What’s usually more important is decreasing the barriers to participating in projects and collaborating across companies. And while individuals and organizations participate and collaborate on projects licensed under the GPL, most famously Linux, permissive licenses are increasingly viewed as an often-better choice.

In part, this is because companies developing with a combination of open source and closed source code simply want to maintain the flexibility to decide what code to contribute and when based on their own desires and not the demands of a license. It also reduces license incompatibility issues. Software licensed under major permissive licenses can be added to GPL-licensed code, which can then be distributed under the GPL, but the reverse doesn’t apply because the GPL is more restrictive than a license like MIT. (Combining code can’t result in removing obligations such as reciprocity.)

Whatever the reason in an individual case, it’s about maximizing participation in projects. The Eclipse Foundation's Ian Skerrett argues that "projects use a permissive license to get as many users and adopters, to encourage potential contributions. They aren't worried about trying to force anyone. You can't force anyone to contribute to your project; you can only limit your community through a restrictive license."

Appropriate licensing remains relevant in a table stakes sort of way for open source projects. However, it’s no longer a major focus for creating successful communities, projects, and businesses. As Chris Aniszczyk of the Cloud Native Computing Foundation puts it: “Licensing and all that is table stakes. That's a requirement to get the gears going for collaboration, but there are [other] aspects around values, coordination, governance.”

Maintaining Open Source Compliance

Maintaining compliance with these different types of open source licenses can sometimes seem intimidating, but it’s mostly about having established processes and following them with reasonable care.

Putting Controls in Place

The Linux Foundation recommends having a designated open source compliance team that’s tasked with ensuring open source compliance.³ Such a team would be responsible for open source compliance strategy and processes to determine how a company will implement these rules. The strategy establishes what must be done to ensure compliance and offers a governing set of principles for how employees interact with open source software. It includes a formal process for the approval, acquisition, and use of open source; and a method for releasing software that contains open source or that’s licensed under an open source license.

As open source becomes more widely used within many companies, it’s increasingly important to have these kinds of controls in place anyway for reasons that aren’t directly related to license compliance. Unvetted code from public repositories may not be current with security patches or may otherwise not meet a company’s standards for production code.

What Are Your Policies ?

A first step is to establish a process and appropriate policies. Often this includes a list of acceptable licenses for software components used for different purposes and in different roles. For example, companies widely use commercial enterprise software that includes programs licensed under the GPL like Linux. However, they may choose to only incorporate open source software components that use permissive licenses into their own software. Or they may be fine with GPL components for their internal software but not in products that they ship or otherwise expose to partners and customers.

In any case, these are matters of policy for management, including the legal staff. One important general recommendation, however, is to not make things too complicated. As Ibrahim Haddad, vice president of R&D and head of the Open Source Group at Samsung Research America, puts it: “If your code review process is overly burdensome, you’ll slow innovation or provide a good excuse for developers to circumvent the process completely.”

An Ongoing Process

The ongoing process then uses scanning tools—whether proprietary or open source—to determine the software components in use, the licenses of those components, potential license conflicts, and any dependencies. Related tools can be used to identify known security vulnerabilities. Problems can then be identified and resolved. For example, a proprietary software component linking to a GPL-licensed component might be in compliance with policy for an internal tool but should raise a flag if it’s going to be shipped externally.

Projects versus Products

To this point, I’ve been mostly talking about open source software as a singular collection of bits. For a hobbyist project, that may be a reasonable simplification. For Linus Torvalds, it was bits on a desktop computer at the University of Helsinki. These days, it’s more commonly files stored on an online repository like GitHub. No one is selling the software. No one is promising support in any sort of formal way.

However, for most products that companies sell, it’s important to distinguish between projects and products.

Upstream and Downstream

In its simplest form, there is an “upstream” community project and a “downstream” product based on the upstream. For this discussion, assume that the product is also fully open source software although some companies practice partial open source development by combining open source components with proprietary ones; “open core” is a common form of this practice. Downstream products may also bring together and integrate independent and quasi-independent upstreams but the same general principle applies.

Upstream is the catch-all term for the ultimate source of the project. It's the core group of contributors, their mailing lists, website, and so on. Ideally, it’s where most of the community development happens. By contrast, the product is something that a customer buys to solve a business problem.

Innovation, rate of improvement, wide acceptance, and other aspects of the product may derive in large part from the fact that a vibrant upstream community exists. But don’t confuse the two. As Paul Cormier, Red Hat’s president of Products and Technologies, puts it: “Too often, we see open source companies who don’t understand the difference between projects and products. In fact, many go out of their way to conflate the two.”⁴

Usually, the open source project comes first. Sometimes the process gets reversed when a company acquires a proprietary product and releases it as open source. But most of the same principles apply to the relationship of the project to the product as seen in Figure 2-1.

../images/464908_1_En_2_Chapter/464908_1_En_2_Fig1_HTML.jpg — Figure 2-1
The open source product development process. Source: Red Hat.

Projects and Products Depend on Each Other

To be clear, there are strong linkages between successful projects and successful products.

For example, as I’ll cover in more detail later, metrics for community success often include measures of community breadth; how many contributors work for someone other than the project’s creator? The reason this is important is that a community with no outside contributors is effectively an open source project in name only. Others can look at the code, but there’s none of the broader participation that makes the open source development model work so well.

It’s also good to have a strong relationship between project and product from a development perspective. “Upstream first” is a best practice, meaning that code changes preferably go into project from whence they subsequently flow downstream into the product.

It’s a best practice, in part, because it means less work. There’s less code to maintain that’s not part of the upstream. This isn’t always possible. The most vibrant and independent projects won’t always accept code that, for example, is specific to a particular customer’s product requirement. But companies that work effectively with and participate in upstream projects stand to have the most influence when it comes to decisions that are important to them.

What Support Means

However, many attributes are part of a product that aren’t necessarily in a project. Many assume that the main difference is that people pay for support in the case of a product. That’s not wrong; customers still call and email for support in the case of software products based on open source just as they do for software products that aren’t based on open source. But, to use Red Hat enterprise software subscriptions as an example, the product includes much more, including some things that are largely specific to open source software.

Support itself has broadened since the days when it meant picking up the phone to get a question answered.

When things go wrong in a production software environment, the ability to access the right information quickly can be the difference between a fast return to normal operations and a costly outage. And sometimes, the best support call is one you don’t need to make. For example, customers today like to search to locate articles, technical briefs, and product documentation that are most relevant to the problem at hand. System architects like to browse detailed technical case studies that engineers have designed, tested, and benchmarked.

Reducing Risk

A big part of what enterprise software companies are looking for is reduced risk. In addition to the support itself, this falls into several different areas.

The first is life cycle support. In a project it’s typical to have an unstable branch and a stable branch. At some point of development, the unstable branch is deemed to be stable, it replaces the current stable branch, and the cycle begins again. At that point, the prior stable branch is typically retired. It’s still available, but typically work on it is frozen. (This isn’t a hard and fast rule; sometimes developers will retrofit a fix for a particularly serious bug but, generally speaking, community projects tend to focus on the latest and greatest.)

However, enterprise customers often want long life cycles. This can mean five years, seven years, or even longer. Longer life cycles mean more choice and flexibility, reduced cost and risk, and more ease of planning. To meet this requirement, enterprise software companies “backport” fixes and feature enhancements into older versions of their software.

Certifications are also an important part of an enterprise software product. This includes certifying hardware, certifying software, and certifying providers like public clouds. These types of certifications are based on joint testing with partners, established business relationships, and other agreements to both reduce the number of potential issues and to have processes in place to resolve problems when they happen.

While well-run projects incorporate automated tests and other processes intended to reduce the number of bugs introduced into the code base, downstream products tend to have more robust quality assurance testing. For example, Red Hat’s program includes acceptance, functionality, regression, integration, and performance testing.

Other product features can include legal protections such as defending customers against intellectual property lawsuits or dealing with certain other legal issues.

The Intersection of Security and Risk

A final area of product assurance that is often top of mind today is security. Some aspects of open source product security are similar to those associated with enterprise software products generally. It’s increasingly important to have a dedicated team of engineers who proactively monitor, identify, and address potential risks. This lets them give accurate advice to quickly assess risk and minimize business income.

What’s different with open source is that developing software in collaboration with users from a range of industries, including government and financial services, provides valuable feedback that guides security-related discussions and product feature implementations. And, in the case of security vulnerabilities with broad impact, it allows engineers from multiple companies to develop fixes cooperatively.

Before leaving the topic of security though, it’s worth considering it more broadly. The intersection of open source and security still causes a lot of angst and misunderstanding.

Securing Open Source

This discussion requires teasing apart a couple of different concepts.

The first gets back to projects versus products. This is pretty straightforward even if, as noted earlier, it is still often a source of confusion.

Business as Usual: Patches and Advice

An open source software vendor needs to provide security patches and advice in much the same manner as any other software vendor—though the open source company has the added advantage of being able to collaborate closely with customers, partners, and other vendors.

Many projects also do an effective job of developing and making available security fixes and best practices—especially when fixes are pushed upstream quickly—but the process may take longer and be less formalized.

The second is open source security in the abstract. In other words, does access to source code by both the “good guys” and the “bad guys” help or hurt security? When people ask whether open source is more or less secure, this is what they are often talking about.

One can get a sense of the debate from two different statements attributed to the US Department of Homeland Security’s Luke McCormack in 2016. A week after his statement that opening up federal source code would be like giving the Mafia a “copy of all FBI system code” set off a minor firestorm, he walked it back to say, “Security through obscurity is not true security: we cannot depend on vulnerabilities not being exploited just because they have not been discovered yet.”

Does Code Help the Bad Guys?

The hurts-security side of the argument is rooted in physical analogies.

When many people think about security, they probably think about something like a home security system. Physical security systems do typically depend to at least some degree on “security through obscurity.” They may actively protect against a wide range of threats, but they probably also depend on, to at least some degree, the would-be burglar not knowing precisely what types of sensors there are, where they are placed, and how the property is being monitored.

The degree to which obscurity makes software more secure is controversial, and the answer probably boils down to “it depends” to some degree. As Daniel Miessler has noted, “It’s utterly sacrilegious to base the security of a cryptographic system on the secrecy of the algorithm.”⁵ At the same time, he argues that “The key determination for whether obscurity is good or bad reduces to whether it’s being used a layer on top of good security, or as a replacement for it. The former is good. The latter is bad.”

Even the US National Institute of Standards and Technology (NIST) equivocates. On the one hand, it has stated that “System security should not depend on the secrecy of the implementation or its components.” However, they also recommend (in the same document) that, “For external-facing servers, reconfigure service banners not to report the server and OS type and version, if possible. (This deters novice attackers and some forms of malware, but it will not deter more skilled attackers from identifying the server and OS type.)”⁶

The general consensus seems to be that obscurity doesn’t help much (if any) and you shouldn’t depend on it in any case. Attacks mostly come through probing for weaknesses en masse. They exploit configuration errors, default passwords, and known vulnerabilities that haven’t been patched yet. Of course, any vulnerability that a white hat security researcher finds by combing through source code could be found by a black hat one first. But that’s not anything like a common pattern.

Or Is “Many Eyes” the Secret Sauce ?

At the same time, it’s not clear to what degree the common argument favoring open source for security reasons—”with many eyes, all bugs are shallow”—applies either. One problem is that many eyes can still miss things. The other is that not all projects have many eyes on them.

In early 2017, I sat down with then-CTO of the Linux Foundation Nicko Van Someren to talk about the Core Infrastructure Initiative (CII), a group set up in the wake of the Heartbleed bug, a security vulnerability that potentially affected of about 70 percent of the world’s web servers.

In the case of Heartbleed specifically, the bug (discovered by a Google engineer) was quickly fixed, but it exposed the fact that a number of key infrastructure projects were underfunded. As Van Someren put it: “Probably trillions of dollars of business were being done in 2014 on OpenSSL [the software with the Heartbleed bug], and yet in 2013, they received 3,000 bucks worth of donations from industry to support the development of the project. This is quite common for the projects that are under the hood, not the glossy projects that everybody sees.”

He went on to say that “We try to support those projects with things like doing security audits where appropriate, by occasionally putting engineers directly on coding, often putting resources in architecture and security process to try to help them help themselves by giving them the tools they need to improve security outcomes. We're funding the development of new security testing tools. We're providing tools to help projects assess themselves against well-understood security practices that'll help give better outcomes. Then, when they don't meet all the criteria, help them achieve those criteria so that they can get better security outcomes.”

Thinking Differently About Risk

The best that we can probably say is that software being open source is neither a hazard nor a particular panacea. And it’s probably not a very useful debate.

Patrick Heim, CISO, ClearSky Security argued at the 2018 Open Source Leadership Summit that “Maybe [we] need to move beyond the argument of which is better. How do we live in this new world where there is more open source? We have to think slightly differently about how we manage risk.” Many organizations provide frameworks for thinking about cybersecurity. Figure 2-2 provides an example from NIST.

../images/464908_1_En_2_Chapter/464908_1_En_2_Fig2_HTML.jpg — Figure 2-2
The National Institute of Standards and Technology (NIST) cybersecurity framework focuses on using business drivers to guide cybersecurity activities and considering cybersecurity risks as part of the organization’s risk management processes.

This includes a lot more automation. A lot more monitoring. A lot more understanding of a software supply chain that increasingly will include open source components. The license considerations I mentioned earlier are only a small part of this. Equally, really more, important are questions such as whether a software component is being maintained, who is maintaining it at what level of activity, who is vetting it for trustworthiness?

And, ultimately, it’s about making risk-based decisions. That means having an honest conversation with the business about priorities and risk management.

Participating in Open Source Projects

Before moving onto how the open source software development process works in more detail—how communities work, how they’re governed, what the process looks like—let’s consider what getting involved in open source looks like.

We can approach the question of participation from a number of different angles. Here are three.

Starting an open source project associated with a current or planned software product;
Doubling down an existing open source project;
Creating an open source program office to manage internal open source use and external contributions—including creating new projects.

There are common themes to these. One of those is that participating in open source is not a philanthropic endeavor or certainly doesn’t need to be. Rather, as the Linux Foundation’s Jim Zemlin puts it:

The epiphany that many companies have had over the last three to four years, in particular, has been, “Wow. If I have processes where I can bring code in, modify it for my purposes, and then, most importantly, share those changes back, those changes will be maintained over time. When I build my next project or a product, I should say, that project will be in line with, in a much more effective way, the products that I'm building. To get the value, it's not just consumed, it is to share back and that there's not some moral obligation (although I would argue that that's also important), there's an actual incredibly large business benefit to that as well.” The industry has gotten that, and that's a big change.

Another theme is that there’s no template. There are principles and practices that recur over and over in most successful open source projects. Many of which should be regarded as more than guidelines but less than an absolute set of rules. However, while many open source projects fall into certain recurring patterns, there’s no template. Chris Aniszczyk calls it the Tolstoy principal for open source, "Each project is unhappy in its unique own way.” (By “unhappy,” read has singular concerns, needs, and challenges.)

Starting an Open Source Project

The lines between the developers of software and the consumers of software are increasingly blurred. So is the demarcation between those who make money from software and those who make money from doing other things like building widgets; companies increasingly deliver software services and experiences as well as physical things.

But it’s still useful to treat open source projects that are directly tied, now or in the future, to commercial software products as distinct from projects created by organizations with other purposes in mind.

One of the first questions to ask, though, is one of the most important. Do you need to start a new project? In some cases, yes. Many commercial software products start life as a new open source project. In other cases, there’s an existing commercial software product and the company or an acquirer of the company wants to start a project based on the product’s code. At Red Hat, for example, when we acquire companies and products, we routinely go through a process to make all components open source. In still other cases, a software company has a new engineering initiative that it wants to develop in open source, but there’s no existing project that’s a natural fit.

Some of the most successful open source projects have come about because companies and individuals decided to rally around an existing project rather than going off and doing their own thing. This sort of behavior is, after all, central to the open source development model. But let’s assume that you have good reasons to create a new project. What are some of the things you’ll want to think about next?

What success looks like for the community project is a good place to start. This should correlate with the reasons you have for creating a new project. What are you trying to accomplish? What are your business objectives?

Typically, the project should support the success of a product offering based on the project or otherwise support commercial objectives such as creating brand association for the sponsoring company. If, on the other hand, it’s a pure community project that’s not associated with a commercial product, it should be able to stand on its own as a viable open source community project.

From there, you can move on to considering project-specific needs for a community launch. These can include existing closed source code that you plan to open source; source code licensing; and how the project will be positioned relative to downstream products from marketing, trademark, naming, and other perspectives.

Will the project be a stand-alone project supported by a company or a group of companies or does it make more sense to become part of an existing foundation? Dan Kohn of the Cloud Native Computing Foundation (under the Linux Foundation) took a look at the 30 “highest velocity” open source projects—as measured by commits, pull requests, and issues.⁷

Nine were backed by foundations including Linux (Linux Foundation), three OpenStack projects (OpenStack Foundation), and Kubernetes (CNCF). Fourteen were backed by companies including Google, Facebook, Red Hat, Elastic, and Basecamp. Six projects were not mainly backed by one company or software foundation; their nature varied but a number were characterized by their value mostly coming from “recipes” individually contributed and updated by hundreds of independent contributors.

In other words, it’s quite a mix.

As with many aspects of open source communities and projects, the “right” approach is very situational. Foundations—whether created for a specific project or with broader scope—are often seen as a more neutral collaboration point than projects solely under the control of a single company. However, as seen above, they’re not a prerequisite for successful projects. It just often takes more work to attract and retain outside contributors and raises the bar for demonstrating that processes for communicating, making decisions, and accepting contributions aren’t biased in favor of the main project sponsor.

In addition to where the project will live, there’s the question of how it will be governed. Some foundations, such as the Apache Software Foundation, have a fairly standard structure for projects. (At least ostensibly. The on-the-ground realities aren’t always so neat and clean.) Others, like the Cloud Native Computing Foundation, take a more flexible approach based on the considerations of individual projects. Still others, like the OpenStack Foundation, were explicitly started to meet the needs of a specific project and its related subprojects.

At a detailed level, there are many additional decisions. Where will the code be hosted? How will communications take place? How will a selected governance model be implemented at a detailed level? Who makes the ultimate decisions and who gets to commit code? How will you enable people to move from user to contributor? Is everything lumped together or will you create Special Interest Groups (SIGs)? Will you have a dedicated community manager? What’s your time line?

And, of course, what’s the project’s name and what does the logo look like? (Topics that always seem to consume an outsized portion of time.)

Finally, iterate, iterate, iterate.

Doubling Down an Existing Open Source Project

However, organizations can often most effectively participate in open source development by joining an already existing community. Doing so can come with its own set of challenges if the community is dominated by companies or individuals with interests that run counter to your own. Nonetheless, connecting to an existing project is often easier, quicker, and more certain than trying to kickstart a new initiative.

Participation in open source communities doesn’t need to be about coding. For example, companies can help with testing in their specific environments, which will often be at higher scale points or otherwise different from configurations that can easily be tested in a lab. This sort of participation in software development by end users was commonplace even before open source software became so widespread. We typically referred to it as alpha or beta testing depending upon how early in the process you were involved.

Writing documentation and providing funding are also ways to get involved.

However, as longtime IBM Software Group head Steve Mills put it: “Code talks.” The Linux Foundation also argues that the greatest influence in open source projects is through the quality, quantity, and consistency of code contributions.⁸

Stormy Peters, senior manager for Community Leads at Red Hat observes that “For many larger projects, we know that most of our contributors are going to be people who work at companies that need to use projects like Ceph and Gluster. We have customers, and customers often contribute to software because they’re using it. We consider both the individual participation and the company participation as success stories.”

Some companies may participate in open source software development as a way to give back to the community. However, there are plenty of business justifications as well.

A big reason is to take advantage of the open source development process. While it’s often possible to use and modify open source code without making contributions, “forking” a project in this way “defeats the whole purpose in terms of collective value” in Jim Zemlin’s words. He adds that “You're now, basically, supporting your own proprietary fork. Whether or not it's open source, it doesn't matter at that point. No one understands it but you.” Microsoft’s Stephen Walli describes it as being “very expensive to live on a fork.”

Stormy Peters and Nithya Ruff, senior director of Open Source Practice at Comcast, add that “Companies that start fixing bugs or adding new features and functionality to an open source project without contributing them back into the upstream project quickly learn that upgrading later to add new features or apply security fixes can be a nightmare that drives maintenance costs through the roof. Contributing your changes back into the upstream project means that they will automatically be included in future updates without incurring additional maintenance costs.”⁹

Furthermore, coming back to Mills’s talking code, new features and functionality come from code contributions, and those contributions can influence the direction of the project. If you want the project to have specific functionality that you need, it may be up to you to implement potential changes.

That said, working with open source projects requires awareness of and sensitivity to the norms, expectations, and processes associated with the specific community. While your contributions, of whatever form, will often be welcomed, that doesn’t mean just showing up and throwing weight around. You need to be aligned with both the overall direction of the project and the way that the project and its associated community operates. A good approach is to have someone join the community and spend some time observing. Alternatively, you can hire someone who already has a proven track record of participation in the community.

Communities have different characters, ways of participating, and different channels for communication—which include mailing lists, forums, IRC, Slack, bug trackers, source code repositories, and more. These are useful for both ongoing communications and to understand how a community works before jumping in. “Lurk first” is what Peters and Ruff advise because “the more time you spend reading and listening, the more likely it is that your first contribution will be well received.”

They also counsel reading up on the project governance and leadership before contributing. Who makes the decisions for various types of contributions? Is there a contributor licensing agreement? What’s the process for contributing? What classes of contributors are there? I’ll get into the specifics of different governance and community models in the next chapter. But, at a high level, it’s important to appreciate that the rules of the road—both formal and informal—in one community may not apply to another.

Peters and Ruff also suggest starting small. For example, tackle a simple bug or documentation fix to start. It will be easier to learn the process and correct mistakes on a small contribution that isn’t critical to your organization’s needs. Companies can and do become major contributors to projects that they didn’t themselves start all the time, but ramping up participation is a process that owes as much to building up relationships at both the individual and overall community levels as it does to cranking out code.

Creating an Open Source Program Office

Increasingly, the logical next step for many organizations is to formalize their participation in open source by establishing an open source program office (OSPO).

Nithya Ruff and Duane O’Brien identify a number of common problems that tend to indicate establishing an OSPO could be useful.¹⁰ Several have to do with a general lack of institutional knowledge about participating in open source. Or maybe there’s some knowledge but it’s in isolated pockets; no one is quite sure where to go to get their questions answered. The legal team is getting overwhelmed with ad hoc requests, and it’s hard to get definitive answers about open source “compliance”—or indeed to fully understand what the pertinent issues even are.

More broadly, there may just be a lack of overall strategy. What projects should be of interest? What events or organizations would it benefit the company to sponsor? How do you establish a reputation in open source communities so you can better attract developers? (Participating in open source as a way to attract talent is a motivation that I hear more and more.)

As with open source communities themselves, there’s a lot of variation in OSPOs. Ruff and O’Brien identify 6 Cs: “Communicate, consume, contribute, collaborate, create competency, and comply.” However, they go on to note that some offices will be of narrower scope than others. For example, one office might be primarily focused on reducing risk to the company by ensuring that the use of open source code doesn’t create any compliance issues from a legal perspective.

A different office might be seeking to improve collaboration within the company by using techniques and workflows inspired by open source projects. Tim O'Reilly is credited with coining the term “inner sourcing” to describe this use of open source development techniques within the corporation.

In a pattern that’s probably becoming familiar in the context of open source communities and projects, the authors of the Linux Foundation’s “Creating an Open Source Program” write that “For every company, the role of the open source program office will likely be custom-configured, based on its business, products, and goals. There is no broad template for building an open source program that applies across all industries—or even across all companies in a single industry. That can make its creation a challenge, but you can learn lessons from other companies and bring them together to fit your own organization’s requirements.”¹¹

The purpose will also influence where the office is hosted within a company. If there’s a strong focus on mitigating legal risk, the legal department might be a logical fit. If, on the other hand, the emphasis is more on participating in open source communities, that seems a better match with the CTO’s office or elsewhere in the engineering organization.

The leadership of the OSPO can be influenced by these considerations as well. If the motivations are largely driven by internal aspects such as improving collaboration, then it may be important to pick someone who knows how to navigate the twisty byways of a typical enterprise organization. If you’re more outwardly facing—perhaps you want to get involved in some existing open source projects that relate to your company’s strategic interest—you may be better off with someone who has experience and skills in open source communities and practices.

Ruff and O’Brien identify good traits to have as including the following: consensus builder, some technical skills, project management, community roots, and presentation chops. It’s probably obvious from that list that OSPO leadership is heavily weighted to connecting, influencing, and persuading.

That because, as Jeff McAffer, director of the Open Source Programs Office at Microsoft, writes: “This is a culture change endeavor. The code is obviously a big part of it, but the community and the engagement is a people-to-people thing. If you’re going to start an open source program office and you’re going to try to make it a real thing, you’re going to need to understand the culture and get somebody in place who can help drive that culture to a new level. Your head of open source is really a change agent.”

The Water Is Fine

Open source licensing, compliance, and participation can be complicated topics. Understanding all the nuances of any of them can require years of study and experience (and you still won’t always get it right). But that’s true of most things.

In fact, developing an appreciation for the broad brushstrokes that make up open source software’s foundations, legal framework, and community model goes a long way toward being in a position to participate productively. You won’t know—and aren’t expected to know—everything at first. But you can start asking the right questions, finding the appropriate on-ramps, and realizing where you need to hire or develop expertise.

Those are good initial steps to taking advantage of the open source development model and starting to appropriately apply open source approaches to different aspects of your business.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 2. From “Free” to “Open Source”

Create new playlist

Sign In

Sign Up