Tip 14Use the Source, Luke
Brown Belt[​​Brown Belt] This could be white belt for you in the right company; in others, you need to build credibility before bringing in outside software.

Open source software is an essential building block of modern systems. You probably learned to program using open source development tools. Your cell phone is likely built using an open source kernel. Start-ups are building their businesses around open source, and even the old-guard tech companies like IBM are investing heavily in open source projects.

Other companies won’t touch it. Open source presents a minefield of legal issues, most of which have never been tested in court.

Your company is probably somewhere in the middle, wanting to use a mix of open source and proprietary software, each to their best advantage. This gives you, as an individual programmer, several ways to build credibility and value within the company:

  • With an awareness of the legal issues surrounding open source, you can give management all the license information they need to make educated decisions and reduce their legal risk.

  • At the same time, you build their confidence that you won’t get the company into legal trouble, nor give away company proprietary code on accident.

  • By contributing improvements to open source projects, you reduce the company’s ongoing code maintenance burden and build cred in the community.

  • Many open source projects have quality standards that rival the best proprietary code. You’ll learn a lot by playing at that level.

The focus of this tip is twofold. First you need grounding in the legal side so you don’t get into trouble. Then we’ll discuss workflow for a project that integrates open source software with a proprietary product.

Proprietary vs. Open

When a company chooses to keep its source code to itself—or said another way, they restrict others from using it—that’s proprietary code. The company keeps the source code a secret, and users of the software get only compiled code.

Assuming you have a traditional employment contract, all code you write for the company is owned wholly by the company. Treat it as proprietary unless you’re specifically told otherwise. Some companies have employment contracts that cover all code you write for the duration of your employment, even stuff done on your own time and with your own computer.

Open source code, on the other hand, is obviously posted in the open—but there are some less-obvious qualifications. Only public domain code is treated as having no owner; in other words, the person who wrote it formally gave up any rights of ownership.

Most open source code has a copyright, which is held by an individual or a company. Code should have a comment block at the top of each file that states the copyright holder. It’ll look something like this (from FreeBSD):

 
/*
 
* Copyright (c) 1989, 1993, 1994
 
* The Regents of the University of California.
 
* All rights reserved.
 
*
 
* Redistribution and use in source and binary forms,
 
* with or without modification, are permitted provided
 
* that the following conditions are met:
 
* [...more here...]
 
*/

That means the copyright of the file is owned by UC, which has the exclusive right to determine the rules for how the file can be copied (or otherwise used). Immediately following are the rules they’ve chosen, known as the file’s license.

(Note: you’ll also hear the term copyleft,[29] but that’s not actually a form of copyright—it’s a philosophy of licensing.)

Licenses

Specific licenses change over time, and interpretations of licenses change as well. Many have not been tested in court. Therefore, I can’t give specific advice—you’ll need to consult your management, and possibly legal department, to determine which licenses are acceptable to your company.

For any license, you’ll want to answer questions such as these:

  • If you change any files covered by the license, does the license require that you openly publish your changes?

  • If you add your own features in new files, are there any requirements that you make those changes public?

  • If the licensed code contains any patented technologies, do you get a license to those patents?

  • Does the license require you to put a copyright notice in your product or its documentation?

Fortunately, there are common licenses that are used by many open source projects. If you’re looking to use a dozen open source components in your project, you may need to research only three or four licenses.

The GNU Public License (GPL) is especially problematic in commercial projects: it requires that all code linked to GPL code also be GPL. A company may not be willing to open source its own proprietary code under the GPL. You need to be very careful about how you use GPL code; many companies avoid the issue with a “no GPL code anywhere” policy.

Note, however, that the Lesser GNU Public License (LGPL) is similar but lessens the restrictions on other code that links to LGPL code. For example, the GNU C Library (glibc) is LGPL, so you can write a program that links to glibc, and it doesn’t impose any licensing requirements on your program.

Licenses such as Apache, MIT, and BSD are more permissive. You can usually integrate code using these licenses into your own products without much trouble. The lawyers will still need to approve it, of course, but it’s a much easier discussion than GPL.

Now that we have some flags on the legal minefield, let’s discuss workflow.

Tracking Upstream Projects

Say you need an XML parser for your company’s Ruby-based product, and the built-in one doesn’t do the job. You find an open source XML parser, and it looks perfect—even the license.

You happily download the current version (let’s say it’s 1.0), write your code, and check the whole ball of wax into version control. Great, problem solved…for today. A month later, you run into a bug and discover it’s already fixed in the latest version (1.2). So, you download the latest and then discover, oh no, you’ve customized some things in the old version; just shoving in the new version will wipe out your changes. Now you need to merge.

The problem here is you can only do a two-way merge: you have your changed version based on 1.0, plus the new version 1.2. Your merge tool only knows where lines are different—it can’t tell where the differences originated from. The burden is entirely on you to figure it out.

images/VendorBranch.png

Figure 6. Tracking external code with a vendor branch

Your version control system can help if you use it correctly. For the basics, see Tip 13, Control Time (and Timelines) . The key for tracking external code is to create a vendor branch that always tracks the upstream code exactly as it comes from the open source project.

Figure 6, Tracking external code with a vendor branch shows how things should look. Now when you get to merging your changes (1.0a) with the upstream changes (1.2), the version control system can do a three-way merge between these two plus their common parents. In many cases, the tool can do a totally hands-off merge, saving you a bunch of time. It’s much less error-prone than manual merging, too.

Contributing to Open Source Projects

So far, we’ve been concerned with pulling in open source components. What about pushing changes back out? Say you find a bug, fix it for your own use, and want to push it back to the community. Sounds like a no-brainer, but your company may treat all of your work as proprietary. You’ll need to get management’s permission first.

Then it’s time to prepare your change. The checklist will depend on the project, but assume that you’ll need to write a detailed change description, demonstrate its quality based on the project’s standards, and ensure the change compiles and runs on other target machines (where applicable).

Then it’s time to submit. The mechanics depend on the project, but they usually look like this:

  • Generate a patch set and email it to a project mailing list. Use your version control tool to generate the patch. The project maintainers will consider the patch, and if they like it, they’ll commit it to the project’s repository.

  • For projects using a hosted version control system (like GitHub[30]), you’ll want to fork the project repository, apply your changes, and then generate a pull request to the project maintainer. This is a more automated version than emailing patches but accomplishes the same thing.

  • You may be granted commit privileges to the project’s source code repository, allowing you to submit changes directly. You’ll need to establish a solid track record first.

Project maintainers want contributions, and they’ll encourage and help you get changes in, but they may reject a change you submit. It could be a quality issue—treat this just like a code review in your day job. Or your change may not fit with their long-term plans.

You can choose whether you want to adapt your code to the project’s desires or just keep your change in your own repository. Where possible, use this as an opportunity to learn from the project’s maintainers. (Also, a record of open source contributions looks great on your resume.)

When you get a change into a project, you may get bug reports. You’ll need to investigate them and submit fixes, again just like your day job. On the plus side, that’s a bug your company could have hit, too.

Actions

Pick an open source project of your liking, and then do the following:

  • Find its license and answer the questions from Licenses.

  • Make your own copy of the project in a way that you can track updates and also maintain your own changes. With GitHub, this is as simple as cloning and creating your own branch—so give that a shot. Other projects may require a bit more work to create the vendor branch and sync upstream changes.

  • Investigate the process for submitting a change to the project. (Bonus points: look at the project’s bug list, fix one, and submit it.)

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.179.59