CHAPTER 16

image

Teamwork

Successful projects start small and are developed by a small, coherent group of people. They emerge as large projects where nobody controls everything. People no longer know each other, nor do they understand the area of code someone else is working on. Architects no longer touch the code. They rise a few levels above the keyboard to discuss the necessary changes in the next version of their product while playing golf with existing or future customers. In this setup, it’s easy to completely lose track of what is going on. In fact, that is almost necessary. Nobody can understand the whole system. Cluelessness gradually increases. Despite that, it’s necessary to produce a new release even in a loosely organized system. There is only one way to do so effectively: selectively choose the aspects of the project that are important and concentrate on those. To that end, this chapter outlines a few recipes that work reasonably well in the NetBeans project.

Organizing Reviews Before Committing Code

As soon as you start to develop as a group consisting not only of people sitting next to each other in the same room but also external contributors, you need procedures to follow when making API changes. Any set of steps is better than nothing at all. However, these steps should include a peer review. Moreover, our experience tells us that the most important thing is to ensure proper review of changes before they are integrated. Only then can the will of the integrator to listen to requested changes be high enough to apply them.

Of course, this is not the only reason why there should be a sequence of steps to follow when performing an API change. Especially if you are working on an open source project, it’s in your interest to “live in a fishbowl” and to have answers for external contributors who want to make some kind of change. Properly documenting steps of what needs to be done to integrate an API change is a perfect and simple answer to everyone requesting an API change.

The other reason you probably don’t want API changes to “just happen” is that you need to distinguish the intended from the unwanted API changes. You don’t want your team to mistakenly integrate incompatible changes, as that hurts users of your library immediately. However, you also don’t want your team to make unconscious compatible API changes as that might hurt you in the future. After a few releases, you might find a few new classes or methods that were added some years ago and that compromise the evolution of your API. Now you are not able to change the code in a way you thought would be possible.

DESIGNED BY A TEAM

As I’ve mentioned many times, in the early days I designed all the NetBeans classes that were meant to be part of the public API. In that situation, you need no formal processes and no rules at all. All these things can simply be in your head. You know the purpose of each element and whether a particular change is aligned with the strategy given to each API. It’s easy to maintain consistency. However, this only works when you are the only designer on the project.

Then one day a NetBeans engineer was supposed to change an API for manipulating the toolbars in the main window of the NetBeans IDE. I didn’t consider it an important enough problem to work on, so I let him do it. That immediately broke the working style I was used to when designing an API. I could have sat with him and watched every change he performed on his computer, but as I had other things to do, he worked alone and simply delivered the result.

The first problem was that I didn’t even notice the API change. There was no “big brother” to watch over and notify others of such changes. When I noticed, I didn’t like the change. However, I couldn’t formulate the reasons for my dislike—the change had happened a long time before I started to write this book. Moreover, I noticed the change after a release. Given our strong adherence to backward compatibility, we couldn’t remove it. We simply had to live with it.

I can’t remember the exact problem in this case. Maybe a class in the API exposed implementation details by implementing MouseListener. Maybe too many internal fields were needed to be hidden later with a few hacks. The point is that as soon as you allow a team, instead of a single person, to change your APIs, you need to organize and observe the work that is being done on them.

Of course, not every API change deserves the same attention. Sometimes it’s necessary to simply supervise the API and ensure that changes are “harmless.” The most important aspects to watch for include the ability to allow future evolution, the proper versioning of the change, some kind of consistency with the rest of the library, and a reasonable amount of documentation and test coverage. This is usually enough for small and compatible changes such as adding new methods or classes into an already existing framework. Also, it’s usually enough just to prepare the changes, and just before the commit, ask for a review by providing the patch that you want to integrate. Usually the change is so small that the patch is self-descriptive. As open source developers like to say, “Show me the code.” That is the only thing that counts anyway.

STANDARD VS. FAST TRACK

NetBeans reviews are open to submitters as well as reviewers. Everyone can participate in them. The review process is described at http://openide.netbeans.org/tutorial/reviews/ and happens mostly inside our bug-tracking system. Each review request is tracked as one issue, and progress and results of its review are tracked as its changes.

For small changes we have a fast track review that is based on an optimistic locking strategy. Submitters prepare the diff of the API change. They put the diff into the issue and wait a week. If there are no objections, the change is applied. We’ve found this to be a nice compromise between the right to review and the need to integrate the changes. A week is long enough for anyone interested in the problem to find a free moment for review. However, a week is not so long that the submitter’s work is blocked in a significant way. A nice additional aspect is that often the proposed change is slightly different from the one integrated: during the week the submitter does a self-review, rethinks the proposal some more, and fine-tunes it himself. This means that the final integration is better than it would have been without the review, and that is often the case even if no reviewer adds comments at all.

The other review type is called standard, and it consists of two rounds. The first takes place before any coding is done, the second before integration. In each of these rounds, there is a discussion in our bug tracking system or in a dedicated wiki page. The discussion is then followed with a conference call with at least four selected reviewers who need to “vote,” meaning create a list of their objections. Then either the project is killed or it is allowed to continue under the condition that the objections are resolved.

However, there can be far larger changes. For these, this kind of review might not be sufficient at all. If you want to create a whole new API and still be consistent with the rest of the framework, it doesn’t make sense to do the review method by method. Reviewing at a time when methods are already written is much too late. You need to start the review at a time when the concept of the library is being created, as only at that time do you have a real chance to influence the direction of the whole effort. Once code is written, nobody is happy with suggestions such as “base your API on JavaCC,” “use ANTLR,” and so on. It’s important to convince developers to ask for reviews as early as possible. This is hard, especially if the developers consider you to be someone who wants to block or slow their progress. This is common for those who have never participated in the review before, as they have an inaccurate expectation of what an architecture review can do for them.

MY FIRST ARCHITECTURE REVIEW

A team is expected to ask a global Sun architecture review committee’s opinion before a project is released to customers. I don’t know the exact history of this process. I expect it was started a long time ago to supervise changes in the Solaris operating system and SPARC processors. At least, that is the impression that I got when learning its methodologies, and especially by reading its “20 questions” document. It’s a living document that, at the time I first came across it, contained a few hundred Unix-oriented questions. It didn’t even mention Java at that time. As you might have guessed, it’s continually evolving and now probably contains a set of Java-related questions.

After Sun acquired NetBeans, we produced a few releases before someone realized that we hadn’t passed through this review process yet. I was asked to prepare and pass the architecture review. You can imagine how upset I was! Someone who probably doesn’t even know how to spell Java is going to tell me, the one who designed NetBeans, something about architecture! How dare they! What kind of advice could I expect from people who knew nothing about NetBeans?

The guidelines were not oriented toward Java. People leading the reviews didn’t know much about Java. However, as I slowly crawled through the “20 questions” document and answered their questions, I began to see the light. Before that, I belonged to the camp that believed that architecture and API ends on the level of classes and their methods. I realized the error of my thinking. Based on that experience, I wrote Chapter 3. I realized that, to do architecture and API reviews, you don’t need to be an expert in the API domain at all. Many general aspects can be reviewed without needing to understand the actual tasks that the API is supposed to do. For these aspects, there is no real need to understand the actual tasks that the API is supposed to do. Does that sound familiar? Yes, indeed; I’ve argued this in Part 1 of this book as well. In fact, I can say that without being asked to pass through the architecture review, I would not have been able to write this book. I hadn’t expected that there would be many new technical skills I would learn while working for Sun. However, the architecture review committee is more valuable than anything else I’ve yet come across at Sun. Therefore, I am thankful to Sun for teaching me how to perform effective architecture reviews.

One thing that scared me when writing documentation for architecture reviews was the fact that I would write them and then lose them after the review. I hate to perform useless work, which is exactly what this task seemed to be. I decided to change the way we do API reviews in NetBeans to make our documentation directly reusable for architecture reviews. We are able not only to pass the review, but also to provide much more detailed documentation to developers using the NetBeans Platform. I open sourced the architecture review process. I changed its name, changed various technical terms, and I rewrote the “20 questions” document to be more NetBeans and Java specific (see the questions at http://openide.netbeans.org/tutorial/questions.html). However, I kept the structure untouched. I kept this an open secret for a few years. However, now that all of Sun seems to be quickly moving toward an open source style of development, I am no longer afraid to disclose the whole history behind the creation of the NetBeans Open Source Review Process.

Feel free to learn more about this kind of review by reading the NetBeans API Review instructions. I can only recommend them, as in short, I’ve found architecture reviews the best thing under the Sun!

Whatever style of work you decide on for your own project, it should include some form of review before committing, at least for architectural changes. My experience suggests that this is the only effective tool you have when coordinating API contributions. Moreover, having the process open to external contributors can decrease your cost of ownership, as discussed in the section “Minimizing Maintenance Cost” in Chapter 14.

Convincing Developers to Document Their API

Anyone who has ever been a developer knows how boring it is to write documentation. Everyone who ever tried to convince a developer to write documentation knows how hard that is. As a result, the general feeling is that developers don’t like and are not able to provide documentation for their own code. However, good documentation is the facade that can shield API users from the internal details, and thus integrate cluelessness further into their programming. Also, the API is the main tool for communication between the author of the API and its users. The better the documentation, the less effort is needed to understand and use a library. That is why having good documentation is a must for any successful framework.

However, a developer’s attitude usually goes along the following lines: “The code is there, it works, so use it!” They, or at least I, feel that writing documentation duplicates the work that has already been done. As anyone can guess, doing the same or similar work a second time around is boring. That is probably the reason why developers are negative when it comes to writing documentation for their code. Also, when they are forced to write documentation, the documentation is not of good quality. Even if they really try, they cannot write a good introduction, as they see their own API with completely unique eyes and cannot pretend to be newcomers anymore. Even if they try to write a good overview, the low-level details often stick out. They are able to say, “The purpose of this library is to draw an image,” which is a good overview sentence. However, engineers are immediately tempted to add a description of how to do it; that is, immediately just show the code, to the deepest level of detail that is unlikely to be useful for newcomers reading the overview with the hope of getting a feel for the functionality of the library.

This is all true, yet I believe it’s possible to help developers write good documentation. It just needs a well-thought-out methodology. One of the most promising methodologies is called working backwards. I saw a note about that on a friend’s blog. As soon as I read it I thought of it as something that could actually work. However, as there is often a long road between thought and action, it took more than a year before I was able to try it out myself. When I did, I was surprised at the result.

The basic idea is simple. Instead of starting with code or a technical proposal, let’s start with the last thing that is needed, such as a press release. Write the goals of the project, pretend you are proud of your achievements, and describe what they are. Then follow with a frequently asked questions (FAQ) document. Here you can go into deeper details and explain some details of the technology that are too low-level for the press release. After that, follow with the advice given in the section “The Importance of Being Use Case Oriented” in Chapter 4: use the FAQ to describe the use cases, then include the actual code and its Javadoc.

There are several reasons why this “inversion of workflow” can prove successful. At the time of writing the initial press release, you don’t yet have enough technical knowledge to stuff the release with technical details. This requires a bit of self-discipline because every developer wants to code as soon as possible. So you need to defer that joy and write down your own thoughts before jumping to the keyboard to code. Though this needs a bit of self-control, it’s not at all unnatural because you must have the overall picture of the project in mind before starting to type the code.

THE FEAR IS MUTUAL

Not only the developers are afraid to write documentation: the technical writers, in turn, seem to be afraid to change the Javadoc! Here I am going to pick on Geertjan Wielenga, one of this book’s beloved editors. Without him this book would never have seen the light of day, as my lab journal was not written in colloquial English and desperately needed many additional edits. (I am looking forward to seeing how much, if any, will be left of this sidebar after his language fixes, which in the case of this sidebar is just a euphemism for censorship!)

Geertjan loves to write documentation about the NetBeans Platform. He is a famous blogger, maintains the NetBeans Platform web site, and feeds it with content consisting of tutorials, interviews, stories, and so on. Clearly he is an essential part of the whole documentation effort. With just a little exaggeration, I can say that without him few people would be using the NetBeans Platform today.

On the other hand, his work is just a part of the documentation our project offers. It’s the initial part, used by newcomers. However, in addition to that, we have the documentation written by our API writers; that is, the developers. This includes the Javadoc, use cases, and so on. Although these kinds of documentation are of different complexities, they address the same issue: documenting the NetBeans Platform usage. In spite of that, there is little to no connection between these types of documentation. That is a shame, because they are all targeted at our users, and our users want to have as much information as possible. They want to have this information constantly at the ready. When coding in the IDE and using code completion, wouldn’t it be useful to jump from there straight to a tutorial related to the usage of the current method? Indeed it would be, and the only required task is to enhance the Javadoc method with a pointer to a related tutorial.

However, we are facing a typical us-versus-them problem. Developers don’t know about tutorials, so they cannot add a link to them when writing the Javadoc. Moreover, the tutorials are usually written later than the Javadoc. From another perspective, Geertjan seems to be afraid of changing the Javadoc himself. In various discussions, I’ve tried to convince him that this is needed, but so far I’ve failed. So, while developers are afraid of writing documentation, documenters are afraid of changing anything written by developers. As a result, we have two sets of documentation, written more or less independently.

I am sure this situation is not good. I am unsure what to do about it. Maybe we should apply a “working from roots” methodology and not let anyone write a tutorial unless there is a link to it from the Javadoc. That might fix the situation and remove Geertjan’s fear to edit it. However, I still need to find out if that would work.

Moreover, one of the reasons why developers don’t like to write documentation is that it’s forgotten as soon as they finish the code. That is not always true, but often enough to instill the concern of writing documentation. With the “working backwards” style of work, this is prevented. The press release will be needed when the project is finished, if not directly, then at least as a source of information for the marketing department. The FAQs are something to be loved by end users or at least technical writers working on tutorials. The need for use cases and the Javadoc is clear. That is why the likelihood of convincing engineers to write good docs is increased, as they are not doing anything useless or duplicating something that has already been done before.

However, the order of things is important. Start with the press release and move step by step to the code. Without this order, the methodology doesn’t work. Recently I wanted to write a “press release” for a finished project and it was painful. The result was not good at all. I started with the overview. However, in the third paragraph I suddenly included a shell command line as an example of how to use the project. This was not a press release at all! However, this is what happens to engineers: they want to share as much useful information as possible, which prevents them from writing press releases after the product is finished. The only point at which they will succeed in this is before they start coding.

I have to admit that I’ve enjoyed this style of work every time I’ve tried it. In the end, writing press releases and FAQs was a lot of fun. However, I still have to see if this can be applied to everybody. I hope this can be seen an as entertaining opportunity that can simultaneously be generally applicable in many situations.

Big Brother Never Sleeps

Mistakes happen. However good you are, however good your team is, it’s certain that from time to time someone will make a mistake, accidentally modifying an API. If your team is not perfect, or if it incorporates new members occasionally, you need to watch over them. You need to verify that they are not making any overwhelmingly strong promises to your API users. If they do make promises, they must mean them and be able to deliver on them. Of course, the last thing you want to do is to sit beside them and watch every step they take. Instead, you want an automatic system that will notify you of changes in the important characteristics of your API.

The goal for a “big brother” system is clear. Now the question is: what are the important characteristics to check and how often should you check them? Obviously, everything that you check on any regular project needs to be checked when working with APIs and libraries. It’s valuable to have an automated build that runs after each commit, or at least a daily build. It’s important to verify that your code can still be compiled. It’s desirable to run automated tests with it as well. It might be interesting to know the test coverage of your library, as that might identify places that are not tested at all. It might also be interesting to be notified of commits that integrate changes to the library that decrease the test coverage and thus might destabilize the library in the future.

WHEN THERE ARE ENOUGH TESTS

While writing tests, developers often ask how many of them should be written. The simple answer is to write tests whenever they are useful. The more precise and less definitive answer follows:

Various tools out there help to measure test coverage. We’ve selected EMMA (http://emma.sourceforge.net) for measuring the coverage of our application code by our tests. For example, when invoked from the pop-up menu of a project from NetBeans.org, it instruments the application code and invokes automated tests on it. While running, it collects information about all the called methods, visited classes, and lines, and then it shows a summary in a web browser.

Counting coverage by visited methods is not a very demanding criterion. However, it can be surprisingly difficult to get close to 100 percent coverage. Even if you succeed, there is no guarantee that the resulting application code will work correctly. Every method has a few input parameters. Knowing that coverage testing succeeded with one subset of them doesn’t say anything about the other cases.

It’s much better to count the coverage by branches or lines. When there is an if (…) { x(); } else { y(); } statement in a method’s code, you want to be sure that both methods, x and y, will be called. The EMMA tool supports this need. By helping us to be sure that every line is visited, it gives us confidence that our application code doesn’t contain useless lines.

Still, the fact that a line is visited once doesn’t mean that our application code is not buggy.

private int sum = 10;

public int add(int x) {
    sum += x;
    return sum;
}

public int percentageFrom(int howMuch) {
    return 100 * howMuch / sum;
}

It’s good if both methods get executed and fine if we test them with various parameters. Still, we can get an error if we call add (-10); percentage(5), because the sum will be zero and division by zero is forbidden. To be sure that our application is not vulnerable to problems like this, we would have to test each method in each possible state of memory it depends on, meaning each value of the sum variable. That would give us the ultimate proof that our application code works correctly in a single-threaded environment.

But there is another problem: Java is not single-threaded. Many applications start new threads by themselves. Even if they don’t do so, there is also the AWT event dispatch thread, the finalizer thread, and so on. You need to count on some amount of nondeterminism. Sometimes the garbage collector simply kicks in and removes “unneeded” objects from memory, which can change the application’s behavior. We used to have a never-ending loop, which could be simulated only if two Mozilla browsers and an Evolution client were running, because then the memory was small enough to invoke the garbage collector. This kind of coverage is immeasurable.

That is why we suggest that people use code coverage tools as a sanity check that something is not under-tested. However, you need to remind yourself that there might still be bugs in an application, no matter how high the code coverage is. To help fight the strange shifts of an application’s amoeba shape, we suggest writing a test whenever something breaks. When there is a bug report, write a test to verify it and prevent regressions. That way, coverage will be focused on the code that matters: the parts that were actually broken.

In addition to these regular tests, your library needs a lot of special care. It’s likely that you want to publish a Javadoc with each build and make sure it doesn’t contain broken links. Moreover, you might be influenced by this book’s opinion that the Javadoc is not enough. In such cases, you probably want to find a way to describe other APIs and annotate them in a way that will be easily recognizable to your API’s users. Potentially, you also want to assign your API stability categories, as discussed in the section “Life Cycle of an API” in Chapter 4. Last but not least, you want to notify users of your API about the latest changes that have been made. For all these reasons, you might want to employ the NetBeans Javadoc extensions.

NETBEANS JAVADOC EXTENSIONS

To follow the principles of this book, we needed to enhance the standard Javadoc generation tool. We needed it to provide support for proper versioning and for an easier and standard way of drilling down from top-level use cases to individual APIs. I believe that everyone building a library or framework following advice described in this document needs a tool like this. So, feel free to use this set of Ant build scripts and XSL transformations, as they are not NetBeans-specific and can be used independently.

The Javadoc is built around the modified “20 questions” document that contains various intrinsic questions that should guide every module writer to realize the various APIs exposed by the library. For example, there are questions about reading system properties, using reflection to access nonpublic classes, reading files, and opening sockets. The goal is not to force the developer to write “Not applicable,” but to document it if it’s true. Best of all, the goal is to document it in a categorizable way, in a way that can be processed later.

In the NetBeans project, the name of the file containing the answers is arch.xml. The text of arch.xml is mostly in HTML format. However, there are two important extensions. Either you can use the <api> element, which marks an API, or you can use the <usecase> element to describe an introductory use case. If you use these tags, then their content is taken and inserted into the overview page of the Javadoc. As a result, the main entry point page of an API, such as the Task List API ,1 contains the overview use cases that can navigate the user to the appropriate classes in the standard Javadoc, as well as a table listing all properties, reflection, and files that form an important API. This is enough to get a first feel of what an API is supposed to do. However, if there is a need to know more, the API user can always read the whole “20 questions” document.

This is by no means all that is interesting to users of an API. They are also interested in knowing what new features are available in the API. This is especially important in the case of libraries with a rapidly evolving design where a few new additions appear several times a month. To deliver this overview, we have another XML file called apidesign.xml. It contains API change sets. Whenever we add a new method or class into an existing API, we increase the specification version of its module and also add a change set. The change set contains a human-readable description of what is going on, links to the changed elements in the original Javadoc, the specification version the change happened in, and a reference to our bug tracking system for those who want to search the absolute history of the API’s introduction. This is usually more detail than is needed. The link usually refers to the details of the API review with all comments, replies, and considerations, but sometimes all this is needed, as it forms the ultimate document, potentially explaining the motivations behind the final look and feel of the API.

Our build scripts take the five latest changes and put them into the overview page. The rest is then available on a separate page, where the whole history of the evolution of the API is kept.

The API change sets can be annotated as being source compatible, binary compatible, semantically compatible, or none. This helps us point out changes that deserve API user attention when migrating to a new version. However, it’s sometimes not easy to decide whether a change is compatible or not. We still haven’t decided whether to aim for 100 percent compatibility or simply 99 percent. The fields allow us only to specify “yes” or “no,” which is probably correct, but what should we say when the change classifies as 99 percent compatibility, instead of 100 percent? This varies. As far as I know, we usually annotate it as compatible but add a note describing under which cases compatibility might not be fully accurate.

If you want to learn more about our Javadoc enhancements, feel free to point your browser to http://javadoc.apidesign.org.

Documenting changes is a nice thing. However, on its own it’s not enough. You need to track the API itself. First and foremost, this means watching for changes in the signature of public and protected members of public classes. You want to watch for two types of changes. The most important, from the standpoint of clueless assembly of libraries, is to watch for backward-incompatible changes. As soon as you release a new version of your library, you should take a snapshot of its API and then, with every new build, compare the actual state of the API with that snapshot. Nothing should be missing and there should be no incompatible changes.

The other aspect to check is that there are no accidental changes. As pointed out in the section “Organizing Reviews Before Committing Code,” the last thing you want is to find that random accidental API changes have sneaked into your API. If they do sneak in at the wrong moment, you need to sustain them in future releases in a compatible way. That is why you need to generate changes on a daily basis, against those from the previous day, and make sure that the differences are sensible.

NETBEANS API SIGNATURE TEST TOOL

The NetBeans tool is based on the one provided as part of the Java Community Process. With a few additions to it, to fit into our Ant build scripts and understand module versioning, we are able to take the snapshots that report both incompatible changes against the snapshot as well as any change against it. If you want to know more about the tool, please visit http://sigtest.apidesign.org.

Signatures are not the only exposed API. You need to check other important aspects. Generally, the layout of various files is of great interest. In the case of modular systems such as NetBeans, it’s reasonable to watch over dependencies between individual components because newly added dependencies can restrict the ability of that component to be picked up and separately assembled into some kind of application.

VERIFICATION FRAMEWORK

NetBeans has a special framework for testing these additional APIs as part of the build. The general idea is to have a special Ant task that is configurable using its setters and that can generate plain text output. This output can then be compared using regular text tools to determine whether there are differences. The input of the task is a set of module JAR files, from where the task reads its dependency information. Remember that all the dependencies are stored inside JAR manifests in NetBeans.

The task is configurable to produce various output formats and to write them into a given output file. The possible outputs include a list of all public packages, which can then be passed to the signature processing tool; a list of all modules and their versions, to notify us that a new module is added; and a list of all modules, along with their dependencies.

We used to have a mode when, to allow each developer to test their own changes, the “golden” results were stored in version control, along with the source files. The build failed when they didn’t match. This completely shielded us from unintentional changes. However, we discovered that in many cases the changes to golden files weren’t of great interest. That is why right now, we run these checks only on the build machine. The golden files generated by previous builds are compared to the new ones. If there is a difference, an e-mail is sent to all the developers. This way everyone interested is notified about the changes made during the day and can check that no unwanted dependency has been added.

I am not sure which of the modes is better. The one that caused the build to fail required too much attention from the developers. The current mode just lets them integrate changes without any kind of selfcontrol. The optimal solution probably lies somewhere in the middle. If we could specify unwanted dependencies, such as “text editor should not depend on compiler” or “HTML support should not require Java support,” these would likely be good candidates for breaking the build. We’ll see if we find that useful in the future.

It’s not important whether you find inspiration in the tools that the NetBeans project uses or whether you choose to develop your own. However, it’s important to create some kind of “big brother” system that watches over your developers. Only then can you successfully design an API in a group and not be surprised by the final result.

An alternative is to have a human “gatekeeper” who receives proposed patches and either accepts or rejects them. That might indeed work too. However, it’s unlikely to scale well enough. Also, it’s not in the spirit of this book to suggest a solution that requires a person who “knows everything.” The nirvana we seek is to build our solution around maximum cluelessness, where human knowledge is minimized. This is more likely to be achieved when you have good tools and tests to control the general aspects of an API, rather than a single gatekeeper.

Accepting API Patches

The section “Minimizing Maintenance Cost” in Chapter 14 claimed that maintaining a well-written API is no more difficult than maintaining regular code—sometimes even simpler. However, this depends on the quality of the contributions received from your API’s users. If they simply complain, or request changes in your library API that don’t make much sense or are not well-justified, you won’t simplify your life much. If you accept bad suggestions or contributions, future maintenance can become a nightmare. The way to prevent that outcome is only to accept good suggestions and good patches.

GO MULTITHREADED!

Separating good from bad suggestions can sometimes be difficult. For example, the MPlayer project, which provides an open source movie player and encoder written in C, was advised to take advantage of multicore processors and rewrite its code to be multithreaded. This looks like a reasonable piece of advice. With two cores, the compression speed could be doubled. Also, these days almost everyone has a multicore processor; if not today, almost everyone will have one in the future. Still, the project refused to follow that advice. No wonder, because even the single-threaded version of MPlayer had a hard time avoiding crashing. With multi-threaded code, this would have been even more difficult. In my opinion, the team was right in refusing to start exploiting multicores, even though they risked a potential fork of their project. That’s exactly what happened: fans of multithreading created their own project, forking the MPlayer code base. As far as I can tell, after a few years, the original MPlayer is still alive, while I cannot find any information on the fork.

As usual, a good decision is one that reduces time to market and lowers cost of ownership. Nevertheless, if you have an API and someone asks you to make a change, you are usually not motivated by time to market. Your API is already on the market. Unless the requested feature is something that everybody wants, you don’t need to hurry. The only concern is the total cost of ownership. If you accept the patch, you’ll become its owner. Potentially that will mean that you’ll spend too much time or energy maintaining it.

What could be the reason for this risk? You might accept a buggy patch, one that doesn’t live up to its promise. This risk cannot be fully eliminated. However, if you ask for reasonable coverage of code lines changed by the patch, you might be confident that at least the basic functionality is acceptable. The next problem is that the fix can cause regressions. Assuming the rest of your library is properly tested as well, this risk can be reasonably eliminated. However, if the patch submitter is the first person who writes a test for your library, then the risk of its integration breaking some of the already existing assumptions might be big. This is yet another reason for having reasonable code coverage. On the other hand, regressions are not the only danger that you need to eliminate. The patch might prevent future evolution. Therefore, you need to review it before integration and measure it according to the suggestions found in this book, as well as by any other suggestions that help guarantee a simple evolution path into the future. Another bad thing might be an underdocumented patch. Again, it’s simple to request further documentation or to ask the submitter to provide more high-level use cases. On the other hand, this requirement could be partially replaced by good test coverage, because automated tests are often the best form of documentation. The last aspect that you should pay attention to is proper versioning. In modular systems with version numbers, you should increase the specification version with every API change and document the new behavior available since that version.

The best approach to accepting patches is to do a review before committing them. This is easy to do when you get a patch from an external contributor. However, “review before commit” is the best way to ensure a proper response from the provider of the patch even if it’s your closest teammate. Until the change is integrated, the motivation to solve your issues is magnified by the need to have the patch accepted. Usually, after integration the contributor’s will to fix additional problems is lowered relative to the time passed since the patch’s acceptance. This means that the only suitable time for minimizing the maintenance cost is before integrating a patch. All the problems found later are likely to end up on the plate of the maintainer of the library, rather than the submitter of the patch.

IT’S SO NICE TO ACCEPT PATCHES!

As a maintainer of parts of an open source framework, I can say that I am always pleased to receive good patches from others. Sometimes I feel that the requirements I impose on them are too high and that it’s not easy to fulfill all of them. Sometimes I am even in doubt as to whether to accept something that is not properly tested or that is underdocumented. However, I am aware of the potential problems of maintaining buggy patches into the future. Usually I request an improved patch. Either I then receive the improvement or I don’t. If not, that probably means I was right, as the submitter probably wouldn’t do a good job providing support for the API change later.

From the point of view of the user of a library, you might ask whether it makes sense to contribute patches, especially if the barrier for acceptance is high. It might be better to fork the library or to work around the problem on your own side. Of course, this depends on the overall situation. However, the problems of forking or working around a problem are clear: you increase the cost of your ownership. Despite the fact that you needed to find the problem, identify it, and understand its fix in the library that you are using, all of which are costly processes, you are willing to endure these hassles, rather than donating the fix to the project providing the library. You want to keep the code change to yourself and you then need to remember it in the future. That’s because, with every new version of the library, you’ll need to apply the patch again. You’ll then need to verify that your workaround still works, which is not easy, as the provider of the library doesn’t even know that you’ve forked their library. This is increasingly costly in the long run—certainly more costly than spending a bit of time fulfilling the project’s requirements for a proper patch. Fulfilling these requirements is something that you need to do once only, after which you can let the project itself handle all future maintenance problems related to your donated API patch.

Each open source project has different requirements for accepting API patches. However, generally, all these are about balance. Patches that are of poor quality need to be rejected because they have hidden maintenance problems. On the other hand, you want to make it easy for users of an API to donate patches. It should be easier to donate patches than it is to invest in inventing custom workarounds and forks. The final decision on the approach to take is on the side of the user of the API. However, the more widespread an open source project with a good API becomes, the more likely it is that users will understand the strategic longterm advantage of contributing patches. For everyone concerned, taking this approach is the best way to minimize overall costs of ownership.

________________

1.    http://bits.netbeans.org/6.0/javadoc/org-netbeans-spi-tasklist/overview-summary.html

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.1.158