Chapter 7. The GNU Compiler for Java (gcj)

The GNU Compiler for Java provides a native binary compiler for Java code. In this chapter we’ll show you how to compile a simple binary application from Java sources.

What You Will Learn

You will learn how to compile a binary executable from Java source code using the gcj compiler.

A Brand GNU Way

Quite some time ago Richard Stallman started an effort to create a free version of UNIX called GNU[1] (which stands for GNU’s Not UNIX—a recursive acronym). More than that, he tried to convince the world that code should be Free with a capital “F”. By this, he meant that it was unreasonable to provide software without both providing the source code and the right to use and modify that code as desired. To ensure this, he and his team created the GPL[2] (the GNU Public License) and founded the Free Software Foundation[3] to foster development and promote the idea.

The story of the founding of GNU/FSF and the motivations behind it[4] makes for a fascinating reading. Even if you are not interested in Free Software, the story prompts you to think in new ways about software, property, and freedom. As interesting as this story is, it is not our topic. The important thing is how the quest to create a Free operating system lead to a native Java compiler and the twists and turns on this way.

The GNU Compiler Collection

If you are going to write a UNIX-like operating system, and one that is “Free” (certainly free of anyone else’s intellectual property which might be restricted from the Free Software point of view), the first thing you need is a C compiler. Thus, a great deal of early effort by the FSF went into developing what was originally called the GNU C Compiler, or gcc.

Once they had a C compiler, some people began to write hundreds of utilities from ls to grep, while others began work on HURD, a microkernel for GNU. That work continues to this day. The bulk of the system commands you use on Linux were in fact developed by the FSF as part of the GNU project. This is why Stallman et al. want us all to refer to “GNU/Linux” rather than “Linux”.[5] An understandable, if unenforceable, position.

It wasn’t long before an effort began to integrate C++ into gcc. As time progressed, support for more and more languages and for more and more architectures[6] was being added. At some point, it was decided to rename (reacronym?) gcc to mean “GNU Compiler Collection.”

Not too surprisingly, as Java emerged and gained popularity, it became one of the languages supported by the GCC using a front end called gcj.[7] That is what we’ll be talking about here.

Compiling Our Simple Application with gcj

The basic form of gcj is

gcj [options...] [codefile...] [@listfile...] [libraryfile...]

We’ll go over the options in a moment. For now, let’s talk about the various kinds of input files the compiler can process.

In the above command-line synopsis, codefile refers to a Java source file, a compiled .class file (yes, gcj can convert already compiled Java bytecodes into native binaries), or even a ZIP or JAR file. A filename prefixed with the at-sign, @, indicates that the file contains a list of filenames to be compiled. That’s the @listfile entry in the command synopsis. Finally, zero or more library files to link with may be specified on the command line. When you specify them directly (as opposed to using the -l command-line option) you must provide the full name of the library.

Like all the other Java compilers we have talked about so far, gcj supports the notion of a classpath. It will look in the classpath for unknown classes referenced by the classes you name to the compiler. Since gcj can read and compile from .class and .jar files, you might think you could just make sure that the JAR files from Sun or IBM Java SDK are on the gcj classpath and you would be able to compile any Java program using any Java APIs. Alas, you would be wrong. Why? Because the Java APIs are full of native methods, and which methods are implemented in Java and which are native is not documented anywhere.

Even if this were not so, it is not permissible under the GPL to distribute binaries without also offering to distribute source code. So, to distribute the Sun or IBM API JAR files would be incompatible with the GPL, and to not distribute them but to depend on them would mean shipping a product that doesn’t work out of the box and requires users to obtain some non-Free software in order to work. That is just not acceptable. So the developers of gcj have opted to reimplement as much of the Java APIs as possible.

As you can probably guess if you have browsed the Java API Javadoc files, this is a monumental undertaking. The Java APIs are a moving target, and they started huge and grow larger with every new release. There is a parallel project to gcj called GNU Classpath[8] which is attempting to implement the entire Java API. Its target for the 1.0 release is to be fully compatible with Java 1.1 and “largely compatible” with Java 1.2. You might want to look at that project for better API support than that provided by gcj’s libgcj.[9] If you are curious about the present status of libgcj’s implementation of the Java APIs, there is a Web page (frequently updated) that compares the status of it against the Java 1.4 packages.[10]

Compiling FetchURL with gcj

We’ll discuss gcj’s command-line switches in detail in Section 7.5, but we will have to use a couple of them here. First off, be aware that since gcj is actually part of gcc, all of the non-language-specific switches of that system also work in gcj; thus, -o specifies the name of the binary output file, and so on. There are many references on gcc to which you should refer for details (the manpage on gcc is a good place to start). Example 7.1 shows compiling and running FetchURL with gcj.

Tip

The source code for FetchURL can be found in Example 3.30.

Example 7.1. Compiling and running FetchURL with gcj

$ gcj -o furl --main=FetchURL FetchURL.java
$ ./furl http://www.multitool.net/pubkey.html
http://www.multitool.net/pubkey.html:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<HTML>
<HEAD>
<TITLE>Michael Schwarz's Public GPG key</TITLE>
</HEAD>
<BODY>
<CENTER>
<H1>Michael Schwarz's Public GPG Key</H1>
</CENTER>
<PRE>
-----BEGIN PGP PUBLIC KEY BLOCK-----
Version: GnuPG v1.0.7 (GNU/Linux)

mQGiBDuv6IQRBACn1TIWUXiEuZtfR+0Lqx6tYBAzIRpljL42O6r5nKHmndsWV71e
FUnhQpQIf+bNGGPMEt0g0vFpD6YWKP4uIEh2o+u1iyIIMs5QH3iqp8kFjbtVZa21
...
...
...
etc.

We already explained the -o switch which names the resulting binary. The other switch we use here is --main which specifies the class containing the main() that should be run when the binary is invoked. Remember that every Java class may contain a main(). In a multiclass program, the binary needs to know which main() to run when the binary is executed.

Remember that FetchURL is in the default package,[11] so you simply type the class name as the argument to --main. However, if the class is in a nondefault package, the fully qualified name must be used.

Compiling a Multiclass Program

For contrast, Example 7.2 shows compiling a multiclass program that is contained in a package (it is the Payback debt/savings/purchase calculator).[12]

Example 7.2. Compiling and running a multiclass program

$ cd payback/src
$ gcj -o payback -I. --main=net.multitool.Payback.Payback 
net/multitool/Payback/Payback.java
$ ./payback
Payback -- A savings/credit comparison tool
Copyright (C) 2003 by Carl Albing and Michael Schwarz
Released under the GNU/GPL. Free Software.
...
...
...
etc.

The -I switch names a directory that is to be prepended to the classpath. In this case, we added “.” which is the source directory for the Payback program.[13] Notice the package elements expressed with dots for the --main argument, and with slashes for the filename argument.

Note

The gcj compiler does pick up and use the CLASSPATH environment variable if it is specified. Also, gcj has a number of switches besides -I for classpath manipulation. We won’t cover those here; -I is the preferred method (according to the gcj manpage at any rate).

Options and Switches

As we have said, gcj is part of the gcc suite of compilers and therefore supports all of the non-language-specific options and switches of that suite.

As with most reference material in this book, we will only cover the highlights. See the gcj manpage or the project’s Web site for full details.[14]

-Idirname

Add dirname to the classpath ahead of its existing contents.

-Dname[=value]

Add name and optional value to the system properties list. This is only valid with the --main switch.

--main

Specifies which class contains the application’s main(). This gives the starting point for an application.

-fno-bounds-check

Disable array bounds checking. Like “real” Java, gcj checks all array operations to ensure that array bounds are not exceeded. Using this switch disables that check. It speeds up array operations but can introduce subtle and hard-to-find bugs. Use at your own risk.

-fno-store-check

Like -fno-bounds-check, this disables a safety feature on arrays. Normally, when you store an object into an array, a check is made to make sure that the object is assignment-compatible with the array type (in other words, that the object is an instanceof() of the array type). Using this switch disables this test. It speeds up array operations but can introduce subtle and hard-to-find bugs. Use at your own risk.

There are other switches for native methods, bytecode (as opposed to native) compilation, and some switches related to resources. We leave it as an exercise for the reader to learn and use these where needed.

Reasons to Use gcj

You might think that speed would be the primary reason to use gcj, but this is not necessarily the case. Yes, gcj is usually used as a native code compiler (it can compile to Java bytecode as well, and thus can be used as a replacement for javac), but there is a lot more to Java performance than that. First off, both Sun’s and IBM’s JVMs have JIT (“Just-In-Time”) compilers in them, which convert some or all of a class’s bytecode to native code on the fly. In some cases, these compilers may do a better job than the gcj compiler, so as a result, initial runs under a JVM are slower than gcj but later loops or iterations are comparable or faster. Also performance of both gcj and JVM code is highly affected by memory, stack, and garbage-collection parameters which may be modified with command-line options or properties files. So speed is not the determining factor. We have not done sufficient testing or measurement to tell you which environment produces “the fastest code” from a given source file. (We’re not even sure exactly what such “sufficient testing” might consist of. All we can suggest is that your try your code in all three environments and then make your own choice.)

It is, perhaps, ironic that one of the main reasons why you might wish to use gcj is portability. You see, you can only run Sun’s and IBM’s JVMs on platforms for which they provide a compiled version. Linux runs on several hardware platforms (such as StrongARM) for which Sun and/or IBM do not provide JVMs. Also, if you are running Linux on some architectures, there may be VMs for the “official” OS, but none for Linux on that architecture. This is the case, for example, for SPARC and Alpha. The cross-compilation that gcj inherits from the GNU Compiler Collection allows you to compile Java to native code for Linux on those platforms.

Another reason to use gcj might be a desire for better integration with code compiled from other languages. gcj has JNI support, but also provides its own inter-language integration system called CNI, for Compiled Native Interface. We don’t have space to cover CNI (and, frankly, we haven’t used it enough to be good judges), but its proponents claim that it is both easier to use and more efficient than JNI. You can read up, use it, and judge that for yourself.

Still another reason might be one that we don’t like very much. Again, it is ironic that the only Free Software Java compiler is the one best able to produce proprietary binary code. Code compiled with gcj is as difficult to reverse engineer as compiled C or C++ code. It is subject to the same sort of binary obfuscation as other native compiled code. If you need to make your code closed and proprietary, gcj may be the right tool for you. Naturally, we aren’t very fond of this idea, but it is still a reason one might choose the tool.

Finally, we mentioned that speed wasn’t a certain factor for choosing gcj, but there is an exception. So far,[15] Java is particularly slow at starting and shutting down virtual machines. If you have a Java program that is invoked on demand or in a loop and the VM is started and stopped on each invocation, then gcj will give you a huge speed improvement, even if the code executes at the same speed or slightly slower than the JIT JVM code.

Reasons Not to Use gcj

We can think of three reasons not to use gcj. First, the compiled binary will run only on the target platform, whereas a Java bytecode binary is portable to any Java runtime without modification or recompilation. Second, gcj is not definitive. Sun still “owns” Java and only Sun’s implementation can be presumed to be “correct.” Third, the gcj API classes are not complete. If you visit the API status page we mentioned earlier, you can see what is provided and what is not. If gcj lacks an API your application requires, then you can be sure gcj is not the tool for you.

Review

The GNU Compiler for Java is part of the GNU Compiler Collection. It is generally used to compile Java source code into native binaries. It provides many of Sun’s API classes, but not all.

What You Still Don’t Know

You do not know how to interface with C/C++ code using gcj. You do not know how to use SWT from Eclipse to write GUI apps with gcj.

Resources

There are a number of resources for gcj, including

  • The gcj home page.[16]

  • The gcj FAQ.[17]

  • The gcj documentation page.[18]

  • The JDK1.4 to libgcj comparison page.[19] This resource is particularly useful in deciding whether gcj is an appropriate tool for compiling your program.

  • Many features of gcj are, in fact, “inherited” from the parent project, the GNU Compiler Collection. You can find your way to a lot of good information from the GCC home page.[20]



[5] A viewpoint we understand and appreciate, but we do not bow to is that we must always say “GNU/Linux.” We say it sometimes, but it gets tedious and annoying if used all the time. So we compromise. We tell you about GNU, but we’ll usually say just “Linux” in the text.

[6] A lot of people do not realize this, but gcc is a cross-compiler. Precompiled binaries do not always support this, but if you build your compiler from source, you can use gcc to compile code for any supported platform. For example, you can compile a program for a PowerPC-based Macintosh on your Intel-based PC.

[9] The gcj and GNU Classpath projects are in the middle of an effort to merge their libraries into a common library. The GNU Classpath project aims to be a Free Software replacement for the JRE API JAR file. As such, it is meant to be a library of Java bytecodes that may be used as a drop-in replacement in any Java runtime environment. For our discussion, we will assume you are using libgcj as shipped with gcj itself.

[11] Any class without a package declaration is in the default package.

[12] Since this chapter was written, XML features were added to Payback that make it no longer work with gcj.

[13] The Payback code can be found at the book’s Web site: http://www.javalinuxbook.com/.

[15] Sun claims that Java 5.0 will show considerable improvement in VM initialization speed.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.143.205.169