Chapter 4. Collecting Runtime Information

Most applications offer ways for users to direct or alter runtime behavior. Two of the most common approaches are

Accepting command line arguments and options. This approach is often used for information that can reasonably change on each application invocation. For example, the host name to connect to during an FTP (file transfer protocol) or TELNET session is usually different each time the command is run.

Reading configuration files. Configuration files usually hold site- or user-specific information that doesn't often change or that should be remembered between application invocations. For example, an installation script may store file system locations to read from or record log files to. The configuration information may indicate which TCP or UDP (user datagram protocol) ports a server should listen on or whether to enable various logging severity levels.

Any information that can reasonably change at runtime should be made available to an application at runtime, not built into the application itself. This allows the information to change without having to rebuild and redistribute the application. In this chapter, we look at the following ACE classes that help in this effort:

ACE_Get_Opt: to access command line arguments and options

ACE_Configuration: to manipulate configuration information on all platforms using the ACE_Configuration_Heap class and, for the Windows registry, the ACE_Configuration_Win32Registry class

4.1 Command Line Arguments and ACE_Get_Opt

ACE_Get_Opt is ACE's primary class for command line argument processing. This class is an iterator for parsing a counted vector of arguments, such as those passed on a program's command line via argc/argv. POSIX developers will recognize ACE_Get_Opt's functionality because it is a C++ wrapper facade for the standard POSIX getopt() function. Unlike getopt(), however, each instance of ACE_Get_Opt maintains its own state, so it can be used reentrantly. In addition, ACE_Get_Opt is easier to use than getopt(), as the option definition string and argument vector are passed only once to the constructor rather than to each iterator call.

ACE_Get_Opt can parse two kinds of options:

  1. Short, single-character options, which begin with a single dash ('-')
  2. Long options, which begin with a double dash ('--')

For example, the following code implements command line handling for a program that offers command line option -f, which takes an argument—the name of a configuration file—and an equivalent long option --config:


static const ACE_TCHAR options[] = ACE_TEXT (":f:");
ACE_Get_Opt cmd_opts (argc, argv, options);
if (cmd_opts.long_option
    (ACE_TEXT ("config"), 'f', ACE_Get_Opt::ARG_REQUIRED) == -1)
  return -1;
int option;
ACE_TCHAR config_file[MAXPATHLEN];
ACE_OS_String::strcpy (config_file, ACE_TEXT ("HAStatus.conf"));
while ((option = cmd_opts ()) != EOF)
  switch (option) {
  case 'f':
    ACE_OS_String::strncpy (config_file,
                            cmd_opts.opt_arg (),
                            MAXPATHLEN);
    break;
  case ':':
    ACE_ERROR_RETURN
      ((LM_ERROR, ACE_TEXT ("-%c requires an argument "),
        cmd_opts.opt_opt ()), -1);
  default:
    ACE_ERROR_RETURN
      ((LM_ERROR, ACE_TEXT ("Parse error. ")), -1);
  }

This example uses the cmd_opts object to extract the command line arguments. This example illustrates what you must do to process a command line.

Define the valid options. To define short options, build a character string containing all valid option letters. A colon following an option letter means that the option requires an argument. In the preceding example, -f requires an argument. Use a double colon if the argument is optional. To add equivalent long options, use the long_option() method to equate a long option string with one of the short options. Our example equates the --config option with -f.

Use operator() to iterate through the command line options. It returns the short option character when located and the short option equivalent when a long option is processed. The option's argument is accessed via the opt_arg() method. The operator() method returns EOF when all the options have been processed.

ACE_Get_Opt keeps track of where in the argument vector it is when processing the argv elements. When it finds an option that takes an argument, ACE_Get_Opt takes the option's argument from the remaining characters of the current argv element or the next argv element as needed. Optional arguments, however, must be in the same element as the option character. The behavior when a required argument is missing depends on the first character of the short options definition string. If it is a ':', as in our example, operator() returns a ':' when a required argument is missing. Otherwise, it returns '?'.

Short options that don't take arguments can be grouped together on the command line after the leading -, but in that case, only the last short option in the group can take an argument. A '?' is returned if the short option is not recognized.

Because short options are defined as integers, long options that wouldn't normally have a meaningful short option equivalent can designate nonalphanumeric values for the corresponding short option. These nonalphanumerics cannot appear in the argument list or in short options definition string but can be returned and processed efficiently in a switch statement. The following two lines of code could be added to the previous example. They illustrate two ways to register a long option without a corresponding short option:


cmd_opts.long_option (ACE_TEXT ("cool_option"));
cmd_opts.long_option (ACE_TEXT ("the_answer"), 42);

The first call to long_option() adds a long option --cool_option that will cause a 0 to be returned from operator() if --cool_option is specified on the command line. The second is similar but specifies that the integer value 42 will be returned from operator() when --the_answer is found on the command line. The following shows the additions that would be made to the switch block in the example on page 78:


case 0:
  ACE_DEBUG ((LM_DEBUG, ACE_TEXT ("Yes, very cool. ")));
  break;

case 42:
  ACE_DEBUG ((LM_DEBUG, ACE_TEXT ("the_answer is 42 ")));
  break;

When the user supplies long options on the command line, each one can be abbreviated as long it is unambiguous. Therefore, --cool_option could be abbreviated as short as --coo. (Anything shorter would also match --config.)

An argv element of -- signifies the end of the option section, and operator() returns EOF. If the opt_ind() method returns a value that's less than the number of command line elements (argc), some elements haven't been parsed.

That's the basic use case; however, ACE_Get_Opt can do a lot more.

4.1.1 Altering ACE_Get_Opt's Behavior

The ACE_Get_Opt class's extended capabilities are accessed by specifying values for the defaulted arguments in the constructor. The complete signature for the constructor is:


ACE_Get_Opt (int argc,
             ACE_TCHAR **argv,
             const ACE_TCHAR *optstring,
             int skip_args = 1,
             int report_errors = 0,
             int ordering = PERMUTE_ARGS,
             int long_only = 0);

Start Parsing at an Arbitrary Index

ACE_Get_Opt can be directed to start processing the argument vector at an arbitrary point specified by the skip_args parameter. The default value is 1, which causes ACE_Get_Opt to skip argv[0]—traditionally, the program name—when parsing a command line passed to main(). When ACE_Get_Opt is used to parse options received when initializing a dynamic service (see Chapter 19), skip_args is often specified as 0, because arguments passed to services initialized via the ACE Service Configurator framework start in argv[0]. The skip_args parameter can also be set to any other value that's less than the value of argc to skip previously processed arguments or arguments that are already known.

Report Errors while Parsing

By default, ACE_Get_Opt is silent about parsing errors; it simply returns the appropriate value from operator(), allowing your application to handle and report errors in the most sensible way. If, however, you'd rather have ACE_Get_Opt display an error message when it detects an error in the specified argument vector, the constructor's report_errors argument should be nonzero. In this case, ACE_Get_Opt will use ACE_ERROR with the LM_ERROR severity to report the error. See Chapter 3 for a discussion of ACE's logging facility, including the ACE_ERROR macro.

Alternative Long Option Specification

If "W;" is included in the options definitions string, ACE_Get_Opt treats -W as if the next command line element is preceded by --. For example, -W foo will be parsed the same as --foo. This can be useful when manipulating argument vectors to change parameters into long options by inserting an element with -W instead of inserting -- on an existing element.

Long Options Only

If the long_only parameter to the ACE_Get_Opt constructor is nonzero, command line tokens that begin with a single - are checked as long options. For example, in the program on page 78, if the long_only argument were set to 1, the user could type either --config or -config.

4.1.2 Understanding Argument Ordering

Some applications require you to specify all options at the beginning of the command line, whereas others allow you to mix options and other nonoption tokens, such as file names. ACE_Get_Opt supports selection of use cases defined by enumerators defined in ACE_Get_Opt. One of these values can be passed as the constructor's ordering parameter, which accepts the following values:

ACE_Get_Opt::PERMUTE_ARGS. As the argument vector is parsed, the elements are dynamically rearranged so that those with valid options—and their arguments—appear at the front of the argument vector, in their original relative ordering. Nonoption elements, placed after the option elements, can be processed by another part of your system or as known nonoptions, such as file names. When operator() returns EOF to indicate the end of options, opt_ind() returns the index to the first nonoption element in the argument vector. This is the default ordering mode.

ACE_Get_Opt::REQUIRE_ORDER. The argument vector is not reordered, and all options and their arguments must be at the front of the argument vector. If a nonoption element is encountered, operator() returns EOF; opt_ind() returns the index of the nonoption element.

ACE_Get_Opt::RETURN_IN_ORDER. The argument vector is not reordered. Any nonoption element causes operator() to return 1, and the actual element is accessible via the opt_arg() method. This mode is useful for situations in which options and other arguments can be specified in any order and in which the relative ordering makes a difference. In this situation, it may be useful to parse options, examine nonoptions, and continue parsing after the nonoptions, using the skip_args argument to specify the new starting point.

As mentioned, the argument ordering can be changed by specifying an enumerator for the ACE_Get_Opt constructor's ordering parameter. However, the argument ordering can also be changed by using two other mechanisms. Specifying a value for the constructor takes least precedence. The other two methods both override the constructor value and are listed here in increasing order of precedence.

  1. If the POSIXLY_CORRECT environment variable is set, the ordering mode is set to REQUIRE_ORDER.
  2. A + or - character is at the beginning of the options string. A + changes the ordering mode to REQUIRE_ORDER; - changes it to RETURN_IN_ORDER. If both are at the start of the options string, the last one is used.

4.2 Accessing Configuration Information

Many applications are installed via installation scripts that store collected information in a file that the application reads at runtime. On modern versions of Microsoft Windows, this information is often stored in the Windows registry; in earlier versions, a file was used. Most other platforms use files as well. The ACE_Configuration class defines the configuration interface for the following two classes available for accessing and manipulating configuration information.

  1. ACE_Configuration_Heap, available on all platforms, keeps all information in memory. The memory allocation can be customized to use a persistent backing store, but the most common use is with dynamically allocated heap memory; hence its name.
  2. ACE_Configuration_Win32Registry, available only on Windows, implements the ACE_Configuration interface to access and manipulate information in the Windows registry.

In both cases, configuration values are stored in hierarchically related sections. Each section has a name and zero or more settings. Each setting has a name and a typed data value. Even though the configuration information can be both read and modified, resist the temptation to use it as a database, with frequent updates. It's not designed for that.

The following example shows how the Home Automation system uses ACE's configuration facility to configure each subsystem's TCP port number. The configuration uses one section per subsystem, with settings in each section used to configure an aspect of that subsystem. Thus, the configuration for the entire system is managed in a central location. The example uses the config_file command line argument read in the example on page 78. After importing the configuration data, the program looks up the ListenPort value in the HAStatus section to find out where it should listen for status requests:


ACE_Configuration_Heap config;
if (config.open () == -1)
  ACE_ERROR_RETURN
    ((LM_ERROR, ACE_TEXT ("%p "), ACE_TEXT ("config")), -1);
ACE_Registry_ImpExp config_importer (config);
if (config_importer.import_config (config_file) == -1)
  ACE_ERROR_RETURN
    ((LM_ERROR, ACE_TEXT ("%p "), config_file), -1);

ACE_Configuration_Section_Key status_section;

if (config.open_section (config.root_section (),
                         ACE_TEXT ("HAStatus"),
                         0,
                         status_section) == -1)
  ACE_ERROR_RETURN ((LM_ERROR, ACE_TEXT ("%p "),
                     ACE_TEXT ("Can't open HAStatus section")),
                    -1);

u_int status_port;
if (config.get_integer_value (status_section,
                              ACE_TEXT ("ListenPort"),
                              status_port) == -1)
  ACE_ERROR_RETURN
    ((LM_ERROR,
      ACE_TEXT ("HAStatus ListenPort does not exist ")),
     -1);
this->listen_addr_.set (ACE_static_cast (u_short, status_port));

To remain portable across all ACE platforms, this example uses the ACE_Configuration_Heap class to access the configuration data. Whereas the ACE_Configuration_Win32Registry class operates directly on the Windows registry, the contents of each ACE_Configuration_Heap object persist only as long as the object itself. Therefore, the data needs to be imported from the configuration file. We'll look at configuration storage in Section 4.2.2.

Because our example application keeps the settings for each subsystem in a separate section, it opens the HAStatus section. The ListenPort value is read and used to set the TCP port number in the listen_addr_ member variable.

4.2.1 Configuration Sections

Configuration data is organized hierarchically in sections, analogous to a file system directory tree. Each configuration object contains a root section that has no name, similar to the file system root in UNIX. All other sections are created hierarchically beneath the root section and are named by the application. Sections can be nested to an arbitrary depth.

4.2.2 Configuration Backing Stores

The ACE_Configuration_Win32Registry class accesses the Windows registry directly, and therefore acts as a wrapper around the Windows API. Thus, Windows manages the data and all access to it. Although it is possible to use a memory-mapped allocation strategy with ACE_Configuration_Heap, the resultant file contents are the in-memory format of the configuration and not a human-readable form. Therefore, configuration information is usually saved in a file. ACE offers two classes for importing data from and exporting data to a file.

  1. ACE_Registry_ImpExp uses a text format that includes type information with each value. This allows type information to be preserved across export/import, even on machines with different byte orders. This is the class used in the previous example to import configuration data from the configuration file specified on the program's command line.
  2. ACE_Ini_ImpExp uses the older Windows “INI” file format, which does not have type information associated with the values. Therefore, configuration data exported using ACE_Ini_ImpExp is always imported as string data, regardless of the original type.

Both classes use text files; however, they are not interchangeable. Therefore, you should choose a format and use it consistently. It is usually best to use ACE_Registry_ImpExp when possible because it retains type information. ACE_Ini_ImpExp is most useful when your application must read existing .INI files over which you have no control.

4.3 Building Argument Vectors

Section 4.1 showed how to process an argument vector, such as the argc/argv passed to a main program. Sometimes, however, it is necessary to parse options from a single long string containing tokens similar to a command line. For example, a set of options may be read as a string from a configuration file. In this case, it is helpful to convert the string to an argument vector in order to use ACE_Get_Opt. ACE_ARGV is a good class for this use.

Let's say that a program that obtains its options from a string wants to parse the string by using ACE_Get_Opt. The following code converts the cmdline string into an argument vector and instantiates the cmd_opts object to parse it:


#include "ace/ARGV.h"
#include "ace/Get_Opt.h"

int ACE_TMAIN (int, ACE_TCHAR *[])
{
  static const ACE_TCHAR options[] = ACE_TEXT (":f:h:");
  static const ACE_TCHAR cmdline[] =
    ACE_TEXT ("-f /home/managed.cfg -h $HOSTNAME");
  ACE_ARGV cmdline_args (cmdline);
  ACE_Get_Opt cmd_opts (cmdline_args.argc (),
                        cmdline_args.argv (),
                        options,
                        0);          // Don't skip any args

Note that the ace/ARGV.h header needs to be included to use the ACE_ARGV class. Another useful feature of ACE_ARGV is its ability to substitute environment variable names while building the argument vector. In the example, the value of the HOSTNAME environment variable is substituted where $HOSTNAME appears in the input string. This feature can be disabled by supplying a 0 value to the second argument on the ACE_ARGV constructor; by default, it is 1, resulting in environment variable substitution.

Note that the environment variable reference uses the POSIX-like leading $, even on platforms such as Windows, where environment variable references do not normally use a $ delimiter. This keeps the feature usable on all platforms that support the use of environment variables. One shortcoming in this feature, however, is that it substitutes only when an environment variable name is present by itself in a token. For example, if the cmdline literal in the previous example contained "-f $HOME/managed.cfg", the value of the HOME environment variable would not be substituted, because it is not in a token by itself.

The preceding example also uses the skip_args parameter on the ACE_Get_Opt constructor. Whereas the argument vector passed to the main() program entry point includes the command name in argv[0], our built vector starts in the first element. Supplying a 0 forces ACE_Get_Opt to start parsing at the first token in the argument vector.

4.4 Summary

Collecting runtime information is a basic part of many applications. Developing code to parse command lines and collect configuration information can be very time consuming and platform dependent. This chapter showed ACE's facilities for collecting and processing runtime information in a portable, customizable, and easy-to-use way.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.162.87