Appendix B. Tools

THIS APPENDIX DESCRIBES TWO TOOLS, THE SOURCE CODE FOR WHICH IS AVAILABLE FOR DOWNLOAD, and which I hope you’ll find useful. The first one is actually a pair of programs, mklipsum and lipsum, which I briefly described in Chapter 4. The second one is a Java tool called Roughbench, which I have used often in my tests, in particular in Chapter 7.

These tools are released under the GPL license and come with the usual disclaimers.

mklipsum and lipsum

mklipsum and lipsum are tools for generating arbitrary random text. The source code is written in C, and I prepared it using GNU Autotools (as best as I could). Windows executables are also provided (they have been ported with Mingwin). Both programs require SQLite3, available from http://www.sqlite.org. For the record, mklipsum contains nontrivial SQL code.

How to Build mklipsum and lipsum

On a *nix machine, after having uncompressed and unarchived the files, you should be able to build the programs by typing ./configure, then make and make install (read the INSTALL file for more information and to learn how to customize default settings).

How to Use mklipsum and lipsum

Using mklipsum and lipsum is fairly simple. mklipsum prepares an SQLite file for use by lipsum from text that it tokenizes and analyzes. The text is read from the standard input. The SQLite file stores the words using UTF8 encoding, and it should also be the character set used in the input. If, however, the library libiconv, which converts character sets, is available on your system, mklipsum will be able to perform character set conversion. The availability or nonavailability of the character set conversion feature is displayed when you run the program.

mklipsum can optionally take one parameter (two if character set conversion is available).

The first parameter is a file identifier that can be used to identify the language of the source file. By default, the SQLite file created is named lipsum.db.

If you run:

mklipsum xx < sample_text_file

the SQLite file will be named lipsumxx.db. This feature is intended for the generation of random text with words from different languages. You can use the ISO code of the language to identify the various files.

The second parameter, if character set conversion is available, is the character set used in the source file. Note that you cannot specify the character set alone, and that it must always follow a file identifier.

I have tested mklipsum and lipsum with different languages that use the Latin alphabet (with extensions); it is possible that languages that use the Cyrillic alphabet may require some adjustments to the programs, and very likely that a language such as Arabic or Hebrew, to say nothing of Chinese or Japanese, will require some rewrites (the programs try hard to retain the original capitalization, but follow Western rules and capitalize the first letter of words that follow a period, for instance).

Here is an example, using a few chapters from Thomas Hobbes’ Leviathan:

$ ./mklipsum < leviathan_sample.txt
mklipsum $Revision$ with charset conversion
-- Reading input ...
-- Loading vocabulary ...
--- 770 words of length 1 analyzed
--- 761 words of length 2 analyzed
--- 800 words of length 3 analyzed
--- 614 words of length 4 analyzed
--- 474 words of length 5 analyzed
--- 312 words of length 6 analyzed
--- 237 words of length 7 analyzed
--- 152 words of length 8 analyzed
--- 119 words of length 9 analyzed
--- 72 words of length 10 analyzed
--- 43 words of length 11 analyzed
--- 22 words of length 12 analyzed
--- 21 words of length 13 analyzed
--- 5 words of length 14 analyzed
--- 1 words of length 15 analyzed
--- 1 words of length 16 analyzed
--- 1 words of length 19 analyzed
-- Vocabulary: 1029 distinct words loaded
-- Loading lengths relationships ...
--- ...................................................................................
..............................................................
-- Indexing
$

To generate random text, you must run the lipsum program that can take as a parameter a file identifier telling which SQLite file should be used.

$ ./lipsum
 Speech Thy Eye up with serves which no Businesse; can of is therefore
up can false a Joynts happen, he late Marcus, up any motion But Eyes
of hath Species; up my Cold which my How First Voyces my are up them
also exercise which a Death are any of parts if Circumstances; Image
whereas els waking is any must, Sees For office is am cease. Ever
say. Prognostiques many course, one cause I Speech mind of For removed
of though. Give not onely Anger horse can them, well sight.

Here is an example using another file seeded by a German text written by Immanuel Kant:

$ ./lipsum de
 Sein daβ Sitz Begriff Synthesis als, von priori Satz kann vorher
vielleicht. Lassen, in komme eine bewuβt, des Kreis die ist. Um
allen selbst natürlicher a dieser die Prädikat hergibt Zufälligkeit
bewuβt, priori Verstand a fassen leicht, er Begriffe. Hat wir wird
denn desjenigen der empfangen sollten. Seine in der einander ist
mithin man korrespondiert, als.

The lipsum program can take a number of flags:

$ ./lipsum -?
Usage : ./lipsum [flags] [suffix]
        This program looks in the current directory for an
        sqlite file named lipsum<suffix>.db that must have
        previously been generated by the mklipsum utility.
 Flags:
  -h, -?           : Display this and exit.
  -n <count>       : Number of paragraphs to generate (default 1)
  -w <len>[,<dev>] : Approximate number of words per paragraph.
                     The program generates paragraphs averaging <len>
                     words, with a standard deviation <dev>. By
                     default, <dev> is 20% of <len>.
                     Default for <len>: 500.
                     This flag is exclusive of -c.
  -c <len>[,<dev>] : Same as -c, but based on character count instead of
                     word count.
  -l <len>         : Insert a carriage return as soon as possible
                     after having output <len> characters. Default: 65.

Roughbench

Roughbench takes as a parameter the name of a simple SQL script, reads it, and executes it, for a number of times or for a certain duration of time, and can start several concurrent threads running the same script.

On completion, statistical information is displayed.

How to Build Roughbench

You need the J2SE Development Kit (JDK) version 1.5 or later to compile the program. The JDK is available from http://java.sun.com/j2se.

How to Use Roughbench

Using Roughbench is straightforward:

  1. Specify the JDBC parameters in a file named roughbench.properties.

  2. Check that your CLASSPATH environment variable is set properly.

  3. Prepare a file containing the (single) statement you want to run. This statement can contain ? characters as placeholders for values generated by Roughbench.

  4. Specify a number of options and how to generate values on the command line.

    The command line looks like this:

    java [options] roughbench <sqlfile> [generators ... ]

The roughbench.properties file

The roughbench.properties file must contain at least the following two values:

CONNECT=<JDBC URL>
DRIVER=<Driver package name>

Additionally, if there is no implicit authentication and if no authentication method is provided inside the JDBC URL, you must specify this:

USERNAME=<...>

and this:

PASSWORD=<...>

For example, assuming that we want to connect as user bench, password possward, to a database named TEST running on the same server, we can use the following:

roughbench.properties file with Oracle:
CONNECT=jdbc:oracle:thin:@localhost:1521:TEST
USERNAME=bench
PASSWORD=possward
DRIVER=oracle.jdbc.OracleDriver

Or, with MySQL, we can specify the username and password in the URL:

CONNECT=jdbc:mysql://localhost:3306/TEST?user=bench&password=possward
DRIVER=com.mysql.jdbc.Driver

Specifying options

Options, which are listed in Table B-1, are specified on the command line with the following syntax:

-D<option name>=<value>
Table B-1. Roughbench options

Option name

Comment

Default value

COMM

Number of executions between two successive commits (ignored for select statements).

1 (commit after every DML statement)

LOOPS

Number of times the SQL statement must be run. Exclusive of -DTIME=

1

RATE

Number of executions per minute. The rate is an average; the interval between two successive executions is random.

 

TAG

Identifier for the run.

Timestamp, format MonDD-hh:mn

THR

Number of concurrent threads to run. All threads execute the same SQL statement.

1

TIME

Number of minutes during which the SQL statement must be run. Exclusive of -DLOOPS=

 

Generating variables

You can generate random variables to bind to the statement by specifying as many generators on the command line as there are placeholders in the statement. Generators are specified as shown in Table B-2.

Table B-2. Generators

Generator

Meaning

C<constant>

Constant value; <constant> can be numeric or a string.

Ra,b,…,c

Uniform random value from the list (the same item can occur several times in the list to give it more weight).

Values can be of any type.

Ra-b

Random value in the range a to b inclusive (numbers or chars).

N<avg>,<stddev>

Normal (Gaussian) distribution, with an average of <avg> and a standard deviation of <stddev>.

I[<start>[,<step>]]

Incremental value. <start> defaults to 1, <step> to 1.

S<min>[-<max>]

String of random letters of length between <min> and <max> inclusive; <max> defaults to the same value as <min>.

Generating integer or float values

The generation of integer or float values depends on the type of the first number given in the generator specification. For instance:

  • R1-250000 will generate integer values between 1 and 250,000.

  • R1.0-250000 will generate float values in the same range.

Generating dates

To generate dates, you must use the functions available with your DBMS and generate integer values to be used with date functions and date arithmetic.

For instance, you can use the following expressions in different SQL dialects associated to an incremental generator to generate dates in chronological order from the past 300 days. Because 300 days equal almost 26 million seconds, you can generate almost as many days as you want in the interval to the current date.

In MySQL:

date_add(date_sub(curdate(), INTERVAL 300 DAY), INTERVAL ? SECOND)

In Oracle:

sysdate - 300 + ?/86400

In T-SQL (SQL Server/Sybase):

dateadd(ss, ?, dateadd(dd, -300, getdate()))

and so on.

Output

Roughbench displays informational messages on the standard error and results on the standard output, which makes it easy to redirect clean results to a file.

The provided information consists of the following:

  • Program name, version, and copyright/licensing information

  • Reminder of the command-line parameters

  • Text of the query being run

Results are displayed as follows:

tag<tab>filename<space>thread#<tab>tenth of second<tab>count
tag is the value of the tag specified by -DTAG=… on the command line. It defaults to a timestamp; for example, Jun11-15:42.
filename is the name of the file that contains the SQL statement.
thread# is the number (starting with 0) identifying a particular thread.
tenth of second and count tell how many statements were executed in what time. A value of 0 for tenth of second means less than 0.1 second, a value of 1 means between 0.1 and 0.2 seconds, and so forth.

For example, the following output means that for the run of October 23 at 4 p.m., thread 3 executed the script insert.sql 15,874 times, of which five times took between 0.1 and 0.2 seconds and the remainder took less than 0.1 second:

Oct23-16:00            insert.sql 3            0     15869
Oct23-16:00            insert.sql 3            1     5

Additional lines may also be displayed. For instance, when operating in loop mode, the total time to execute the required number of loops is displayed as follows:

tag<tab>filename<space>thread#<tab>elapsed (ms)<tab>time

If some of the executions end in failure, the number of successful executions and the number of failures will be output as shown here:

tag<tab>filename<space>thread#<tab>OK<tab>count
tag<tab>filename<space>thread#<tab>KO<tab>count
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.189.170.134