By now you should have gathered that UDFs rely on there being a C or C++ program to define them. Although it is straightforward to add the sample UDFs to your MySQL server, you will need some knowledge of at least one of these languages to write your own functions.
This section describes the calling procedures for UDFs and explains how to make them operate in the MySQL environment. This section, together with the information contained in Day 13, “Using the C API,” is enough to get you coding, provided that you already have a reasonable knowledge of C or C++.
As explained previously, a user-defined MySQL function MYFUNC() will have a corresponding C function called myfunc(). In addition, it may have an optional function called myfunc_init() for initialization and argument checking, and optionally myfunc_deinit() for cleanup.
An aggregating function such as avgcost() also has avgcost_reset() and avgcost_add().
The calling sequence for a simple function is like this:
For an aggregating function, there is a more complex calling sequence, like this:
The resultset of the SELECT query will be sorted according to the GROUP BY clause.
myfunc() will be called when the group has been processed entirely and will return the UDF's result.
The MySQL datatypes passed to and from the C program must correspond according to the following:
The MySQL STRING type corresponds with the C or C++ char *.
The MySQL INTEGER type corresponds with C/C++ long long.
The MySQL REAL type corresponds with C/C++ double.
This section describes how you should declare the various C/C++ functions that make up your UDF. For ease of reference, I'm assuming that your function is called MYFUNC().
If your UDF returns a STRING datatype to MySQL, you should define the main myfunc() function as follows:
char *myfunc( UDF_INIT *initid, UDF_ARGS *args, char *result, unsigned long *length, char *is_null, char *error);
Notice that the arguments initid and args are pointers to UDF_INIT and UDF_ARGS structures, respectively. These are important. You will look in more detail at the UDF_INIT and UDF_ARGS structures in a moment.
For returning results, result points to the result string, and length points to a number that is the length of that string. If your string result is more than 255 bytes, you will need to allocate memory for it using malloc() in myfunc_init(), and free it up again in myfunc_deinit().
is_null() is a flag that indicates a NULL result. You will see this again in myfunc_init() and myfunc_deinit(), and it is cleared or set within each routine.
error is also a flag (not a message) that should be set if an error occurs.
For functions returning an INTEGER datatype, fewer arguments are used. The C type should be long long:
long long myfunc( UDF_INIT *initid, UDF_ARGS *args, char *is_null, char *error);
Functions returning a REAL also take fewer arguments. The C type should be double:
double myfunc( UDF_INIT *initid, UDF_ARGS *args, char *is_null, char *error);
Again notice the UDF_INIT and UDF_ARGS structures, but this time with just the is_null and error flags. The value returned from a numeric function is the actual result, unlike a string function, which returns pointers to the result.
The initialization function of the UDF should be declared like this
my_bool myfunc_init( UDF_INIT *initid, UDF_ARGS *args, char *message);
myfunc_init() should return a 1 if an error occurs (such as because of unsatisfactory arguments being passed to the UDF), or 0 if everything looks okay. In an error situation, your routine should set message to the error message text (a null-terminated string).
The deinitialization function is declared like this
void myfunc_deinit(UDF_INIT *initid);
Notice that only the UDF_INIT structure is passed, and the routine frees up memory if it has previously been allocated for the function.
As mentioned previously, to create an aggregating function, you need to define the functions myfunc_reset() and myfunc_add(). You need these as well as the ones for simple UDFs, which should be declared in the same way as described just now.
These functions should be declared like this
char *myfunc_reset( UDF_INIT *initid, UDF_ARGS *args, char *is_null, char *error); char *myfunc_add( UDF_INIT *initid, UDF_ARGS *args, char *is_null, char *error);
In aggregating UDFs, the myfunc() function, which in simple UDFs does the meaningful work, will be called only when all rows in the group have been processed. The args pointer passed to it will be meaningless in a group context, so your code in the myfunc() routine should do its processing on the internal summary variables instead.
is_null and error are handled in the same way as in myfunc(). However, whereas is_null is reset for each group (before calling myfunc_reset()), if error is set in any of the routines, it is never reset.
If is_null or error become set any time after running myfunc(), MySQL returns NULL as the UDF result.
You will have noticed that all the C functions described have the initid parameter passed to them. This is in the form of a UDF_INIT structure that communicates initialization information about the UDF between the C functions.
Table 21.1 shows the members of the UDF_INIT structure. Your initialization function myfunc_init() should define values in this structure by setting any number of its members. Alternatively, you may leave them undeclared to take up their default values.
The args parameter is a pointer to the main data that you will pass among your functions. This data has a UDF_ARGS structure. Table 21.2 lists the members of the UDF_ARGS structure.
Type | Argument | Usage |
---|---|---|
unsigned int | arg_count | The number of arguments passed to the UDF. If your UDF should be called with a particular number of arguments, verify this number in myfunc_init(). |
enum Item_result | *arg_type | arg_type points to an array holding the datatype for each argument, with possible values STRING_RESULT, INT_RESULT, and REAL_RESULT. If your UDF should be called with particular datatypes, you should verify these values in myfunc_init(). |
char | **args | In the context of myfunc_init(), args->args contains information about the arguments; its values depend on whether a constant or nonconstant argument was passed (in other words, a defined number or string, as opposed to a column name). With a constant value, args->args[i] points to the value of the argument, whereas with a nonconstant, the value of args->args[i] will be 0. In the context of myfunc(), args->args contains the actual arguments passed for the row currently being processed. See the following section on args->args for information on how to access this data. |
unsigned long | *lengths | lengths points to an array containing the string length of each argument.
In myfunc_init(), this means the maximum string length (which should not be changed). In myfunc(), it contains the actual lengths of any string arguments in the row currently being processed. |
Within UDF_ARGS is the args->args array, which contains the actual arguments to be processed. Depending on the type of each argument, your function myfunc()refers to them as follows:
For string arguments (type STRING_RESULT), args->args represents a string pointer plus its length. (This lets you handle binary data, which may contain nulls, so don't assume that strings will be terminated with a null.) The content of the string is given by args->args[i], and the length is given by args->lengths[i].
For integer arguments (type INT_RESULT), args->args[i] represents the integer argument itself. However, you must cast it to a long long value before processing.
For arguments of type REAL_RESULT, args->args[i] represents the real argument. However, you must cast args->args[i] to a double value before processing.
The preceding descriptions explain how to set up functions in C or C++. However, this information is targeted at C programmers, who should now have enough information to begin writing their own UDFs.
No examples of writing UDFs are given here; space does not permit the inclusion of several worked examples, which would be helpful. However, refer to the sample UDFs in your MySQL distribution for example code and also to the exercises in this chapter, which also describe the creation of a simple UDF.
3.137.152.87