Handling records as arguments or returned values

As our next exercise, let's write a function which takes a record of three integers a, b, and c as an argument and returns a set of different records—all permutations of a, b, and c with an extra field x computed as a * b + c.

First, this function is written in PL/python to make it easier to understand what we are trying to do:

hannu=# CREATE LANGUAGE plpythonu;
CREATE LANGUAGE 
hannu=# CREATE TYPE abc AS (a int, b int, c int); 
CREATE TYPE 
hannu=# CREATE OR REPLACE FUNCTION 
hannu-#     reverse_permutations(r abc) 
hannu-#   RETURNS TABLE(c int, b int, a int, x int) 
hannu-# AS $$ 
hannu$#     a,b,c = r['a'], r['b'], r['c'] 
hannu$#     yield a,b,c,a*b+c 
hannu$#     yield a,c,b,a*c+b 
hannu$#     yield b,a,c,b*b+c 
hannu$#     yield b,c,a,b*c+a 
hannu$#     yield c,a,b,c*a+b 
hannu$#     yield c,b,a,c*b+a 
hannu$# $$ LANGUAGE plpythonu; 
CREATE FUNCTION 
hannu=# SELECT * FROM reverse_permutations(row(2,7,13)); 
-[ RECORD 1 ] 
c | 2 
b | 7 
a | 13 
x | 27 
-[ RECORD 2 ] 
c | 2 
b | 13 
a | 7 
x | 33 
-[ RECORD 3 ] 
c | 7 
b | 2 
a | 13 
x | 62 
-[ RECORD 4 ] 
c | 7 
b | 13 
a | 2 
x | 93 
-[ RECORD 5 ] 
c | 13 
b | 2 
a | 7 
x | 33 
-[ RECORD 6 ] 
c | 13 
b | 7 
a | 2 
x | 93 

There are three new things that we are going to touch in the following C implementation of a similar function:

  • How to fetch an element of RECORD passed as an argument
  • How to construct a tuple to return a RECORD type
  • How to return SETOF (also known as TABLE) of this RECORD

So let's dive into the C code for this right away (a sample can be found in the chap9/c_records/ directory).

For better clarity, we will explain this function in two parts: first, doing a simple reverse(a,b,c) function, which returns just a single record of (c,b,a,x=c*b+a), and then expanding it to return a set of permutations such as the sample PL/pythonu function.

Returning a single tuple of a complex type

The first step in constructing a version of the reverse permutations function in C is to start with simply being able to return a single record of type abc:

Datum 
c_reverse_tuple(PG_FUNCTION_ARGS) 
{ 
    HeapTupleHeader th; 
    int32   a,b,c; 
    bool    aisnull, bisnull, cisnull; 
    
    TupleDesc resultTupleDesc; 
    Oid resultTypeId; 
    Datum retvals[4]; 
    bool  retnulls[4]; 
    HeapTuple rettuple; 
    
    // get the tuple header of 1st argument 
    th = PG_GETARG_HEAPTUPLEHEADER(0); 
    // get argument Datum's and convert them to int32 
    a = DatumGetInt32(GetAttributeByName(th, "a", &aisnull)); 
    b = DatumGetInt32(GetAttributeByName(th, "b", &bisnull)); 
    c = DatumGetInt32(GetAttributeByName(th, "c", &cisnull)); 

   // debug: report the extracted field values
    ereport(INFO, 
               (errmsg("arg: (a: %d,b: %d, c: %d)", a, b, c)) ); 
    
    // set up tuple descriptor for result info 
    get_call_result_type(fcinfo, &resultTypeId, &resultTupleDesc); 
    // check that SQL function definition is set up to return arecord 
    Assert(resultTypeId == TYPEFUNC_COMPOSITE); 
    // make the tuple descriptor known to postgres as valid return type 
    BlessTupleDesc(resultTupleDesc); 
    
    retvals[0] = Int32GetDatum(c); 
    retvals[1] = Int32GetDatum(b); 
    retvals[2] = Int32GetDatum(a); 
    retvals[3] = Int32GetDatum(retvals[0]*retvals[1]+retvals[2]); 
    
    retnulls[0] = aisnull; 
    retnulls[1] = bisnull; 
    retnulls[2] = cisnull; 
    retnulls[3] = aisnull || bisnull || cisnull; 
    
    rettuple = heap_form_tuple( resultTupleDesc, retvals, retnulls ); 

    PG_RETURN_DATUM( HeapTupleGetDatum( rettuple ) ); 
}

Extracting fields from an argument tuple

Getting the fields of an argument tuple is easy. First, you fetch the HeapTupleHeader file of the argument into the th variable using the PG_GETARG_HEAPTUPLEHEADER(0) macro, and then for each field you get the Datum (a generic type which can hold any field value in PostgreSQL) by the field name using the GetAttributeByName() function and then assign its value to a local variable after converting it to int32 via DatumGetInt32():

a = DatumGetInt32(GetAttributeByName(th, "a", &aisnull));

The third argument to GetAttributeByName(...) is an address of a bool which is set to true if the field was NULL.

There is also a companion function GetAttributeByNum() if you prefer to get the attributes by their numbers instead of names.

Constructing a return tuple

Constructing the return tuple(s) is almost as easy.

First, you get the called functions return type descriptor using the get_call_result_type() function:

get_call_result_type(fcinfo, &resultTypeId, &resultTupleDesc);

The first argument to this function is the FunctionCallInfo structure fcinfo which is used when calling the function you are currently writing (hidden behind the PG_FUNCTION_ARGS macro in the C function declaration). The other two arguments are addresses of the return type Oid and TupleDesc to receive the return tuple descriptor in case the function returns a record type.

Next, there is a safety assert for checking that the return type is really a record (or composite) type:

Assert(resultTypeId == TYPEFUNC_COMPOSITE);

This is to guard against errors in the CREATE FUNCTION declaration in SQL which tells PostgreSQL about this new function.

And there still remains one thing before we construct the tuple:

BlessTupleDesc(resultTupleDesc);

The purpose of BlessTupleDesc() is to fill in the missing parts of the structure, which are not needed for internal operations on the tuple, but are essential when the tuple is returned from the function.

So we are done with the tuple descriptor and finally, we can construct the tuple or record it to be returned.

The tuple is constructed using the heap_form_tuple( resultTupleDesc, retvals, retnulls ); function which uses TupleDesc we just prepared. It also needs an array of Datum to be used as values in the return tuple, and an array of bool, which is used to determine if any field should be set to NULL instead of their corresponding Datum value. As all our fields are of type int32, their values in retvals are set using Int32GetDatum(<localvar>). The array retnull is a simple array of bool and needs no special tricks to set its values.

Finally we return the constructed tuple:

PG_RETURN_DATUM( HeapTupleGetDatum( rettuple ) );

Here, we first construct Datum from the tuple we just constructed using HeapTupleGetDatum() and then use the PG_RETURN_DATUM macro.

Interlude – what is Datum?

In this chapter, we use something called Datum in several places. This calls for a bit of explanation about what a Datum is.

In short, a Datum is any data item the PostgreSQL processes and passes around. Datum itself does not contain any type information or info on whether the field is actually NULL. It is just a pointer to a memory. You always have to find out (or know beforehand) the type of any Datum you use and also how to find out if your data may be NULL instead of any real value.

In the preceding example, GetAttributeByName(th, "b", &bisnull) returns a Datum, and it can return something even when the field in the tuple is NULL, so always check for null-ness first. Also, the returned Datum itself cannot be used for much unless we convert it to some real type, as done in the next step using DatumGetInt32(), which simply converts the vague Datum to a real int32 value basically doing a cast from a memory location of an undefined type to int32.

The definition of Datum in postgresql.h is typedef Datum *DatumPtr; that is anything pointed to by a DatumPtr. Even though DatumPtr is defined as typedef uintptr_t Datum; it may be easier to think of it as a (slightly restricted) void *.

Once more, any real substance is added to a Datum by converting it to a real type.

You can also go the other way, turning almost anything into a Datum as seen at the end of the function:

HeapTupleGetDatum( rettuple )

Again, for anything else in PostgreSQL to make use of such Datum, the type information must be available somewhere else, in our case the return type definitions of the function.

Returning a set of records

Next, we will modify our function to not just return a single record of reordered fields from an argument record, but to return all possible orderings. We will still add one extra field x as an example of how you can use the values you extracted from the argument.

For set-returning functions, PostgreSQL has a special calling mechanism where PostgreSQL's executor machinery will keep calling the function over and over again until it reports back that it does not have any more values to return. This return-and-continue behavior is very similar to how the yield keyword works in Python or JavaScript.

All calls to the set returning function get an argument a persistent structure maintained outside the function and made available to the function via macros: SRF_FIRSTCALL_INIT() for the first call and SRF_PERCALL_SETUP() for subsequent calls.

To further clarify the example, we provide a constant array of possible orderings to be used when permuting the values.

Also, we read argument fields a, b, and c only once at the beginning of the function and save the extracted values in a structure c_reverse_tuple_args, which we allocate and initialize at the first call. For the structure to survive through all calls, we allocate this structure in a special memory context which is maintained in the funcctx -> multi_call_memory_ctx and store the pointer to this structure in funcctx -> user_fctx. We also make use of funcctx fields: call_cntr and max_calls.

In the same code section run once at the first call, we also prepare the descriptor structure needed for returning the tuples. To do so, we fetch the return tuple descriptor by passing the address we get in funcctx->tuple_desc to the function get_call_result_type(...), and we complete the preparation by calling BlessTuple(...) on it to fill in the missing bits needed for using it for returning values.

At the end of this section, we restore the memory context. While you usually do not need to pfree() the things you have palloc() allocated, you should always remember to restore the memory context when you are done using any context you have switched to, or else you risk messing up PostgreSQL in a way that can be hard to debug later!

The rest is something that gets done at each call, including the first one.

We start by checking that there is still something to do by comparing the current call to the max calls parameter. This is by no means the only way to determine if we have returned all values, but it is indeed the simplest way if you know ahead how many rows you are going to return. If there are no more rows to return, we signal this back using SRF_RETURN_DONE().

The rest is very similar to what the previous single-tuple function did. We compute the retvals and retnulls arrays using the index permutations array ips and then construct a tuple to return using heap_form_tuple(funcctx->tuple_desc, retvals, retnulls);.

Finally, we return the tuple using macro SRF_RETURN_NEXT(...), converting the tuple to Datum, as this is what the macro expects.

One more thing to note, all current versions of PostgreSQL will always keep calling your function until it returns SRF_RETURN_DONE(). There is currently no way to do an "early exit" from the callers side. This means that if your function returns 1 million rows and you do:

SELECT * FROM mymillionrowfunction() LIMIT 3;

The function will be called 1 million times internally, and all the results will be cached, and only after this, the first three rows will be returned and the remaining 9,99,997 rows are discarded. This is not a fundamental limitation, but just an implementation detail which is likely to change in some future version of PostgreSQL. Don't hold your breath though, this will only happen if somebody finds this valuable enough to implement.

The source with modifications described previously is as follows:

struct c_reverse_tuple_args { 
    int32   argvals[3]; 
    bool    argnulls[3]; 
    bool    anyargnull; 
}; 

Datum 
c_permutations_x(PG_FUNCTION_ARGS) 
{ 
    FuncCallContext     *funcctx; 
    
    const char  *argnames[3] = {"a","b","c"}; 
    // 6 possible index permutations for 0,1,2 
    const int   ips[6][3] = {{0,1,2},{0,2,1}, 
                             {1,0,2},{1,2,0}, 
                             {2,0,1},{2,1,0}}; 
    int i, call_nr; 
    
    struct c_reverse_tuple_args* args; 
    
    if(SRF_IS_FIRSTCALL()) 
    { 
        HeapTupleHeader th = PG_GETARG_HEAPTUPLEHEADER(0); 
        MemoryContext   oldcontext; 
        /* create a function context for cross-call persistence */ 
        funcctx = SRF_FIRSTCALL_INIT(); 
        /* switch to memory context appropriate for multiple function calls */ 
        oldcontext = MemoryContextSwitchTo(
                     funcctx->multi_call_memory_ctx
                     ); 
        /* allocate and zero-fill struct for persisting extracted arguments*/ 
        args = palloc0(sizeof(struct c_reverse_tuple_args)); 
        args->anyargnull = false; 
        funcctx->user_fctx = args; 
        /* total number of tuples to be returned */ 
        funcctx->max_calls = 6; 
        // there are 6 permutations of 3 elements 
        // extract argument values and NULL-ness 
        for(i=0;i<3;i++){ 
            args->argvals[i] = DatumGetInt32(GetAttributeByName(th, argnames[i], &(args->argnulls[i]))); 
            if (args->argnulls[i]) 
                args->anyargnull = true; 
        } 
        // set up tuple for result info 
        if (get_call_result_type(fcinfo, NULL, &funcctx->tuple_desc)
            != TYPEFUNC_COMPOSITE) 
            ereport(ERROR, 
                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), 
                 errmsg("function returning record called in context " 
                        "that cannot accept type record"))); 
        BlessTupleDesc(funcctx->tuple_desc); 
        // restore memory context 
        MemoryContextSwitchTo(oldcontext); 
    } 
    
    funcctx = SRF_PERCALL_SETUP(); 
    args = funcctx->user_fctx; 
    call_nr = funcctx->call_cntr; 
    
    if (call_nr < funcctx->max_calls) { 
        HeapTuple   rettuple; 
        Datum       retvals[4]; 
        bool retnulls[4]; 
        
        for(i=0;i<3;i++){ 
           retvals[i] = Int32GetDatum(args->argvals[ips[call_nr][i]]); 
           retnulls[i] = args->argnulls[ips[call_nr][i]]; 
        }
        retvals[3] = Int32GetDatum(args->argvals[ips[call_nr][0]]
                                 * args->argvals[ips[call_nr][1]]
                                 + args->argvals[ips[call_nr][2]]); 
        retnulls[3] = args->anyargnull; 
        
        rettuple = heap_form_tuple(funcctx->tuple_desc, retvals, retnulls); 

        SRF_RETURN_NEXT(funcctx, HeapTupleGetDatum( rettuple ));
    } 
    else    /* do when there is no more left */ 
    { 
        SRF_RETURN_DONE(funcctx); 
    } 
    
}
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.128.197.164