Dynamic SQL

Dynamic SQL is used for building and executing queries on the fly. Unlike the static SQL statement, a dynamic SQL statement full text is unknown, and can change between successive executions. These queries can be DDL, DCL, and DML statements. Dynamic SQL is used for reducing repetitive tasks. For example, one could use dynamic SQL to create table partitioning for a certain table on a daily basis, add missing indexes on all foreign keys, or add data auditing capabilities to a certain table without major coding effects. Another important use of dynamic SQL is to overcome the side effects of PL/pgSQL caching, as queries executed using the EXECUTE statement are not cached.

Dynamic SQL is achieved via the EXECUTE statement. The EXECUTE statement accepts a string and simply evaluates it. The synopsis to execute a statement is given as follows:

EXECUTE command-string [ INTO [STRICT] target ] [ USING expression [, ... ] ];

Executing DDL statements in dynamic SQL

In some cases, one needs to perform operations at the database object level such as tables, indexes, columns, roles, and so on. For example, a database developer would like to vacuum and analyze a specific schema object, which is a common task after deployment of updating the statistics. For example, to vacuum the car_portal_app schema, one could write the following script:

SET search_path TO car_portal_app;
DO $$
DECLARE
table_name text;
BEGIN
  FOR table_name IN SELECT tablename FROM pg_tables WHERE schemaname = 'car_portal_app' LOOP
    EXECUTE 'ANALYZE ' || table_name;
  END LOOP;
END;
$$;    

Executing DML statements in dynamic SQL

Some applications might interact with data in an interactive manner. For example, one might have a billing data generated on a monthly basis. Also, some applications filter data on different criteria defined by the user. In such cases, dynamic SQL is very convenient.

For example, in the car portal application, search functionality is needed for getting accounts using the dynamic predicate, as follows:

CREATE OR REPLACE FUNCTION get_account (predicate TEXT) RETURNS SETOF account AS
$$
BEGIN
  RETURN QUERY EXECUTE 'SELECT * FROM account WHERE ' || predicate;
END;
$$
LANGUAGE plpgsql;
-- For test
--SELECT * FROM get_account ('true');
--SELECT * FROM get_account (E'first_name='test'');

Dynamic SQL and the caching effect

As mentioned earlier, PL/pgSQL caches execution plans. This is quite good if the generated plan is expected to be static. For example, the following statement is expected to use an index scan because of selectivity. In this case, caching the plan saves some time, and thus, increases performance:

SELECT * FROM account WHERE account_id =<INT>

In other scenarios however, this is not true. For example, let us assume we have an index on the advertisement_date column, and we would like to get the number of advertisements since a certain date, as follows:

SELECT count (*) FROM car_portal_app.advertisement WHERE advertisement_date >= <certain_date>;

In the preceding query, the entries from the advertisement table can be fetched from the hard disk either by using the index scan or using the sequential scan based on selectivity, which depends on the provided certain_date value. Caching the execution plan of such a query will cause serious problems; thus, writing the function as follows is not a good idea:

CREATE OR REPLACE FUNCTION get_advertisement_count (some_date timestamptz ) RETURNS BIGINT AS $$
BEGIN
  RETURN  (SELECT count (*) FROM car_portal_app.advertisement WHERE advertisement_date >=some_date)::bigint;
END;
$$
LANGUAGE plpgsql;

To solve the caching issue, one could rewrite the above function either using the SQL language function or by using the PL/pgSQL execute command, as follows:

  CREATE OR REPLACE FUNCTION get_advertisement_count (some_date timestamptz ) RETURNS BIGINT AS $$
DECLARE
  count BIGINT;
BEGIN
EXECUTE 'SELECT count (*) FROM car_portal_app.advertisement WHERE advertisement_date >= $1' USING some_date INTO count;
  RETURN count;
END;
$$
LANGUAGE plpgsql;
-- To test
SELECT get_advertisement_count(now() -INTERVAL '1 year');

Recommended practices when using dynamic SQL

Dynamic SQL can cause security issues if not handled carefully; dynamic SQL is vulnerable to the SQL injection technique. SQL injection is used for executing SQL statements that reveal secure information, or even for destroying the data in the database.

A very simple example of a PL/pgSQL function vulnerable to SQL injection is as follows:

CREATE OR REPLACE FUNCTION can_login (user text, pass text) RETURNS BOOLEAN AS $$
DECLARE
stmt TEXT;
result bool;
BEGIN
    stmt = E'SELECT COALESCE (count(*)=1, false) FROM account WHERE email = ''|| $1 || E''  and password = ''||$2||E''';
    RAISE NOTICE '%' , stmt;
    EXECUTE stmt INTO result;
    RETURN result;
END;
$$
LANGUAGE plpgsql

The preceding function returns true if the e-mail and the password match. To test this function, let us insert a row and try to inject some code, as follows:

car_portal=# INSERT INTO account VALUES (DEFAULT, 'test', 'injection', 'test@injet_db.com', 'test_pass' );
INSERT 0 1
car_portal=# SELECT can_login('test@injet_db.com', 'test_pass');
NOTICE:  SELECT COALESCE (count(*)=1, false) FROM account WHERE email = 'test@injet_db.com'  and password = 'test_pass'
 can_login
-----------
 t
(1 row)

car_portal=# SELECT can_login('test@injet_db.com', 'wrong_pass');
NOTICE:  SELECT COALESCE (count(*)=1, false) FROM account WHERE email = 'test@injet_db.com'  and password = 'wrong_pass'
 can_login
-----------
 f
(1 row)

car_portal=# SELECT can_login(E'test@injet_db.com'--', 'wrong_pass');
NOTICE:  SELECT COALESCE (count(*)=1, false) FROM account WHERE email = 'test@injet_db.com'--'  and password = 'wrong_pass'
 can_login
-----------
 t
(1 row)

Notice that the function returns true even when the password does not match the password stored in the table. This is simply because the predicate was commented, as shown by the raise notice:

 SELECT COALESCE (count(*)=1, false) FROM account WHERE email = 'test@injet_db.com'--'  and password = 'wrong_pass'

To protect the code against this technique, one could follow these practices:

  • For parameterized dynamic SQL statements, use the USING clause.
  • Use the format function with appropriate interpolation to construct your queries. Note that %I escapes the argument as an identifier and %L as a literal.
  • Use quote_ident(), quote_literal(), and quote_nullable() to properly format your identifiers and literal.

One way to write the preceding function is as follows:

CREATE OR REPLACE FUNCTION can_login (text, pass text) RETURNS BOOLEAN AS $$
DECLARE
stmt TEXT;
result bool;
BEGIN
    stmt = format('SELECT COALESCE (count(*)=1, false) FROM account WHERE email =  %Land password = %L', $1,$2);
    RAISE NOTICE '%' , stmt;
    EXECUTE stmt INTO result;
    RETURN result;
END;
$$
LANGUAGE plpgsql;
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.194.55