11.3 Serializability

Serial execution of transactions means that transactions are performed one after another, without any interleaving of operations. If A and B are two transactions, there are only two possible serial executions: all of transaction A is completed first and then all of transaction B is done, or all of B is completed and then all of A is done. In serial execution, there is no interference between transactions because only one is executing at any given time. However, there is no guarantee that the results of all serial executions of a given set of transactions will be identical. In banking, for example, it matters whether INTEREST is calculated for an account before a large deposit is made or after. However, as far as database correctness is concerned, the order does not matter because either order will produce a consistent state of the database.

In most cases, users do not care in what order their transactions are executed, as long as the results are correct, and no user is made to wait too long. For n transactions, there are n! possible serial schedules. Regardless of which serial schedule is chosen, a serial execution will never leave the database in an inconsistent state, so every serial execution is correct, even though different results may be produced by different orders. The three problems that we illustrated in our figures resulted from concurrency and left the database in an inconsistent state. No serial execution would allow these problems to arise. Our goal is to find ways to allow transactions to execute concurrently while making sure they do not interfere with one another, and to produce a database state that could be produced by a truly serial execution. We want to maximize concurrency to maximize throughput, the amount of work done during a given period of time, while preserving database consistency.

If a set of transactions executes concurrently, we say the schedule is serializable if it produces the same results as some serial execution. It is essential to guarantee serializability of concurrent transactions to ensure database correctness. In serializability, the following factors are important:

  • If two transactions are only reading data items, they do not confl t and order is not important.

  • If two transactions operate on (either reading or writing) completely separate data items, they do not conflict and order is not important.

  • If one transaction writes to a data item and another either reads or writes to the same data item, then the order of execution is important.

Therefore, two operations conflict only if all of the following are true:

  • The operations belong to different transactions.

  • The operations access the same data item.

  • At least one operation writes the item.

A serializable execution of transactions orders any conflicting operations in the same way as some serial execution, so that the results of the concurrent execution are the same as the results of at least one of the possible serial executions. This type of serializability is called conflict serializability, and it is the strictest form of serializability.

FIGURE 11.6 illustrates the two serial schedules and a serializable schedule for two transactions, S and T. Column (a) in Figure 11.6 shows the serial schedule S,T, while column (b) shows the serial schedule T,S. Column (c) shows a serializable schedule that is conflict equivalent to the first serial schedule, S,T. The conflicting reads and writes of the data items are done in the same order in the serializable schedule as they are in the serial schedule S,T. Note that in column (a), S reads and writes x before T reads and writes x, and S reads and writes y before T reads and writes y. The same order of reads and writes is preserved in the serializable schedule in column (c). Also note that the two serial schedules here are not equivalent, but both are considered correct and would leave the database in a consistent state.

A listing of 2 serial schedules and a serializable schedule for transactions S and T. The time instants range from t 1 to t 16. a. Serial S, T. t 1: BEGIN TRANS. t 2: read x. t 3: x equals x times 3. t 4: write x. t 5: read y. t 6: y equals y plus 2. t 7: write y. t 8: COMMIT. t 9: BEGIN TRANS. t 10: read x. t 11: x equals x plus 2. t 12: write x. t 13: read y. t 14: y equals y plus 5. t 15: write y. t 16: COMMIT. b. Serial T, S. t 1: BEGIN TRANS. t 2: read x. t 3: x equals x plus 2. t 4: write x. t 5: read y. t 6: y equals y plus 5. t 7: write y. t 8: COMMIT. t 9: BEGIN TRANS. t 10: read x. t 11: x equals x times 3. t 12: write x. t 13: read y. t 14: y equals y plus 2. t 15: write y. t 16: COMMIT. c. Serializable. t 1: BEGIN TRANS. t 2: read x; BEGIN TRANS. t 3: x equals x times 3. t 4: write x. t 5: read x. t 6: read y; x equals x plus 2. t 7: y equals y plus 2; write x. t 8: write y. t 9: COMMIT; read y. t 10: y equals y plus 5. t 11: write y. t 12: COMMIT.

FIGURE 11.6 Two Serial Schedules and a Serializable Schedule for Transactions S and T

You can verify that the two serial schedules give different results by assigning initial values for x and y and seeing what final values you get. For example, if the initial value for x is 1 and for y is 2, the serial schedule in column (a) produces final values of 5 and 9 respectively, while the serial schedule in column (b) produces final values of 9 and 9. Both of these results are considered correct. Using the same initial values, you should see that the schedule in column (c) gives the same results as column (a). However, it is important to note that showing that a concurrent schedule gives the same results as a serial schedule for some set of values is not sufficient to prove the concurrent schedule is serializable because another set of values may not give the same results as a serial schedule. Therefore, we need a more robust method of testing for serializability.

It is possible to determine whether a schedule is conflict serializable by examining the order of all its conflicting reads and writes and comparing them with the order of the same reads and writes in the transactions’ serial schedules, to determine if the order matches at least one of them. Fortunately, there is a graphical technique, called a precedence graph, that allows us to determine whether a schedule is conflict serializable. To construct the graph, we draw nodes and directed edges as follows:

  • Draw a node for each transaction: T1, T2, …, Tn

  • If Ti writes x, and then Tj reads x, draw a directed edge from Ti to Tj

  • If Ti reads x, and then Tj writes x, draw a directed edge from Ti to Tj

  • If Ti writes x, and then Tj writes x, draw a directed edge from Ti to Tj

The schedule is serializable if the precedence graph has no cycles. A cycle is a path that begins with a starting node and ends at the same starting node. FIGURE 11.7(A) shows a precedence graph for the schedule showing the inconsistent analysis problem in Figure 11.5. Because SUMBAL reads BAL_A and then TRANSFER writes BAL_A, we draw an edge from SUMBAL to TRANSFER. Because TRANSFER writes BAL_C and then SUMBAL reads it, we draw an edge from TRANSFER to SUMBAL, resulting in a cycle and proving that the schedule is not serializable.

FIGURE 11.7 Precedence Graphs for Serializability Testing

A precedence graph with 2 nodes labeled SUM BAL and TRANSFER. An edge is directed from SUM BAL to TRANSFER. Another edge is directed from TRANSFER to SUM BAL.

(A) Precedence Graph for Figure 11.5

A precedence graph for serializable schedule. The graph has 2 nodes labeled S and T. An edge is directed from S to T.

(B) Precedence Graph for Serializable Schedule Shown in Column (c) of Figure 11.6

If a schedule is serializable, it is possible to use the precedence graph to find an equivalent serial schedule by performing a topological sort of the graph. Select a node with no incoming edges to add to the serial schedule; remove that node and its outgoing edges from the graph. Repeat until all nodes have been added to the schedule. The resulting schedule will have each transaction executing before any transaction that followed it in the precedence graph. The relative order of the intermediate nodes themselves does not matter. There may be several possible serial schedules. If a cycle exists in the graph, no serial schedule is possible because eventually you will need a node to precede itself.

We will apply this test to the schedule in column (c) in Figure 11.6. Let rS(x) mean transaction S reads x, and wS(x) mean S writes x. The following string shows the reads and writes in the schedule

A string illustrating reads and writes in the schedule. r subscript s open parentheses x close parentheses, semicolon, w subscript s open parentheses x close parentheses, semicolon, r subscript T open parentheses x close parentheses, semicolon, r subscript s open parentheses y close parentheses, semicolon, w subscript T open parentheses x close parentheses, semicolon, w subscript s open parentheses y close parentheses, semicolon, r subscript T open parentheses y close parentheses, semicolon, w subscript T open parentheses y close parentheses.

FIGURE 11.7(B) shows the precedence graph. We begin by first drawing nodes for the two transactions, S and T. We read the string in order from left to right, looking for conflicts, which always involve at least one write. The first write occurs with wS(x), but this does not cause a conflict with the previous read, rS(x), because transaction S did the read and no transaction ever conflicts with itself. The first conflict is with the third entry in our list, rT(x).We see that before this, we had wS(x), which means S wrote x before T read it. So, we draw a directed edge from S to T, which essentially means S must precede T with respect to its operations on x. The next conflict comes with wT(x), which conflicts with both rS(x) and wS(x) and would require an edge from S to T, but we already have one, so we need not repeat it. The next conflict is for rT(y), which is in conflict with wS(y), meaning S wrote y before T read it. So, we would draw an edge from S to T, but it already appears. The same is true for wT(y). We see there is no cycle in the graph, which means the schedule is serializable. From the only edge on the graph, we see that S must precede T, so the schedule is equivalent to the serial schedule S,T.

Serializability can be achieved in several ways, but most systems use the locking, timestamping, or optimistic techniques that will be described in the following sections. In general, the DBMS has a concurrency control subsystem that is “part of the package” and not directly controllable by either the users or the database administrator (DBA). In locking and timestamping, a facility called a scheduler is used to allow operations to be executed immediately, delayed, or rejected. If a transaction’s operation is rejected, that transaction is aborted, but it may, of course, be restarted after the conflicting transaction completes. Optimistic techniques do not delay or reject transactions but allow them to proceed as if there were no conflict, and then check for conflicts just before a transaction commits. If a conflict occurs at commit time, the transaction is aborted.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.12.123.189