H. [Step (3) of Restart] The before image of x wrt T, can be found in the log.
I. [Step (4) of Restart] Upon recovery from a system failure, the scheduler must wait for the acknowledgment that Restart has been completed by the RM. It may then start sending operations to the RM again.
Undo and Redo Rules
This algorithm satisfies the Undo Rule. Suppose the location of x in the stable database contains the last committed value of X, say V, written by transaction
T,. When T, wrote into x, the RM inserted [T;, X, V] in the log (see step (2) of
RM-Write). By the Garbage Collection Rule, since v is the last committed value of X, this record cannot have been removed. In particular, it will still be in the log when the CM overwrites v in the stable database, as desired for the
The Redo Rule is likewise satisfied. All of a transaction’s updates are recorded in the log before the transaction commits, whether or not they were also recorded in the stable database. By the Garbage Collection Rule they must still be there when the transaction commits.
Since the algorithm satisfies the Undo and Redo Rules, Restart can always find in stable storage the information that it needs for restoring the last committed value of each data item in the stable database. In step (3), Restart redoes an update to x by a committed transaction, or undoes an update to x by an active or aborted transaction, only if no other committed transaction subsequently updated x. Thus, when Restart terminates, each data item will contain its last committed value. Moreover, Restart is idempotent. If it is interrupted by a system failure and reexecutes from the beginning, the updates it redoes and undoes in step (3) are the same as those it would have done if its first execution had not been interrupted.
The Restart procedure just sketched may have to examine every record ever written in the log - except, of course, those that have been garbage collected,
This is still a very large number of records, since garbage collection is an expensive activity and is carried out fairly infrequently. Moreover, since most data items in the database probably contain their last committed values at the time of the failure, Restart is doing much more work than necessary This inefficiency of Restart is an important issue, because after a system failure, the
DBS is unavailable to users until Restart has finished its job.
184 CHAPTER 6 I CENTRALIZED RECOVERY
This problem is solved by the use of checkpointing methods. In general, checkpointing is an activity that writes information to stable storage during normal operation in order to reduce the amount of work Restart has to do after a failure.
Checkpointing performs its work by a combination of two types of updates to stable storage: (1) marking the log, commit list, and abort list to indicate which updates are already written or undone in the stable database, and (2) writing the after images of committed updates or before images of aborted updates in the stable database. Technique (1) tells Restart which updates don’t have to be undone or redone again. Technique (2) reduces the amount of work that Restart has to do by doing that work during checkpointing.
Technique (1) is essential to any checkpointing activity. Technique (2) is optional. One simple checkpointing scheme is periodically to stop processing transactions, wait for all active transactions to commit or abort, flush all dirty cache slots, and then mark the end of the log to indicate that the checkpointing activity took place. This is caIled commit consistent checkpointing, because the stable database now contains the last committed value of each data item relative to the transactions whose activity is recorded in the log. With commit…