Saturday, June 27, 2009

Sql Server Interview Question:Transactions


What is Transactions?

A transaction is a sequence of operations performed as a single logical unit of work. A logical unit of work must exhibit four properties, called the ACID (Atomicity, Consistency, Isolation, and Durability) properties, to qualify as a transaction:

What is ACID?

Atomicity

A transaction must be an atomic unit of work; either all of its data modifications are performed, or none of them is performed.

Consistency

When completed, a transaction must leave all data in a consistent state. In a relational database, all rules must be applied to the transaction's modifications to maintain all data integrity. All internal data structures, such as B-tree indexes or doubly-linked lists, must be correct at the end of the transaction.


Isolation

Modifications made by concurrent transactions must be isolated from the modifications made by any other concurrent transactions.

A transaction either sees data in the state it was in before another concurrent transaction modified it, or it sees the data after the second transaction has completed, but it does not see an intermediate state.

This is referred to as serializability because it results in the ability to reload the starting data and replay a series of transactions to end up with the data in the same state it was in after the original transactions were performed.

Durability
After a transaction has completed, its effects are permanently in place in the system. The modifications persist even in the event of a system failure.

Optimizing Transaction Log Performance
General recommendations for creating transaction log files include:

Create the transaction log on a physically separate disk or RAID (redundant array of independent disks) device. The transaction log file is written serially; therefore, using a separate, dedicated disk allows the disk heads to stay in place for the next write operation.


Set the original size of the transaction log file to a reasonable size to prevent the file from automatically expanding as more transaction log space is needed. As the transaction log expands, a new virtual log file is created, and write operations to the transaction log wait while the transaction log is expanded. If the transaction log expands too frequently, performance can be affected.


Set the file growth increment percentage to a reasonable size to prevent the file from growing by too small a value. If the file growth is too small compared to the number of log records being written to the transaction log, then the transaction log may need to expand constantly, affecting performance.

Manually shrink the transaction log files rather than allowing Microsoft® SQL Server™ 2000 to shrink the files automatically. Shrinking the transaction log can affect performance on a busy system due to the movement and locking of data pages.


What is Transaction Logs?

A database in Microsoft® SQL Server™ 2000 has at least one data file and one transaction log file. Data and transaction log information is never mixed on the same file, and individual files are used by only one database.

SQL Server uses the transaction log of each database to recover transactions. The transaction log is a serial record of all modifications that have occurred in the database as well as the transaction that performed each modification. The transaction log records the start of each transaction. It records the changes to the data and enough information to undo the modifications (if necessary later) made during each transaction. For some large operations, such as CREATE INDEX, the transaction log instead records the fact that the operation took place. The log grows continuously as logged operations occur in the database.

The transaction log records the allocation and deallocation of pages and the commit or rollback of each transaction. This allows SQL Server either to apply (roll forward) or back out (roll back) each transaction in the following ways:

A transaction is rolled forward when you apply a transaction log. SQL Server copies the after image of every modification to the database or reruns statements such as CREATE INDEX. These actions are applied in the same sequence in which they originally occurred. At the end of this process, the database is in the same state it was in at the time the transaction log was backed up.

A transaction is rolled back when you back out an incomplete transaction. SQL Server copies the before images of all modifications to the database since the BEGIN TRANSACTION. If it encounters transaction log records indicating that a CREATE INDEX was performed, it performs operations that logically reverse the statement. These before images and CREATE INDEX reversals are applied in the reverse of their original sequence.


At a checkpoint, SQL Server ensures that all transaction log records and database pages modified are written to disk.

During the recovery process of each database that occurs when SQL Server is restarted, a transaction must be rolled forward only when it is not known whether all the data modifications in the transaction were actually written from the SQL Server buffer cache to disk.

Because a checkpoint forces all modified pages to disk, it represents the point at which the startup recovery must start rolling forward transactions.

Because all pages modified before the checkpoint are guaranteed to be on disk, there is no need to roll forward anything done before the checkpoint.


What is Virtual Log Files?
Each transaction log file is divided logically into smaller segments called virtual log files. Virtual log files are the unit of truncation for the transaction log. When a virtual log file no longer contains log records for active transactions, it can be truncated and the space becomes available to log new transactions.

The smallest size for a virtual log file is 256 kilobytes (KB). The minimum size for a transaction log is 512 KB, which provides two 256-KB virtual log files. The number and size of the virtual log files in a transaction log increase as the size of the log file increases. A small log file can have a small number of small virtual log files (for example, a 5-MB log file that comprises five 1-MB virtual log files).
A large log file can have larger virtual log files (for example, a 500-MB log file that comprises ten 50-MB virtual log files).

Microsoft® SQL Server™ 2000 tries to avoid having many small virtual log files. The number of virtual log files grows much more slowly than the size. If a log file grows in small increments, it tends to have many small virtual log files.
If the log file grows in larger increments, SQL Server creates a smaller number of larger virtual log files.


For example, if the transaction log is growing by 1-MB increments, the virtual log files are smaller and more numerous compared to a transaction log growing at 50-MB increments. A large number of virtual log files can increase the time taken to perform database recovery.

As records are written to the log, the end of the log grows from one virtual log file to the next. If there is more than one physical log file for a database, the end of the log grows through each virtual log file in each physical file before circling back to the first virtual log file in the first physical file. Only when all log files are full will the log begin to grow automatically.

Write-Ahead Transaction Log

Microsoft® SQL Server™ 2000, like many relational databases, uses a write-ahead log. A write-ahead log ensures that no data modifications are written to disk before the associated log record.

SQL Server maintains a buffer cache into which it reads data pages when data must be retrieved. Data modifications are not made directly to disk, but are instead made to the copy of the page in the buffer cache.

The modification is not written to disk until either the database is checkpointed, or the modifications must be written to disk so the buffer can be used to hold a new page. Writing a modified data page from the buffer cache to disk is called flushing the page. A page modified in the cache but not yet written to disk is called a dirty page.

At the time a modification is made to a page in the buffer, a log record is built in the log cache recording the modification. This log record must be written to disk before the associated dirty page is flushed from the buffer cache to disk.

If the dirty page were flushed before the log record, it would create a modification on disk that could not be rolled back if the server failed before the log record were written to disk.

SQL Server has logic that prevents a dirty page from being flushed before the associated log record. Because log records are always written ahead of the associated data pages, the log is called a write-ahead log.

Transactions Architecture

Microsoft® SQL Server™ 2000 maintains the consistency and integrity of each database despite errors that occur in the system. Every application that updates data in a SQL Server database does so using transactions.

A transaction is a logical unit of work made up of a series of statements (selects, inserts, updates, or deletes). If no errors are encountered during a transaction, all of the modifications in the transaction become a permanent part of the database. If errors are encountered, none of the modifications are made to the database.

A transaction goes through several phases:

Before the transaction starts, the database is in a consistent state.

The application signals the start of a transaction. This can be done explicitly with the BEGIN TRANSACTION statement. Alternatively, the application can set options to run in implicit transaction mode; the first Transact-SQL statement executed after the completion of a prior transaction starts a new transaction automatically.

No record is written to the log when the transaction starts; the first record is written to the log when the application generates the first log record for a data modification.

The application starts modifying data. These modifications are made one table at a time. As a series of modifications are made, they may leave the database in a temporarily inconsistent intermediate state.

When the application reaches a point where all the modifications have completed successfully and the database is once again consistent, the application commits the transaction.


This makes all the modifications a permanent part of the database.
If the application encounters some error that prevents it from completing the transaction, it undoes, or rolls back, all the data modifications. This returns the database to the point of consistency it was at before the transaction started.

SQL Server applications can also run in autocommit mode. In autocommit mode each individual Transact-SQL statement is committed automatically if it is successful and rolled back automatically if it generates an error.

There is no need for an application running in autocommit mode to issue statements that specifically start or end a transaction.
All Transact-SQL statements run in a transaction: an explicit transaction, an implicit transaction, or an autocommit transaction.

All SQL Server transactions that include data modifications either reach a new point of consistency and are committed, or are rolled back to the original point of consistency. Transactions are not left in an intermediate state where the database is not consistent.

How do you Shrinking the Transaction Log?

The size of the log files are physically reduced when:
A DBCC SHRINKDATABASE statement is executed.
A DBCC SHRINKFILE statement referencing a log file is executed.
An autoshrink operation occurs.

Shrinking a log is dependent on first truncating the log. Log truncation does not reduce the size of a physical log file, it reduces the size of the logical log and marks as inactive the virtual logs that do not hold any part of the logical log.

A log shrink operation removes enough inactive virtual logs to reduce the log file to the requested size.

The unit of size reduction is a virtual log. For example, if you have a 600 MB log file that has been divided into six 100 MB virtual logs, the size of the log file can only be reduced in 100 MB increments. The file size can be reduced to sizes such as 500 MB or 400 MB, but it cannot be reduced to sizes such as 433 MB or 525 MB.

Virtual logs that hold part of the logical log cannot be freed. If all the virtual logs in a log file hold parts of the logical log, the file cannot be shrink until a truncation marks one or more of the virtual logs at the end of the physical log as inactive.

When any file is shrunk, the space freed must come from the end of the file. When a transaction log file is shrunk, enough virtual logs from the end of the file are freed to reduce the log to the size requested by the user.

The target_size specified by the user is rounded to the next highest virtual log boundary. For example, if a user specifies a target_size of 325 MB for our sample 600 MB file with 100 MB virtual log files, the last two virtual log files are removed and the new file size is 400 MB.

In SQL Server 2000, a DBCC SHRINKDATABASE or DBCC SHRINKFILE operation attempts to shrink the physical log file to the requested size (subject to rounding) immediately:

If no part of the logical log is in the virtual logs beyond the target_size mark, the virtual logs after the target_size mark are freed and the successful DBCC statement completes with no messages.

If part of the logical log is in the virtual logs beyond the target_size mark, SQL Server 2000 frees as much space as possible and issues an informational message. The message tells you what actions you need to perform to get the logical log out of the virtual logs at the end of the file. After you perform this action, you can then reissue the DBCC statement to free the remaining space.

What is Truncating the Transaction Log?
If log records were never deleted from the transaction log, the logical log would grow until it filled all the available space on the disks holding the physical log files.

At some point in time, old log records no longer necessary for recovering or restoring a database must be deleted to make way for new log records. The process of deleting these log records to reduce the size of the logical log is called truncating the log.

What is Transaction Recovery?
Every Microsoft® SQL Server™ 2000 database has a transaction log that records data modifications made in the database.

The log records the start and end of every transaction and associates each modification with a transaction. An instance of SQL Server stores enough information in the log to either redo (roll forward) or undo (roll back) the data modifications that make up a transaction.

Each record in the log is identified by a unique log sequence number (LSN). All of the log records for a transaction are chained together.

An instance of SQL Server records many different types of information in the transaction log. Instances of SQL Server 2000 primarily log the logical operations performed. The operation is reapplied to roll forward a modification, and the opposite of the logical operation is performed to roll back a modification.

What is dirty pages?

Each instance of SQL Server controls when modifications are written from its data buffers to disk. An instance of SQL Server may cache modifications in buffers for a period of time to optimize disk writes.

A buffer page that contains modifications that have not yet written to disk is known as a dirty page. Writing a dirty buffer page to disk is called flushing the page. When modifications are cached, care must be taken to ensure that no data modification is flushed before the corresponding log image is written to the log file. This could create a modification that could not be rolled back if necessary.

To ensure that they can recover all modifications, instances of SQL Server use a write-ahead log, which means that all log images are written to disk before the corresponding data modification.
A commit operation forces all log records for a transaction to the log file so that the transaction is fully recoverable even if the server is shut down.

A commit operation does not have to force all the modified data pages to disk as long as all the log records are flushed to disk. A system recovery can roll the transaction forward or backward using only the log records.

What are the ways of Starting Transactions?
You can start transactions in Microsoft® SQL Server™ as explicit, autocommit, or

implicit transactions.
Explicit transactions

Explicitly start a transaction by issuing a BEGIN TRANSACTION statement.

Autocommit transactions

This is the default mode for SQL Server. Each individual Transact-SQL statement is committed when it completes. You do not have to specify any statements to control transactions.

Implicit transactions
Set implicit transaction mode on through either an API function or the Transact-SQL SET IMPLICIT_TRANSACTIONS ON statement. The next statement automatically starts a new transaction. When that transaction is completed, the next Transact-SQL statement starts a new transaction.

Connection modes are managed at the connection level. If one connection changes from one transaction mode to another it has no effect on the transaction modes of any other connection


How do you Ending Transactions?

You can end transactions with either a COMMIT or ROLLBACK statement.

COMMIT

If a transaction is successful, commit it. A COMMIT statement guarantees all of the transaction's modifications are made a permanent part of the database. A COMMIT also frees resources, such as locks, used by the transaction.

ROLLBACK

If an error occurs in a transaction, or if the user decides to cancel the transaction, then roll the transaction back. A ROLLBACK statement backs out all modifications made in the transaction by returning the data to the state it was in at the start of the transaction. A ROLLBACK also frees resources held by the transaction.

What does the BEGIN TRANSACTION do?
Marks the starting point of an explicit, local transaction. BEGIN TRANSACTION increments @@TRANCOUNT by 1.
BEGIN TRANSACTION represents a point at which the data referenced by a connection is logically and physically consistent.

If errors are encountered, all data modifications made after the BEGIN TRANSACTION can be rolled back to return the data to this known state of consistency.

Each transaction lasts until either it completes without errors and COMMIT TRANSACTION is issued to make the modifications a permanent part of the database, or errors are encountered and all modifications are erased with a ROLLBACK


TRANSACTION statement.

BEGIN TRANSACTION starts a local transaction for the connection issuing the statement. Depending on the current transaction isolation level settings, many resources acquired to support the

Transact-SQL statements issued by the connection are locked by the transaction until it is completed with either a COMMIT TRANSACTION or ROLLBACK TRANSACTION statement. Transactions left outstanding for long periods of time can prevent other users from accessing these locked resources.

Although BEGIN TRANSACTION starts a local transaction, it is not recorded in the transaction log until the application subsequently performs an action that must be recorded in the log, such as executing an INSERT, UPDATE, or DELETE statement.

An application can perform actions such as acquiring locks to protect the transaction isolation level of SELECT statements, but nothing is recorded in the log until the application performs a modification action.
Naming multiple transactions in a series of nested transactions with a transaction name has little effect on the transaction. Only the first (outermost) transaction name is registered with the system.

A rollback to any other name (other than a valid savepoint name) generates an error. None of the statements executed before the rollback are in fact rolled back at the time this error occurs. The statements are rolled back only when the outer transaction is rolled back.

BEGIN TRANSACTION starts a local transaction. The local transaction is escalated to a distributed transaction if the following actions are performed before it is committed or rolled back:

An INSERT, DELETE, or UPDATE statement is executed that references a remote table on a linked server. The INSERT, UPDATE, or DELETE statement fails if the OLE DB provider used to access the linked server does not support the ITransactionJoin interface.

A call is made to a remote stored procedure when the REMOTE_PROC_TRANSACTIONS option is set to ON.


What does the COMMIT TRANSACTION do?


Marks the end of a successful implicit or user-defined transaction. If @@TRANCOUNT is 1, COMMIT TRANSACTION makes all data modifications performed since the start of the transaction a permanent part of the database, frees the resources held by the connection, and decrements @@TRANCOUNT to 0.

If @@TRANCOUNT is greater than 1, COMMIT TRANSACTION decrements @@TRANCOUNT only by 1.

It is the responsibility of the Transact-SQL programmer to issue COMMIT TRANSACTION only at a point when all data referenced by the transaction is logically correct.

If the transaction committed was a Transact-SQL distributed transaction, COMMIT TRANSACTION triggers MS DTC to use a two-phase commit protocol to commit all the servers involved in the transaction.

If a local transaction spans two or more databases on the same server, SQL Server uses an internal two-phase commit to commit all the databases involved in the transaction.

When used in nested transactions, commits of the inner transactions do not free resources or make their modifications permanent.

The data modifications are made permanent and resources freed only when the outer transaction is committed.

Each COMMIT TRANSACTION issued when @@TRANCOUNT is greater than 1 simply decrements @@TRANCOUNT by 1. When @@TRANCOUNT is finally decremented to 0, the entire outer transaction is committed.

Because transaction_name is ignored by SQL Server, issuing a COMMIT TRANSACTION referencing the name of an outer transaction when there are outstanding inner transactions only decrements @@TRANCOUNT by 1.

Issuing a COMMIT TRANSACTION when @@TRANCOUNT is 0 results in an error that there is no corresponding BEGIN TRANSACTION.

You cannot roll back a transaction after a COMMIT TRANSACTION statement is issued because the data modifications have been made a permanent part of the database.


What does the SAVE TRANSACTION do?


Sets a savepoint within a transaction.
A user can set a savepoint, or marker, within a transaction. The savepoint defines a location to which a transaction can return if part of the transaction is conditionally canceled.

If a transaction is rolled back to a savepoint, it must proceed to completion (with more Transact-SQL statements if needed and a COMMIT TRANSACTION statement), or it must be canceled altogether (by rolling the transaction back to its beginning).

To cancel an entire transaction, use the form ROLLBACK TRANSACTION transaction_name. All the statements or procedures of the transaction are undone.
SAVE TRANSACTION is not supported in distributed transactions started either explicitly with BEGIN DISTRIBUTED TRANSACTION or escalated from a local transaction.

What is Explicit Transactions?
An explicit transaction is one in which you explicitly define both the start and end of the transaction. Explicit transactions were also called user-defined or user-specified transactions in earlier versions of Microsoft® SQL Server™.

What does the @@TRANCOUNT do?
Returns the number of active transactions for the current connection.

No comments:

Post a Comment