Pages

Saturday, October 16, 2010

Change Data Capture Known as CDC

Change Data Capture popularly known as CDC is a new feature in SQL Server 2008 that records insert, update and delete activity in SQL Server tables.  A good example of how this feature can be used is in performing periodic updates to a data warehouse.  CDC records (or captures) DML activity on designated tables. CDC works by scanning the transaction log for a designated table's 'captured columns' whose content has changed and then making those changes available for data synchronizing purposes in a relational format. As you can see this feature in entrenched in transaction log architecture and thus a lot of the metadata in CDC is related around the concept of a Log Sequence Number (LSN). After performing some setup and configuration steps (which we will cover below), CDC will begin scanning the database transaction log for changes to certain tables that you specify, and will insert these changes into change tables.  These change tables are created during the setup and configuration process.  The setup and configuration process will also create table-valued functions which can be used to query for the changes.  You use the table-valued functions in lieu of querying the underlying change tables directly.  Based on this high level description, let's proceed to the demo.

Use the below query to check which databases is enabled for CDC.

select name,is_cdc_enabled from sys.databases where is_cdc_enabled = 1

Enabling CDC for database.

use <DatabaseName>

GO

Exec sys.sp_cdc_enable_db

GO

Replace <DatabaseName> with the database name on which the you want to enable CDC.

The above query enables CDC on database level.

Note: Just enabling CDC on database level does not capture any data change to any table.

Enabling CDC for Database Table.

These are parameter which we need to set for enabling CDC for new investran database tables for our environment.

exec sys.sp_cdc_enable_table

@source_schema = '<schemaname>',

@source_name = '<TableName>',

@role_name = '<CDCRoleName>',

@supports_net_changes = 1 or 0,

@captured_column_list = '[Column1],[ Column1],...,[ ColumnN]',

@filegroup_name = '<FileGroupName>';

Replace <schemaname>,<TableName>,<CDCRoleName>,[Column1],[ Column1],...,[ ColumnN] and <FileGroupName> as per your database details where

@source_schema is the name of the schema in which the source table belongs.

@source_name is the name of the source table on which to enable change data capture

@role_name is the name of the database role used to gate access to change data

@supports_net_changes indicates whether support for querying for net changes is to be enabled for this capture instance

@captured_column_list identifies the source table columns that are to be included in the change table

@filegroup_name is the filegroup to be used for the change table created for the capture instance

After running the above query two jobs <Databasename>_Capture and <Databasename>_Change get created. Once these jobs are running all the changes made to CDC enabled table gets captured in CDC tables.

Will cover more in next articles: Follow below link for Enabling CDC while restoring a Database

Enabling CDC while Restoring database

No comments:

Post a Comment