What you want to see on DotNetFunda.com ?
Go to DotNetFunda.com
Twitter TwitterLinkedIn
YouTubeGoogle
 Online : 6266 |  Welcome, Guest!   Register  Login
Home > Articles > Sql Server > CDC in SQL server 2008 R2 (Part II)

CDC in SQL server 2008 R2 (Part II)

1 vote(s)
Rating: 5 out of 5
Article posted by Vishvvas on 7/16/2012 | Views: 1873 | Category: Sql Server | Level: Intermediate | Points: 250 red flag


This article (Part II) discusses the intrinsic of CDC as one of the feature (of SQL server 2008 and higher) for providing the data auditing abilities

Introduction

In the previous article (http://www.dotnetfunda.com/articles/article1924-cdc-in-sql-server-2008-r2-part-i.aspx), we discussed about auditing needs in general and solutions for data auditing. Also we saw how to implement CDC and consequent results.

In this article, we would dive deep to know what the architecture of CDC and how it is structured in the SQL server 2008.

Objective


To learn about the intrinsic of CDC feature of SQL server (version 2008 and higher) To learn about the intrinsic of CDC feature of SQL server (version 2008 and higher)


Description


Note: SQL 2008 R2 is referenced and also couple of MSDN articles, image/s are referred for this write-up.

The diagram which depicts the data flow for CDC is as below

 (Image taken from http://msdn.microsoft.com/en-us/library/cc645937%28v=sql.105%29.aspx)

As seen in the diagram, the data capture process captures the data from transaction logs resulting better performance by avoiding overload on DML operations and transactions. The snapshot for captured changed data is retrieved through the query functions (as shown above) which are used for data warehouse.

CDC database tables

After reading the transaction logs, it populates the change tables. This change tables are mentioned in the Part I still we would revisit them.

[cdc].[captured_columns]

[cdc].[change_tables]

[cdc].[ddl_history]

[cdc].[index_columns]

[cdc].[lsn_time_mapping]

[cdc].[<capture_instance>_CT]

Apart from the last one, these are kind of system table for CDC and actual data would be captured in separate tables which would correspond to the database tables for which CDC is to be enabled (as indicated last in the list)

CDC Database Stored Procedures

Following stored procedures are created in database as a part of enabling the CDC for particular database.

sys.sp_cdc_add_job (Transact-SQL)

sys.sp_cdc_generate_wrapper_function (Transact-SQL)

sys.sp_cdc_change_job (Transact-SQL)

sys.sp_cdc_get_captured_columns (Transact-SQL)

sys.sp_cdc_cleanup_change_table (Transact-SQL)

sys.sp_cdc_get_ddl_history (Transact-SQL)

sys.sp_cdc_disable_db (Transact-SQL)

sys.sp_cdc_help_change_data_capture (Transact-SQL)

sys.sp_cdc_disable_table (Transact-SQL)

sys.sp_cdc_help_jobs (Transact-SQL)

sys.sp_cdc_drop_job (Transact-SQL)

sys.sp_cdc_scan (Transact-SQL)

sys.sp_cdc_enable_db (Transact-SQL)

sys.sp_cdc_start_job (Transact-SQL)

sys.sp_cdc_enable_table (Transact-SQL)

sys.sp_cdc_stop_job (Transact-SQL)

These are employed for

1.       Configure CDC

2.       Manage (i.e. create and run) the CDC agent jobs

3.       Provide the metadata to CDC consumers i.e. the one who would be utilizing CDC which could be a data warehouse of simple system for display history

CDC Database Functions

Following functions are created to retrieve the information about the changes which are captured as part of CDC.

sys.fn_cdc_has_column_changed (Transact-SQL)

sys.fn_cdc_increment_lsn (Transact-SQL)

sys.fn_cdc_decrement_lsn (Transact-SQL)         

sys.fn_cdc_is_bit_set (Transact-SQL)

sys.fn_cdc_get_column_ordinal (Transact-SQL)

sys.fn_cdc_map_lsn_to_time (Transact-SQL)

sys.fn_cdc_get_max_lsn (Transact-SQL)

sys.fn_cdc_map_time_to_lsn (Transact-SQL)

sys.fn_cdc_get_min_lsn (Transact-SQL)

cdc.fn_cdc_get_net_changes_<capture_instance> (Transact-SQL)         

cdc.fn_cdc_get_all_changes_<capture_instance> (Transact-SQL)

The last 2 functions returns the rows for change applied for data. Other functions are administrative and related to meta data.

SQL Server Agent Jobs

There are 2 SQL server agent jobs for CDC

      1.       To populate the change tables à Capture Job

2.       To clean-up the change tablesà Clean-up job

These jobs are created through stored procedures once a database table is enabled for CDC. The “Capture Job” starts immediately and removed when CDC is disabled for database. The creation and removal can be controlled by administrators.

Limitations of CDC

CDC doesn’t support the capturing of changes in case of computed columns, usage of column-set in case of sparse columns. For timestamp data type the data is converted into binary.

Querying and meta data

Let’s check out for the columns in the tables meant for captured data.

(i)                   __$operation  denotes the database operation

1: DELETE             2: INSERT             3: Before UPDATE            4: After UPDATE

(ii)                __$start_lsn  : It’s a Log Sequence Number (LSN) of data type binary(10) and is common in one transaction.

(iii)               __$end_lsn : Meant for informational purpose

(iv)              __$seqval : Represents a sequence value for order (sequence) in a transaction

(v)                __$update_mask : Serves as a bit mask for identifying which columns from change tables are changed.

Conclusion

The data auditing through CDC is really useful feature. In the part I of this series we saw practical example. We delved deep in CDC intrinsic and touch based the data flow overview, the stored procedures, agent jobs and querying etc. in this part.

Definitely there is a bit of administrative and monitoring overhead involved in CDC but it is taken care by the atomic, unified, configurable execution of capturing the changed data.

Hope these articles have helped to understand the minimum fundamental aspects of CDC for actual implementation and also implementation steps.

HAPPY PROGRAMMING!!!

Reference

http://msdn.microsoft.com/en-us/library/bb522489(v=sql.105).aspx

http://msdn.microsoft.com/en-us/library/cc645858.aspx

 



If you like this article, subscribe to our RSS Feed. You can also subscribe via email to our Interview Questions, Codes and Forums section.

Page copy protected against web site content infringement by Copyscape
Found interesting? Add this to:



Please Sign In to vote for this post.

Experience:18 year(s)
Home page:http://www.dotnetfunda.com
Member since:Monday, May 30, 2011
Level:Bronze
Status: [Member]
Biography:Over 16 years of experience, worked across different technologies and programming languages like PB,VB, ASP. C++, C#, VB.NET
>> Write Response - Respond to this post and get points
Related Posts

This article describes about how you can join different tables in SQL. It describes different join operations.

SQL Server - Crossword Puzzle - #1

SQL Server Management studio provides a very useful functionality called query shortcuts. This makes the task of querying the sql server easier. Just by pressing two keys on the keyboard a complex query can be executed.

Backup and Restore is one of the high availability model. In this article, I would like to perform Backup on Remote server and Performing Restore the backup file(.bak) from Remote Server into local SQL Server.

In this article, we will look into the Try_Convert function of SQL Server 2012 (Denali).

More ...
About Us | Contact Us | The Team | Advertise | Software Development | Write for us | Testimonials | Privacy Policy | Terms of Use | Link Exchange | Members | Go Top
General Notice: If you find plagiarised (copied) contents on this page, please let us know the original source along with your correct email id (to communicate) for further action.
Copyright © DotNetFunda.Com. All Rights Reserved. Copying or mimicking the site design and layout is prohibited. Logos, company names used here if any are only for reference purposes and they may be respective owner's right or trademarks. | 5/23/2013 11:57:43 PM