Bluedog67

Random Thoughts About SQL Server and My Dog Blue

About Bluedog67

My name is Stephen Horne. I am a well-trained owner of an Australian Cattle Dog named Blue. I also develop software mainly using SQL Server, ASP.NET, and C#. I love data, databases, T-SQL, and turning raw data into actionable information. My plan for this blog is to write about SQL Server and sometimes my dog Blue. Please contact me at stephen at bluedog67 dot com. Follow me on Twitter at bluedog67.

Simple Table Data Change History Solution

Introduction


This post describes a possible simple solution if you need to keep track of the data change history (or data auditing) of a SQL Server table. This solution tracks all INSERT, UPDATE, and DELETE operations and provides visibility of the following:
  • When a new row was created, what database login created the new row, and what values were inserted in all columns in the table.
  • All the times that existing rows were edited, what database logins edited the existing rows, and the starting and ending values for all columns in the table before and after the edit.
  • When an existing row was deleted, what database login deleted the row, and what the values in all columns were before deletion.
This solution can be limited to tracking the data change history for a single table or can include all tables in a database or somewhere in between.

Please note that there are some some concerns and considerations associated with implementing this solution. If you proceed with implementing this solution or something similar in a production environment you will need to take these into account. I will discuss some of these concerns and considerations later in this posting.

Note: For more robust change history/auditing you can take a look at SQL Server's Change Tracking and Change Data Capture, plus some third-party solutions. You may also find alternative solutions that others have shared that may be a better solution or work better in your environment.

Solution Components


There are three main components to this solution:
  1. Transaction Table - this is the database table that you want to track the data change history for.
  2. History Table - this will be a new table that you create to store the data change history for the associated Transaction Table.
  3. Three History Triggers - these triggers will be associated with the Transaction Table. One of the three triggers corresponds to each of the three data change operations (INSERT, UPDATE, and DELETE). When one of these three commands is executed against the Transaction Table the corresponding trigger fires and, if appropriate, data row(s) are written to the associated History Table.

Transaction Table


One of the basic assumptions of this solution is that the Transaction Table has a Primary Key and that the Primary Key preferrably be a Surrogate Key and not a Natural Key. A Surrogate Key is a constructed column or set of columns that do not have "real" meaning but are used to uniquely identify each row in the Transaction Table. The most typical Surrogate Key is a single integer column defined as an IDENTITY column. A Natural Key is column or columns that do have "real" meaning independent of the database table. A Natural Key is not a constructed column or set of columns. The value or values that make up the Natural Key have meaning. A Natural Key can be used to uniquely identify each row in the table.

This data change history solution assumes that the Primary Key is unchanging. This is the reason for preferring a Surrogate Key as the Primary Key. Because a Surrogate Key is a constructed value and doesn't have meaning the possiblity of its value ever needing to be changed for a given row is very low, plus being defined as an IDENTITY columns prevents updates. A Natural Key works as long as it *never* changes. As we progress through the details of this solution it should become clear why there is this requirement that the Primary Key values never change. If the Primary Key is a Natural Key that can possibly change there are additional requirements that need to be implemented for this solution to work. I will discuss these additional requirements in a future posting.

Let's take a look a simple Transaction Table with a Surrogate Key as the Primary Key:

CREATE TABLE dbo.Customer
(
CustomerID int NOT NULL IDENTITY (1, 1),
LastName varchar(50) NOT NULL,
FirstName varchar(50) NOT NULL,
Email varchar(50) NOT NULL
)
GO
ALTER TABLE dbo.Customer ADD CONSTRAINT PK_Customer PRIMARY KEY (CustomerID)
GO

History Table


Following is what we want to end up for the History Table for the above Customer Transaction Table:

CREATE TABLE dbo.History_Customer
(
ChangeDateTime datetime NULL,
ChangeType char(1) NULL,
ChangeLogin nvarchar(128) NULL,
CustomerID int NULL,
LastName varchar(50) NULL,
FirstName varchar(50) NULL,
Email varchar(50) NULL
)

Following is a listing of the main characteristics of the new History Table:
  • No Primary Key - the History Table is effectively a log. Uniquely identifying each row in the History Table is less critical than the capturing of the data change history. If you must have a Primary Key in the History Table I will describe how to implement this in a future posting.
  • All Columns Allow NULLs - again, the History Table is effectively a log. Allowing NULLs is the more permissive (that is, less restrictive). If future data structure change in the Transaction Table where a column changes from No NULLs to Allow NULLs, the History table will still work properly.
  • New Column: ChangeDateTime - this column will store the date and time of the INSERT, UPDATE, or DELETE operation.
  • New Column: ChangeType - this column will store what operation occurred on a particular row. The possible values and their corresponding operations follow:
    • I - INSERT
    • D - DELETE
    • U - UPDATE
  • New Column: ChangeLogin - this column stores the login (SQL Server or Windows) that issued the INSERT, UPDATE, or DELETE command.
Using SSMS (SQL Server Management Studio) you can generate the CREATE TABLE script for existing tables. A CREATE TABLE script for an existing Transcation Table could be modified to add the 3 new columns, remove the IDENTITY definition for the Primary Key, and convert any No NULLs columns to Allow NULLs. This is probably the preferred method if the underlying Transaction Table is empty and has no existing or legacy data. Legacy data is a bit more problematic to handle. The next paragraph discusses the recommend steps to take to account for legacy data in your Transacation Table. The History Table creation trick used to account for the legacy data also works if the Transaction Table is empty so it may become your default way of creating History Tables.

RDBMSes have a command to create a new table from a SELECT statement. In SQL Server the command is the SELECT INTO command. In other RDBMSes such as Oracle and MySQL the command is a bit clearer. The command is the CREATE TABLE AS SELECT command. I personnally would like Microsoft to add the CREATE TABLE AS SELECT command as an alternative to the SELECT INTO.

If you created the History_Customer table above go ahead and drop it. Run the following script to create the History_Customer History Table using the SELECT INTO command:

select GETDATE() AS [ChangeDateTime],
'I' AS [ChangeType],
SYSTEM_USER AS [ChangeLogin],
*
INTO dbo.History_Customer
FROM dbo.Customer

Next in SSMS open the new History_Customer table in Design mode and remove any and all IDENTITY definitions (the CustomerID column) and change all the No NULLs to Allow NULLs.

If your Transaction Table was empty the associated new History Table will also be empty. Conversely, if your Transaction Table contained legacy data the associated new History Table contains one row for every row in the Transaction Table. Because you cannot create the history of your Transaction Table out of thin air, we are "pretending" that they were all created (ChangeType equals I for INSERT) at the moment the History Table was created. An alternative to choosing "now" or GETDATE() as the value of the ChangeDateTime column for this legacy data is to choose some arbitrary date (for example, 01/01/1999) as the "magic" date of the History Table rows. It is critical the the legacy data be accounted for within the History Table. As you will see in the Triggers section below, the UPDATE trigger only captures the new values - the old/original values prior to the UPDATE are obtained from either when a row was INSERTed or when the last UPDATE occurred.

Three History Triggers


Important Note: Please recall that within SQL Server INSERT, UPDATE, and DELETE triggers are set operations and NOT a per row operation.

INSERT Trigger:

CREATE TRIGGER tr_History_Customer_Insert
ON dbo.Customer
FOR INSERT
AS
BEGIN
INSERT INTO History_Customer
SELECT GETDATE() AS [ChangeDateTime],
'I' AS [ChangeType],
SYSTEM_USER AS [ChangeLogin],
*
FROM inserted
END

UPDATE Trigger:

CREATE TRIGGER tr_History_Customer_Update
ON dbo.Customer
FOR UPDATE
AS
BEGIN
INSERT INTO History_Customer
SELECT GETDATE() AS [ChangeDateTime],
'U' AS [ChangeType],
SYSTEM_USER AS [ChangeLogin],
*
FROM inserted
END

DELETE Trigger:

CREATE TRIGGER tr_History_Customer_Delete
ON dbo.Customer
FOR DELETE
AS
BEGIN
INSERT INTO History_Customer
SELECT GETDATE() AS [ChangeDateTime],
'D' AS [ChangeType],
SYSTEM_USER AS [ChangeLogin],
*
FROM deleted
END

These three separate triggers could be consolidated into a single trigger with additional logic, but I prefer keeping them separate.

Testing


At this point you can test the operation of the new History Table and three triggers by performing various INSERTs, UPDATEs, and DELETEs against your Transaction Table and then querying the History Table to confirm the results match expectations. It is useful to sort by the ChangeDateTime column in descending order and limiting your query to a single Primary Key value. This will allow you to see the full data change history for a single row in your Transaction Table. Following is a simple query that does just that.

SELECT *
FROM dbo.History_Customer
WHERE CustomerID = 67
ORDER BY ChangeDateTime DESC

Concerns and Considerations


Following is a listing with descriptions of possible concerns and considerations to take into account if implementing this solution in a production environment (in future posting I will try and provide possible solutions/workarounds to some of these issues):
  • Transaction Table Structural Changes (Very Important!) - Most changes to the Transaction Table (new column, column name change, significant data type/size change, etc.) will cause the triggers to fail and thus the INSERT/UPDATE/DELETE operation against the Transaction Table will be rolled back (An error will be thrown. If the error is not properly handled, data could be lost). The Transaction Table structure and the History Table structure *MUST* be kept in synch. If a column is added to the Transaction Table is also must be added at the same time to the History Table.
  • Storage Requirements - depending upon the transaction rate the History Table(s) could grow very very rapidly and consume/require a great deal of hard drive space.
  • History Table Query Performance - again depending upon the transaction rate the History Table(s) could be very large. The more rows in a table the slower queries are likely to run. Adding the appropriate indexes will likely improve History Table query performance, but this will also increase the storage requirements.
  • INSERT/UPDATE/DELETE Transaction Speed - If your system requires fast transaction speed this solution may be problematic - the additional logic being executed within the triggers could possibly harm your transaction speed.
There are possibly other concerns and considerations associated with this possible solution. If I become aware of additional issues I will update this post.

Conclusion


In this post I have provided a possible simple solution to capturing table data change history. As mentioned within the post I plan to provide additional details and expand on some of the topics covered. In addition, I have some ideas about other related posts including selective history tracking (possibly capturing history only when a status flag changes, for example), possibly using either the INFORMATION_SCHEMA or system views to develop a database-wide implementation of this solution, plus possibly implementing some sort of WinForms or Web solution for building the History Tables and Triggers and easy viewing of the data change history. We shall see.

Thank you for your time in reading this article. If you have any corrections, questions, ideas, or criticism please send them my way. Thanks!

Permalink | Comments (0) | Post RSSRSS comment feed

Comments