Note: Full script is attached below (DBTickets.zip).
My best friend, who is an expert web developer in ASP.NET/C#/SharePoint, asked me for help on a SQL Server problem. Together we came up with some possible solutions. In this post I describe some of the things we discussed. My hope is that someone else can either learn something from this write-up or share their own solution ideas or variations.
Please note that names have been changed to protect the innocent <grin>.
My friend works for a very large company. They have an IT Help Ticket system with a SQL Server 2005 database backend. The organization's accounting department requires all help tickets to be assigned to the correct organization within the company so that cost can be correctly allocated. Each organization within the company (Section and/or Department) has a unique Organization Code. The company is made up of Sections and Sections are made up of Departments. My friend's problem was he needed help writing a query or queries to extract out Total Ticket Counts by Section and Total Ticket Counts by Section and Department for a given Date Range. He also needs to get it into a nice pretty format where executives/management have visibility of the queries results. Following are some additional details and possible solutions.
The Help Ticket database contains a column called ChargeCode that contains data the looks like the following: SSSS-DDDD. The Ss and Ds can be any character A to Z and any number 0 to 9. The first four Ss make up the Orgnization Code for the Section. The second four Ds make up the Organization Code for the Department. An example ChargeCode might be something like "1111-1234".
Note: In this example I will use just numbers for the Organization Code for simplicity's sake. In this case the Section is "1111" and the Department is "1234". Normally, ChargeCode's fit this pattern of "Section"-"Department". However there is one variation that needs to be discussed. Sections have employees and IT resources not part of a Department. When someone at the Section (not Department level) makes a request for help, the ChargeCode looks like "1111-1111". When this occurs, I call it "Section"-"Self" (There probably is a better was to classify this situation but I can't think of it right now <grin>).
So here is where I say, "Ugh! Non-normalized data!" I love it when a single column is used for multiple values - NOT! Before tackling this problem we need to create a dummy version of the Tickets table and populate it with dummy data.
Following is a script that creates a test database named DBTickets and a table named Tickets. The Tickets table only includes the columns relevant to this problem - columns corresponding to the name of the person who made the request, problem description, etc. are not included for simplicity's sake.
USE master;
GO
IF EXISTS (SELECT d.[name] FROM sys.databases d WHERE d.[name] = 'DBTickets')
DROP DATABASE DBTickets;
CREATE DATABASE DBTickets;
GO
USE DBTickets;
GO
CREATE TABLE Tickets
(
id INT IDENTITY(1, 1) PRIMARY KEY,
ChargeCode VARCHAR(9) NOT NULL,
TicketDate DATETIME NOT NULL DEFAULT GETDATE()
);
Following is a script that populates the newly created Tickets table with 1000 sample Help Desk Tickets over the time period 12/01/2009 to 12/07/2009. I created three different Sections each with a few Departments. Notice that the Organization Codes for both Sections and Departments are pulled from the same pool. In other words, Organization Codes for both Sections and Departments are globablly unique. Understanding how this sample dummy data is generated is not a core part of this problem - you can either ignore this section or study it if you wish - the end results are what matter. If you are interested in how the following script works, please note that there are likely better ways to accomplish the same end result!
SET NOCOUNT ON;
GO
CREATE TABLE DataLoad1
(
id INT PRIMARY KEY,
ChargeCode VARCHAR(9) NOT NULL
);
INSERT DataLoad1 (id, ChargeCode) VALUES (0, '0001-0001'); -- Section_0001 (SELF)
INSERT DataLoad1 (id, ChargeCode) VALUES (1, '0001-0003'); -- Section_0001 Department_0003
INSERT DataLoad1 (id, ChargeCode) VALUES (2, '0001-0004'); -- Section_0001 Department_0004
INSERT DataLoad1 (id, ChargeCode) VALUES (3, '0002-0002'); -- Section_0002 (SELF)
INSERT DataLoad1 (id, ChargeCode) VALUES (4, '0002-0005'); -- Section_0002 Department_0005
INSERT DataLoad1 (id, ChargeCode) VALUES (5, '0002-0006'); -- Section_0002 Department_0006
INSERT DataLoad1 (id, ChargeCode) VALUES (6, '0007-0007'); -- Section_0007 (SELF)
INSERT DataLoad1 (id, ChargeCode) VALUES (7, '0007-0008'); -- Section_0007 Department_0008
INSERT DataLoad1 (id, ChargeCode) VALUES (8, '0007-0009'); -- Section_0007 Department_0009
INSERT DataLoad1 (id, ChargeCode) VALUES (9, '0007-0010'); -- Section_0007 Department_0010
CREATE TABLE DataLoad2
(
id1 INT,
id2 INT
);
DECLARE @i INT;
SET @i = 0;
DECLARE @RandomNumber int;
DECLARE @RandomNumber2 int;
WHILE (@i < 1000)
BEGIN
SET @RandomNumber = RAND() * 10;
SET @RandomNumber2 = RAND() * 7;
INSERT DataLoad2 (id1, id2) VALUES (@RandomNumber, @RandomNumber2);
SET @i = @i + 1;
END;
INSERT Tickets (ChargeCode, TicketDate)
SELECT D1.ChargeCode,
CAST('12/01/2009' AS DATETIME) + D2.id2 TicketDateTime
FROM DataLoad1 D1 JOIN DataLoad2 D2
ON D1.id = D2.id1
ORDER BY CAST('12/01/2009' AS DATETIME) + D2.id2;
DROP TABLE DataLoad1;
DROP TABLE DataLoad2;
SET NOCOUNT OFF;
GO
Now the queries of the dummy Tickets table containing the dummy data starts.
First select all the records to confirm that things look OK. There should be a mix a Tickets over the date range 12/01/2009 to 12/07/2009 assigned to various Section-Department possibilites, including some that fit the Section-Self format (for example, 0007-0007). When we start to GROUP BY Section and Department we have to do some kludgy SUBSTRING stuff that I am not a big fan of - here is that "Ugh! Non-normalized data!" problem again.
SELECT T.*
FROM Tickets T
ORDER BY T.TicketDate;
The next SELECT returns the total Ticket Count for the given time period - in this case for just 12/01/2009.
DECLARE @StartDate DATETIME;
SET @StartDate = '12/01/2009 00:00:00';
DECLARE @EndDate DATETIME;
SET @EndDate = '12/01/2009 23:59:59';
SELECT COUNT(*) TicketCount
FROM Tickets T
WHERE T.TicketDate BETWEEN @StartDate AND @EndDate;
Total Ticket Count By Section for the given time period.
DECLARE @StartDate DATETIME;
SET @StartDate = '12/01/2009 00:00:00';
DECLARE @EndDate DATETIME;
SET @EndDate = '12/01/2009 23:59:59';
SELECT SUBSTRING(T.ChargeCode, 1, 4) Section,
COUNT(*) TicketCount
FROM Tickets T
WHERE T.TicketDate BETWEEN @StartDate AND @EndDate
GROUP BY SUBSTRING(T.ChargeCode, 1, 4)
ORDER BY SUBSTRING(T.ChargeCode, 1, 4);
Total Ticket Count By Section and Department for the given time period. When Department matches Section, Department is changed to "SELF".
DECLARE @StartDate DATETIME;
SET @StartDate = '12/01/2009 00:00:00';
DECLARE @EndDate DATETIME;
SET @EndDate = '12/01/2009 23:59:59';
SELECT SUBSTRING(T.ChargeCode, 1, 4) Section,
CASE WHEN SUBSTRING(T.ChargeCode, 1, 4) = SUBSTRING(T.ChargeCode, 6, 4) THEN 'SELF'
ELSE SUBSTRING(T.ChargeCode, 6, 4) END Department,
COUNT(*) TicketCount
FROM Tickets T
WHERE T.TicketDate BETWEEN @StartDate AND @EndDate
GROUP BY SUBSTRING(T.ChargeCode, 1, 4),
CASE WHEN SUBSTRING(T.ChargeCode, 1, 4) = SUBSTRING(T.ChargeCode, 6, 4) THEN 'SELF'
ELSE SUBSTRING(T.ChargeCode, 6, 4) END
ORDER BY SUBSTRING(T.ChargeCode, 1, 4),
CASE WHEN SUBSTRING(T.ChargeCode, 1, 4) = SUBSTRING(T.ChargeCode, 6, 4) THEN 'SELF'
ELSE SUBSTRING(T.ChargeCode, 6, 4) END
The next query uses the WITH ROLLUP clause to consolidate the above three queries into one.
DECLARE @StartDate DATETIME;
SET @StartDate = '12/01/2009 00:00:00';
DECLARE @EndDate DATETIME;
SET @EndDate = '12/01/2009 23:59:59';
SELECT ISNULL(X.Section, 'ALL') Section,
CASE WHEN ISNULL(X.Department, 'ALL') = X.Section THEN 'SELF'
ELSE ISNULL(X.Department, 'ALL') END Department,
COUNT(*) TicketCount
FROM (SELECT SUBSTRING(T.ChargeCode, 1, 4) Section,
SUBSTRING(T.ChargeCode, 6, 4) Department
FROM Tickets T
WHERE T.TicketDate BETWEEN @StartDate AND @EndDate) X
GROUP BY Section, Department WITH ROLLUP
The above four queries work but I don't like how they are implemented. I am not a big fan of mingling of what I call "Cleanup Code" with "Functional Code" in my queries. The breaking out using SUBSTRING of the ChargeCode into Section and Department are what I call "Cleanup Code". The other stuff that actually turns the raw data into meaningful information is what I call "Functional Code" - I prefer to keep them separate if possible. There are several possible options to solving this problem. Following are some of the ideas: 1) including redoing the Help Ticket System and normalizing the Tickets table properly so that Section and Department are two separate columns (chance of doing this approaches negative infinity
), 2) involves indexed persisted computed columns on the Tickets table, 3) involves new indexed columns for Section and Department and INPUT/UPDATE triggers on the Tickets table, and 4) an indexed view on the Tickets table.
My favorite solution and the solution that has the least impact on existing database structure is option #4 - an indexed view on the Tickets table that normaizes the data in the ChargeCode column and gives us a Section column and a Department column. Plus because these new view columns are indexed, query performance will be enhanced.
Following is the code to create the indexed view on the Tickets table.
SET ANSI_NULLS ON;
SET ANSI_PADDING ON;
SET ANSI_WARNINGS ON;
SET ARITHABORT ON;
SET CONCAT_NULL_YIELDS_NULL ON;
SET QUOTED_IDENTIFIER ON;
SET NUMERIC_ROUNDABORT OFF;
GO
CREATE VIEW vwTickets WITH SCHEMABINDING
AS
SELECT T.id,
T.ChargeCode,
SUBSTRING(T.ChargeCode, 1, 4) Section,
SUBSTRING(T.ChargeCode, 6, 4) Department,
T.TicketDate
FROM dbo.Tickets t;
GO
CREATE UNIQUE CLUSTERED INDEX UIX_vwTickets_id ON vwTickets(id);
CREATE INDEX IX_vwTickets_Section ON vwTickets(Section);
CREATE INDEX IX_vwTickets_Department ON vwTickets(Department);
Now that the "Cleanup Code" is fixed, let's redo the above queries where the focus is the "Functional Code". This code is cleaner and should be easier to understand. And should run faster because of the indexed view.
SELECT T.*
FROM vwTickets T
ORDER BY T.TicketDate;
DECLARE @StartDate DATETIME;
SET @StartDate = '12/01/2009 00:00:00';
DECLARE @EndDate DATETIME;
SET @EndDate = '12/01/2009 23:59:59';
-- Returns Total Ticket Count
SELECT COUNT(*) TicketCount
FROM vwTickets T
WHERE T.TicketDate BETWEEN @StartDate AND @EndDate;
-- Returns Total Ticket Count by Section
SELECT T.Section,
COUNT(*) TicketCount
FROM vwTickets T
WHERE T.TicketDate BETWEEN @StartDate AND @EndDate
GROUP BY T.Section
ORDER BY T.Section;
-- Returns Total Ticket Count by Section and Department
SELECT T.Section,
CASE WHEN T.Section = T.Department THEN 'SELF'
ELSE T.Department END Department,
COUNT(*) TicketCount
FROM vwTickets T
WHERE T.TicketDate BETWEEN @StartDate AND @EndDate
GROUP BY T.Section,
CASE WHEN T.Section = T.Department THEN 'SELF'
ELSE T.Department END
ORDER BY T.Section,
CASE WHEN T.Section = T.Department THEN 'SELF'
ELSE T.Department END
-- Consolidated using WITH ROLLUP clause
SELECT ISNULL(T.Section, 'ALL') Section,
CASE WHEN ISNULL(T.Department, 'ALL') = T.Section THEN 'SELF'
ELSE ISNULL(T.Department, 'ALL') END [Department],
COUNT(*) TicketCount
FROM vwTickets T
WHERE T.TicketDate BETWEEN @StartDate AND @EndDate
GROUP BY Section, Department WITH ROLLUP
Ok. Things are starting to work and look ok. My friend throws in another wrinkle. They want to translate the Organization Code into Organization Name. Shouldn't be that difficult
First, let's dummy an Organization table and populate it with dummy data.
SET NOCOUNT ON;
GO
CREATE TABLE Organization
(
code varchar(4) PRIMARY KEY,
name varchar(50) NOT NULL
);
INSERT Organization (code, name) VALUES ('0001', 'Section_0001');
INSERT Organization (code, name) VALUES ('0002', 'Section_0002');
INSERT Organization (code, name) VALUES ('0003', 'Section_0001 Department_0003');
INSERT Organization (code, name) VALUES ('0004', 'Section_0001 Department_0004');
INSERT Organization (code, name) VALUES ('0005', 'Section_0002 Department_0005');
INSERT Organization (code, name) VALUES ('0006', 'Section_0002 Department_0006');
INSERT Organization (code, name) VALUES ('0007', 'Section_0007');
INSERT Organization (code, name) VALUES ('0008', 'Section_0007 Department_0008');
INSERT Organization (code, name) VALUES ('0009', 'Section_0007 Department_0009');
INSERT Organization (code, name) VALUES ('0010', 'Section_0007 Department_0010');
GO
SET NOCOUNT OFF;
Let's rewrite the queries that use the indexed view vwTickets and integrate with this new Organization lookup table to translate the Section and Department codes into Oranization Names.
DECLARE @StartDate DATETIME;
SET @StartDate = '12/01/2009 00:00:00';
DECLARE @EndDate DATETIME;
SET @EndDate = '12/01/2009 23:59:59';
-- Returns Total Ticket Count by Section (Organization Name)
SELECT Org.name Section,
COUNT(*) TicketCount
FROM vwTickets T JOIN Organization Org
ON T.Section = Org.code
WHERE T.TicketDate BETWEEN @StartDate AND @EndDate
GROUP BY Org.name
ORDER BY Org.name;
-- Returns Total Ticket Count by Section and Department
SELECT Org1.name Section,
CASE WHEN T.Section = T.Department THEN 'SELF'
ELSE Org2.name END Department,
COUNT(*) TicketCount
FROM vwTickets T JOIN Organization Org1
ON T.Section = Org1.code
JOIN Organization Org2
ON T.Department = Org2.code
WHERE T.TicketDate BETWEEN @StartDate AND @EndDate
GROUP BY Org1.name,
CASE WHEN T.Section = T.Department THEN 'SELF'
ELSE Org2.name END
ORDER BY Org1.name,
CASE WHEN T.Section = T.Department THEN 'SELF'
ELSE Org2.name END
-- Consolidated using WITH ROLLUP clause
-- Something is going on with this query where
-- the NULLS are not getting converted to ALLs.
-- Something to do with the WITH ROLLUP clause.
-- Will research.
SELECT ISNULL(Org1.name, 'ALL') Section,
CASE WHEN ISNULL(Org2.name, 'ALL') = Org1.name THEN 'SELF'
ELSE ISNULL(Org2.name, 'ALL') END Department,
COUNT(*) TicketCount
FROM vwTickets T JOIN Organization Org1
ON T.Section = Org1.code
JOIN Organization Org2
ON T.Department = Org2.code
WHERE T.TicketDate BETWEEN @StartDate AND @EndDate
GROUP BY ISNULL(Org1.name, 'ALL'),
CASE WHEN ISNULL(Org2.name, 'ALL') = Org1.name THEN 'SELF'
ELSE ISNULL(Org2.name, 'ALL') END WITH ROLLUP
I am tempted to go back and add the Organization Names for Section and Department to the indexed view vwTickets [When I attempted to do this I got a SQL Server error on the CREATE UNIQUE CLUSTERED INDEX that the view contains a self join. #fail] . This would denormalize the data even further making things even easier to understand and likely improve performance further.
The next step is to take the results from these various queries and turn them into some kind a pretty report for executives and management. That will be Part 2 of this post to follow soon!
DBTickets.zip (1.49 kb)