SQL – Common Table Expressions (CTE) and HierarchyId

SQL Server 2008/R2/2012 offers hierarchyid data type that can be used for handling hierarchical data. In this blog I will try to take you down the rabbit hole and explore common table expressions and hierarchyid.

Quite often we have  to deal with hierarchical or recursive data. e.g.

Product Category (Entertainment Device)

     Product Sub-category (TV)

         Product Sub-sub-category (LCD)

              Product Sub-sub-sub-category (HD)

                      …..

I guess you get the idea.

 

Another example would be Employee – Manager relationship where every manager is an employee (but not every employee is a manager). Here’s a sample diagram:

image

So, how do we store this data in our table and then how do we query it? We have two options:

1. Common Table Expressions – CTE

This works in older versions of SQL Server 2005 and also the newer versions like 2008/R2, 2012. We use the traditional data types like int etc. to handle relationship.

Let’s first create the table so we can store the records and then we will query it.

   1:  CREATE TABLE dbo.Employee
   2:   ( 
   3:     EmployeeId int NOT NULL IDENTITY(1,1) PRIMARY KEY, 
   4:     Name varchar(50) NOT NULL, 
   5:     ManagerId int 
   6:  ) 
   7:  GO

 

Next step is to populate the table with the data (note: I am using same data that I have in my diagram)

   1:  INSERT INTO dbo.Employee (Name, ManagerId) 
   2:  VALUES 
   3:  ('CEO', NULL), 
   4:  ('Sales Manager', 1), 
   5:  ('Accounts Manager', 1), 
   6:  ('Senior Sales Person 1', 2), 
   7:  ('Senior Sales Person 2', 2), 
   8:  ('Senior Accounts Person 1', 3), 
   9:  ('Senior Accounts Person 2', 3), 
  10:  ('Junior Sales Person 1', 4), 
  11:  ('Junior Sales Person 2', 5), 
  12:  ('Junior Accounts Person 1', 6), 
  13:  ('Junior Accounts Person 2', 7) 

Now we are ready to use common table expressions to query the table.

   1:  --Finding descendants
   2:  WITH EmployeeDescendants(EmployeeId, Name, ManagerId) 
   3:  AS 
   4:  ( 
   5:  select EmployeeId, Name, ManagerId from dbo.Employee 
   6:  WHERE EmployeeId = 1 -- this is parameter
   7:  union all 
   8:  select m.EmployeeId, m.Name, m.ManagerId 
   9:  from EmployeeDescendants CTE,dbo.Employee m 
  10:  WHERE CTE.EmployeeId = m.ManagerId 
  11:  ) 
  12:  SELECT EmployeeId, Name, ManagerId 
  13:  FROM EmployeeDescendants 
  14:  ORDER BY EmployeeId
  15:   
  16:  --Finding ancestors
  17:   
  18:  WITH EmployeeAncestor(EmployeeId, Name, ManagerId) 
  19:  AS 
  20:  ( 
  21:  select EmployeeId, Name, ManagerId from dbo.Employee 
  22:  WHERE EmployeeId = 11 -- this is parameter
  23:  union all 
  24:  select m.EmployeeId, m.Name, m.ManagerId 
  25:  from EmployeeAncestor CTE,dbo.Employee m 
  26:  WHERE CTE.ManagerId = m.EmployeeId 
  27:  ) 
  28:  SELECT EmployeeId, Name, ManagerId 
  29:  FROM EmployeeAncestor 
  30:  ORDER BY EmployeeId ASC
  31:   

Common Table Expressions aren’t too bad, eh. You could easily convert the above script into a stored procedure and pass arguments to parameters.

Query Execution Plan for finding descendant nodes looks like this (Note: Plan for finding ancestors is similar):

image

Ouch, lots is going on in the query execution plan.

2. HierarchyId

This won’t work in SQL Server 2005. You must have newer versions like 2008/R2, 2012. We use the newer managed data type called “hierarchyid” to handle relationship.

Let’s first create the table so we can store the records and then we will query it.

   1:  CREATE TABLE [dbo].[EmployeeNew]
   2:  ( 
   3:      [EmployeeId] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY, 
   4:      [Name] [varchar](50) NOT NULL, 
   5:      [ManagerId] [int] NULL, 
   6:      [OrgNode] [hierarchyid] NULL
   7:  )
   8:   
   9:  GO 

Next step is to populate the table with sample records

   1:  INSERT INTO dbo.EmployeeNew(Name, ManagerId, OrgNode) 
   2:  VALUES 
   3:  ('CEO', NULL, hierarchyid::Parse('/')), 
   4:  ('Sales Manager', 1, hierarchyid::Parse('/1/')), 
   5:  ('Accounts Manager', 1, hierarchyid::Parse('/2/')), 
   6:  ('Senior Sales Person 1', 2, hierarchyid::Parse('/1/1/')), 
   7:  ('Senior Sales Person 2', 2, hierarchyid::Parse('/1/2/')), 
   8:  ('Senior Accounts Person 1', 3, hierarchyid::Parse('/2/1/')), 
   9:  ('Senior Accounts Person 2', 3, hierarchyid::Parse('/2/2/')), 
  10:  ('Junior Sales Person 1', 4, hierarchyid::Parse('/1/1/1/')), 
  11:  ('Junior Sales Person 2', 5, hierarchyid::Parse('/1/2/1/')), 
  12:  ('Junior Accounts Person 1', 6, hierarchyid::Parse('/2/1/1/')), 
  13:  ('Junior Accounts Person 2', 7, hierarchyid::Parse('/2/2/1/')) 
     

Time to test the table:

   1:  --display all records 
   2:  select orgnode.ToString() AS OrgNodeString,orgnode.GetLevel()  Level,* 
   3:  from dbo.EmployeeNew 
   4:   
   5:   
   6:  --getting root 
   7:  DECLARE @root hierarchyid = hierarchyid::GetRoot(); 
   8:  SELECT @root, @root.ToString()
   9:   
  10:  --employee we are interested in 
  11:  DECLARE @node hierarchyid 
  12:  SELECT @node = OrgNode FROM dbo.EmployeeNew 
  13:  WHERE Name = 'Sales Manager'; 
  14:  SELECT @node, @node.ToString()
  15:   
  16:  --get ancestors 
  17:  SELECT * FROM dbo.EmployeeNew WHERE @node.IsDescendantOf(orgnode) = 1;
  18:   
  19:  --get descendants 
  20:  SELECT * FROM dbo.EmployeeNew WHERE OrgNode.IsDescendantOf(@node) = 1;
  21:   
  22:  --get by level (junior employees) 
  23:  SELECT * FROM dbo.EmployeeNew 
  24:  WHERE OrgNode.GetLevel() = 3;

And the query execution plan for finding descendants looks like this (Note: query execution plan for finding ancestors is similar):

image

That query execution plan is sweet and simple, eh. But, don’t let it mislead you. Under the hood there’s still lots going on. Bulk of the work is getting performed by the managed functions like IsDescendantOf, GetLevel etc.

HierarchyId definitely reduces the lines of code required to query the hierarchical or recursive data.

 

Conclusion

Use hierarchyid if you have newer SQL Server like 2008 and onwards. It’s designed just to tackle hierarchical scenarios.

Use CTE if you have older SQL Server like 2005. However, CTE is a generic query pattern which means it can be used even when we don’t have hierarchical data.

Power View – Highlight, Filter and Slicer

SQL Server 2012 brings Power View (aka. Project Crescent) which supports highlight filter and slicers. They are pretty much similar to PowerPivot. Let’s discuss when these features should be used

Let’s take the following scenario where we have a table as well a chart on the same Power View report:

image

Highlight:

It let’s you focus on the interested data. If you highlight a table it won’t do much except for highlighting the cell in the table.

image

But, if you highlight a bar in the chart then it filters the table control automatically

image

Filter:

Filter can be utilized at individual control level e.g. table, matrix, card, chart etc. It cuts down on the data that’s being displayed on the Power View canvas. Filter at control level doesn’t effect other controls on the canvas.

Here’s the filter applied on table (Note: applying filter on chart will have similar effect):

image

One benefit of filter is that it has it’s own section (the right hand side window which can be closed or opened) where it gets displayed so it doesn’t consume real estate on the screen, well at least not all the time. You can choose to display filter pane whenever you require it.

Slicer:

Slicer is a filter but it’s scope is broader than a filter. Slicer filters data at the View level. One drawback of slicer is that it’s always displayed in the Power View canvas which means it reduces the real estate on the screen for controls that we can use.

Here we have a slicer on the right hand side and it filters all the controls on the canvas:

image

 

Summary:

  • Use highlight for showing areas of interest as well for filtering (especially chart control)
  • Use filter when you have to filter individual controls without effecting other controls
  • Use slicer when you have to filter everything on the canvas

Power View – Power to the User

Power Builder, PowerPivot. Let me guess what word is common in these technologies ,“POWER”.  Quite a few products and technologies have used the word “Power” to give a sense of empowerment to the users. And, I guess they deserve to use the name as they have delivered the results. Now there’s another one to join the family and that’s Power View (aka. Project Crescent) in SQL Server 2012.

 

So far Power View seems pretty good. It does empower the end users. Here’s the requirements for using Power View:

  • SharePoint 2010 Enterprise edition with SP1
  • SQL Server 2012 PowerPivot (SharePoint must be configured as well)
  • Silverlight 5 (What?? No HTML 5 or AJAX?? Sad smile)

 

Yes, you read it right. It’s a Silverlight 5 based application and it won’t run on iOS. That’s where Microsoft should have offered AJAX alternative as well so those who don’t have Silverlight plug-in could use it as well. If you could just come over this one itsy bitsy drawback Power View is definitely an excellent self-servicing BI feature for business users.

As far as data source goes, Power View requires one of the following:

  • Tabular Model (New in SSAS 2012)
  • PowerPivot based Excel report (this could use OLTP, OLAP or Tabular Model)

Here are some of the graphical visualizations that we can create in Power View:

  • Charts (Horizontal and Vertical)

image

  • Scatter Chart (with Play Axis. Simply love the animation for showing breakdown of the data over period of time)

image

  • Charts (with Tiles)

image

 

  • Views (OK, this is not really a visualization since it’s available out-of-the-box but I just got to mention it since it’s soooooo good. Just like Excel has sheets concept, PowerPoint has slides concept the Power View has views concept. It lets you add multiple views of the data in the same report. You could say that it’s Power View’s equivalent to Sub-report in SSRS)

image

Here are the pros, cons and the personal verdict (it’s by no means an exhaustive list of pros and cons)

Pros:

  • Easy to create reports in a matter of minutes
  • No programming / SQL knowledge is required
  • Out-of-the-Box animations and visualizations (Play Axis stole my heart away)
  • Easy to filter reports at control level or at the report level

Cons:

  • Uses Silverlight plug-in
  • Yet to see any support for creating custom extensions or plug-ins to develop custom visualizations and animations

 

Personal Verdict:

Excellent reporting tool for business users and delivers well considering it’s the first iteration of Power View. Definitely looking forward to it’s future iterations.