post by scott
Had an interesting, yet all too common situation last week whilst looking to tune an integration solution built on BizTalk Server. The LOB system we were integrating with had used SQL Server as it’s backend as was performing poorly…real poorly. We were seeing up to 15 sec to grab a single customer from the database!
Upon looking at the DB schema I found all the tables defined something like this:
With just an internal ID as the PK and a single column to store segments of xml data. What made this design “interesting” was the lack of any indexing on the DB. Capturing the queries being issued from the LOB system we had something like:
SELECT *
FROM [dbo].[Customers]
WHERE XmlData.exist
(' declare namespace ns0="http://BizTalkService.Customer"; /ns0:Customer[CustomerID=' declare namespace ns0="http://BizTalkService.Customer"; /ns0:Customer[CustomerID=''F001998A-E367-4B34-B630-3A70A91CA0BD''] ' ') = 1
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 16 ms.
(1 row(s) affected)
Table 'Customers'. Scan count 1, logical reads 33459, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 4914 ms, elapsed time = 4981 ms.
CREATE FUNCTION udf_Get_CustomerID(@xmldata XML)
RETURNS nvarchar(50)
WITH SCHEMABINDING
BEGIN
RETURN @xmldata.value
(
'declare namespace ns0="http://BizTalkService.Customer"; (/ns0:Customer/CustomerID)[1]'
(/ns0:Customer/CustomerID)[1]',
'nvarchar(50)'
)
END
GO
ALTER TABLE Customers
ADD CustomerID AS dbo.udf_Get_CustomerID(XmlData) PERSISTED
GO
CREATE INDEX ix_CustomerID ON Customers(CustomerID);
GO
SELECT *
FROM [dbo].[Customers]
WHERE CustomerID='F001998A-E367-4B34-B630-3A70A91CA0BD';
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 4 ms.
(1 row(s) affected)
Table 'Customers'. Scan count 1, logical reads 6, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 15 ms, elapsed time = 1 ms.
Of course not all scenarios can rely on the client app changing its behaviour and issuing an optimised query once it see’s proper indexing. If this describes your situation you maybe interested in Selective Xml Indexes. In this article Seth Delconte writes about the new SQL Server 2012 SP1 feature that doesn't require the client to change.
FROM [dbo].[Customers]
WHERE XmlData.exist
(' declare namespace ns0="http://BizTalkService.Customer"; /ns0:Customer[CustomerID=' declare namespace ns0="http://BizTalkService.Customer"; /ns0:Customer[CustomerID=''F001998A-E367-4B34-B630-3A70A91CA0BD''] ' ') = 1
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 16 ms.
(1 row(s) affected)
Table 'Customers'. Scan count 1, logical reads 33459, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 4914 ms, elapsed time = 4981 ms.
CREATE FUNCTION udf_Get_CustomerID(@xmldata XML)
RETURNS nvarchar(50)
WITH SCHEMABINDING
BEGIN
RETURN @xmldata.value
(
'declare namespace ns0="http://BizTalkService.Customer"; (/ns0:Customer/CustomerID)[1]'
(/ns0:Customer/CustomerID)[1]',
'nvarchar(50)'
)
END
GO
ALTER TABLE Customers
ADD CustomerID AS dbo.udf_Get_CustomerID(XmlData) PERSISTED
GO
CREATE INDEX ix_CustomerID ON Customers(CustomerID);
GO
SELECT *
FROM [dbo].[Customers]
WHERE CustomerID='F001998A-E367-4B34-B630-3A70A91CA0BD';
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 4 ms.
(1 row(s) affected)
Table 'Customers'. Scan count 1, logical reads 6, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 15 ms, elapsed time = 1 ms.
Of course not all scenarios can rely on the client app changing its behaviour and issuing an optimised query once it see’s proper indexing. If this describes your situation you maybe interested in Selective Xml Indexes. In this article Seth Delconte writes about the new SQL Server 2012 SP1 feature that doesn't require the client to change.
These were taking anywhere from 1 sec up to 15 seconds. The execution plan was confirming what I’m sure you are already thinking: table scans! Simulating this query against our Customers table above with 100K rows gave us a baseline of around 5 sec to extract a single row.
Instead of xml indexing, which brings substantial storage and maintenance overhead, promoted columns were looked at. We were told by the LOB vendor that the client application optimised the queries if it found indexes defined on the table. Indexed columns would be used instead of xpath queries when they existed. But how do we get indexed columns from the table structure we had?
Here is the gist of the approach:
First, Identify the common xpath expressions being used which look appropriate to optimise. In our example this might be the CustomerID lookup. Then we create a UDF to grab this value from the xml segment stored in the row. A computed or promoted column is created using the PERSISTED keyword and an index created.
Now we can modify our query, as the LOB client did, to use the new indexed column and dramatically increase he performance of the types of queries.
Giving us stats of:
So we went from 4981 ms to 1 ms!
Obviously we need to consider the overhead this creates for write operations as we should when considering any indexing design. Only optimise the most costly queries that get issued most frequently. Overkill can lead to poorer performance as more resources are required to maintain the indexes. During our subsequent testing we didn't measure any noticeable impact. Further testing will quantify this for us in the coming weeks.
No comments:
Post a Comment
Post Your Comment...