Florian Reischl

Saturday, September 26, 2009

High Performance String Split Functions

To split a string into pieces is a quiet common task in T-SQL and there are many different solutions to handle this. So, why yet another one? This article should not only show yet another way to split strings, but it tries to compare all the different approaches. Before the competition starts, let me present the participants.

WHILE Loop
Numbers Table
XML
SQL CLR

First I want to show you the different approaches. At the end of the article I'll show you some performance comparisons.

WHILE Loop
If you search the internet for functions to split strings you will primary find solutions based on WHILE loops. Since this is the most common approach we should start with it.

How does the cursor work?

The common topology is to start at the first position of the string and search the first occurrence of the specified separator by using CHARINDEX. Now use SUBSTRING to get the first item from position 1 to the first found separator position; now find the next item starting at the previously found position and so on.
Other versions work with PATINDEX instead of CHARINDEX but this is much slower and not needed for the requirement.

This should be a quiet fast WHILE loop based solution.

IF (OBJECT_ID('tvf_SplitString_Cursor') IS NULL)
   EXECUTE('CREATE FUNCTION tvf_SplitString_Cursor() 
            RETURNS @t TABLE (Id INT) AS 
            BEGIN INSERT INTO @t VALUES(1) 
            RETURN END');
GO
ALTER FUNCTION tvf_SplitString_Cursor
(
   @text NVARCHAR(MAX), 
   @separator NVARCHAR(255)
)
   RETURNS @ReturnData TABLE (Item NVARCHAR(4000))
AS
BEGIN
   DECLARE 
      @pos           INT
      ,@next         INT
      ,@separatorLen INT = LEN(@separator);
   

   -- Get the first occurence to get the item before the first delimiter
   SELECT @pos = CHARINDEX(@separator, @text, 1);
   
   -- Insert the first item
   INSERT INTO @ReturnData
      SELECT SUBSTRING(@text, 1, @pos - 1);
   
   -- Step over the first delimiter
   SELECT @pos = @pos + @separatorLen;

   WHILE (1 = 1)
   BEGIN
      -- Get the next delimiter position from our previous position
      SELECT @next = CHARINDEX(@separator, @text, @pos);
      
      IF (@next = 0) BREAK -- nothing more to do

      -- Insert the next found item
      INSERT INTO @ReturnData
         SELECT SUBSTRING(@text, @pos, @next - @pos);

      -- Step of the delimiter
      SELECT @pos = @next + @separatorLen;
   END

   -- Due to the fact that our cursor only jumps from delimter 
   -- to delimiter the last item would be lost
   INSERT INTO @ReturnData
      SELECT SUBSTRING(@text, @pos, LEN(@text) - 1);

   RETURN;
END

Numbers Table

A Numbers table is a table which contains incremental numbers from 1 to N, where N depends on your requirements to the table. This tables can be used for different solutions, as I showed here.

The following script is based on Jeff Moden's great article The "Numbers" or "Tally" Table: What it is and how it replaces a loop at SQLServerCentral.com. I don't want to say "It's my idea".

How can I use a Numbers table to split a string?

The general idea is to transform the horizontal string into a vertical table where each row contains one character of the source string. The WHERE clause is used to return only the specified separators. Whenever we reach a separator we use SUBSTRING to extract the item from the current position to the next occurrence of the separator. To determine the next position of the separator we use CHARINDEX, just like the WHILE loop based approach.

Here is a sample, how to use a Numbers table to split a string into pieces.

DECLARE 
   @text VARCHAR(100)      = ',a,b,c,'
   ,@separator CHAR(1)     = ',';

SELECT
   -- Extract the current part from specified string
   -- Start position is the current position within the numbers table 
   --    plus the length of the separator
   -- The length is the next position of the separator minus the
   --    current position and the len of the separator
   SUBSTRING(
      @text
      ,t.Num + LEN(@separator)
      ,CHARINDEX(@separator, @text, t.Num + 1) - t.Num - LEN(@separator)
      ) AS Item
FROM dbo.Numbers t
WHERE 
   -- Scan only until the length of the text - 1 
   -- to avoid wrong parameter for SUBSTRING
   t.Num < LEN(@text)
   -- The current item has to match to the specified delimiter
   AND SUBSTRING(@text, t.Num, LEN(@separator)) = @separator

If you want to use a Numbers table to split strings you have to keep two important things in mind. The first thing is, the source string always needs to be enclosed with a leading and trailing separator. This depends on the topology of the approach to get the items between the separators. As second you have to ensure that your Numbers table contains enough rows. The count of rows has to be at least the maximal length of the source strings plus two separators. A way to ensure a Numbers table with enough rows is to use a common table expression (CTE) in this case.

WITH 
n1 (Num) AS (SELECT 1 UNION ALL SELECT 1),   -- 2
n2 (Num) AS (SELECT a.Num FROM n1 a, n1 b),  -- 4
n3 (Num) AS (SELECT a.Num FROM n2 a, n2 b),  -- 16
n4 (Num) AS (SELECT a.Num FROM n3 a, n3 b),  -- 256
n5 (Num) AS (SELECT a.Num FROM n4 a, n4 b),  -- 65536
-- 4.294.967.296
Numbers (Num) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 1)) 
                  FROM n5 a, n5 b)
SELECT
      *
   FROM Numbers
   WHERE Num < 100000;

This provides a Numbers table with up to 4.294.967.296 which fits for string with up to 4GB. XML

Some solutions to split a string are based on a conversion to XML to use XQuery to extract the items. Sounds strange? It is ;-). How can I do that? To split a string using XML first you have to convert the source string into a XML data type and replace the specified separator with XML tags. After that you can use XQuery to get the items within the string. The following script shows how to work with XML to split strings.

DECLARE 
   @text VARCHAR(100)      = 'a,b,c'
   ,@separator CHAR(1)     = ',';

WITH cte (data) AS
(
   SELECT CONVERT(XML, '<y>' 
                       + REPLACE(@text, @separator, '</y><y>') 
                       + '</y>')
)
SELECT
      T.C.value('.', 'nvarchar(2000)')
   FROM cte
      CROSS APPLY cte.data.nodes('y') T(C)

This method should work with most kind of data, but it runs into problems if the source string contains XML language elements like "<", ">" or "&".

SQL CLR

Now the (hopefully) last possibility to split strings into pieces is to use SQL CLR. How to do? Well, there are many different ways to split a string using .NET. The simplest way would be the build in function String.Split() which is available on every instance of a System.String. The most flexible way to split a string with .NET would be a regular expression. Nevertheless we use a non build-in raw function to handle that. Why?? String.Split() is definitely a cool function but we are speaking about a function for a Server and String.Split() causes the complete string twice in memory which can affect the performance when your server in on load. Same depends on regular expressions, which is way to hungry for resources and a functional overkill to just split a string. We use a function which iterates through the complete string and searches for the specified separator. Whenever a separator is reached we use yield return to return the item between the last position of a separator and and the current position. The functionality is quiet alike the WHILE loop approach but enclosed within a compiled .NET assembly. Here is the body of your function.

using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
using System.Collections;


public partial class StoredProcedures
{
   [Microsoft.SqlServer.Server.SqlFunction(
      FillRowMethodName = "FillRowSplitString",
      TableDefinition = "Item nvarchar(4000)",
      IsDeterministic = true,
      SystemDataAccess = SystemDataAccessKind.None
      )
   ]
   public static IEnumerable SplitString(
      [SqlFacet(MaxSize = -1, IsFixedLength=false, IsNullable=false)]
      SqlString text,
      [SqlFacet(MaxSize = 1, IsFixedLength = true, IsNullable = false)]
      SqlString separator
      )
   {
      // init variables
      char s = separator.Value[0];
      string buffer = text.Value;
      int lastIndex = 0;
      int len = 0;
      int i;

      // loop through the string
      for (i = 0; i < buffer.Length; i++)
      {
         // check if the current character matches to the separator
         if (buffer[i] == s)
         {
            // return the item from last and current positions
            yield return buffer.Substring(lastIndex, len);
            lastIndex = i + 1;
            len = 0;
         }
         else
         {
            len++;
         }
      }

      // return the last item
      yield return buffer.Substring(lastIndex, buffer.Length - lastIndex);
   }

   // fill output row
   public static void FillRowSplitString(
      object obj,
      out SqlString item
      )
   {
      item = (string)obj;
   }
};

Processing Performance Now, that we know all possible ways to split a string within T-SQL, let me compare the performance of the different approaches. Since the performance tests depend on the hardware, a short description of my test environment.

CPU	Intel Quad 2.4GHz Q6600
RAM	8 GB
Operating System	Windows Server 2008 x64 SP2
SQL Server	SQL Server 2008 x64 SP2
Network	localhost

It migth be quiet boring to split "a,b,c" by "," and show you zero durations for all tests. So I worked with whole table strings. The columsn of the following table mean:

Item Length: Is the length of the resulting items within the source text.
Item count is the count of resulting items for each row.
Row count is the count of source rows which have to be handled.
Data type is the column type of the source table.

Here you can find a table with some of my test results.

#	Item Length	Item Count	Row Count	Data Type	Method	Duration (ms)
1	10	50	1000	NVARCHAR(4000)
					WHILE	2850
					Numbers	763
					CLR	1246
					XML	27273
2	50	50	100	NVARCHAR(4000)
					WHILE	307
					Numbers	213
					CLR	163
					XML	7190
3	50	50	1000	NVARCHAR(4000)
					WHILE	3853
					Numbers	2133
					CLR	1593
					XML	72116
4	200	19	100	NVARCHAR(4000)
					WHILE	203
					Numbers	293
					CLR	80
					XML	3850
5	200	19	100	NVARCHAR(MAX)
					WHILE	206
					Numbers	773
					CLR	76
					XML	4120
6	10	50	1000	VARCHAR(8000)
					WHILE	2750
					Numbers	643
					CLR	1263
					XML	16673
7	50	50	100	VARCHAR(8000)
					WHILE	340
					Numbers	180
					CLR	163
					XML	2723
8	50	50	1000	VARCHAR(8000)
					WHILE	3086
					Numbers	1726
					CLR	1586
					XML	71693
9	200	19	100	VARCHAR(8000)
					WHILE	153
					Numbers	230
					CLR	80
					XML	3850
10	200	19	100	NVARCHAR(MAX)
					WHILE	156
					Numbers	706
					CLR	80
					XML	4196
11	10	50	10000	VARCHAR(8000)
					WHILE	27610
					Numbers	6396
					CLR	12726
12	50	50	5000	VARCHAR(8000)
					WHILE	15466
					Numbers	8596
					CLR	8043
13	50	50	5000	VARCHAR(MAX)
					WHILE	15363
					Numbers	24456
					CLR	8060
14	1000	20	200	VARCHAR(MAX)
					WHILE	516
					Numbers	18933
					CLR	310
15	1000	50	200	VARCHAR(MAX)
					WHILE	1176
					Numbers	42420
					CLR	800

Note: To ensure fair tests, I changed the WHILE loop function from NVARCHAR to VARCHAR for test with non-unicode data types. Note: I kept the XML approach away from the last five test since these test reached a higher level of data. Concurrent Work Last but not least we have to look at the concurrency behavior. I did these tests with the following string configuration.

Item Length	50
Item Count	50
Row Count	500
Data Type	NVARCHAR(4000)

And here are the test results.

Threads	Cursor	Numbers	CLR
1	1540	1068	779
3	3305	2259	1646
5	4384	2365	2246
10	11208	4745	5249
100	123839	37060	53512

Conclusion

We've seen four completely different ways to split strings within T-SQL. XML appears to be the wrong technology to do the job. XML is a validated data type which has to created on the fly by using REPLACE which causes a complete copy of all data. XQuery is really fast if source data are already available in XML, but the conversion from text to XML seems to be way to expensive. WHILE loops are the most common approach to handle this kind of job. They are okay for smaller strings or in cases where they are not needed so often. They are the wrong solution for high performance split routines. The way to use a Numbers table for this kind of work sounds curious if you never did before. As you see, the Numbers table is a great tool once again. For text with small items this is the best performing solution. Especially the concurrency performance is really great. There are some issues though. You have to ensure the leading and trailing separator. If you have to split data from client side, which can add the separators before the data reach the server, everything is fine. If you have to work with data which are already stored in database - without being enclosed - you have to add the separators on the fly; in this case you may end up in same problem as the XML since all data have to be copied in memory. My tests showed me that the performance of this approach goes down if you encapsulate it into a table-valued function (even a single-statement table-valued function). The Numbers table split becomes slower with the length of result items within the source text. There are quiet less tasks which should be done with SQL CLR since now. Anyway, splitting strings is one of the tasks .NET is optimized for. It performs great for any kind and count of data. Even the concurrency performance is not as perfect as the Numbers table split but even better than linear.

Friday, September 25, 2009

Numbers table

The proposal for this article was Jeff Moden's great article The "Numbers" or "Tally" Table: What it is and how it replaces a loop. at SQLServerCentral.com. This post should show some additional tasks which can be done with Numbers tables.

A "Numbers" table is a powerful SQL tool. Basically it's just a table containing incremental numbers but this table can be used for many cases, especially to avoid the usage of SQL cursors. Cursors are a common performance problem in many databases. Database systems are optimized for bulk operations, cursors are a row based approach, so you should always try to avoid them.

Let's start with the creation of a Numbers table. The table contains one column and the numbers from 1 to 11000. Why 11000? I don't know :-D. The table can be used for many approaches as we will see in this article. If the table contains too less rows it cannot be used for some tasks, if the table contains too many rows, the selectivity becomes worse. The idea behind 11000 is, 10000 is a common number of "many items", 11000 can handle those tasks and contains 10% puffer for slightly larger operations.

CREATE TABLE Numbers
(
   Num INT NOT NULL PRIMARY KEY CLUSTERED WITH (FILLFACTOR = 100)
);

INSERT INTO Numbers
   SELECT TOP(11000)
      ROW_NUMBER() OVER (ORDER BY (SELECT 1))
   FROM master.sys.all_columns c1
      CROSS JOIN master.sys.all_columns c2;

Well now we have "Numbers", so now we are cool, but why? Let's have a look into some solutions which can be done with a numbers table.

Transform columns into rows

Sometimes it can be needed to transform data of two or more columns into rows. Since this smells like a violation of Normalization Form 1 (NF1) we will never need this for our databases, which are always correctly normalized. Presume we get data from another - not normalized - data source and have to transform those data into our - well normalized - data structure.

We start with a table containing persons and their phone numbers.

DECLARE @BadData TABLE
(
   Id INT NOT NULL IDENTITY
      PRIMARY KEY CLUSTERED
   ,FirstName VARCHAR(30)
   ,LastName VARCHAR(30)
   ,Phone1 VARCHAR(100)
   ,Phone2 VARCHAR(100)
   ,Phone3 VARCHAR(100)
);

INSERT INTO @BadData 
      (FirstName, LastName, Phone1, Phone2, Phone3)
   VALUES
      ('Pete', 'Porter', '1111', '1112', '1113')
      ,('Rush', 'Bush', '2221', '2222', '2223');

Now let's normalize those bad data, using a Numbers table. This can be done by a CROSS JOIN to combine all rows of both tables and a restriction to only three numbers within the WHERE clause.

SELECT
      Id
      ,FirstName
      ,LastName
      ,CASE 
         WHEN Num = 1 THEN Phone1
         WHEN Num = 2 THEN Phone2
         ELSE Phone3
         END AS Phone
      ,Num
   FROM @BadData
      CROSS JOIN Numbers
   WHERE Num <= 3

The result of this statement is shown here.


Id   FirstName  LastName   Phone   Num
---- ---------- ---------- ------- -----------
1    Pete       Porter     1111    1
1    Pete       Porter     1112    2
1    Pete       Porter     1113    3
2    Rush       Bush       2221    1
2    Rush       Bush       2222    2
2    Rush       Bush       2223    3

Build data ranges

Some reports show calculated data on date level. Therefore we have to create a table containing all the dates for a specified date range.

DECLARE
   @start   DATETIME = '2009-01-01'
   ,@end    DATETIME = '2009-02-01'

SELECT
      DATEADD(DAY, Num - 1, @start)
   FROM Numbers
   WHERE
      DATEADD(DAY, Num - 1, @start) < @end

Split strings

Here you can find a dedicated post of me concerning string split methods which also covers how to use a Numbers table to do this:
High Performance String Split Functions

Generate Test Data

Sometimes it's useful to have some test data.

A reason might be the start of a new project. Usually you start with an empty database. Customer and/or analysts did not yet provide real data. Now you can start to write your own row by row or you can use a tool (in our case a script) and let it generate the data. To tell you the truth, for tests of some specific use cases you still have to write the data manually. But to show a grid or aggregate some numbers you might be happy with some scrap data.

On the other side you need an environment with many data for load tests. Feel lucky, if you have a large backup from production but this is not guaranteed. (E.g. you are in a new project, the database is in another location and too large to be copied or you are just not allowed to see the real data for some reason.) For load tests it's almost impossible to write data manually.

Here's a little sample which shows how to use a "Numbers" table to generate hundreds of thousands of rows. As you can see it's relatively simple to configure the output data with T-SQL build in functions like NEWID(), REPLICATE(), ASCII(), NCHAR() or ABS.

SET NOCOUNT ON;
GO
---==============================================
-- table TestDataFR

-- drop an existing test table
IF (OBJECT_ID('TestDataFR') IS NOT NULL)
   DROP TABLE TestDataFR;
-- create a new table
CREATE TABLE TestDataFR
(
   Id INT NOT NULL IDENTITY(1,1)
      PRIMARY KEY CLUSTERED
   ,SomeInt INT
   ,SomeVarChar VARCHAR(50)
   ,SomeNVarChar NVARCHAR(50)
   ,SomeDateTime DATETIME
   ,SomeDecimal DECIMAL(15,5)
   ,SomeFloat FLOAT
);
GO

---==============================================
-- configuration
DECLARE
   @count            INT            = 100000
   ,@done            INT            = 0
   ,@batchSize       INT            = 10000
   ,@top             INT            = 0
   ,@maxLenNVarChar  INT            = 15
   ,@maxSomeDecimal  DECIMAL(15,5)  = 800000000
   ,@maxSomeFloat    BIGINT         = 900000000000000
   ;


-- get the current count of rows (should be zero)
SELECT @done = COUNT(*)
   FROM TestDataFR

-- create the test data
WHILE (@done < @count)
BEGIN
   -- work in batch size transactions
   BEGIN TRANSACTION;
   
   -- determine the count of rows for this batch
   SELECT @top = 
      CASE 
         WHEN @count - @done < @batchSize THEN @count - @done 
         ELSE @batchSize 
         END;

   -- insert the rows
   INSERT INTO TestDataFR (
         SomeInt
         ,SomeVarChar
         ,SomeNVarChar
         ,SomeDateTime
         ,SomeDecimal
         ,SomeFloat
         )
      SELECT TOP(@top)
         SomeInt        = Num
         ,SomeVarChar   = CONVERT(VARCHAR(50), NEWID())
         ,SomeNVarChar  = 
            CONVERT
            (
               -- return type
               NVARCHAR(50), 
               -- replicate a char between A and Z up to 
               -- max configured length
               REPLICATE
               (
                  NCHAR(ASCII('a') + Num % 26), 
                  Num % @maxLenNVarChar + 1
               ) 
               -- add a space
               + N' ' 
               -- add a number at the end
               + CONVERT(NVARCHAR(10), Num)
            )
         ,SomeDateTime  = DATEADD(HOUR, Num * -1, GETDATE())
         ,SomeDecimal   = 
            CONVERT
            (
               DECIMAL(15,5), 
               -- a non negative number up to the max devided by Num
               ABS(CHECKSUM(NEWID()) % @maxSomeDecimal) / Num
            )
         ,SomeFloat     = 
            -- a possibly negative number up to specified max
            CONVERT(FLOAT, CHECKSUM(NEWID()) % @maxSomeFloat) 
            -- devide by 1.xxx 
            / (1 + (CONVERT(FLOAT, 1) / Num))
      FROM Numbers
   
   -- calculate the count of done rows
   SELECT @done += @@ROWCOUNT;
   
   COMMIT TRANSACTION;
END;
GO

---==============================================
-- show result and some of the data

SELECT 'Count of rows: ' + CONVERT(VARCHAR(20), COUNT(*))
   FROM TestDataFR;

SELECT TOP(100)
      'Samples'
      ,*
   FROM TestDataFR;

Binary data as hex-string

The transformation of binary data into a hexadecimal string can be used to export data into SQL statements or to export the data to any other destination.

IF (OBJECT_ID('ufn_VarBinToHexString') IS NULL)
   EXECUTE ('CREATE FUNCTION ufn_VarBinToHexString() 
             RETURNS INT AS BEGIN RETURN 1 END');
GO
ALTER FUNCTION ufn_VarBinToHexString(@binary VARBINARY(MAX))
   RETURNS VARCHAR(MAX)
AS
BEGIN
   --DECLARE @binary VARBINARY(100) = NEWID();

   -- Mapping for binary values
   DECLARE @HexMap VARCHAR(550)
   SELECT @HexMap = 
       '000102030405060708090A0B0C0D0E0F101112131415161718191A1B1C1D1E1F'
     + '202122232425262728292A2B2C2D2E2F303132333435363738393A3B3C3D3E3F' 
     + '404142434445464748494A4B4C4D4E4F505152535455565758595A5B5C5D5E5F'
     + '606162636465666768696A6B6C6D6E6F707172737475767778797A7B7C7D7E7F' 
     + '808182838485868788898A8B8C8D8E8F909192939495969798999A9B9C9D9E9F'
     + 'A0A1A2A3A4A5A6A7A8A9AAABACADAEAFB0B1B2B3B4B5B6B7B8B9BABBBCBDBEBF' 
     + 'C0C1C2C3C4C5C6C7C8C9CACBCCCDCECFD0D1D2D3D4D5D6D7D8D9DADBDCDDDEDF'
     + 'E0E1E2E3E4E5E6E7E8E9EAEBECEDEEEFF0F1F2F3F4F5F6F7F8F9FAFBFCFDFEFF'
     ;

   RETURN 
      '0x' 
      + 
      (
      SELECT
         SUBSTRING(
            @HexMap
            ,CONVERT(
               TINYINT
               ,SUBSTRING(@binary, Num, 1)) * 2 + 1
            ,2)
      FROM Numbers t2
      WHERE Num <= DATALENGTH(@binary)
      ORDER BY Num
      FOR XML PATH('')
      );
END
GO

The numbers table is used to loop through all bytes within the specified binary stream. SUBSTRING is used to extract the current byte. Now the @HexMap is used to get the characters which represent the current byte. The result of this is a table with one row for each byte within the specified binary value. Finally the are concatenated by a common hack with FOR XML PATH(''). This function can be helpful for SQL Server 2005, in SQL Server 2008 it became obsolete. The build-in CONVERT function has been extended with an optional style-parameter which can be used to generate the same result. The following sample shows the usage of this function and the new CONVERT feature of SQL Server 2008.

DECLARE @binary VARBINARY(100) = NEWID();
SELECT
   @binary
   ,dbo.ufn_VarBinToHexString(@binary)
   ,CONVERT(VARCHAR(MAX), @binary, 1)

Conclusion As you saw, a "Numbers" table can be used for many different things. If you start using it you will find more and more tasks which can be optimized with it.

Query a web service with SQLCLR

Since SQL Server 2005 it's possible to create .NET based functions and procedures and call them from T-SQL. This feature should replace old approaches using extended procedures written in C/C++ and COM based solutions working with sp_OACreate. Problem of extended procedures is that bugs within these procedures may affect the whole SQL server service. Problems of COM based solutions is that they are hard do write.

Anyway, SQL CLR should not be used to replace usual SQL procedures. T-SQL is still much faster and better to maintain for CRUD (create, read, update, delete) actions. The idea of SQL CLR is to extend T-SQL tool set, not to replace it.

One of the tasks, which can be done with SQL CLR, is to request information from a web service.This sample uses the National Digital Forecast Database (NDFD) to query some weather information.

Database Configuration

Before we can start to develop the function, we have to set database to TRUSTWORTHY mode. This is needed, because our function needs to access external resources.You can use the following SQL statement to enable the TRUSTWORTHY mode for your database:

ALTER DATABASE Sandbox SET TRUSTWORTHY ON;

Server Configuration

If not yet done, you also have to enable CLR on SQL server. This can be done with the following SQL statement:

EXECUTE sp_configure 'clr enabled', 1;
RECONFIGURE;

Visual Studio Project Configuration

Open Visual Studio 2008 and create a new C# project from type "SQL Server Project". Open project properties and navigate to "Database" tab. Ensure the correct connection string. Permission level has to be changed to "External". Add a new item from type "User-Defined Function" with name "QueryWebService".

Now we got everything we need to create our CLR function.

The Sample Function

Each CLR table-valued function needs the attribute SqlFunction and has to return an IEnumerable. The SqlFunction attribute as to specify the returned table definition by parameter TableDefinition. A second parameter FillRowMehodName specifies the name of the method to be used from SQL Server to get the row items from current element of the enumerator.

First create a WebRequest for the URL to query the web service. Get the WebResponse and use a XPathDocument to parse the returned XML stream.

The following block shows an cutting of the returned XML.

<dwml 
   version="1.0" 
   xsi:noNamespaceSchemaLocation="http://www.nws.noaa.gov/forecasts/xml/DWMLgen/schema/DWML.xsd" 
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   >
   <head>
      <!-- ... -->
   </head>
   <data>
      <location>
         <location-key>point1</location-key>
         <point latitude="38.99" longitude="-77.01" />
      </location>
      <!-- ... -->
      <time-layout time-coordinate="local" summarization="none">
         <layout-key>k-p24h-n7-1</layout-key>
         <start-valid-time>2009-09-23T08:00:00-04:00</start-valid-time>
         <end-valid-time>2009-09-23T20:00:00-04:00</end-valid-time>
         <start-valid-time>2009-09-24T08:00:00-04:00</start-valid-time>
         <end-valid-time>2009-09-24T20:00:00-04:00</end-valid-time>
         <!-- ... -->
      </time-layout>
      <time-layout time-coordinate="local" summarization="none">
         <layout-key>k-p24h-n6-2</layout-key>
         <!-- ... -->
      </time-layout>
      <parameters applicable-location="point1">
         <temperature type="maximum" units="Fahrenheit" time-layout="k-p24h-n7-1">
            <name>Daily Maximum Temperature</name>
            <value>83</value>
            <value>85</value>
            <!-- ... -->
         </temperature>
         <!-- ... -->
      </parameters>
   </data>
</dwml>

Last thing we have to do is to extract the information we want to return to SQL Server. Therefore we use XPath to query the returned location and forecast information.

using System;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
using System.Collections;
using System.Net;
using System.IO;
using System.Xml.XPath;


public partial class StoredProcedures
{
   [SqlFunction(
      FillRowMethodName = "FillMethodWeather",
      TableDefinition = "Latitude FLOAT, Longitude FLOAT, "
                        + "ForecastDate DateTime, Temperature FLOAT"
      )
   ]
   public static IEnumerable QueryWebService()
   {
      // sample web request url to get a weather forecast
      String url =
         "http://www.weather.gov/forecasts/xml/sample_products"
         + "/browser_interface/ndfdXMLclient.php?"
         + "lat=38.99"
         + "&lon=-77.01"
         + "&product=time-series"
         + "&begin=2004-01-01T00:00:00"
         + "&end=2013-04-20T00:00:00"
         + "&maxt=maxt"
         + "&mint=mint";

      string xpath;

      // create teh web request and get the response
      WebRequest request = HttpWebRequest.Create(url);
      WebResponse response = request.GetResponse();
      
      // take the response stream and create an XML DOM
      Stream responseStream = response.GetResponseStream();
      XPathDocument dom = new XPathDocument(responseStream);

      XPathNavigator root = dom.CreateNavigator();

      // get latitude and longitude information
      Double latitude;
      Double longitude;
      xpath = "dwml/data/location/point";
      XPathNavigator point = root.SelectSingleNode(xpath);
      latitude = Double.Parse(point.GetAttribute("latitude", String.Empty));
      longitude = Double.Parse(point.GetAttribute("longitude", String.Empty));

      // get the date and temperature nodes from xml
      
      xpath = "dwml/data/time-layout [position()='1']/* "
              + "[name()='start-valid-time']";
      XPathNodeIterator dateNodes = root.Select(xpath);

      xpath = "dwml/data/parameters/temperature "
              + "[position()='1']/* [name()='value']";
      XPathNodeIterator tempNodes = root.Select(xpath);

      // loop over all date nodes
      while (true)
      {
         // break if we passed the last date element
         if (!dateNodes.MoveNext())
            break;
         tempNodes.MoveNext();

         // get current forecast information
         XPathNavigator dateNode = dateNodes.Current;
         XPathNavigator tempNode = tempNodes.Current;
         DateTime date = DateTime.Parse(dateNode.Value);
         Double temp = Double.Parse(tempNode.Value);

         // use a "yield return" to enumerate through all data
         yield return new object[] 
         {
            (SqlDouble)latitude, 
            (SqlDouble)longitude, 
            (SqlDateTime)date, 
            (SqlDouble)temp 
         };
      }
   }

   // function will be called from SQL Server to get row items 
   // for current element from enumerator
   public static void FillMethodWeather(object current, 
      out SqlDouble latitude, 
      out SqlDouble longitude, 
      out SqlDateTime forecastDate, 
      out SqlDouble temperature
      )
   {
      // set return information from current item
      object[] data = (object[])current;
      latitude = (SqlDouble)data[0];
      longitude = (SqlDouble)data[1];
      forecastDate = (SqlDateTime)data[2];
      temperature = (SqlDouble)data[3];
   }
};

Publish

The easiest way to deploy the created assembly to SQL Server is to use the Visual Studio deployment feature. Right click the project node in solution explorer and hit "Deploy" to make the function available in SQL Server.

The usage of this function within T-SQL is equal to any other table-valued function call.

SELECT * FROM QueryWebService();

The output shows the result of the web service call.


Latitude    Longitude    ForecastDate              Temperature
----------- ------------ ------------------------- -------------
38.99       -77.01       2009-09-25 14:00:00.000   72
38.99       -77.01       2009-09-26 14:00:00.000   65
38.99       -77.01       2009-09-27 14:00:00.000   78
38.99       -77.01       2009-09-28 14:00:00.000   75
38.99       -77.01       2009-09-29 14:00:00.000   68
38.99       -77.01       2009-09-30 14:00:00.000   69
38.99       -77.01       2009-10-01 14:00:00.000   70

(7 row(s) affected)

As you see SQL CLR is neat way to extent SQL Server functionality with .NET.

Wednesday, September 23, 2009

Cascading using directive

Disclaimer:
This post is no great trick but a tiny tip for some of you, who probably didn't yet know this C# syntax feature. ;-)

As you know, .NET requires to implement the "IDisposable" interface for classes working with unmanaged resources. This interface provides the single method "Dispose()" which can be called to explicitly free the resources. If you don't call the method explicitly, the garbage collector (GC) calls the method automatically when the instance of the class becomes collected. Nevertheless, as usual you should care about those types and free them in your code.

One ensure the call of the Dispose() method within your code is a try-catch-finally block. Since this approach requires quite a bunch of code for a simple "Dispose()" method, .NET provides the "using" statement which implicit calls the "Dispose()" method as soon as you leave the scope of this statement.

using System;
using System.Data.SqlClient;

namespace ConsoleApplication1
{
   class Programm
   {
      static void Main(string[] args)
      {
         String cnStr = "Server=.\\Sql2k8;" +
                        "Database=Sandbox;" +
                        "Trusted_Connection=true;";

         // using block for connection
         using (SqlConnection cn = new SqlConnection(cnStr))
         {
            cn.Open();
         }
      }
   }
}

In this sample the instance of a "SqlConnection" becomes disposed as soon as you leave the "using" statement.

Unfortunately this becomes a little bit ugly if you have to work with more than one types which require to be disposed. In ADO.NET most times you need a "SqlConnection", a "SqlCommand" and a "SqlDataReader" or a "SqlDataAdapter". All these classes implement the "IDisposable" interface, work with unmanaged resources and have to be disposed. Let's have a look at a simple sample to select one column of a table and write all rows to a text file.

using System;
using System.Data.SqlClient;
using System.IO;

namespace ConsoleApplication1
{
   class Programm
   {
      static void Main(string[] args)
      {
         String cnStr = "Server=.\\Sql2k8;" +
                        "Database=Sandbox;" +
                        "Trusted_Connection=true;";
         String sql = "SELECT TOP(1000) Num FROM Numbers";
         String exportFile = @"D:\Temp\Test\test.txt";

         // using block for connection
         using (SqlConnection cn = new SqlConnection(cnStr))
         {
            cn.Open();

            // using block for command
            using (SqlCommand cmd = new SqlCommand(sql, cn))
            {
               // using block for reader
               using (SqlDataReader reader = cmd.ExecuteReader())
               {
                  // using block for stream writer
                  using (StreamWriter writer = new StreamWriter(exportFile))
                  {
                     // write first column of all rows to a file
                     while (reader.Read())
                     {
                        String row = reader.GetValue(0).ToString();
                        writer.WriteLine(row);
                     }

                     writer.Close();
                  }
               }
            }
         }
      }
   }
}

As you can see we need four "using" statements to ensure a disposing of all classes.

Alternatively to this cascading "using" blocks C# offers the possibility to work with several "using" statements in one scope.

using System;
using System.Data.SqlClient;
using System.IO;

namespace ConsoleApplication1
{
   class Programm
   {
      static void Main(string[] args)
      {
         String cnStr = "Server=.\\Sql2k8;" +
                        "Database=Sandbox;" +
                        "Trusted_Connection=true;";
         String sql = "SELECT TOP(1000) Num FROM Numbers";
         String exportFile = @"D:\Temp\Test\test.txt";

         // using block for connection
         using (SqlConnection cn = new SqlConnection(cnStr))
         {
            cn.Open();

            // using scope for all other instances
            using (SqlCommand cmd = new SqlCommand(sql, cn))
            using (SqlDataReader reader = cmd.ExecuteReader())
            using (StreamWriter writer = new StreamWriter(exportFile))
            {
               // write first column of all rows to a file
               while (reader.Read())
               {
                  String row = reader.GetValue(0).ToString();
                  writer.WriteLine(row);
               }

               writer.Close();
            }
         }
      }
   }
}

As you can see you can use one single "using" scope for all instances, except the "SqlConnection". The connection needs requires to be handled in an own block since the call of "SqlCommand.ExecuteReader()" requires an open connection.

If you have to work with database connections frequently, you can use a simple trick by extending the connection class with an additional method. (Extension methods are a feature of .NET 3.0 and future versions). Let's define a new method which opens the connection and returns this instance of the connection to be used within the "using" statement.

static class SqlConnectionExtensions
{
   public static SqlConnection AutoOpen(this SqlConnection connection)
   {
      connection.Open();
      return connection;
   }
}
>

Now you are able to call this method directly from your connection constructor to open the connection and get the instance into the using-scope reference.

using System;
using System.Data.SqlClient;
using System.IO;

namespace ConsoleApplication1
{
   class Programm
   {
      static void Main(string[] args)
      {
         String cnStr = "Server=.\\Sql2k8;" +
                        "Database=Sandbox;" +
                        "Trusted_Connection=true;";
         String sql = "SELECT TOP(1000) Num FROM Numbers";
         String exportFile = @"D:\Temp\Test\test.txt";

         // one using scope for all disposable instances
         using (SqlConnection cn = new SqlConnection(cnStr).AutoOpen())
         using (SqlCommand cmd = new SqlCommand(sql, cn))
         using (SqlDataReader reader = cmd.ExecuteReader())
         using (StreamWriter writer = new StreamWriter(exportFile))
         {
            // write first column of all rows to a file
            while (reader.Read())
            {
               String row = reader.GetValue(0).ToString();
               writer.WriteLine(row);
            }

            writer.Close();
         }
      }
   }
}

As I told you at the begin of this post, the possibility to use more than one instance of an unmanaged class is not a great trick, but a quiet neat feature of C#.

Tuesday, September 22, 2009

ThreadSafeDictionary

Threading is a powerful tool today, since almost every client computer has two or more CPUs. On the other side Dictionary objects are often used to store important information which are need to provide and access information for different proposes. Unfortunately, since today (.NET 3.5) .NET does not offer a build in thread safe implementation of a dictionary. The System.Collections.HashTable provides a static Synchronized() method. Nevertheless, this method includes two problems. On one side, the HashTable is a not strong typed implementation, so it always becomes risky if you use a HashTable for more than a small scope. On the other side the MSDN says that it only supports multiple writing threads, but it's not thread safe for multi-threaded reading access.

Here you can find a complete implementation of a thread safe dictionary which might be helpful for your proposes. It implements the following interfaces:

System.Collections.IEnumerableSystem.Collections.IDictionary
System.Collections.Generic.IDictionary System.Collections.Generic.ICollection>
System.Collections.Generic.IEnumerable>

In addition to a thread safe read/write implementation of a usual dictionary it provides three special methods:

TryAdd(TKey, TValue)
Works like "TryGetValue". It tries to add a specified key and value to the dictionary if it is not yet present; otherwise it does not change the existing items.

Lock()
Which generates an exclusive thread lock for the current thread.

Unlock()
Which releases an exclusive lock for the current thread.

A sample how to use the ThreadSafeDictionary in a multi-threaded scenario can be found at the end of this blog.

using System;
using System.Collections;
using System.Collections.Generic;
using System.Threading;
using System.Diagnostics;

namespace FR.Collections.Generic
{
   /// <summary>
   /// Represents a thread safe version of a 
   /// <see cref="IDictionary<TKey,TValue>"/>.
   /// </summary>
   /// <typeparam name="TKey">The key to be used for items stored within the 
   /// dictionary.</typeparam>
   /// <typeparam name="TValue">The value to be handled within the 
   /// dictionary for a specified key.</typeparam>
   [DebuggerDisplay("Count = {Count}")]
   public class ThreadSafeDictionary<TKey, TValue>
   :
   IDictionary<TKey, TValue>,
   ICollection<KeyValuePair<TKey, TValue>>,
   IEnumerable<KeyValuePair<TKey, TValue>>,
   IEnumerable,
   IDictionary
   {
      #region Constructor

      /// <summary>
      /// Creates a new instance of a 
      /// <see cref="ThreadSafeDictionary<TKey,TValue>"/> with 
      /// default settings
      /// </summary>
      public ThreadSafeDictionary()
      {
         _dict = new Dictionary<TKey, TValue>();
         _sync = new object();
      }

      /// <summary>
      /// Creates a new instance of a 
      /// <see cref="ThreadSafeDictionary<TKey,TValue>"/> 
      /// and adds all items from another dictionary.
      /// </summary>
      /// <param name="dictionary">
      /// The <see cref="IDictionary<TKey,TValue>"/> to 
      /// fill into this instance of a thread safe dictionary.
      /// </param>
      public ThreadSafeDictionary(IDictionary<TKey, TValue> dictionary)
      {
         _dict = new Dictionary<TKey, TValue>(dictionary);
         _sync = new object();
      }

      /// <summary>
      /// Creates a new instance of a thread safe dictionary which uses a 
      /// specified comparer to identify the uniquenes of added keys.
      /// </summary>
      /// <param name="comparer">The 
      /// <see cref="IEqualityComparer<T>"/> to use to determine the 
      /// uniqueness of hold keys.</param>
      public ThreadSafeDictionary(IEqualityComparer<TKey> comparer)
      {
         _dict = new Dictionary<TKey, TValue>(comparer);
         _sync = new object();
      }

      /// <summary>
      /// Creates a new instance of an empty 
      /// <see cref="ThreadSafeDictionary<TKey,TValue>"/>
      /// with a specified initial capacity.
      /// </summary>
      /// <param name="initialCapacity">The initial capacity for the new 
      /// instance of a <see cref="ThreadSafeDictionary<TKey,TValue>"/>.
      /// </param>
      public ThreadSafeDictionary(int initialCapacity)
      {
         _dict = new Dictionary<TKey, TValue>(initialCapacity);
         _sync = new object();
      }

      /// <summary>
      /// Creates a new instance of a 
      /// <see cref="ThreadSafeDictionary<TKey,TValue>"/> 
      /// with initially contains the items of another specified 
      /// <see cref="IDictionary<TKey, TValue>"/> and uses a 
      /// specified comparer to determine the uniqueness of the 
      /// items within the dictionary.
      /// </summary>
      /// <param name="dictionary">
      /// The <see cref="IDictionary<TKey,TValue>"/> to 
      /// copy into this new instance of a 
      /// <see cref="ThreadSafeDictionary<TKey,TValue>"/>.
      /// </param>
      /// <param name="comparer">
      /// The <see cref="IEqualityComparer<T>"/> to use 
      /// to compare the keys within this dictionary.
      /// </param>
      public ThreadSafeDictionary(IDictionary<TKey, TValue> dictionary, IEqualityComparer<TKey> comparer)
      {
         _dict = new Dictionary<TKey, TValue>(dictionary, comparer);
         _sync = new object();
      }

      /// <summary>
      /// Creates a new instance of a 
      /// <see cref="ThreadSafeDictionary<TKey,TValue>"/> 
      /// with a initial capacity of items to hold and a specified 
      /// <see cref="IEqualityComparer<T>"/> to compare keys.
      /// </summary>
      /// <param name="initialCapacity">
      /// The initial capacity for items to hold within this dictionary.
      /// </param>
      /// <param name="comparer">
      /// The <see cref="IEqualityComparer<T>"/> 
      /// to use to compare to compare keys.
      /// </param>
      public ThreadSafeDictionary(int initialCapacity, IEqualityComparer<TKey> comparer)
      {
         _dict = new Dictionary<TKey, TValue>(initialCapacity, comparer);
         _sync = new object();
      }

      #endregion

      #region Private Fields

      private Dictionary<TKey, TValue> _dict;
      private object _sync;

      #endregion

      #region Public Indexers

      /// <summary>
      /// Gets or sets the value associated with the specified key.
      /// </summary>
      /// <param name="key">
      /// The key of the value to get or set.
      /// </param>
      /// <returns>
      /// The value associated with the specified key. If the 
      /// specified key is not found, a get operation throws 
      /// a <see cref="KeyNotFoundException"/>.
      /// </returns>
      public TValue this[TKey key]
      {
         get
         {
            lock (_sync)
            {
               return _dict[key];
            }
         }
         set
         {
            lock (_sync)
            {
               _dict[key] = value;
            }
         }
      }

      #endregion

      #region Public Properties

      /// <summary>
      /// Gets the number of key/value pairs contained in the 
      /// <see cref="ThreadSafeDictionary<TKey,TValue>"/>.
      /// </summary>
      public int Count
      {
         get
         {
            lock (_sync)
            {
               return _dict.Count;
            }
         }
      }

      /// <summary>
      /// Gets a collection containing the keys in the 
      /// <see cref="ThreadSafeDictionary<TKey,TValue>"/>.
      /// </summary>
      public ICollection<TKey> Keys
      {
         get
         {
            lock (_sync)
            {
               return _dict.Keys;
            }
         }
      }

      /// <summary>
      /// Gets a collection containing the values in the 
      /// <see cref="ThreadSafeDictionary<TKey,TValue>"/>.
      /// </summary>
      public ICollection<TValue> Values
      {
         get
         {
            lock (_sync)
            {
               return _dict.Values;
            }
         }
      }

      #endregion

      #region Public Methods

      /// <summary>
      /// Adds a specified key and value to the dictionary.
      /// </summary>
      /// <param name="key">
      /// The key of the new item to add to this dictionary.
      /// </param>
      /// <param name="value">
      /// The value to be associated with the specified key.
      /// </param>
      public void Add(TKey key, TValue value)
      {
         lock (_sync)
         {
            _dict.Add(key, value);
         }
      }

      /// <summary>
      /// Removes all keys and values from this instance of a 
      /// <see cref="ThreadSafeDictionary<TKey,TValue>"/>.
      /// </summary>
      public void Clear()
      {
         lock (_sync)
         {
            _dict.Clear();
         }
      }

      /// <summary>
      /// Determines if a specified key is currently present within this 
      /// <see cref="ThreadSafeDictionary<TKey,TValue>"/>.
      /// </summary>
      /// <param name="key">
      /// The key to find within this dictionary.
      /// </param>
      /// <returns>
      /// True if the specified key is actually presend within the 
      /// dictionary; otherwise returns false.
      /// </returns>
      public bool ContainsKey(TKey key)
      {
         lock (_sync)
         {
            return _dict.ContainsKey(key);
         }
      }

      /// <summary>
      /// Copies all <see cref="KeyValuePair<TKey,TValue>"/> 
      /// which are currently present within this instance of a 
      /// <see cref="ThreadSafeDictionary<TKey,TValue>"/> 
      /// to a specified array.
      /// </summary>
      /// <param name="array">
      /// The one-dimensional array to copy the current items of 
      /// this dictionary into.
      /// </param>
      /// <param name="arrayIndex">
      /// The zero-based start index within the destination array
      /// to start copying.
      /// </param>
      public void CopyTo(KeyValuePair<TKey, TValue>[] array, int arrayIndex)
      {
         lock (_sync)
         {
            ((IDictionary<TKey, TValue>)_dict).CopyTo(array, arrayIndex);
         }
      }

      /// <summary>
      /// Returns an <see cref="IEnumerator<T>"/> to iterate
      /// through all items within this dictionary.
      /// </summary>
      /// <returns>
      /// An <see cref="IEnumerator<T>"/> to iterate through 
      /// all items with this dictionary.
      /// </returns>
      public IEnumerator<KeyValuePair<TKey, TValue>> GetEnumerator()
      {
         lock (_sync)
         {
            return _dict.GetEnumerator();
         }
      }

      /// <summary>
      /// Generates an exclusive thread lock for the calling thread. 
      /// As long as the 
      /// <see cref="ThreadSafeDictionary<TKey,TValue>"/> 
      /// is locked by this method, no other thread can read or 
      /// write data into or from the dictionary.
      /// </summary>
      public void Lock()
      {
         Monitor.Enter(_sync);
      }

      /// <summary>
      /// Removes an item for a specified key from this instance of a 
      /// <see cref="ThreadSafeDictionary<TKey,TValue>"/>.
      /// </summary>
      /// <param name="key">
      /// The key of the item to remove.
      /// </param>
      /// <returns>
      /// True if the specified key was currently presend and 
      /// successfully removed from the 
      /// <see cref="ThreadSafeDictionary<TKey,TValue>"/>; 
      /// otherwise false.</returns>
      public bool Remove(TKey key)
      {
         lock (_sync)
         {
            return _dict.Remove(key);
         }
      }

      /// <summary>
      /// Tries to add a new item to the 
      /// <see cref="ThreadSafeDictionary<TKey,TValue>"/>. 
      /// If the specified key is currently not present the 
      /// new key/value pair will be added; otherwise nothing happens.
      /// </summary>
      /// <param name="key">
      /// The new key to be added to the current instance of a 
      /// <see cref="ThreadSafeDictionary<TKey,TValue>"/>.
      /// </param>
      /// <param name="value">
      /// The new value to be added for the specified key.
      /// </param>
      /// <returns>
      /// True if the new key/value was added to the dictionary; 
      /// otherwise false.
      /// </returns>
      public bool TryAdd(TKey key, TValue value)
      {
         lock (_sync)
         {
            if (_dict.ContainsKey(key))
               return false;
            _dict.Add(key, value);
            return true;
         }
      }

      /// <summary>
      /// Tries to get a value for a specified key from this instance 
      /// of a <see cref="ThreadSafeDictionary<TKey,TValue>"/> 
      /// into a specified out parameter.
      /// </summary>
      /// <param name="key">
      /// The key to get an item for.
      /// </param>
      /// <param name="value">
      /// If the specified key was found within this 
      /// <see cref="ThreadSafeDictionary<TKey,TValue>"/> 
      /// the value will return the value for the key; otherwise 
      /// value will be set to the default of the type of the value.
      /// </param>
      /// <returns>
      /// True if the specified key was found within the dictionary; 
      /// otherwise false.
      /// </returns>
      public bool TryGetValue(TKey key, out TValue value)
      {
         lock (_sync)
         {
            return _dict.TryGetValue(key, out value);
         }
      }

      /// <summary>
      /// Releases a exclusive lock for this 
      /// <see cref="ThreadSafeDictionary<TKey,TValue>"/>
      /// generated by calling the <see cref="Lock"/> method.
      /// </summary>
      public void Unlock()
      {
         Monitor.Exit(_sync);
      }

      #endregion

      #region IDictionary<TKey,TValue> Members

      void IDictionary<TKey, TValue>.Add(TKey key, TValue value)
      {
         this.Add(key, value);
      }

      bool IDictionary<TKey, TValue>.ContainsKey(TKey key)
      {
         return this.ContainsKey(key);
      }

      ICollection<TKey> IDictionary<TKey, TValue>.Keys
      {
         get { return this.Keys; }
      }

      bool IDictionary<TKey, TValue>.Remove(TKey key)
      {
         return this.Remove(key);
      }

      bool IDictionary<TKey, TValue>.TryGetValue(TKey key, out TValue value)
      {
         return this.TryGetValue(key, out value);
      }

      ICollection<TValue> IDictionary<TKey, TValue>.Values
      {
         get { return this.Values; }
      }

      TValue IDictionary<TKey, TValue>.this[TKey key]
      {
         get { return this[key]; }
         set { this[key] = value; }
      }

      #endregion

      #region ICollection<KeyValuePair<TKey,TValue>> Members

      void ICollection<KeyValuePair<TKey, TValue>>.Add(KeyValuePair<TKey, TValue> item)
      {
         this.Add(item.Key, item.Value);
      }

      void ICollection<KeyValuePair<TKey, TValue>>.Clear()
      {
         this.Clear();
      }

      bool ICollection<KeyValuePair<TKey, TValue>>.Contains(KeyValuePair<TKey, TValue> item)
      {
         TValue value;
         if (this.TryGetValue(item.Key, out value) && object.ReferenceEquals(value, item.Value))
            return true;
         return false;
      }

      void ICollection<KeyValuePair<TKey, TValue>>.CopyTo(KeyValuePair<TKey, TValue>[] array, int arrayIndex)
      {
         this.CopyTo(array, arrayIndex);
      }

      int ICollection<KeyValuePair<TKey, TValue>>.Count
      {
         get { return this.Count; }
      }

      bool ICollection<KeyValuePair<TKey, TValue>>.IsReadOnly
      {
         get { return false; }
      }

      bool ICollection<KeyValuePair<TKey, TValue>>.Remove(KeyValuePair<TKey, TValue> item)
      {
         return this.Remove(item.Key);
      }

      #endregion

      #region IEnumerable<KeyValuePair<TKey,TValue>> Members

      IEnumerator<KeyValuePair<TKey, TValue>> IEnumerable<KeyValuePair<TKey, TValue>>.GetEnumerator()
      {
         return this.GetEnumerator();
      }

      #endregion

      #region IEnumerable Members

      IEnumerator IEnumerable.GetEnumerator()
      {
         return this.GetEnumerator();
      }

      #endregion

      #region IDictionary Members

      void IDictionary.Add(object key, object value)
      {
         this.Add((TKey)key, (TValue)value);
      }

      void IDictionary.Clear()
      {
         this.Clear();
      }

      bool IDictionary.Contains(object key)
      {
         return this.ContainsKey((TKey)key);
      }

      IDictionaryEnumerator IDictionary.GetEnumerator()
      {
         return (IDictionaryEnumerator)GetEnumerator();
      }

      bool IDictionary.IsFixedSize
      {
         get { return false; }
      }

      bool IDictionary.IsReadOnly
      {
         get { return false; }
      }

      ICollection IDictionary.Keys
      {
         get
         {
            lock (_sync)
            {
               return new Dictionary<TKey, TValue>.KeyCollection(_dict);
            }
         }
      }

      void IDictionary.Remove(object key)
      {
         Remove((TKey)key);
      }

      ICollection IDictionary.Values
      {
         get
         {
            lock (_sync)
            {
               return new Dictionary<TKey, TValue>.ValueCollection(_dict);
            }
         }
      }

      object IDictionary.this[object key]
      {
         get { return this[(TKey)key]; }
         set { this[(TKey)key] = (TValue)value; }
      }

      #endregion

      #region ICollection Members

      void ICollection.CopyTo(Array array, int index)
      {
         lock (_sync)
         {
            ((ICollection)_dict).CopyTo(array, index);
         }
      }

      int ICollection.Count
      {
         get { return this.Count; }
      }

      bool ICollection.IsSynchronized
      {
         get { return true; }
      }

      object ICollection.SyncRoot
      {
         get { return _sync; }
      }

      #endregion
   }
}

As you can see in the following example, the dictionary can be used in a multi-threading environment just like any other dictionary in a single-threaded scenario.

using System;
using System.Collections.Generic;
using System.Threading;

namespace ConsoleApplication1
{
  class Program
  {
    // the count of items to be added by each thread
    private static int _itemCount;
    // the count of concurrent working threads
    private static int _threadCount;
    // one ThreadSafeDictionary wich will be used by all threads
    private static ThreadSafeDictionary<int, int> _dict;


    static void Main(string[] args)
    {
      // configure parameters
      _itemCount = 100000;
      _dict = new ThreadSafeDictionary<int, int>(_itemCount);
      _threadCount = 100;

      // track the working threads by a AutoResetEvent 
      // instead of old-school Thread.Sleep()
      List<AutoResetEvent> events = new List<AutoResetEvent>(_threadCount);

      for (int i = 0; i < _threadCount; i++)
      {
        // execution information
        int id = i;
        AutoResetEvent autoEvent = new AutoResetEvent(false);
        WaitCallback callback = new WaitCallback(DoAsyncWork);
        // remember the AutoResetEvent wich will be called by the 
        // worker thread when done
        events.Add(autoEvent);

        // create the worker thread using the ThreadPool
        ThreadPool.QueueUserWorkItem(
        callback,
        new object[] { id, autoEvent }
        );
      }

      // wait since all threads are finished
      foreach (var autoEvent in events)
        autoEvent.WaitOne();

      Console.WriteLine();
      Console.WriteLine("Finished");
      Console.ReadKey();
    }

    static void DoAsyncWork(object obj)
    {
      // get arguments from input parameter
      object[] args = obj as object[];
      int id = (int)args[0];
      // each threads starts to work with its own id range
      int start = id * _itemCount;
      AutoResetEvent autoEvent = (AutoResetEvent)args[1];

      Console.WriteLine("Enter {0}", id);

      // use the ThreadSafeDictionary in a multi-threaded
      // environment like each other dictionary

      // add items
      int end = _itemCount - 100;
      for (int i = 0; i < end; i++)
      {
        _dict.Add(i + start, i);
      }

      for (int i = 0; i < end; i++)
      {
        int j = _dict[i + start];
      }

      // set an exclusive lock. All other threads will wait
      _dict.Lock();
      start = _itemCount - 100;
      Console.WriteLine("-> Lock from {0}", id);
      for (int i = _itemCount - 100; i < _itemCount; i++)
      {
        _dict.Add(i + start, i);
      }
      Console.WriteLine("<- Unlock from {0}", id);
      _dict.Unlock();

      Console.WriteLine("Exit {0}", id);
      // inform caller that work is done
      autoEvent.Set();
    }
  }
}

Hope this is class will be helpful for some of you guys and gals! :-)

Flo