Monday, January 30, 2012

Advanced SQL techniques using IN and XML Path

Ok, so the reason I decided for this article to use IN and XML Path together is in my mind they kind of go hand in hand as they (sometimes virtually) produce lists. 

If you have a basic understanding of SQL you know that your Select statement fetches applicable records into a record set and then SQL massages it with your Where, Having clauses and the like (for instance Distinct).

So imagine we have a table with a field for Customer Number.  This is a key field, but not an identity field.  For instance, we are looking at an Orders table.  So of course we have an order number (most likely a varchar(50)) a customer number field (or customer name, whichever also a varchar(50)) and a product description field (for simplicity we’ll leave it at those fields).
So our SQL statement to create the table would look like -

Use [order_database]










    Create Table dbo.Orders


         [order_number] varchar(50) identity NOT NULL    

    ,    [customer_number] varchar(50) NULL    

    ,    [product_descr] varchar(1000) NULL 

    ) On [Primary]





Or something like that….

So now that we have our table built, we have the following data -

order_number customer_number product_descr
10134567 cn123 bicycle
10134568 cn123 pencil
10134569 cn321 car

So as you can see here, we have two customers in the table.  Customer cn123 has bought both a bicycle and pencil and customer cn321 has bought a new car (whoop, whoop!).

At any rate, our requirement is to return only one row per customer with a list of products that the customer has bought.  Thus, it will look like -

customer_number product_descr
cn123 bicycle, pencil
cn321 car

Impossible for SQL to concatenate columns into a row, you say? Well I laugh at your doubt and we will prove you wrong by utilizing XML Path (and also the STUFF tSQL keyword)!

So, to set this up we will use our Orders table twice (once called main_table and the second called sub_table).  Here’s our code -

SELECT      distinct(main_table.customer_number) AS [customer_number],

            STUFF((    SELECT ',' + sub_table.product_descr AS [text()]

                        FROM dbo.Orders sub_table


                        sub_table.customer_number = main_table.customer_number

                        FOR XML PATH('')), 1, 1, '' )

                        AS [product_descr]

FROM  dbo.Orders main_table

So looking at this Query, we are only wanting unique (or distinct which means one of each) customer_number in our original select, then we move into the STUFF keyword which combines strings in SQL and we are basing that off of the customer_number and select the product_descr.  Finally, we are adding a comma, which the 1, 1, ‘’ references the STUFF keyword and removes the preceding (first) comma in front of the first item.

Thus, we end up with the table above and prove your doubts unfounded. Smile

So now we move into the IN keyword.  Say we have a requirement that states if a customer buys a car, instead of a putting car in the column we want it to say “Valued Customer” (cause they spent so much money on a new car).  Obviously, this is a ridiculous requirement but let’s keep it simple.  In this case, we would use the CASE keyword.  Our code would look like this -



    ,    o.order_number

    ,    Case

            When 'car' IN o.product_descr Then 'Valued Customer'

            When 'car' NOT IN o.product_descr Then 'Regular Customer'

            -- We also could have used a ELSE statement

            -- Like So:

            -- Else 'Regular Customer'

         END As 'Customer Value'

This will give use three columns – customer_number, order_number, and Customer Value.

This demonstration hits home on several KEYWORDS like Case, When, End As, IN, and NOT IN.

SQL is one of the most versatile database systems available and it has unsurpassed flexibility.  Massaging data is its specialty.  With knowledge of keywords and knowing how to write tSQL you can manipulate data into any format you want.

Happy .Netting – Saleh…


If you want to weed out additional copies of the same item in your list, you can add a Group By clause after the WHERE clause in your XML Path.

No comments:

Post a Comment