Wednesday, February 18, 2009

When MAX Isn't - Getting the Maximum Value of a VARCHAR Field

A user reported that a UI being fed by a particular SQL stored procedure was returning the wrong dollar amount for a customer account on an invoicing system I help maintain.

The particulars of the (somewhat convoluted) balance calculation aren't terribly exciting, but in a nutshell, a part of the process involves grabbing the most recent customer invoice by its InvoiceNumber, and using a value associated with that invoice to perform other calculations which eventually lead to the account balance.

This is a list of the customer's InvoiceNumber values, in descending order as they appear in the Invoices table.
SELECT InvoiceNumber
FROM Invoices
WHERE AccountKey = '99432'
ORDER BY InvoiceNumber DESC

12345678900902
12345678900901
1234567890507
1234567890506
1234567890505

AccountKey represents a foreign key in the Invoices table representing the customer account. Notice that the result set shows the most recent Invoice at the top of the list, and since the primary key for the Invoices table is auto-incrementing, that invoice possesses the largest key value.

Some years back, a data fix was made to this table whereby the customer account number (in this case, 1234567890) was changed to add an additional digit. Thus, what was once 1234567890 became 12345678900.

Time went by. In the last couple of months, a different data fix was made to a piece of code responsible for calculating account balances, specifically late fee calculations. This turned out to be the crux of the matter, as you'll see here.

The code which calculates the account balance, as I mentioned, needs to obtain the customer's most recent InvoiceNumber. The code used to grab this had been modified to add this section of code as follows:
SELECT MAX(InvoiceNumber)
FROM Invoices
WHERE AccountKey = '99432'

1234567890507

Take a look at the position of this InvoiceNumber in the actual list:
12345678900902
12345678900901
1234567890507
1234567890506
1234567890505

Rhut rho! There's the problem. The InvoiceNumber field is of the SQL data type varchar, and consequently the MAX function is returning what is alphanumerically the largest value, when in fact numerically the one at the top of the list should be returned instead. This resulted in the wrong InvoiceNumber being grabbed as part of the late fee calculation, and this further led to a certain balance threshold not being exceeded, which then altered the balance from what it needs to be.

The fix is easy, just CAST the InvoiceNumber to a decimal within the MAX function.
SELECT MAX(CAST(InvoiceNumber AS decimal(18,0))
FROM Invoices
WHERE AccountKey = '99432'

12345678900902

Modifying this code allowed the process to properly obtain and work with the most recent InvoiceNumber. You might be asking, why not just avoid the hassle, and grab the MAX of the Invoices table's primary key for the given AccountKey value? That would certainly make more sense, and would be more efficient, after all, the primary key field for the Invoices table is an integer, not a varchar(25) like the InvoiceNumber.

However, the code which does the calculations is outside my normal scope, so rather than modify someone else's handiwork to the extent I'd prefer, I'm opting for a less invasive change which will compensate for whatever necessitates getting the InvoiceNumber directly. In resolving this issue, I've left detailed commentary in the work order on our internal helpdesk system, so that whomever stumbles across this code in future will have a point of reference.

Friday, January 2, 2009

Extend Existing XML Parsing to Support Multiple Child Tags

I work on a database which is currently a mix of Microsoft SQL 2000 and 2005 T-SQL code. As part of a project I’m working on, I'm working with a stored procedure which receives a reference to an XML document in the form of an integer variable @idoc, and this XML document’s structure needs to be extended with some additional elements.

I'd prefer to refactor the plumbing which generates the XML in the remote stored procedure so that I can take advantage of the new XML data type in 2005, which would allow me to simply pass the XML as a parameter to my stored proc, but unfortunately time is of the essence, so I need to develop an alternative which doesn't involve refactoring to that extent.

The current XML resembles the following:

<XML>
<AccountInfo
Number="123456789"
Status="Active"
Region="SE" />
</XML>


The new XML will look like this:

<XML>
<AccountInfo
Number="123456789"
Status="Active"
Region="SE" />
<State>
<StateInfo
Name="Florida"
Population="9488294"
AreaSquareMiles="58560"
LandAreaSquareMiles="54252"
WaterAreaSquareMiles="4308" />
<County
Name="Alachua"
Population="277120" />
<County
Name="Union"
Population="14842" />
</State>
</XML>


The stored procedure which I need to modify for my project receives the pointer to the XML document, @idoc, and after extracting values from it makes various other calls to other stored procs.

A new State element is being added, with a child StateInfo element that describes the state in question, and any number of child County elements describing counties in that state. In other words, this XML describes an account, the state it operates in, and however many counties the state contains which are relevant to a given account.

My major challenge with this is in the form of the new County tags. Whereas previously there would only ever be a single instance of any particular tag in this XML, there could potentially be dozens of County tags associated with a given State for an account! For each County I need to be able to iterate through the collection and provide these details somehow to my stored procedure for further processing.

My work is pretty well cut out for me. I need to find a way to:
  • Rebuild the XML document using the @idoc as a reference.
  • Grab the new XML specific to each County tag and serialize the values.
To reproduce the XML in the state my stored procedure will be consuming it, I first needed to generate my own @idoc using sp_xml_preparedocument like this:

DECLARE @TestXML varchar(4000)
DECLARE @idoc int

SET @TestXML = '<XML>
<AccountInfo
Number="123456789"
Status="Active"
Region="SE" />
<State>
<StateInfo
Name="Florida"
Population="9488294"
AreaSquareMiles="58560"
LandAreaSquareMiles="54252"
WaterAreaSquareMiles="4308" />
<County
Name="Alachua"
Population="277120" />
<County
Name="Union"
Population="14842" />
</State>
</XML>'

EXEC sp_xml_preparedocument @idoc OUTPUT, @TestXML


At this point, @idoc is a reference to an accurate representation of the XML structure my stored proc will receive and process.

As I mentioned, the code outside of the scope of my stored proc is SQL 2000 based, but that doesn’t mean I can’t use some of the great new capabilities in 2005 in my code. Thus, here I use the OPENXML command to crack open the XML, in conjunction with the FOR XML PATH directive to build the XML structure.

-- First build the XML to give @idoc a reason to live.
EXEC sp_xml_preparedocument @idoc OUTPUT, @TestXML

SELECT
(
SELECT Number, Status, Region
FROM OPENXML(@idoc, '/XML/AccountInfo', 1) WITH (
Number int,
Status varchar(100),
Region varchar(2)
) AS StateInfo FOR XML PATH('AccountInfo'), Type
),
(
SELECT
(
SELECT Name,
Population,
AreaSquareMiles,
LandAreaSquareMiles,
WaterAreaSquareMiles
FROM OPENXML(@idoc, '/XML/StateInfo', 1) WITH (
Name varchar(100),
Population int,
AreaSquareMiles int,
LandAreaSquareMiles int,
WaterAreaSquareMiles int
) AS StateInfo FOR XML PATH('StateInfo'), Type
),

(
SELECT Name,
Population
FROM OPENXML(@idoc, '/XML/County', 1) WITH (
Name varchar(100),
Population int
) AS County FOR XML PATH('County'), Type
)
FOR XML PATH('State'), Type
)
FOR XML PATH('XML')

-- Eliminate the document.
EXEC sp_xml_removedocument @idoc


Here is the output from execution of the above code. I’m done… right??

<XML>
<AccountInfo>
<Number>123456789</Number>
<Status>Active</Status>
<Region>SE</Region>
</AccountInfo>
<State>
<StateInfo>
<Name>Florida</Name>
<Population>9488294</Population>
<AreaSquareMiles>58560</AreaSquareMiles>
<LandAreaSquareMiles>54252</LandAreaSquareMiles>
<WaterAreaSquareMiles>4308</WaterAreaSquareMiles>
</StateInfo>
<County>
<Name>Alachua</Name>
<Population>277120</Population>
</County>
<County>
<Name>Union</Name>
<Population>14842</Population>
</County>
</State>
</XML>


Hmmm, not quite. Notice that there are separate tags beneath AccountInfo, StateInfo and County. Whereas my prototype XML has the values under these tags in the form of XML attributes, they appear here as tags. Not good!

I found an article by Jerry Dixon which led me to the answer. I just needed to take each item in my SELECT statements and use AS ‘@<attributeName> to tell SQL to output the item as an attribute of the node rather than a separate element.

EXEC sp_xml_preparedocument @idoc OUTPUT, @TestXML

SELECT
(
SELECT Number AS '@Number',
Status AS '@Status',
Region AS '@Region'
FROM OPENXML(@idoc, '/XML/AccountInfo', 1) WITH (
Number int,
Status varchar(100),
Region varchar(2)
) AS StateInfo FOR XML PATH('AccountInfo'), Type
),

(
SELECT Name AS '@Name',
Population AS '@Population',
AreaSquareMiles AS '@AreaSquareMiles',
LandAreaSquareMiles AS '@LandAreaSquareMiles',
WaterAreaSquareMiles AS '@WaterAreaSquareMiles'
FROM OPENXML(@idoc, '/XML/State/StateInfo', 1) WITH (
Name varchar(100),
Population int,
AreaSquareMiles int,
LandAreaSquareMiles int,
WaterAreaSquareMiles int
) AS StateInfo FOR XML PATH('StateInfo'), Type
),

(
SELECT Name AS '@Name',
Population AS '@Population'
FROM OPENXML(@idoc, '/XML/State/County', 1) WITH (
Name varchar(100),
Population int
) AS County FOR XML PATH('County'), Type
)

FOR XML PATH('XML')

EXEC sp_xml_removedocument @idoc


Now my rebuilt representation of the original XML document, reconstructed using @idoc as a reference, is just the way I need it to be:

<XML>
<AccountInfo Number="123456789" Status="Active" Region="SE" />
<State>
<StateInfo
Name="Florida"
Population="9488294"
AreaSquareMiles="58560"
LandAreaSquareMiles="54252"
WaterAreaSquareMiles="4308" />
<County Name="Alachua" Population="277120" />
<County Name="Union" Population="14842" />
</State>
</XML>


The next requirement, how to serialize only the County values tied to each account? I can use some of the code which reconstructs the County nodes, above, and omit the AS and FOR XML PATH(‘County’), Type syntax so that a result set rather than XML is returned:

   SELECT Name, Population
FROM OPENXML(@idoc, '/XML/State/County', 1) WITH (
Name varchar(100),
Population int
)


Here’s the result set output by the SELECT:


   Name          Population

Alachua 277120
Union 14842


I defined a table variable to hold the County values, along with its own autoincrementing ID (just because it’s handy to have a unique identifier for other stuff happening in my stored proc).


DECLARE @County TABLE
(
ID int IDENTITY (1, 1) PRIMARY KEY NOT NULL,
Name varchar(100),
Population int
)


Finally, I populate the table variable with the County data I extract from the XML:


INSERT INTO @County (Name, Population)
SELECT Name, Population
FROM OPENXML(@idoc, '/XML/State/County', 1) WITH (
Name varchar(100),
Population int
)


In summary
, I’ve created code which extends the stored procedure so that it can handle multiple instances of a tag, whereas previously it was built to expect only a single instance of each tag. This enables me to properly recombobulate, if you will, the original XML passed by reference from a remote stored proc, then serialize all instances of the new tag and mess with them elsewhere in my code.


___

Special thanks to Greg Houston and his article Format My Source Code For Blogging which helped me compensate for difficulties I had with formatting my code.

Wednesday, December 17, 2008

The Coming Salivation

Brought to you through AnswerBag, a site where goofy Q&A is just a click away...

Friday, October 10, 2008

John McCain Gets Personal... with a Libertarian?!




I'm SOOO excited!!!!1


I received a letter in the mail recently from John McCain! It had his name at the top and EVERYthing!!!












"Dear Friend,"
Why, hello, friend John McCain!




"
In the past you have served as a valued and trusted member of our Party."

Uhhh, wait a minute now...
I'm a Libertarian!
I've been one for over 15 years...!





"...the RNC's records indicate that you still remain on the sidelines and have yet to respond to appeals to stand with us in our fight to prevail over the Obama Democrats."
Well... duh! C'mon John boy, I've NEVER supported your party!







"I am asking you to please get involved NOW."
Uhhh... NO??



"I would not ask for your help
if the circumstances were not so dire."

Owning 7 houses and 13 cars... sure, that's dire...



"But the alarming truth is, if the Obama Democrats win the White House and retain control of Congress, they will have total control of the federal government."
ZOMG, srsly???!











The little faux-handwriting blurb reads...

"By using your own first class stamp to return this envelope,
you will be helping us save much needed funds. -- Thank you."






Hmmm... "POSTAGE WILL BE PAID BY ADDRESSEE..."