Wednesday, February 18, 2009

When MAX Isn't - Getting the Maximum Value of a VARCHAR Field

A user reported that a UI being fed by a particular SQL stored procedure was returning the wrong dollar amount for a customer account on an invoicing system I help maintain.

The particulars of the (somewhat convoluted) balance calculation aren't terribly exciting, but in a nutshell, a part of the process involves grabbing the most recent customer invoice by its InvoiceNumber, and using a value associated with that invoice to perform other calculations which eventually lead to the account balance.

This is a list of the customer's InvoiceNumber values, in descending order as they appear in the Invoices table.
SELECT InvoiceNumber
FROM Invoices
WHERE AccountKey = '99432'
ORDER BY InvoiceNumber DESC

12345678900902
12345678900901
1234567890507
1234567890506
1234567890505

AccountKey represents a foreign key in the Invoices table representing the customer account. Notice that the result set shows the most recent Invoice at the top of the list, and since the primary key for the Invoices table is auto-incrementing, that invoice possesses the largest key value.

Some years back, a data fix was made to this table whereby the customer account number (in this case, 1234567890) was changed to add an additional digit. Thus, what was once 1234567890 became 12345678900.

Time went by. In the last couple of months, a different data fix was made to a piece of code responsible for calculating account balances, specifically late fee calculations. This turned out to be the crux of the matter, as you'll see here.

The code which calculates the account balance, as I mentioned, needs to obtain the customer's most recent InvoiceNumber. The code used to grab this had been modified to add this section of code as follows:
SELECT MAX(InvoiceNumber)
FROM Invoices
WHERE AccountKey = '99432'

1234567890507

Take a look at the position of this InvoiceNumber in the actual list:
12345678900902
12345678900901
1234567890507
1234567890506
1234567890505

Rhut rho! There's the problem. The InvoiceNumber field is of the SQL data type varchar, and consequently the MAX function is returning what is alphanumerically the largest value, when in fact numerically the one at the top of the list should be returned instead. This resulted in the wrong InvoiceNumber being grabbed as part of the late fee calculation, and this further led to a certain balance threshold not being exceeded, which then altered the balance from what it needs to be.

The fix is easy, just CAST the InvoiceNumber to a decimal within the MAX function.
SELECT MAX(CAST(InvoiceNumber AS decimal(18,0))
FROM Invoices
WHERE AccountKey = '99432'

12345678900902

Modifying this code allowed the process to properly obtain and work with the most recent InvoiceNumber. You might be asking, why not just avoid the hassle, and grab the MAX of the Invoices table's primary key for the given AccountKey value? That would certainly make more sense, and would be more efficient, after all, the primary key field for the Invoices table is an integer, not a varchar(25) like the InvoiceNumber.

However, the code which does the calculations is outside my normal scope, so rather than modify someone else's handiwork to the extent I'd prefer, I'm opting for a less invasive change which will compensate for whatever necessitates getting the InvoiceNumber directly. In resolving this issue, I've left detailed commentary in the work order on our internal helpdesk system, so that whomever stumbles across this code in future will have a point of reference.