I discovered a new old Firefox add-on which is my new favorite. Repagination lets you take many annoyingly spoon-fed articles on the web which are broken up into separate pages and consolidate them as you like.
Increasingly, to get their advertising exposed to as many eyes as possible, content publishers will take anything from a howto article to a top 10 list and break it into individual pages. I think mainly the purpose of this is increase the CPM, which is understandable if someone is publishing content for purposes of earning some money.
Another reason for doing this is to try to regulate how much bandwidth is consumed in viewing the site. One user gobbling up 1 item of a top 10 list at a time uses less bandwidth than a user grabbing all 10 items at once. This is certainly a concern particularly for a forum which might be running with an ISP which charges for monthly bandwidth overage. However, some sites like Something Awful will do stuff like charge users a one-time fee for the privilege of searching their forum. While this is understandable, I find it annoying.
Say you have a favorite thread on a forum which is huge, but unfortunately only shows a set limit of posts per page. This add-on lets you bypass the site's limitation, and lets you consolidate all the posts into a single, huge web document.
In the screenshot below, by right-clicking on the Next link, a Repagination popup menu is present, and the "All" option I choose lets me append to the current page all subsequent pages of the forum thread.
In this example, a thread which is several hundred pages long can, if I like, be loaded en masse in a single browser window. At that point, I can easily search it for references, in this case, to artists or songs I like, or just save the whole mess for viewing later on a laptop if I happen to be somewhere without wireless connectivity (a situation which is becoming increasingly infrequent).
While the popularity of this kind of add-on certainly will open the floodgates as far as your ability to gobble up more of a content publisher's bandwidth at one time, it also will and arguably should send the message that informed users can and will get around attempts to spoon-feed content and advertising to them.
Although this add-on hasn't been updated recently, someone posted an update to this extension here which makes it compatible with Firefox 3.x. I hope that the author will consider an update of his own, I find it an incredibly helpful and useful add-on!
Monday, March 9, 2009
Wednesday, February 18, 2009
When MAX Isn't - Getting the Maximum Value of a VARCHAR Field
A user reported that a UI being fed by a particular SQL stored procedure was returning the wrong dollar amount for a customer account on an invoicing system I help maintain.
The particulars of the (somewhat convoluted) balance calculation aren't terribly exciting, but in a nutshell, a part of the process involves grabbing the most recent customer invoice by its InvoiceNumber, and using a value associated with that invoice to perform other calculations which eventually lead to the account balance.
This is a list of the customer's InvoiceNumber values, in descending order as they appear in the Invoices table.
AccountKey represents a foreign key in the Invoices table representing the customer account. Notice that the result set shows the most recent Invoice at the top of the list, and since the primary key for the Invoices table is auto-incrementing, that invoice possesses the largest key value.
Some years back, a data fix was made to this table whereby the customer account number (in this case, 1234567890) was changed to add an additional digit. Thus, what was once 1234567890 became 12345678900.
Time went by. In the last couple of months, a different data fix was made to a piece of code responsible for calculating account balances, specifically late fee calculations. This turned out to be the crux of the matter, as you'll see here.
The code which calculates the account balance, as I mentioned, needs to obtain the customer's most recent InvoiceNumber. The code used to grab this had been modified to add this section of code as follows:
Take a look at the position of this InvoiceNumber in the actual list:
Rhut rho! There's the problem. The InvoiceNumber field is of the SQL data type varchar, and consequently the MAX function is returning what is alphanumerically the largest value, when in fact numerically the one at the top of the list should be returned instead. This resulted in the wrong InvoiceNumber being grabbed as part of the late fee calculation, and this further led to a certain balance threshold not being exceeded, which then altered the balance from what it needs to be.
The fix is easy, just CAST the InvoiceNumber to a decimal within the MAX function.
Modifying this code allowed the process to properly obtain and work with the most recent InvoiceNumber. You might be asking, why not just avoid the hassle, and grab the MAX of the Invoices table's primary key for the given AccountKey value? That would certainly make more sense, and would be more efficient, after all, the primary key field for the Invoices table is an integer, not a varchar(25) like the InvoiceNumber.
However, the code which does the calculations is outside my normal scope, so rather than modify someone else's handiwork to the extent I'd prefer, I'm opting for a less invasive change which will compensate for whatever necessitates getting the InvoiceNumber directly. In resolving this issue, I've left detailed commentary in the work order on our internal helpdesk system, so that whomever stumbles across this code in future will have a point of reference.
The particulars of the (somewhat convoluted) balance calculation aren't terribly exciting, but in a nutshell, a part of the process involves grabbing the most recent customer invoice by its InvoiceNumber, and using a value associated with that invoice to perform other calculations which eventually lead to the account balance.
This is a list of the customer's InvoiceNumber values, in descending order as they appear in the Invoices table.
SELECT InvoiceNumber
FROM Invoices
WHERE AccountKey = '99432'
ORDER BY InvoiceNumber DESC
12345678900902
12345678900901
1234567890507
1234567890506
1234567890505
AccountKey represents a foreign key in the Invoices table representing the customer account. Notice that the result set shows the most recent Invoice at the top of the list, and since the primary key for the Invoices table is auto-incrementing, that invoice possesses the largest key value.
Some years back, a data fix was made to this table whereby the customer account number (in this case, 1234567890) was changed to add an additional digit. Thus, what was once 1234567890 became 12345678900.
Time went by. In the last couple of months, a different data fix was made to a piece of code responsible for calculating account balances, specifically late fee calculations. This turned out to be the crux of the matter, as you'll see here.
The code which calculates the account balance, as I mentioned, needs to obtain the customer's most recent InvoiceNumber. The code used to grab this had been modified to add this section of code as follows:
SELECT MAX(InvoiceNumber)
FROM Invoices
WHERE AccountKey = '99432'
1234567890507
Take a look at the position of this InvoiceNumber in the actual list:
12345678900902
12345678900901
1234567890507
1234567890506
1234567890505
Rhut rho! There's the problem. The InvoiceNumber field is of the SQL data type varchar, and consequently the MAX function is returning what is alphanumerically the largest value, when in fact numerically the one at the top of the list should be returned instead. This resulted in the wrong InvoiceNumber being grabbed as part of the late fee calculation, and this further led to a certain balance threshold not being exceeded, which then altered the balance from what it needs to be.
The fix is easy, just CAST the InvoiceNumber to a decimal within the MAX function.
SELECT MAX(CAST(InvoiceNumber AS decimal(18,0))
FROM Invoices
WHERE AccountKey = '99432'
12345678900902
Modifying this code allowed the process to properly obtain and work with the most recent InvoiceNumber. You might be asking, why not just avoid the hassle, and grab the MAX of the Invoices table's primary key for the given AccountKey value? That would certainly make more sense, and would be more efficient, after all, the primary key field for the Invoices table is an integer, not a varchar(25) like the InvoiceNumber.
However, the code which does the calculations is outside my normal scope, so rather than modify someone else's handiwork to the extent I'd prefer, I'm opting for a less invasive change which will compensate for whatever necessitates getting the InvoiceNumber directly. In resolving this issue, I've left detailed commentary in the work order on our internal helpdesk system, so that whomever stumbles across this code in future will have a point of reference.
Friday, January 2, 2009
Extend Existing XML Parsing to Support Multiple Child Tags
I work on a database which is currently a mix of Microsoft SQL 2000 and 2005 T-SQL code. As part of a project I’m working on, I'm working with a stored procedure which receives a reference to an XML document in the form of an integer variable @idoc, and this XML document’s structure needs to be extended with some additional elements.
I'd prefer to refactor the plumbing which generates the XML in the remote stored procedure so that I can take advantage of the new XML data type in 2005, which would allow me to simply pass the XML as a parameter to my stored proc, but unfortunately time is of the essence, so I need to develop an alternative which doesn't involve refactoring to that extent.
The current XML resembles the following:
The new XML will look like this:
The stored procedure which I need to modify for my project receives the pointer to the XML document, @idoc, and after extracting values from it makes various other calls to other stored procs.
A new State element is being added, with a child StateInfo element that describes the state in question, and any number of child County elements describing counties in that state. In other words, this XML describes an account, the state it operates in, and however many counties the state contains which are relevant to a given account.
My major challenge with this is in the form of the new County tags. Whereas previously there would only ever be a single instance of any particular tag in this XML, there could potentially be dozens of County tags associated with a given State for an account! For each County I need to be able to iterate through the collection and provide these details somehow to my stored procedure for further processing.
My work is pretty well cut out for me. I need to find a way to:
At this point, @idoc is a reference to an accurate representation of the XML structure my stored proc will receive and process.
As I mentioned, the code outside of the scope of my stored proc is SQL 2000 based, but that doesn’t mean I can’t use some of the great new capabilities in 2005 in my code. Thus, here I use the OPENXML command to crack open the XML, in conjunction with the FOR XML PATH directive to build the XML structure.
Here is the output from execution of the above code. I’m done… right??
Hmmm, not quite. Notice that there are separate tags beneath AccountInfo, StateInfo and County. Whereas my prototype XML has the values under these tags in the form of XML attributes, they appear here as tags. Not good!
I found an article by Jerry Dixon which led me to the answer. I just needed to take each item in my SELECT statements and use AS ‘@<attributeName>’ to tell SQL to output the item as an attribute of the node rather than a separate element.
Now my rebuilt representation of the original XML document, reconstructed using @idoc as a reference, is just the way I need it to be:
The next requirement, how to serialize only the County values tied to each account? I can use some of the code which reconstructs the County nodes, above, and omit the AS and FOR XML PATH(‘County’), Type syntax so that a result set rather than XML is returned:
Here’s the result set output by the SELECT:
I defined a table variable to hold the County values, along with its own autoincrementing ID (just because it’s handy to have a unique identifier for other stuff happening in my stored proc).
Finally, I populate the table variable with the County data I extract from the XML:
In summary, I’ve created code which extends the stored procedure so that it can handle multiple instances of a tag, whereas previously it was built to expect only a single instance of each tag. This enables me to properly recombobulate, if you will, the original XML passed by reference from a remote stored proc, then serialize all instances of the new tag and mess with them elsewhere in my code.
___
Special thanks to Greg Houston and his article Format My Source Code For Blogging which helped me compensate for difficulties I had with formatting my code.
I'd prefer to refactor the plumbing which generates the XML in the remote stored procedure so that I can take advantage of the new XML data type in 2005, which would allow me to simply pass the XML as a parameter to my stored proc, but unfortunately time is of the essence, so I need to develop an alternative which doesn't involve refactoring to that extent.
The current XML resembles the following:
<XML>
<AccountInfo
Number="123456789"
Status="Active"
Region="SE" />
</XML>
The new XML will look like this:
<XML>
<AccountInfo
Number="123456789"
Status="Active"
Region="SE" />
<State>
<StateInfo
Name="Florida"
Population="9488294"
AreaSquareMiles="58560"
LandAreaSquareMiles="54252"
WaterAreaSquareMiles="4308" />
<County
Name="Alachua"
Population="277120" />
<County
Name="Union"
Population="14842" />
</State>
</XML>
The stored procedure which I need to modify for my project receives the pointer to the XML document, @idoc, and after extracting values from it makes various other calls to other stored procs.
A new State element is being added, with a child StateInfo element that describes the state in question, and any number of child County elements describing counties in that state. In other words, this XML describes an account, the state it operates in, and however many counties the state contains which are relevant to a given account.
My major challenge with this is in the form of the new County tags. Whereas previously there would only ever be a single instance of any particular tag in this XML, there could potentially be dozens of County tags associated with a given State for an account! For each County I need to be able to iterate through the collection and provide these details somehow to my stored procedure for further processing.
My work is pretty well cut out for me. I need to find a way to:
- Rebuild the XML document using the @idoc as a reference.
- Grab the new XML specific to each County tag and serialize the values.
DECLARE @TestXML varchar(4000)
DECLARE @idoc int
SET @TestXML = '<XML>
<AccountInfo
Number="123456789"
Status="Active"
Region="SE" />
<State>
<StateInfo
Name="Florida"
Population="9488294"
AreaSquareMiles="58560"
LandAreaSquareMiles="54252"
WaterAreaSquareMiles="4308" />
<County
Name="Alachua"
Population="277120" />
<County
Name="Union"
Population="14842" />
</State>
</XML>'
EXEC sp_xml_preparedocument @idoc OUTPUT, @TestXML
At this point, @idoc is a reference to an accurate representation of the XML structure my stored proc will receive and process.
As I mentioned, the code outside of the scope of my stored proc is SQL 2000 based, but that doesn’t mean I can’t use some of the great new capabilities in 2005 in my code. Thus, here I use the OPENXML command to crack open the XML, in conjunction with the FOR XML PATH directive to build the XML structure.
-- First build the XML to give @idoc a reason to live.
EXEC sp_xml_preparedocument @idoc OUTPUT, @TestXML
SELECT
(
SELECT Number, Status, Region
FROM OPENXML(@idoc, '/XML/AccountInfo', 1) WITH (
Number int,
Status varchar(100),
Region varchar(2)
) AS StateInfo FOR XML PATH('AccountInfo'), Type
),
(
SELECT
(
SELECT Name,
Population,
AreaSquareMiles,
LandAreaSquareMiles,
WaterAreaSquareMiles
FROM OPENXML(@idoc, '/XML/StateInfo', 1) WITH (
Name varchar(100),
Population int,
AreaSquareMiles int,
LandAreaSquareMiles int,
WaterAreaSquareMiles int
) AS StateInfo FOR XML PATH('StateInfo'), Type
),
(
SELECT Name,
Population
FROM OPENXML(@idoc, '/XML/County', 1) WITH (
Name varchar(100),
Population int
) AS County FOR XML PATH('County'), Type
)
FOR XML PATH('State'), Type
)
FOR XML PATH('XML')
-- Eliminate the document.
EXEC sp_xml_removedocument @idoc
Here is the output from execution of the above code. I’m done… right??
<XML>
<AccountInfo>
<Number>123456789</Number>
<Status>Active</Status>
<Region>SE</Region>
</AccountInfo>
<State>
<StateInfo>
<Name>Florida</Name>
<Population>9488294</Population>
<AreaSquareMiles>58560</AreaSquareMiles>
<LandAreaSquareMiles>54252</LandAreaSquareMiles>
<WaterAreaSquareMiles>4308</WaterAreaSquareMiles>
</StateInfo>
<County>
<Name>Alachua</Name>
<Population>277120</Population>
</County>
<County>
<Name>Union</Name>
<Population>14842</Population>
</County>
</State>
</XML>
Hmmm, not quite. Notice that there are separate tags beneath AccountInfo, StateInfo and County. Whereas my prototype XML has the values under these tags in the form of XML attributes, they appear here as tags. Not good!
I found an article by Jerry Dixon which led me to the answer. I just needed to take each item in my SELECT statements and use AS ‘@<attributeName>
EXEC sp_xml_preparedocument @idoc OUTPUT, @TestXML
SELECT
(
SELECT Number AS '@Number',
Status AS '@Status',
Region AS '@Region'
FROM OPENXML(@idoc, '/XML/AccountInfo', 1) WITH (
Number int,
Status varchar(100),
Region varchar(2)
) AS StateInfo FOR XML PATH('AccountInfo'), Type
),
(
SELECT Name AS '@Name',
Population AS '@Population',
AreaSquareMiles AS '@AreaSquareMiles',
LandAreaSquareMiles AS '@LandAreaSquareMiles',
WaterAreaSquareMiles AS '@WaterAreaSquareMiles'
FROM OPENXML(@idoc, '/XML/State/StateInfo', 1) WITH (
Name varchar(100),
Population int,
AreaSquareMiles int,
LandAreaSquareMiles int,
WaterAreaSquareMiles int
) AS StateInfo FOR XML PATH('StateInfo'), Type
),
(
SELECT Name AS '@Name',
Population AS '@Population'
FROM OPENXML(@idoc, '/XML/State/County', 1) WITH (
Name varchar(100),
Population int
) AS County FOR XML PATH('County'), Type
)
FOR XML PATH('XML')
EXEC sp_xml_removedocument @idoc
Now my rebuilt representation of the original XML document, reconstructed using @idoc as a reference, is just the way I need it to be:
<XML>
<AccountInfo Number="123456789" Status="Active" Region="SE" />
<State>
<StateInfo
Name="Florida"
Population="9488294"
AreaSquareMiles="58560"
LandAreaSquareMiles="54252"
WaterAreaSquareMiles="4308" />
<County Name="Alachua" Population="277120" />
<County Name="Union" Population="14842" />
</State>
</XML>
The next requirement, how to serialize only the County values tied to each account? I can use some of the code which reconstructs the County nodes, above, and omit the AS and FOR XML PATH(‘County’), Type syntax so that a result set rather than XML is returned:
SELECT Name, Population
FROM OPENXML(@idoc, '/XML/State/County', 1) WITH (
Name varchar(100),
Population int
)
Here’s the result set output by the SELECT:
Name Population
Alachua 277120
Union 14842
I defined a table variable to hold the County values, along with its own autoincrementing ID (just because it’s handy to have a unique identifier for other stuff happening in my stored proc).
DECLARE @County TABLE
(
ID int IDENTITY (1, 1) PRIMARY KEY NOT NULL,
Name varchar(100),
Population int
)
Finally, I populate the table variable with the County data I extract from the XML:
INSERT INTO @County (Name, Population)
SELECT Name, Population
FROM OPENXML(@idoc, '/XML/State/County', 1) WITH (
Name varchar(100),
Population int
)
In summary, I’ve created code which extends the stored procedure so that it can handle multiple instances of a tag, whereas previously it was built to expect only a single instance of each tag. This enables me to properly recombobulate, if you will, the original XML passed by reference from a remote stored proc, then serialize all instances of the new tag and mess with them elsewhere in my code.
___
Special thanks to Greg Houston and his article Format My Source Code For Blogging which helped me compensate for difficulties I had with formatting my code.
Wednesday, December 17, 2008
The Coming Salivation
Labels:
god,
mastication,
salivation,
salvation
Subscribe to:
Posts (Atom)