Temporarily Lacking Free Time

I was pretty active in my free time making progress on support for Microsoft Azure from June of 2011 through November, but suddenly found myself without any time to spend working on MediaWiki at all! So, what was I doing instead? I've been spending all my time with my FIRST Robotics Competition team, from South San High School, FRC Team 457. Weekend before last, we competed in the Alamo Regional Championship, won in the finals, and qualified for the national competition. I have a bit of a respite now, but will be joining my team at the nationals in a bit over a month. Here I am with Chango, our team's robot, in the pits at the Alamo Regional.


Struggling with an inexplicable issue

I was seeing some odd behavior in 1.19 alpha, so I decided to focus my attention on 1.18 instead, thereby increasing my confidence that my work would be against a known-good version of MediaWiki. Like 1.19 alpha, 1.18 is mostly working -- but here's a head scratcher. I'm getting strange rendering of some characters, such as the hyphen (-). Here's a page as seen in MediaWiki 1.18 on Windows 7, SQL Server 2008 R2, and IIS 7.0. Notice the messed up "Model-View-Controller" in the table of contents. The page was copied over from the English Wikipedia. Here's what's very strange. The SQL database has hyphens in it, nothing remarkable. When I select the "Edit" tab, I can see the hyphens in the text box, so I press preview. The result looks fine! The table of contents now shows "Model-View-Controller". Why is the rendering different between the preview and the rendered page? Except for the database access, the code is pretty much identical between the official 1.18 code and my variant.


1.19 alpha working on Windows

MediaWiki 1.19 alpha is up on Windows, with testing and perhaps modifications to come. Needs to use previous, rather than current version of the installer (as of NOLA Hackathon), but Max, who works on the installer, is aware of the problem, so I expect a newer version of the installer will work as well. Also, compression of values stored to the objectstore is disabled for this version, since I'm having problems when I compress and decompress those values.


SQL Server JumpIn Outcome

I had a great time at the Microsoft-hosted SQL Server JumpIn! camp. I got MediaWiki 1.16.5 to run on IIS 7 on Windows 7 with SQL Server 2008 R2, with SQL Server Denali, and with SQL Server Azure. I also got the engine mostly working on Windows Azure, deployed to the cloud, but need to flesh out a few missing pieces. First, sessions will have to be stored to the database, rather than to the file system. Second, the file-based media storage mechanism will have to be adapted to use blob storage. Also, I'll have to figure out how to get Java support so I can continue to use Apache Batik, and how I'll get that to work with cloud storage.


Like a Phoenix from the Ashes

This work is coming back to life. Next week, I'll be attending the Microsoft SQL Server JumpIn! camp, along with sixteen other PHP developers. I'll be devoting special attention to attempting to use the new Microsoft PDO for the SQL Server Direct driver. Mostly, it should be the same as it was when using the ADODB library, except I'll have a new Database class. Since I haven't kept things up to date, I'll have to keep my fingers crossed that I can get things up and running with MediaWiki 1.16.5 or 1.17. I've been pleased to discover, however, that while I was not looking the SQL syntax in tables.sql for MySQL has evolved to resemble what I had been doing for SQL Server much more closely than it used to. Also, I'll be targeting SQL Server 2008 R2 rather than SQL Server 2000 or SQL Server 2005.


An issue of timing

For the Microsoft SQL Server version of the MediaWiki engine (at version 1.13 or so), everything was working except the Special:WantedPages and Special:AllPages. With the new design of Special:AllPages, I can't make heads or tails out of the SQL being generated, so I just reverted to the old behavior and it's fine. Special:WantedPages was different. The SQL was fine, but as the size of the wiki has grown (closing in on 200,000 articles), the database server was timing out before returning any results to the PHP. This problem was relatively easy to fix, using a not-widely-known property of the ADO Connection object, Connection::CommandTimeout. The default for Connection::CommandTimeout is 30 seconds, and no results were being returned for the (expensive) query until about 35 seconds or so. Once a row comes back from the server, ADO doesn't care how much time it takes to complete the query, but it's pretty sensitive to how long it takes for results to START coming back. I went into the ADOdb connector code and in the file adodb-ado5.inc.php added right before the return from the method ADODB_ado::_connect the single line
   $dbc->CommandTimeout = 90;
Although this fixed the timeout problem, it brought out a latent problem. For the Special pages that return true from QueryPage::isExpensive, some of them -- including this one -- were trying to cache the results in a temporary table. This clashed with ADOdb's handling of OFFSET and LIMIT clauses in the SQL, so I have disabled the caching of results from "expensive" Special pages by not performing any activity in the overriden QueryPage::preProcessResults methods for these pages (Special:ShortPages, Special:LongPages, and Special:WantedPages in particular). Down the road, perhaps I'll think of a slightly more clever way to handle SQL OFFSET and LIMIT clauses and I can re-enable the preProcessResults methods.


Another recommended change for ADODB

I'm trying to minimize the number of changes needed to the wiki code, which is still fairly indecisive about retrieving data from the database via column names or column numbers. To make things as simple as possible, I set the fetch mode to ADODB_FETCH_BOTH rather than toggling back and forth between ADODB_FETCH_ASSOC and ADODB_FETCH_NUM, as I had been doing in the past. To make this work right, you'll need to make a change to the file adodb-ado5.inc.php. Change line 654 from
$this->fields = $this->GetRowAssoc(ADODB_ASSOC_CASE);
to read
$this->fields = array_merge($this->fields, $this->GetRowAssoc(ADODB_ASSOC_CASE));
Having done this, you'll have an array with both numeric keys and string keys.