tag:blogger.com,1999:blog-369200532024-02-20T23:18:23.596-05:00MediaWikiWorkerThis blog records work my colleagues and I are doing with the MediaWiki engine.DJhttp://www.blogger.com/profile/04147793739642749276noreply@blogger.comBlogger19125tag:blogger.com,1999:blog-36920053.post-11487876411270146222012-03-14T14:21:00.003-06:002012-03-14T14:29:24.409-06:00Temporarily Lacking Free Time<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2OzZsDMzoUDO-3Wvb5pu2-kRBIzVMXOT05Q0JjLLkxEVXEsZ8wVtlu6YOeW5vAo_ajwm2b0Qkrv83PEBXMBeyi-QZYReBKgYee0sUWW73qA1VvhdTYSy5Qkp8VMwa6D7e7LhPuA/s1600/100_5655.jpg"><img style="float:right; margin:0 0 10px 10px;cursor:pointer; cursor:hand;width: 240px; height: 320px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2OzZsDMzoUDO-3Wvb5pu2-kRBIzVMXOT05Q0JjLLkxEVXEsZ8wVtlu6YOeW5vAo_ajwm2b0Qkrv83PEBXMBeyi-QZYReBKgYee0sUWW73qA1VvhdTYSy5Qkp8VMwa6D7e7LhPuA/s320/100_5655.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5719852314402487746" /></a>
I was pretty active in my free time making progress on support for Microsoft Azure from June of 2011 through November, but suddenly found myself without any time to spend working on MediaWiki at all! So, what was I doing instead? I've been spending all my time with my FIRST Robotics Competition team, from South San High School, FRC Team 457. Weekend before last, we competed in the Alamo Regional Championship, won in the finals, and qualified for the national competition. I have a bit of a respite now, but will be joining my team at the nationals in a bit over a month. Here I am with Chango, our team's robot, in the pits at the Alamo Regional.DJhttp://www.blogger.com/profile/04147793739642749276noreply@blogger.com0tag:blogger.com,1999:blog-36920053.post-14960709998753604692011-10-31T21:24:00.005-06:002011-10-31T21:44:31.017-06:00Struggling with an inexplicable issue<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiA-yLYAdc1ZUrhZDM31i2rCbMtvniFfgmecgYtGPBxjFHE7J6ciwvNcNbyv_YQTZLnhJxI-TPXD0soWsB_hslclZ7tgOfHtRuI7LJIjCE3XjV0SfqVM_0-GQDc8OdjMewoXRuAhA/s1600/BadRendering.png"><img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 320px; height: 205px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiA-yLYAdc1ZUrhZDM31i2rCbMtvniFfgmecgYtGPBxjFHE7J6ciwvNcNbyv_YQTZLnhJxI-TPXD0soWsB_hslclZ7tgOfHtRuI7LJIjCE3XjV0SfqVM_0-GQDc8OdjMewoXRuAhA/s320/BadRendering.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5669864321127914674" /></a>
I was seeing some odd behavior in 1.19 alpha, so I decided to focus my attention on 1.18 instead, thereby increasing my confidence that my work would be against a known-good version of MediaWiki. Like 1.19 alpha, 1.18 is mostly working -- but here's a head scratcher. I'm getting strange rendering of some characters, such as the hyphen (-). Here's a page as seen in MediaWiki 1.18 on Windows 7, SQL Server 2008 R2, and IIS 7.0. Notice the messed up "Model-View-Controller" in the table of contents. The page was copied over from the English Wikipedia. Here's what's very strange. The SQL database has hyphens in it, nothing remarkable. When I select the "Edit" tab, I can see the hyphens in the text box, so I press preview. <a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2Iil8EMYFQr8J5ncHZTAoY2_sK1CJEcKJoFviytLUHJc39gUEOjzhv0YoRmtqeFlWtkwxC_lVZagjqZPaOC5_hNw13QQA7Vlvif-R-20RKidRsOAcA2EQlT1ZHUPAd7OmPg0P9w/s1600/GoodRendering.png"><img style="float:right; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 320px; height: 219px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2Iil8EMYFQr8J5ncHZTAoY2_sK1CJEcKJoFviytLUHJc39gUEOjzhv0YoRmtqeFlWtkwxC_lVZagjqZPaOC5_hNw13QQA7Vlvif-R-20RKidRsOAcA2EQlT1ZHUPAd7OmPg0P9w/s320/GoodRendering.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5669866093997689202" /></a>The result looks fine! The table of contents now shows "Model-View-Controller". Why is the rendering different between the preview and the rendered page? Except for the database access, the code is pretty much identical between the official 1.18 code and my variant.DJhttp://www.blogger.com/profile/04147793739642749276noreply@blogger.com1tag:blogger.com,1999:blog-36920053.post-3926644037650320202011-10-23T19:18:00.002-05:002011-10-23T19:25:17.618-05:001.19 alpha working on WindowsMediaWiki 1.19 alpha is up on Windows, with testing and perhaps modifications to come. Needs to use previous, rather than current version of the installer (as of NOLA Hackathon), but Max, who works on the installer, is aware of the problem, so I expect a newer version of the installer will work as well. Also, compression of values stored to the objectstore is disabled for this version, since I'm having problems when I compress and decompress those values.
<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgWQHZR5VoQ_4H_T2x450XZeVxI-QRHc0qEWMuconYZjirOFPCrzS_KTep_YHJoK0le9lU1DKAJtqwywxd7p8nNF5s0zPa8dN94MYieM36oDUmw1H43Z1oDQDzlAU38OjuSWw0h3w/s1600/MWTrunk.png"><img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 320px; height: 173px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgWQHZR5VoQ_4H_T2x450XZeVxI-QRHc0qEWMuconYZjirOFPCrzS_KTep_YHJoK0le9lU1DKAJtqwywxd7p8nNF5s0zPa8dN94MYieM36oDUmw1H43Z1oDQDzlAU38OjuSWw0h3w/s320/MWTrunk.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5666846908502892882" /></a>DJhttp://www.blogger.com/profile/04147793739642749276noreply@blogger.com1tag:blogger.com,1999:blog-36920053.post-70129745449399029782011-06-26T01:03:00.002-05:002011-06-26T01:08:34.017-05:00SQL Server JumpIn OutcomeI had a great time at the Microsoft-hosted SQL Server JumpIn! camp. I got MediaWiki 1.16.5 to run on IIS 7 on Windows 7 with SQL Server 2008 R2, with SQL Server Denali, and with SQL Server Azure. I also got the engine mostly working on Windows Azure, deployed to the cloud, but need to flesh out a few missing pieces. First, sessions will have to be stored to the database, rather than to the file system. Second, the file-based media storage mechanism will have to be adapted to use blob storage. Also, I'll have to figure out how to get Java support so I can continue to use Apache Batik, and how I'll get that to work with cloud storage.DJhttp://www.blogger.com/profile/04147793739642749276noreply@blogger.com2tag:blogger.com,1999:blog-36920053.post-27020610086574828922011-06-17T17:25:00.002-05:002011-06-17T17:32:07.956-05:00Like a Phoenix from the AshesThis work is coming back to life. Next week, I'll be attending the Microsoft SQL Server JumpIn! camp, along with sixteen other PHP developers. I'll be devoting special attention to attempting to use the new Microsoft PDO for the SQL Server Direct driver. Mostly, it should be the same as it was when using the ADODB library, except I'll have a new Database class. Since I haven't kept things up to date, I'll have to keep my fingers crossed that I can get things up and running with MediaWiki 1.16.5 or 1.17. I've been pleased to discover, however, that while I was not looking the SQL syntax in tables.sql for MySQL has evolved to resemble what I had been doing for SQL Server much more closely than it used to. Also, I'll be targeting SQL Server 2008 R2 rather than SQL Server 2000 or SQL Server 2005.DJhttp://www.blogger.com/profile/04147793739642749276noreply@blogger.com0tag:blogger.com,1999:blog-36920053.post-90120443976228746482008-10-17T21:31:00.002-05:002008-10-17T21:55:18.530-05:00An issue of timingFor the Microsoft SQL Server version of the MediaWiki engine (at version 1.13 or so), everything was working except the Special:WantedPages and Special:AllPages. With the new design of Special:AllPages, I can't make heads or tails out of the SQL being generated, so I just reverted to the old behavior and it's fine. Special:WantedPages was different. The SQL was fine, but as the size of the wiki has grown (closing in on 200,000 articles), the database server was timing out before returning any results to the PHP. This problem was relatively easy to fix, using a not-widely-known property of the ADO Connection object, Connection::CommandTimeout. The default for Connection::CommandTimeout is 30 seconds, and no results were being returned for the (expensive) query until about 35 seconds or so. Once a row comes back from the server, ADO doesn't care how much time it takes to complete the query, but it's pretty sensitive to how long it takes for results to START coming back. I went into the ADOdb connector code and in the file <span style="font-weight:bold;">adodb-ado5.inc.php</span> added right before the return from the method <span style="font-weight:bold;">ADODB_ado::_connect</span> the single line<pre> $dbc->CommandTimeout = 90;</pre>Although this fixed the timeout problem, it brought out a latent problem. For the Special pages that return true from <tt>QueryPage::isExpensive</tt>, some of them -- including this one -- were trying to cache the results in a temporary table. This clashed with ADOdb's handling of OFFSET and LIMIT clauses in the SQL, so I have disabled the caching of results from "expensive" Special pages by not performing any activity in the overriden <tt>QueryPage::preProcessResults</tt> methods for these pages (Special:ShortPages, Special:LongPages, and Special:WantedPages in particular).
Down the road, perhaps I'll think of a slightly more clever way to handle SQL OFFSET and LIMIT clauses and I can re-enable the preProcessResults methods.DJhttp://www.blogger.com/profile/04147793739642749276noreply@blogger.com0tag:blogger.com,1999:blog-36920053.post-3618091041748780712008-08-06T18:48:00.002-05:002008-08-06T18:55:17.128-05:00Another recommended change for ADODBI'm trying to minimize the number of changes needed to the wiki code, which is still fairly indecisive about retrieving data from the database via column names or column numbers. To make things as simple as possible, I set the fetch mode to ADODB_FETCH_BOTH rather than toggling back and forth between ADODB_FETCH_ASSOC and ADODB_FETCH_NUM, as I had been doing in the past. To make this work right, you'll need to make a change to the file adodb-ado5.inc.php. Change line 654 from
<pre>$this->fields = $this->GetRowAssoc(ADODB_ASSOC_CASE);</pre> to read <pre>$this->fields = array_merge($this->fields, $this->GetRowAssoc(ADODB_ASSOC_CASE));</pre> Having done this, you'll have an array with both numeric keys and string keys.DJhttp://www.blogger.com/profile/04147793739642749276noreply@blogger.com1tag:blogger.com,1999:blog-36920053.post-68825515781415368992008-07-30T17:33:00.002-05:002008-07-30T18:46:38.263-05:00MediaWiki 1.13 and ADOdb 5.05As of today, MediaWiki at approximately the 1.13 level is working while using the new ADOdb library that was released on 11 July 2008. I say approximately 1.13 because I actually work with the MediaWiki edge code from SubVersion, which is now identified as 1.14alpha. Far fewer changes are required these days to keep everything humming along. Thanks to MediaWiki guru "Simetrical" for incorporating some of my recommended changes, since I still don't have commit access to the shared repository.
Very minor changes were required to the ADOdb library, very similar to the changes that I'd made to the 4.94 release that I had been using. These changes are actually very simple.
First, after downloading and unzipping adodb5.05, find the file ./adodb.inc.php and change line 954 from
<span style="color: rgb(153, 0, 0);" ><pre>$array_2d = is_array($element0) && !is_object(reset($element0));</pre></span>
to read just<span style="color: rgb(0, 153, 0);"><pre>$array_2d = is_array($element0);</pre></span>.
If you don't make that change, you'll see error messages whenever you try to save a new article in your wiki. The last part of that statement apparently has to do with something unique to oci8 (Oracle?) descriptors, but wreaks havoc when used in conjunction with SQL Server.
Second, also in ./adodb.inc.php, you'll need to modify the function FetchObject. MediaWiki code always expects objects returned from the database to have attributes named with all-lowercase letters. To make that happen, change line 3571 from
<span style="color: rgb(102, 0, 0);"><pre>function FetchObject($isupper=true)</pre></span>
to
<span style="color: rgb(0, 102, 0);"><pre>function FetchObject($isupper=false)</pre>
</span>and change line 3588 from <span style="color: rgb(102, 0, 0);" ><pre>else $n = $name;</pre></span>
to
<span style="color: rgb(0, 102, 0);"><pre>else $n = strtolower($name);</pre></span>.
Without that change, you'll witness lots of member not found error messages.
Third, in ./drivers/adodb-ado_mssql.inc.php, you'll have to change line 49 from
<span style="color: rgb(102, 0, 0);" ><pre>return $this->GetOne('select SCOPE_IDENTITY()');</pre></span>
to
<span style="color: rgb(0, 102, 0);" ><pre>return $this->GetOne('select @@IDENTITY');</pre></span>
or you'll get NULL for the last ID used after inserts.
Finally, to get the various pagers to work, you'll have to modify ./drivers/adodb-ado5.inc.php around line 423. Change the code from
<span style="color: rgb(102, 0, 0);"><pre>
if ($this->_currentRow > $row) return false;
@$rs->Move((integer)$row - $this->_currentRow-1); //adBookmarkFirst
</pre></span>
to
<span style="color: rgb(0, 102, 0);"><pre>
$fReturn = $rs->Supports( 0x200 /*adMovePrevious*/ );
if (! $fReturn )
return false;
if ( $row == 0 ) {
@$rs->MoveFirst();
} else {
@$rs->Move((integer)$row - $this->_currentRow-1); //adBookmarkFirst
}
$this->_currentRow = $row;
</pre></span>Without that change, most of the special pages won't work. Note that SQL Server 2000 and SQL Server 2005 will both respond that they support adMovePrevious.
That's it, the rest of the changes can be found attached to the MediaWiki BugZilla report 9767, or may become part of the MediaWiki code base. The main attachments of interest are the SQL code and the DatabaseADODB.php file. These are fairly up-to-date as I write this, although I've evolved my schema somewhat since uploading the SQL code. It generally falls a little behind because I make the schema changes through Microsoft SQL Server Management Studio and then wind up reverse-engineering those changes into the SQL file.DJhttp://www.blogger.com/profile/04147793739642749276noreply@blogger.com0tag:blogger.com,1999:blog-36920053.post-33713294145259835592008-05-13T19:40:00.003-05:002008-12-09T02:33:18.398-06:00Getting ready to go into production<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhoOKimI9gVslU4OCQHYe8zfgrc649erMd6_AsdH4J0HBSdHmT-mrZax9pqDzZttZvCgSPZz1gl_6aDrUzK5S3JQnsiMUwyIpHUrAzniao2Y56dPGbkxncoaUOxoj18Mq9Fn29UrQ/s1600-h/JIOWiki.gif"><img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhoOKimI9gVslU4OCQHYe8zfgrc649erMd6_AsdH4J0HBSdHmT-mrZax9pqDzZttZvCgSPZz1gl_6aDrUzK5S3JQnsiMUwyIpHUrAzniao2Y56dPGbkxncoaUOxoj18Mq9Fn29UrQ/s320/JIOWiki.gif" alt="" id="BLOGGER_PHOTO_ID_5200028705993528882" border="0" /></a>Things have come a long way since my last post. The wiki has been running continuously on a test box for a couple of months with no major glitches. It is now based on version 1.12, but is running on SQL Server 2005 instead of SQL Server 2000. Everything works except for images that have funny characters in the names. Even those sort of work (the thumbnails get generated and stored in the appropriate place), but IIS 6 seems unable to server the files from there. The code is stable enough so the boss has decided to put the thing into production. There's still some pieces in this screenshot that show some of the parts of the wiki that are still in development (The collaboration box shows some debugging messages from the presence / chat server). The screw behind the logo means that this is the development version and won't be in the production version.DJhttp://www.blogger.com/profile/04147793739642749276noreply@blogger.com1tag:blogger.com,1999:blog-36920053.post-66241764981813469282007-06-08T17:45:00.000-05:002007-06-08T17:48:49.787-05:00Hitting the transaction limitIt looks like it will be necessary to occasionally break down and re-establish the connection to the database in the <code>importDump.php</code> utility. Running it for too long results in the following:<code><pre>C:\Inetpub\wwwroot\wiki193\mediawiki\maintenance>"C:\Program Files\Zend\Core\bin
\php.exe" ImportDump.php s:\enwiki-20070402-pages-articles.xml
100 (50.09 pages/sec 50.09 revs/sec)
200 (51.75 pages/sec 51.75 revs/sec)
300 (50.43 pages/sec 50.43 revs/sec)
400 (50.78 pages/sec 50.78 revs/sec)
500 (51.62 pages/sec 51.62 revs/sec)
600 (52.31 pages/sec 52.31 revs/sec)
700 (51.62 pages/sec 51.62 revs/sec)
800 (51.35 pages/sec 51.35 revs/sec)
900 (51.24 pages/sec 51.24 revs/sec)
1000 (51.57 pages/sec 51.57 revs/sec)
1100 (51.85 pages/sec 51.85 revs/sec)
1200 (52.11 pages/sec 52.11 revs/sec)
1300 (51.62 pages/sec 51.62 revs/sec)
1400 (51.11 pages/sec 51.11 revs/sec)
1500 (50.16 pages/sec 50.16 revs/sec)
1600 (50.30 pages/sec 50.30 revs/sec)
1700 (50.09 pages/sec 50.09 revs/sec)
1800 (50.31 pages/sec 50.31 revs/sec)
1900 (50.39 pages/sec 50.39 revs/sec)
2000 (50.37 pages/sec 50.37 revs/sec)
2100 (50.34 pages/sec 50.34 revs/sec)
2200 (50.30 pages/sec 50.30 revs/sec)
2300 (50.41 pages/sec 50.41 revs/sec)
2400 (50.51 pages/sec 50.51 revs/sec)
2500 (49.89 pages/sec 49.89 revs/sec)
2600 (50.17 pages/sec 50.17 revs/sec)
2700 (50.34 pages/sec 50.34 revs/sec)
2800 (50.08 pages/sec 50.08 revs/sec)
2900 (50.18 pages/sec 50.18 revs/sec)
3000 (49.94 pages/sec 49.94 revs/sec)
3100 (48.71 pages/sec 48.71 revs/sec)
3200 (48.58 pages/sec 48.58 revs/sec)
3300 (48.43 pages/sec 48.43 revs/sec)
3400 (48.49 pages/sec 48.49 revs/sec)
3500 (48.51 pages/sec 48.51 revs/sec)
3600 (48.54 pages/sec 48.54 revs/sec)
3700 (48.52 pages/sec 48.52 revs/sec)
3800 (25.39 pages/sec 25.39 revs/sec)
3900 (15.09 pages/sec 15.09 revs/sec)
4000 (10.41 pages/sec 10.41 revs/sec)
4100 (8.62 pages/sec 8.62 revs/sec)
4200 (6.98 pages/sec 6.98 revs/sec)
4300 (4.98 pages/sec 4.98 revs/sec)
4400 (3.79 pages/sec 3.79 revs/sec)
4500 (3.20 pages/sec 3.20 revs/sec)
4600 (2.66 pages/sec 2.66 revs/sec)
4700 (2.45 pages/sec 2.45 revs/sec)
4800 (2.25 pages/sec 2.25 revs/sec)
4900 (2.13 pages/sec 2.13 revs/sec)
5000 (1.99 pages/sec 1.99 revs/sec)
5100 (1.83 pages/sec 1.83 revs/sec)
5200 (1.70 pages/sec 1.70 revs/sec)
5300 (1.61 pages/sec 1.61 revs/sec)
5400 (1.56 pages/sec 1.56 revs/sec)
5500 (1.49 pages/sec 1.49 revs/sec)
5600 (1.43 pages/sec 1.43 revs/sec)
5700 (1.37 pages/sec 1.37 revs/sec)
5800 (1.34 pages/sec 1.34 revs/sec)
5900 (1.30 pages/sec 1.30 revs/sec)
6000 (1.28 pages/sec 1.28 revs/sec)
6100 (1.23 pages/sec 1.23 revs/sec)
6200 (1.19 pages/sec 1.19 revs/sec)
6300 (1.13 pages/sec 1.13 revs/sec)
6400 (1.09 pages/sec 1.09 revs/sec)
6500 (1.04 pages/sec 1.04 revs/sec)
6600 (1.01 pages/sec 1.01 revs/sec)
6700 (0.98 pages/sec 0.98 revs/sec)
6800 (0.96 pages/sec 0.96 revs/sec)
6900 (0.95 pages/sec 0.95 revs/sec)
7000 (0.94 pages/sec 0.94 revs/sec)
7100 (0.91 pages/sec 0.91 revs/sec)
7200 (0.89 pages/sec 0.89 revs/sec)
7300 (0.87 pages/sec 0.87 revs/sec)
7400 (0.86 pages/sec 0.86 revs/sec)
7500 (0.86 pages/sec 0.86 revs/sec)
7600 (0.87 pages/sec 0.87 revs/sec)
7700 (0.87 pages/sec 0.87 revs/sec)
7800 (0.88 pages/sec 0.88 revs/sec)
7900 (0.86 pages/sec 0.86 revs/sec)
8000 (0.86 pages/sec 0.86 revs/sec)
exception 'com_exception' with message 'Source: Microsoft OLE DB Provider for SQL Server
Description: Cannot start more transactions on this session.' in C:\PHP\includes\adodb\drivers\adodb-ado5.inc.php:290
Stack trace:
#0 C:\PHP\includes\adodb\drivers\adodb-ado5.inc.php(290): com->BeginTrans()
#1 C:\Inetpub\wwwroot\wiki193\mediawiki\includes\DatabaseADODB.php(1135): ADODB_ado->BeginTrans()
#2 C:\Inetpub\wwwroot\wiki193\mediawiki\includes\JobQueue.php(170): DatabaseADODB->begin()
#3 C:\Inetpub\wwwroot\wiki193\mediawiki\includes\HTMLCacheUpdate.php(81): Job::batchInsert(Array)
#4 C:\Inetpub\wwwroot\wiki193\mediawiki\includes\HTMLCacheUpdate.php(46): HTMLCacheUpdate->insertJobs(Object(ResultWrapper))
#5 C:\Inetpub\wwwroot\wiki193\mediawiki\includes\Title.php(2424): HTMLCacheUpdate->doUpdate()
#6 C:\Inetpub\wwwroot\wiki193\mediawiki\includes\Article.php(2644): Title->touchLinks()
#7 C:\Inetpub\wwwroot\wiki193\mediawiki\includes\SpecialImport.php(363): Article::onArticleCreate(Object(Title))
#8 [internal function]: WikiRevision->importOldRevision()
#9 C:\Inetpub\wwwroot\wiki193\mediawiki\includes\DatabaseADODB.php(1048): call_user_func_array(Array, Array)
#10 C:\Inetpub\wwwroot\wiki193\mediawiki\includes\SpecialImport.php(510): DatabaseADODB->deadlockLoop(Array)
#11 [internal function]: WikiImporter->importRevision(Object(WikiRevision))
#12 C:\Inetpub\wwwroot\wiki193\mediawiki\maintenance\importDump.php(58): call_user_func(Array, Object(WikiRevision))
#13 [internal function]: BackupReader->handleRevision(Object(WikiRevision), Object(WikiImporter))
#14 C:\Inetpub\wwwroot\wiki193\mediawiki\includes\SpecialImport.php(761): call_user_func_array(Array, Array)
#15 [internal function]: WikiImporter->out_revision(Resource id #55, 'revision')
#16 C:\Inetpub\wwwroot\wiki193\mediawiki\includes\SpecialImport.php(426): xml_parse(Resource id #55, ']] and [[Botswa...', 0)
#17 C:\Inetpub\wwwroot\wiki193\mediawiki\maintenance\importDump.php(109): WikiImporter->doImport()
#18 C:\Inetpub\wwwroot\wiki193\mediawiki\maintenance\importDump.php(91): BackupReader->importFromHandle(Resource id #54)
#19 C:\Inetpub\wwwroot\wiki193\mediawiki\maintenance\importDump.php(129): BackupReader->importFromFile('s:\bauch\enwiki...')
#20 {main}
C:\Inetpub\wwwroot\wiki193\mediawiki\maintenance></pre></code>DJhttp://www.blogger.com/profile/04147793739642749276noreply@blogger.com3tag:blogger.com,1999:blog-36920053.post-20305377794613436152007-06-07T20:04:00.000-05:002007-06-07T23:09:24.852-05:00Testing (and fixing) the Maintenance UtilitiesWith the wiki up and stable for a couple of weeks now, it was time to try to push some limits. To do this, I downloaded a dump of the English Wikipedia. The command-line tool <code>ImportDump.php</code> works for a while (albeit slower than I would have liked). Here's what prints to the console:
<code><pre>
C:\Inetpub\wwwroot\wiki193\mediawiki\maintenance>"C:\Program Files\Zend\Core\bin
\php.exe" ImportDump.php s:\enwiki-20070402-pages-articles.xml
100 (3.27 pages/sec 3.27 revs/sec)
200 (1.90 pages/sec 1.90 revs/sec)
300 (1.40 pages/sec 1.40 revs/sec)
400 (1.13 pages/sec 1.13 revs/sec)
500 (1.04 pages/sec 1.04 revs/sec)
600 (1.01 pages/sec 1.01 revs/sec)
700 (0.79 pages/sec 0.79 revs/sec)
800 (0.74 pages/sec 0.74 revs/sec)
900 (0.68 pages/sec 0.68 revs/sec)
1000 (0.66 pages/sec 0.66 revs/sec)
1100 (0.63 pages/sec 0.63 revs/sec)
1200 (0.62 pages/sec 0.62 revs/sec)
1300 (0.59 pages/sec 0.59 revs/sec)
1400 (0.59 pages/sec 0.59 revs/sec)
A database query syntax error has occurred.
The last attempted database query was:
"COMMIT"
from within function "Database::deadlockLoop".
MySQL returned error "3902: e (jmolafbwikidev)"</pre>
</code>This turns out to require some fixes in my <code>DatabaseADODB.php</code>. In particular, in the <code>deadlockLoop()</code> method. It also looks like my indexes on the <code>page</code> table need a little tweeking.
After that, <code>rebuildrecentchanges.php</code> doesn't want to work at all
<code><pre>
C:\Inetpub\wwwroot\wiki193\mediawiki\maintenance>"C:\Program Files\Zend\Core\bin
\php.exe" rebuildrecentchanges.php
PHP Notice: Undefined variable: wgDBadminuser in C:\Inetpub\wwwroot\wiki193\med
iawiki\maintenance\rebuildrecentchanges.php on line 15
PHP Notice: Undefined variable: wgDBadminpassword in C:\Inetpub\wwwroot\wiki193
\mediawiki\maintenance\rebuildrecentchanges.php on line 16
Loading from page and revision tables...
A database query syntax error has occurred.
The last attempted database query was:
"INSERT INTO [recentchanges] (rc_timestamp,rc_cur_time,rc_user,rc_user_text,rc_
namespace,rc_title,rc_comment,rc_minor,rc_bot,rc_new,rc_cur_id,rc_this_oldid,rc_
last_oldid,rc_type) SELECT TOP 5000 rev_timestamp,rev_timestamp,rev_user,rev_us
er_text,page_namespace,page_title,rev_comment,rev_minor_edit,0,page_is_new,page_
id,rev_id,0, IF(page_is_new != 0, 1, 0) FROM [page],[revision] WHERE (rev_tim
estamp > '20070531214411') AND (rev_page=page_id) ORDER BY rev_timestamp DES
C"
from within function "rebuildRecentChangesTablePass1".
MySQL returned error "170: n (jmolafbwikidev)"</pre></code>
The fix for this is to promote the method <code>conditional</code> from the <code>Database</code> class to the <code>DatabaseADODB</code> class, where it is implemented as:<code><pre>
function conditional( $cond, $trueVal, $falseVal ) {
return " CASE WHEN $cond THEN $trueVal ELSE $falseVal END ";
}</pre></code>
Running again reveals another problem:<code><pre>C:\Inetpub\wwwroot\wiki193\mediawiki\maintenance>"C:\Program Files\Zend\Core\bin
\php.exe" rebuildrecentchanges.php
PHP Notice: Undefined variable: wgDBadminuser in C:\Inetpub\wwwroot\wiki193\med
iawiki\maintenance\rebuildrecentchanges.php on line 15
PHP Notice: Undefined variable: wgDBadminpassword in C:\Inetpub\wwwroot\wiki193
\mediawiki\maintenance\rebuildrecentchanges.php on line 16
Loading from page and revision tables...
Updating links...
A database query syntax error has occurred.
The last attempted database query was:
"SELECT rev_id FROM [revision] WHERE rev_page=87 AND rev_timestamp&lt;'200706052
02808' ORDER BY rev_timestamp DESC LIMIT 1"
from within function "".
MySQL returned error "170: n (jmolafbwikidev)"
</pre></code>
It looks like we need to add some alternative SQL to <code>rebuildRecentChangesTablePass2()</code> in <code>rebuildrecentchanges.inc</code> as follows:<code><pre>
if ( $wgDBtype == 'adodb' ) {
$sql2 = "SELECT TOP 1 rev_id FROM $revision " .
"WHERE rev_page={$lastCurId} ".
"AND rev_timestamp < '{$emit}' ORDER BY rev_timestamp DESC";
} else {
$sql2 = "SELECT rev_id FROM $revision " .
"WHERE rev_page={$lastCurId} ".
"AND rev_timestamp<'{$emit}' ORDER BY rev_timestamp DESC LIMIT 1";
}</pre></code>
And that fixes it.DJhttp://www.blogger.com/profile/04147793739642749276noreply@blogger.com0tag:blogger.com,1999:blog-36920053.post-71537083418983925202007-06-05T20:53:00.000-05:002007-06-06T13:58:18.797-05:00Changes requiredIn addition to the obvious (addition of the code to support Microsoft SQL Server), some other files need to change, and a few other files need to be added to the system.
<br />
<strong>includes/Autoloader.php</strong> (change)<br />
This file needs just a couple of additions, as follows
<code>
<pre>
static $localClasses = array(
# Includes
'DatabaseADODB' => 'includes/DatabaseADODB.php',
...
'SearchADODB' => 'includes/SearchADODB.php',
...
</pre>
</code>
<strong>includes/BagOStuff.php</strong> (change)<br />
In <code>SqlBagOStuff::get($key)</code>, the SQL needs a little fix, from
<code>
<pre>"SELECT value,exptime FROM $0 WHERE keyname='$1'", $key);</pre>
</code>
to
<code><pre>"SELECT value,exptime FROM $0 WHERE keyname=$1", $key);</pre>
</code>
Microsoft SQL Server is thrown off by what become extraneous single quotes. A similar change must be made to <code>SqlBagOStuff::delete($key,$time=0)</code>.
In the methods <code>_serialize(&$data)</code> and <code>_unserialize($serial)</code>, I've had to remove the calls to <code>gzdeflate</code> and <code>ginflate</code>, since I couldn't get them to behave consistently.<br />
<strong>includes/Database.php</strong><br />
To this base class, I've added two new abstract methods
<code><pre>
public function setFetchModeAssoc() {
}
public function setFetchModeNum() {
}</pre>
</code><br />
These are used very little, and only by a couple of the <code>Special:</code> pages. By default, the ADODB connection operates in numeric fetch mode (i.e., it returns records in an array subscripted by numbers). After a call to <code>setFetchModeAssoc</code> the connection operates in associative fetch mode (i.e., it returns records in an array subscripted by field names). This is different from the MySQL connection, which returns arrays that can be indexed both ways.<br />
<strong>includes/GlobalFunctions.php</strong>
In <code>function wfShellExec($cmd, &$retval=null)</code>, I changed the line
<code><pre>
$cmd = '"' . $cmd . '"'; </pre>
</code>
to
<code><pre>
$cmd = 'cmd /C ' . '"' . str_replace( '/', '\\', $cmd) . '"';</pre>
</code><br />
In <strong>includes/MagicWord.php</strong>, I changed a line in <code>MagicWord::initRegex()</code> from
<code><pre>
$case = $this-mCaseSensitive ? '' : 'iu';</pre>
</code>
to
<code><pre>
$case = $this->mCaseSensitive ? '' : 'i';</pre>
</code>
I also had to rewrite <code>matchAndRemove(&$text)</code> and <code>matchStartAndRemove(&$text)</code><br />
In <strong>includes/Pager.php</strong>, method
<code>IndexPager::reallyDoQuery($offset,$limit,$ascending)</code> the line
<code><pre>
$res = $this->mDb->select( $tables, $fields, $conds, $fname, $options );</pre>
</code> must be wrapped up like this:<code>
<pre>
$this->mDb->setFetchModeAssoc();
$res = $this->mDb->select( $tables, $fields, $conds, $fname, $options );
$this->mDb->setFetchModeNum();
</pre>
</code>DJhttp://www.blogger.com/profile/04147793739642749276noreply@blogger.com0tag:blogger.com,1999:blog-36920053.post-81799812245144523712007-05-25T17:24:00.001-05:002008-12-09T02:33:18.958-06:00What the wiki looks like<div><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhN4ZzX4pBxWR0Q4rTYwPSp7sYIoDtZGS5CkUAqnf8B66bLh2H2VbcVLVWrTv3tMC0-4Gc9t5rGO0L2Xfgy1lfKapOArkJaeO-wO_V7qYvWrxSQKrcffUqa_nRNrm8OpLFQZfEKOQ/s1600-h/version.png"><img id="BLOGGER_PHOTO_ID_5068628653331671122" style="FLOAT: right; MARGIN: 0px 0px 10px 10px; CURSOR: hand" alt="" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhN4ZzX4pBxWR0Q4rTYwPSp7sYIoDtZGS5CkUAqnf8B66bLh2H2VbcVLVWrTv3tMC0-4Gc9t5rGO0L2Xfgy1lfKapOArkJaeO-wO_V7qYvWrxSQKrcffUqa_nRNrm8OpLFQZfEKOQ/s320/version.png" border="0" /></a>
<div>The only page that looks much different from Wikipedia or any other MediaWiki-based wiki is the Special:Version page. I've included a screen-grab of that so you can see how my version is different from the mainstream MediaWiki engine. In particular, notice IIS instead of Apache, the ADOdb library and SQL Server instead of the MySQL library and MySQL, and a slightly different version of PHP (this is the version from Zend Core)</div></div>DJhttp://www.blogger.com/profile/04147793739642749276noreply@blogger.com0tag:blogger.com,1999:blog-36920053.post-40887197179126456882007-05-17T18:24:00.000-05:002007-06-07T17:58:29.886-05:00More speed testsHaving been pleased with the effects of the Zend Optimizer, I began to wonder what the effects would be of running the entire MediaWiki software on the commercial Zend Core engine.
Here are some samples (all times in seconds):
<table border="1"><tr>
<th>Page</th><th>PHP No optimization</th><th> PHPw/Optimizer</th><th> Zend Core</th><th> Wikipedia</th></tr>
<tr><td><a href="http://en.wikipedia.org/wiki/2003_invasion_Of_Baghdad">2003 invasion of Baghdad</a></td><td>3.32</td><td>3.20</td><td>0.69</td><td>0.34</td></tr>
<tr><td><a href="http://en.wikipedia.org/wiki/2003_Invasion_of_Iraq">2003 Invasion of Iraq</a></td><td>20.76</td><td>5.45</td><td>6.66</td><td>4.09</td></tr>
<tr><td><a href="http://en.wikipedia.org/wiki/Air_Force_Special_Operations_Command">Air Force Special Operations Command</a></td><td>1.16</td><td>0.63</td><td>0.48</td><td>0.21</td></tr>
<tr><td><a href="http://en.wikipedia.org/wiki/World_Geodetic_System">World Geodetic System</a></td><td>1.32</td><td>1.75</td><td>1.30</td><td>0.36</td>
</tr>
<tr><td><a href="http://en.wikipedia.org/wiki/United_States_Department_of_Justice">United States Department of Justice</a></td><td>2.03</td><td>0.94</td><td>0.77</td><td>0.54</td></tr>
<tr><td><a href="http://en.wikipedia.org/wiki/United_States_Air_Force">United States Air Force</a></td><td>6.47</td><td>7.24</td><td>2.17</td><td>1.17</td></tr>
<tr><td><a href="http://en.wikipedia.org/wiki/The_Sunshine_Boys">The Sunshine Boys</a></td><td>1.93</td><td>0.82</td><td>0.62</td><td>0.37</td></tr>
<tr><td><a href="http://en.wikipedia.org/wiki/Ruby_on_Rails">Ruby on Rails</a></td><td>1.751</td><td>5.88</td><td>0.65</td><td>0.25</td></tr>
<tr><td><a href="http://en.wikipedia.org/wiki/PowerPoint">PowerPoint</a></td><td>3.99</td><td>2.07</td><td>1.24</td><td>1.68</td></tr>
<tr><td><a href="http://en.wikipedia.org/wiki/Lee_Meredith">Lee Meredith</a></td><td>0.85</td><td>0.53</td><td>0.37</td><td>0.18</td></tr>
<tr><td><a href="http://en.wikipedia.org/wiki/Arnold_Schwarzenegger">Arnold Schwarzenegger</a></td><td>15</td><td>7</td><td>3.9</td><td>14.03</td></tr>
<tr><td><a href="http://en.wikipedia.org/wiki/Philippines">Philippines</a></td><td>25</td><td>5</td><td>5.6</td><td>3.51</td></tr>
</table>
Note that these are the times reported by PHP (as can be seen with the view source command on the resulting web page). Also note that the times vary significantly depending on what the server may be doing at the time. The times reported for my server are for uncached results, whereas the times reported for Wikipedia are presumably usually for cached results -- except for Arnold, it looks like I found him out of the cache this time around!DJhttp://www.blogger.com/profile/04147793739642749276noreply@blogger.com0tag:blogger.com,1999:blog-36920053.post-34652790444214224812007-05-11T19:09:00.000-05:002007-06-06T13:59:52.999-05:00Version 1.10 on SQL Server<a href="http://www.mediawiki.org/wiki/Download">MediaWiki 1.10</a> is out. That's my cue to start trying to get my changes integrated into the shared source code. I've started on that with a <a href="http://bugzilla.wikimedia.org/show_bug.cgi?id=9767">Bugzilla</a>, Bug # 9767
I pretty much have 1.10 all working, except the thumbnail functionality stopped working. In my configuration, <a href="http://www.imagemagick.org">ImageMagick</a> 6.3.2-5 is used to generate the thumbnails. Now, instead of thumbnails, I get "Error creating thumbnail: The input line is too long". The first part of this message, up to the colon, is generated by the PHP code. The second part, however, appears to come from ImageMagick itself. In my modified MediaWiki version 1.9.3, this had been working -- so I need to figure out what's changed.
(... a little time goes by ...) OK, It's not ImageMagick's fault. The problem had to do with the documented (http://news.php.net/php.internals/21796) flaw with PHP's invocation of cmd.exe. My implementation of the wfShellExec function in GlobalFunctions.php now contains this:
<code><pre>
...
} elseif (php_uname( 's' ) == 'Windows NT' ) {
# This is a hack to work around PHP's flawed invocation of cmd.exe
# http://news.php.net/php.internals/21796
$cmd = 'cmd /C ' . '"' . str_replace( '/', '\\', $cmd) . '"';
}
wfDebug ( "wfShellExec: $cmd\n" );
$output = array();
$retval = 1; // error by default?
exec( $cmd, $output, $retval ); // returns the last line of output.
return implode( "\n", $output);
}</pre>
</code>
Note that this differs in two ways from the distributed version of GlobalFunctions.php:
<ol>
<li>Prefixes the command with "cmd /C"</li>
<li>Replaces all occurrences of slash with backslash</li>
</ol>
I did the first, as I remember, to get Batik to work (for processing SVGs), which is driven by a batch file.
I did the second to get some consistency in the direction the slashes are facing. The MediaWiki code is a little bit unclear when it decides to stick slashes into strings that represent filenames, and sometimes uses pieces of a URL (forward slash) to append subdirectory names to existing directories (backward slash). All I do is format the result as though it's a filename (which it will always be in this situation, since we're trying to execute something). Naturally, there will be a problem if there ever arises a situation in which the slashes are being used for something else, but I haven't seen this happen. Presumably they would show up uuencoded if they were being used (say) as part of an argument to a command.DJhttp://www.blogger.com/profile/04147793739642749276noreply@blogger.com0tag:blogger.com,1999:blog-36920053.post-23207450197155239772007-05-11T18:48:00.000-05:002007-05-16T14:33:41.737-05:00Making it fastUntil today, I have been reasonably disappointed at the speed at which the pages in my wiki have been coming up. Wikipedia uses lots of stunts which aren't really open to me, like using Squid together with lots & lots of slaved servers. Right now, I'm just running on a two-processor (2 x 3GHz Pentium) box with 2GB of memory. SQL Server, IIS, etc. are all running locally. In this configuration, pages were taking anywhere from 2 to 10 times as long to display as the corresponding pages from Wikipedia.
The fix is in. I suspected that a PHP accelerator would work, and it does -- beyond my expectations. All of a sudden, this wiki is snappy!
The trick is getting a PHP accelerator that works in our configuration (Windows 2003 SE, IIS 6, PHP 5, etc.) The one that does it is <a href="http://www.zend.com/products/zend_optimizer">Zend Optimizer</a>. It's a free download. Thanks, Zend. If you're running any PHP on IIS, give it a try. I think you'll be impressed.
Here are some example timings for my wiki, along with times from wikipedia.org:
<table>
<tr><th>Page title</th><th>Time before optimizer</th><th>Time after optimizer</th><th>Wikipedia time</th></tr>
<tr><td><a href="http://en.wikipedia.org/wiki/Philippines">Philippines</a></td><td>25 seconds</td><td>5 seconds</td><td>4 seconds</td></tr>
<tr><td><a href="http://en.wikipedia.org/wiki/Arnold_Schwarzenegger">Arnold Schwarzenegger</a></td><td>15 seconds</td><td>7 seconds</td><td>5 seconds</td></tr>
</table>DJhttp://www.blogger.com/profile/04147793739642749276noreply@blogger.com1tag:blogger.com,1999:blog-36920053.post-45739017168560351532007-04-02T19:46:00.000-05:002007-04-02T19:56:27.623-05:00The long pauseI disappeared from the Blogosphere for a while. No need to panic, the project is going very well. Pretty much everything is working on SQL Server now. My intent was to record in the blog everything that was going on as it was happening. Instead, I got so wrapped up in working on the code I didn't have much energy left to do the blogging.
My intent now is to get my work rolled into the MediaWiki baseline once MediaWiki 1.10 releases. The MediaWiki principals seem to be working fairly arduously on making the 1.10 release happen, and don't need to be distracted by worrying about incorporating completely new capabilities that come at them from out of the blue.
Not surprisingly, there turned out to be a little bit of scope creep. MediaWiki doesn't seem to get much use on Microsoft Windows servers, so I discovered some things that needed to be done to make that work better too -- like getting the ImageMagick and Batik image management working (by default they don't). I also needed to work to get HTML Tidy working. It wasn't until pretty late in the game that I realized that was even going to be necessary. In particular, there's a lot of templates used in Wikipedia and related sites that just plain won't work right now without HTML Tidy.
Having posted this latest status report now, it remains for me to go back and retrace the steps it took to get to a working MediaWiki engine on Microsoft SQL Server in upcoming blog posts.DJhttp://www.blogger.com/profile/04147793739642749276noreply@blogger.com2tag:blogger.com,1999:blog-36920053.post-1163651905767076052006-11-15T22:33:00.000-06:002007-01-22T21:10:49.585-06:00First stepsIn order to make MediaWiki do something new, we first watched how it does some of the old things. In particular, how a new Wiki gets created. Here's a summary.
When MediaWiki is first installed, there is no database. How does the engine know this, and what does it do to create one? Well, it doesn't actually check for a database. Instead, a clean installation of the MediaWiki scripts looks for a LocalSettings.php file in the top-level directory (which we'll call "~"). If that file isn't found, then the script ultimately serves up the content of ~/includes/templates/NoLocalSettings.php. The most significant thing on that page is a link to ~/config/index.php.
Clicking on that link will take you to the installation script. The top of the page generated by that script will report on the environment, and the bottom half will contain a form to allow the wiki to be configured. In our case, the second bullet in the environment report says "Found database drivers for: MySQL" and the form Database config section has a single radio button for Database type, labelled "MySQL". It looks like this page will be a good place to start looking for what needs to change. Obviously we want to expand the found database drivers and to make an additional radio button available.DJhttp://www.blogger.com/profile/04147793739642749276noreply@blogger.com0tag:blogger.com,1999:blog-36920053.post-1162342102220719972006-10-31T18:30:00.000-06:002007-05-11T19:17:19.389-05:00Project StartWe decided to embark on a project to adapt the MediaWiki engine to support Microsoft SQL Server on the back end instead of MySQL. From monitoring the MediaWiki discussion group, I was aware that my team was not the only one to find this of interest, and one participant in the discussion asked that I set up a blog to discuss our progress on this project. Well, here it is!
For development, we had access to a machine that already had PHP 4.4.4 installed on it. The first action we took was to upgrade that to PHP 5.1.6. MySQL was already up and running on the machine, as was Microsoft SQL 2000.
The first hurdle we faced was getting PHP 5.1.6 to work properly with Microsoft Internet Information Server (IIS) version 6. Most of the instructions we found online relate to IIS 5, and were not particularly helpful. Finding our resident webmaster to point us in the right direction got us off the ground.
Making changes to the PHP.ini file had no effect until I added a key to the registry using regedit. The new key has the name [HKEY_LOCAL_MACHINE\SOFTWARE\PHP] and the value "IniFilePath" = "C:\\PHP". After making this change, it was smooth sailing to get PHP configured.
Copying the files from the MediaWiki distribution onto a web server directory and browsing to that directory, and we had MediaWiki up and running (with the MySQL back end) in a matter of moments.
Now we want to watch the PHP as the web site runs our Wiki, so we want a debugger that runs on our desktops and that will let us step through code running on the server. That's where Komodo (<a href="http://www.activestate.com">http://www.activestate.com</a>) comes in. What a great tool!
The only confusing thing we ran into setting up Komodo is that we have to get rid of the PHP registry entry on the client or we can't set up Komodo for remote debugging. (Komodo runs with a special PHP.ini file when it's in debugging mode, and apparently the registry entry overrides Komodo's mechanism for specifying where to find this alternate INI file). Once that's set up, we put the magic words at the end of the Wiki URL, and presto!, we can step through the code from the Komodo client and watch how the MediaWiki PHP scripts do their magic.DJhttp://www.blogger.com/profile/04147793739642749276noreply@blogger.com0