BaseDump Class Reference
[Maintenance]

Readahead helper for making large MediaWiki data dumps; reads in a previous XML dump to sequentially prefetch text records already normalized and decompressed. More...

List of all members.

Public Member Functions

 BaseDump ($infile)
 prefetch ($page, $rev)
 Attempts to fetch the text of a particular page revision from the dump stream.
 debug ($str)
 nextPage ()
 nextRev ()
 nextText ()
 skipTo ($name, $parent='page')
 nodeContents ()
 Shouldn't something like this be built-in to XMLReader? Fetches text contents of the current element, assuming no sub-elements or such scary things.
 close ()

Public Attributes

 $reader = null
 $atEnd = false
 $atPageEnd = false
 $lastPage = 0
 $lastRev = 0


Detailed Description

Readahead helper for making large MediaWiki data dumps; reads in a previous XML dump to sequentially prefetch text records already normalized and decompressed.

This can save load on the external database servers, hopefully.

Assumes that dumps will be recorded in the canonical order:

Requires PHP 5 and the XMLReader PECL extension.

Definition at line 57 of file backupPrefetch.inc.


Member Function Documentation

BaseDump::BaseDump ( infile  ) 

Definition at line 64 of file backupPrefetch.inc.

BaseDump::close (  ) 

Access:
private

Definition at line 197 of file backupPrefetch.inc.

Referenced by nodeContents(), and skipTo().

BaseDump::debug ( str  ) 

Definition at line 102 of file backupPrefetch.inc.

References wfDebug().

Referenced by prefetch(), and skipTo().

BaseDump::nextPage (  ) 

Access:
private

Definition at line 111 of file backupPrefetch.inc.

References nodeContents(), and skipTo().

Referenced by prefetch().

BaseDump::nextRev (  ) 

Access:
private

Definition at line 126 of file backupPrefetch.inc.

References nodeContents(), and skipTo().

Referenced by prefetch().

BaseDump::nextText (  ) 

Access:
private

Definition at line 139 of file backupPrefetch.inc.

References nodeContents(), and skipTo().

Referenced by prefetch().

BaseDump::nodeContents (  ) 

Shouldn't something like this be built-in to XMLReader? Fetches text contents of the current element, assuming no sub-elements or such scary things.

Returns:
string
Access:
private

Definition at line 172 of file backupPrefetch.inc.

References close().

Referenced by nextPage(), nextRev(), and nextText().

BaseDump::prefetch ( page,
rev 
)

Attempts to fetch the text of a particular page revision from the dump stream.

May return null if the page is unavailable.

Parameters:
int $page ID number of page to read
int $rev ID number of revision to read
Returns:
string or null

Definition at line 78 of file backupPrefetch.inc.

References $page, debug(), nextPage(), nextRev(), and nextText().

BaseDump::skipTo ( name,
parent = 'page' 
)

Access:
private

Definition at line 147 of file backupPrefetch.inc.

References close(), and debug().

Referenced by nextPage(), nextRev(), and nextText().


Member Data Documentation

BaseDump::$atEnd = false

Definition at line 59 of file backupPrefetch.inc.

BaseDump::$atPageEnd = false

Definition at line 60 of file backupPrefetch.inc.

BaseDump::$lastPage = 0

Definition at line 61 of file backupPrefetch.inc.

BaseDump::$lastRev = 0

Definition at line 62 of file backupPrefetch.inc.

BaseDump::$reader = null

Definition at line 58 of file backupPrefetch.inc.


The documentation for this class was generated from the following file:

Generated on Sat Sep 5 02:08:33 2009 for MediaWiki by  doxygen 1.5.9