Mediawiki maintenance

From wikinotes
Revision as of 18:03, 12 June 2022 by Will (talk | contribs) (→‎wget)

Documentation

mediawiki static dump tools https://meta.wikimedia.org/wiki/Static_version_tools
mediawiki dumpBackup xml https://www.mediawiki.org/wiki/Manual:DumpBackup.php

Backups

Full Backups

To create a full backup, you'll need to:

Backup Database


mysqldump -u wiki -pPASSWORD wikidb > ~/wikidb-backup.sql

Backup Images

TODO

Backup LocalSettings.php

TODO

XML Dumps

You can dump the entire wiki as XML, and then use an XML parser to convert it to various other formats.
This is very fast, but doesn't render text normally handled by plugins.
Each page is defined within a <page> tag.

cd /usr/local/www/mediawiki/maintenance
php dumpBackup.php --full --quiet > dump.xml
# mediawiki's builtin parser (parsoid, best-effort)
php ${your_wiki}/maintenance/parse.php dump.xml > out.html

# pandoc (halts on error)
pandoc dump.xml -f mediawiki -t html -o dump.html

Static HTML

Tools

mw2html
static-wiki

wget

Captures/correct links, but not as relative links for me. technically can capture CSS too.
wget has an option to override the DNS server used for requests, which you might be able to use to force this to evaluate over the localhost.
otherwise you'll need to bring down the website, pointing your $wgServer to the localhost.

wget --recursive \
     --page-requisites \
     --adjust-extension \
     --convert-links \
     --no-parent \
     -R "*Special*" \
     -R "Special*" \
     -R "*action=*" \
     -R "*printable=*" \
     -R "*oldid=*" \
     -R "*title=Talk:*" \
     -R "*limit=*" \
     "https://yourwiki.com"

Zim

xmldump2zim create zimfile from a mediawiki XML dump
wget-2-zim bash script to scrape mediawiki to zimfile
zim-tools includes zimwriterfs which dumps mediawiki to zimfile
mwoffliner scrape a mediawiki to zimfile

Delete Revision History

cd /usr/local/www/mediawiki/maintenance
php deleteOldRevisions.php --delete