Mediawiki maintenance: Difference between revisions

From wikinotes
No edit summary
Line 37: Line 37:
}}
}}
</blockquote><!-- full backups -->
</blockquote><!-- full backups -->
== XML Dumps ==
<blockquote>
You can dump the entire wiki as XML, and then use an XML parser to convert it to various other formats.
<syntaxhighlight lang="bash">
cd /usr/local/www/mediawiki/maintenance
php dumpBackup.php --full --quiet > dump.xml
</syntaxhighlight>
<syntaxhighlight lang="bash">
# mediawiki's builtin parser
cat dump.xml | php ${your_wiki}/maintenance/parse.php --title foo
# parsoid
# pandoc
pandoc dump.xml -f mediawiki -t html -o dump.html
</syntaxhighlight>
</blockquote><!-- XML Dumps -->


== Static HTML ==
== Static HTML ==
Line 50: Line 71:
|}
|}
</blockquote><!-- Tools -->
</blockquote><!-- Tools -->
=== Home Grown ===
<blockquote>
= dump + pandoc =
<blockquote>
<syntaxhighlight lang="bash">
cd /usr/local/www/mediawiki/maintenance
php dumpBackup.php --full --quiet > dump.xml
pandoc dump.xml -f mediawiki -t html -o dump.html
</syntaxhighlight>
</blockquote><!-- pandoc -->
==== wikicode parsers ====
<blockquote>
See page of mediawiki parsers here: http://www.mediawiki.org/wiki/Alternative_parsers
You can render wikicode to html using the actual parser
<syntaxhighlight lang="bash">
echo "'''foo'''" | php ${your_wiki}/maintenance/parse.php --title foo
</syntaxhighlight>
You can find the wiki's contents in the database
<syntaxhighlight lang="bash">
SELECT * FROM text LIMIT 10 OFFSET 100;
</syntaxhighlight>
</blockquote><!-- wikicode parsers -->


==== wget ====
==== wget ====

Revision as of 17:13, 12 June 2022

Documentation

mediawiki static dump tools https://meta.wikimedia.org/wiki/Static_version_tools
mediawiki dumpBackup xml https://www.mediawiki.org/wiki/Manual:DumpBackup.php

Backups

Full Backups

To create a full backup, you'll need to:

Backup Database


mysqldump -u wiki -pPASSWORD wikidb > ~/wikidb-backup.sql

Backup Images

TODO

Backup LocalSettings.php

TODO

XML Dumps

You can dump the entire wiki as XML, and then use an XML parser to convert it to various other formats.

cd /usr/local/www/mediawiki/maintenance
php dumpBackup.php --full --quiet > dump.xml
# mediawiki's builtin parser
cat dump.xml | php ${your_wiki}/maintenance/parse.php --title foo

# parsoid


# pandoc
pandoc dump.xml -f mediawiki -t html -o dump.html

Static HTML

Tools

mw2html
static-wiki

wget

Captures/correct links, but not as relative links for me. technically can capture CSS too.

wget --recursive \
     --page-requisites \
     --adjust-extension \
     --convert-links \
     --no-parent \
     -R "*Special*" \
     -R "Special*" \
     -R "*action=*" \
     -R "*printable=*" \
     -R "*oldid=*" \
     -R "*title=Talk:*" \
     -R "*limit=*" \
     "https://yourwiki.com"

Zim

xmldump2zim create zimfile from a mediawiki XML dump
wget-2-zim bash script to scrape mediawiki to zimfile
zim-tools includes zimwriterfs which dumps mediawiki to zimfile
mwoffliner scrape a mediawiki to zimfile

Delete Revision History

cd /usr/local/www/mediawiki/maintenance
php deleteOldRevisions.php --delete