Internet Archive plugin for WordPress

The Internet Archive stores snapshots of webpages to preserve online information and culture for the future. Anyone can see snapshots of old versions of webpages in the Internet Archive Wayback Machine. If anything ever happens to you or your blog goes down, your blog posts can still remain accessible in the Wayback Machine. You can now help this effort by making sure your WordPress blog is included in the archive.

By using my shiny new Post Archival in the Internet Archive plugin for WordPress, your WordPress blog will automatically ping the archive 12 hours after you publish new blog posts. The 12-hour window will allow you to correct spelling and other mistakes, or even unpublish the blog post entirely without having it stored in the archive. It helps to keep you honest and functions as a motivation to set high-quality expectations for yourself.

The first time you activate the plugin on a WordPress site, it will start to send archive requests for existing blog posts. Only one blog post will be archived every 25 minutes, so this can take quite a while depending on the number of posts. You don’t need to do anything, just let the plugin do its thing.

Download Post Archival in the Internet Archive plugin from the WordPress Plugin Gallery to start archiving your WordPress blog immediately.

The plugin only archives blog posts and doesn’t archive pages, dated archives, taxonomy pages, or other types of pages. There are no configurable options for the plugin. All archival requests will use your permalink settings as well as shortlink settings (when configured).

The plugin uses the User-Agent “Post-Archival-Plugin/1.0 WordPress/<version>”. You’ll see requests from this User-Agent in your server logs echoed from IP addresses owned by the Internet Archive. This means you’re save request was successfully processed and that they’re retrieving the page, images, and any other resources required to render the page.

Your blog will not be archived if your robots.txt excludes or accidentally discriminates against the “ia_archiver” robot. The plugin doesn’t test that archival requests will succeed nor guarantee inclusion in the Internet Archive. All it does is ping the archive when your blog has new content. You can test for any potential problems by manually saving a blog post using the Save Page option at web.archive.org. If you don’t run into any errors there, you shouldn’t have any problems using this plugin either.

If you found this plugin and the Internet Archive’s services useful, please consider donating to the Internet Archive project. They’re a non-profit trying to preserve the depth of information and culture that’s available on the web. Thousands upon thousands of pages and sites disappear off the web every day; the Internet Archive is trying to preserve our digital heritage.

Download Post Archival in the Internet Archive plugin from the WordPress Plugin Gallery, or grab the source code from WordPress Plugin Hosting (Subversion).

As there are a few Nikola users following this blog, I’d like to remind them that I’ve also made an Internet Archive plugin for Nikola. Nikola is a static site generator written in Python, if WordPress isn’t quite your thing.

The photo of the books in the background of the feature image © 2016 Patrick Tomasso. WordPress and the WordPress circled-W logo are trademarks of WordPress Foundation. The Post Archival in the Internet Archive code is free software licensed under the GPLv3. The Internet Archive are just awesome.