This article explains how to export WordPress to Hugo with R to create a static website. The code contains functionality for both RMarkdown and Org Mode files.

Export WordPress to Hugo Markdown or Org Mode with R

Peter Prevos | 18 July 2020
Last Updated | 19 August 2024
899 words | 5 minutes

Share this content

I started my first website in 1996 with hand-written HTML. That became a bit of a chore, so for about fifteen years, WordPress became my friend. WordPress has been great to me, but it is slowly becoming a pain to keep updating plugins, security issues, slow performance and the annoying block editor. I am also always looking for additional activities I can do with Emacs. Hugo takes a lot of the pain of managing site away as you can focus on the content and Emacs provides me with powerful editing functionality.

I recently returned to a static website using Hugo. This article explains how to export a WordPress blog to Hugo and customise it with R code. The only reason I used R is because it is the only programming language I know well enough.

You will also need to install the mighty Pandoc software to convert the content to Org mode and the WP All Export WordPress plugin to export your website to a CSV file.

Convert the content to Markdown or Org Mode

The first step is to export the WordPress posts database to a CSV file. Several plugins are available that help you with this task. I have used the WP All Export plugin to export the data. You need to download the ZIP file and install this plugin manually in your WordPress setup. Follow the steps in the All Export plugin and create a CSV file from your posts with at least these fields:

Title
Slug
Date
Content
Categories
Tags

Alternatively, you can link directly to the WordPress database and extract the data with the RMySQL package.

The content files for Hugo are either Markdown or Org Mode. I prefer to use Org Mode as it provides me with access to the extensive functionality that Emacs has to offer, including writing and evaluating R code.

The Content field in the exported CSV file contains HTML code of the article. The code below reads the CSV file and saves each content field as an HTML file, using the post's slug as the filename. The mighty Pandoc software converts this file to Org mode. Any draft posts or pages in the export file will have NA as the file name and are as such skipped.

Now that we have some content, we need to add the Org mode front matter so that Hugo can build a site. The last part of the code generates the front matter for each post, prepends it to the exported Org mode file and cleans some entries.

Copy the code below and save it as wp2org.R. You need to change the filename in the line that starts with file to the name of your export file. The script also creates two subdirectories to store the HTML and Org files.

You run this code with Rscript wp2org.R from the same directory where the CSV file is stored. The result will be a collection of Org mode files.

This new site will not be perfect just yet. To show the images, you need to download your wp-content folder and move it to the static/images folder in Hugo.

The internal links in your blogs will be hard-coded, which means that you need to configure Hugo to ensure your slugs stay the same.

There will be other bits and pieces that might not have adequately converted, so do check your pages.

All you have to do now is to add a theme to your website, and your blog is fully converted. The Hugo website has a great Quick Start page that will get you going.

You can create new posts and edit your content with your favourite text editor. I use Org mode in Emacs to develop this website.

Summary

In summary, you need to take the following steps:

Install pandoc software and WP All Export WordPress plugin.
Download your website as a CSV file with the WordPress plugin.
Copy the R script in a file called wp2org.R and save it in the same location as the CSV file.
Open your console and move to the folder with the script and CSV file
Run Rscript wp2org.R
Review the Org mode files and clean-up any issues

Script

  ## Export WP to Hugo

  ## Read exported WP content
  library(tibble)
  library(readr)
  library(dplyr)
  library(stringr)

  ## Replace the filename with the exported file
  posts <- read.csv("filename", skipNul = TRUE)

  ## Create subdirectories
  if (!dir.exists("tmp")) dir.create("tmp")
  if (!dir.exists("org")) dir.create("org")

  ## Read posts
  for (i in 1:nrow(posts)) {
      ## Save content as temporary html file
      filename <- paste0(posts$Slug[i], ".html")
      writeLines(posts$Content[i], paste0("tmp/", filename))
      ## Convert to Org mode with Pandoc
      pandoc <- paste0("pandoc -o ", paste0("org/", posts$Slug[i],
                                            ".org ", paste0("tmp/", filename)))
      system(pandoc)
  }

  ## Create front matter for all posts
  fm <- tibble(title = paste("#+title:", posts$Title),
               date = paste0("#+date: [", as.POSIXct(posts$Date, origin = "1970-01-01"), "]"),
               lastmod = paste0("#+lastmod: [", Sys.Date(), "]"),
               categories = paste("#+categories[]:", str_replace_all(posts$Categories, " ", "-")),
               tags = paste("#+tags[]:", str_replace_all(posts$Tags, " ", "-")),
               draft = "#+draft: true") %>%
      mutate(categories = str_replace_all(categories, "\\|", " "),
             tags = str_replace_all(tags, "\\|", " "))

  ## Load Hugo files an prepend front matter
  for (f in 1:nrow(posts)) {
      filename <- paste0("org/", posts$Slug[f], ".org")
      post <- c(paste(fm[f, ]), "", readLines(filename))
      ## Repoint images
      post <- str_replace_all(post, paste0("http.*wp-content"), "/images")
      ## Cleanup LaTeX
      post <- str_replace_all(post, "\\$latex ", "$")
      ## Remove remaining Wordpress artefacts
      post <- str_remove_all(post, ':::|\\{.wp.*|.*\\"\\}')
      ## Write to disk
      writeLines(post, filename)
  }

Share this content

Export WordPress to Hugo Markdown or Org Mode with R

Convert the content to Markdown or Org Mode

Summary

Script

You might also enjoy reading these articles