
Simulating Text Files with R to Test the Emacs Denote Package

Peter Prevos |
847 words | 4 minutes
Share this content
Emacs is the most user friendly piece of software ever invented by humanity. I use it for 90% of my computing tasks, including keeping my digital knowledge garden with notes. Several notes packages exist in the Emacs ecosystem, with Org Roam as the most popular and fully-featured. I have used this package for a while now, but it relies on a database and has grown a feature set far beyond my needs.
Protesilaos (Prot) Stavrou developed the Denote package that goes back to the basics of Emacs. The defining feature of this package is a file-naming convention that acts as metadata to find your notes. The basic structure is:
YYYMMDDTHHMMSS==signature--file-name__keyword1_keyword2.extension
.
The filename starts with a timestamp at one second resolution to ensure unique file names (unless you create more than one per second). This timestamp also acts as the unique identifier to link notes.
The timestamp is followed by two dashes and the sluggified file name. Two underscores after the file name indicate the start of the keywords, separated by one underscore. Users can also add a signature, which is denoted with two equals signs.
This convention provides a convenient heuristics to find notes based on dates, title and keywords. Denote supports either Org mode, plain text or Markdown files.
The simplicity of Denote allows for it to be easily integrated with other Emacs packages and it can be easily extended with some Emacs Lisp code. My Denote Explore package is an example of a set of auxiliary functions to help find notes.
I considered moving away from Org Roam to the monastic simplicity of Denote. But before converting my existing knowledge base, I wanted to see how it behaves with thousands of files in a single folder. Rather then converting my existing files, I decided to generate some random files to see how it performs.
Generating Random Text Files for Emacs Denote
My coding chops in R are much better than Emacs Lisp, so I decided to write some R code to generate random text files and take Denote through its paces.
This code uses the Collins Scrabble Word list to generate random file names and keywords. Download this file to your working directory before using this code. The code reads the file and generates a set of 50 keywords. Random timestamps are set somewhere in the distant future. Each file has a template for the front matter.
## Simulate n files in denote folder
## Initiation
library(stringr)
n <- 10000
k <- 50
wordlist <- readLines("collins-scrabble-words-2019.txt")
wordlist <- tolower(words)
tag_words <- sample(words[nchar(wordlist) <= 5], k)
timestamps <- Sys.time() + sample(600E6:666E6, n)
template <- c("#+title: ",
"#+date: ",
"#+filetags: ",
"#+identifier: ")
denote_directory <- "~/denote-sim/"
dir.create(denote_directory)
This next code snippet generates $n$ Org mode files in the denote_directory
folder. Titles are extracted by sampling the word list and the tags (keywords) are sampled from the $k$ defined tags. The front matter includes the tile, the creation date, keywords (called filetags in Org mode) and the identifier.
The Lorem Ipsum generator in the stringr package generates some paragraphs of text. The last part of the code generates some links to random posts.
## Generate n random posts
for(i in 1:n) {
title <- paste(sample(wordlist, sample(2:5, 1)), collapse = "-")
tags <- paste(sample(tag_words, sample(4, 1)), collapse = "_")
identifier <- format(timestamps[i], "%Y%m%dT%H%M%S")
front_matter <- c(paste0(template[1],
str_to_title(str_replace_all(title,
"-", " "))),
paste0(template[2],
paste0("[", format(timestamps[i],
"%F %a %H:%M"), "]")),
paste0(template[3],
paste0(":", str_replace_all(tags,
"_", ":"), ":")),
paste0(template[4], identifier))
links_list <- vector()
for (j in 1:(sample(1:5, 1))) {
links_list[j] <- paste0("- ", "[[denote:",
sample(format(timestamps, "%Y%m%dT%H%M%S"), 1), "]]")
}
content <- c(front_matter,
"",
stringi::stri_rand_lipsum(1),
"",
paste("*", str_to_title(paste(sample(wordlist,
sample(1:3, 1)),
collapse = " "))),
links_list)
filename <- paste0(denote_directory, identifier, "--",
title, "__", tags, ".org")
writeLines(content, filename)
}
Generating thousands of files will take a few minutes …
This codes generates ten thousand notes to test the Denote package to review it if works at a large scale. This tests shows that Prot's approach is perfectly capable of working with thousands of notes. Just for kicks, I also synchronised these files with an Org Roam setup. My laptop struggled with the computational load. I was unable to properly access the files as with the large number of files and struggled accessing files. So case, closed - I am moving to Denote and teach myself more Emacs Lisp to build my ideal Zettelkasten.
Postscriptum
Since I wrote this post a few months ago I have migrated all my Org Roam notes to Denote. I migrated my 2,000 notes manually as I wanted to reread all my material and immerse myself in my old ideas to generate new ones. Paraphrasing Scocrates: gaining knowledge is the art of remembering.
Denote plays a central role in my Emacs Writing Studio configuration. This starter kit is specially designed for authors who have no need to develop software.
If you don't feel like spending the time to migrate manually, then other Emacs enthusiasts have developed automated methods:
- Jeremy Friesen, Exploring the Denote Emacs Package
- Jeremy Friesen, Migration Plan for Org-Roam Notes to Denote
- Charanjit Singh, Notes Migrator
Share this content