When writing papers, managing evaluation data and plots has been consistently troublesome. While getting to the first draft of a plot is often quick, the challenge arises during the inevitable subsequent revisions. Here’s the workflow I’ve been using to speed up my iteration time.

Issues & Inconveniences

There are two categories of issues/inconveniences that my previous bespoke Excel-based workflow had:

Manual updates:

  • Loading data was manual (even from CSVs).
  • Every plot had to be created from scratch.
  • Plots had to be manually exported one by one on each change.

Reproducibility issues:

  • Spreadsheets had to be versioned and tracked separately from code*.
  • I often forgot which evaluation run was loaded in any given sheet.
    • Especially when returning to a project after a few weeks/months.
  • There was no record of label/name updates.
  • Excel subtly changes chart sizes when you save/load a sheet.

Most of these issues are different forms of excess user-managed state.

*While spreadsheets can be version-controlled like any other file, no merge algorithm will resolve conflicts, forcing this task on the user.

My workflow

TLDR: Code → data.json → duckdb → ggplot2 → plots/<name>.pdf → LaTeX
(where duckdb and ggplot2 are one-liners in an R Jupyter Notebook cell)

Evaluation Code: Any benchmark outputs metrics as individual lines of JSON, which can be written to a file. For example:

import time
import json

def fibonacci_recursive(n):
    if n <= 1:
        return n
    return fibonacci_recursive(n-1) + fibonacci_recursive(n-2)

def fibonacci_iterative(n):
    a, b = 0, 1
    for _ in range(n):
        a, b = b, a + b
    return a

n = 12
start_time = time.time()
_ = fibonacci_recursive(n)
rec_time = time.time() - start_time
print(json.dumps({"fn": "fib-recursive", "time": rec_time}))

start_time = time.time()
_ = fibonacci_iterative(n)
iter_time = time.time() - start_time
print(json.dumps({"fn": "fib-iterative", "time": iter_time}))
$ python my_benchmark.py >> data.json
$ cat data.json
{"fn": "fib-recursive", "time": 3.695487976074219e-05}
{"fn": "fib-iterative", "time": 3.337860107421875e-06}

If there are multiple benchmarks to run, I make a script that clears the data.json file and then runs each benchmark appending the results to data.json.

Loading, Preprocessing, and Plotting Data: I then load and preprocess the JSON on disk with DuckDB. I do this in a Jupyter Notebook running in R kernel. While I prefer Python, ggplot2 is better than anything Python has.

library(duckdb)
con <- dbConnect(duckdb())
dbExecute(con, "INSTALL json; LOAD json;")

q = "
SELECT
    CASE 
    WHEN json->>'fn' = 'fib-recursive' THEN 'Recursive'
    WHEN json->>'fn' = 'fib-iterative' THEN 'Iterative'
    ELSE json->>'fn'
    END AS fn,
    CAST(json->>'time' AS DOUBLE) * 1000000 AS time
FROM read_ndjson_objects('data.json')
"
df = dbGetQuery(con, q)

Now, the dataframe can be plotted directly and saved as a PDF at the same time:

library(ggplot2)

p <- ggplot(df, aes(x = fn, y = time)) +
    geom_bar(stat = "identity", fill = "steelblue") +
    labs(title = "Execution Time of Fibonacci Functions",
    x = "Function Type",
    y = "Time (microseconds)")
ggsave("plots/fibonacci.pdf", p, device='pdf', width=4.5, height=3)
p

Each SQL/plot pair gets its own cell. And, for convenience, the notebook starts by clearing the plots directory:

library(ggplot2)
library(duckdb)

con <- dbConnect(duckdb())
dbExecute(con, "INSTALL json; LOAD json;")

if (dir.exists("plots")) {
    unlink("plots", recursive = TRUE)
}
dir.create("plots")

This plots directory can then be copied directly into the LaTeX project/Overleaf.

Conclusion: This notebook can now be version-controlled, and the data is in a single, human-readable file that is easy to back up. Additionally, figures can be updated in 30 seconds with a change in data, preprocessing, or plots.