Reproducible Analytical Pipelines

My personal hot takes πŸ”₯

2026-05-21

πŸ›οΈ My RAP Pillars

  • πŸ” Visible: Can I find it?

  • πŸ” Reusable: Can I adapt it?

  • πŸ›‘οΈ Reliable: Can I trust it?

πŸ” Visible

  • Public and open πŸ”“
  • Discoverable 🌐
  • Clear README & examples πŸ“–
  • LICENSE included πŸ“ƒ

πŸ” Reusable

  • Open-source languages πŸ”“
  • Well documented πŸ“š
  • Consistent code style πŸ’…
  • Modular 🧩
  • Configurable βš™οΈ
  • Easy to install πŸ“¦

πŸ›‘οΈ Reliable

  • Tested πŸ§ͺ
  • Automated checks πŸ€–
  • Peer reviewed πŸ‘©β€πŸ’»
  • Reproducible environments 🌱

Two animated characters from Toy Story, Woody and Buzz Lightyear, stand together. Buzz gestures widely. Text reads 'Hot Takes Hot Takes Everywhere.'

Commented code

\(\neq\) good code

library(dplyr) # load dplyr package
d1 <- as.data.frame(crimtab) # turn table into dataframe
d2 <- d1 |> # create new dataset
  mutate(# make new variables
    fg = f1(Var1), # convert Var1 into numeric this represents finger length
    ht = f1(Var2),     # convert Var2 into numeric this represents height
    n = Freq # rename frequency column this is the count
  ) |>
  select(fg, ht, n) # keep only required columns

# function to turn factor into numeric 
# this is needed because the variables are factors
# and we want numeric values instead
f1 <- function(x) {
  y <- as.character(x)  # convert factor to character
  as.numeric(y)  # convert character to numeric
}
library(dplyr)

criminal_summary <- crimtab |>
  as.data.frame() |>
  mutate(
    finger_length = extract_numeric_from_factor(Var1),
    height = extract_numeric_from_factor(Var2),
    count = Freq
  ) |>
  select(finger_length, height, count)

extract_numeric_from_factor <- function(factor) {
  as.numeric(as.character(factor))
}

Reusable ♻️

not reproducible

Open first πŸ”“

You don’t need friends πŸ‘©β€πŸ’»

to use Git

can build an R package πŸ“¦

Myths ❌

It doesn’t have to…

  • have hundreds of functions
  • be useful for anyone else
  • do anything fancy
  • be on CRAN
  • be scary

(spreadsheets). Make QA work for you.

Drake Nah-Yeah meme. Nah is a colourful spreadsheet, Yeah is a number of QA tasks in GitHub.

It doesn’t have to be

perfect ✨

How do I get support? πŸ’™

A cartoon person standing proudly next to a screen