Everything you wanted to know about R & Quarto

But were too afraid to ask

Oct 17, 2024

Hi I’m Rhian 👋

A profile photo of Rhian.

Logo for The Strategy Unit

  • Senior Data Scientist at the Strategy Unit
  • Expert in R training & consultancy
  • Statistics PhD
  • Statistical Ambassador for the RSS

What & Why R?

What questions do you have? 🤔

What is R?

  • A programming language
  • About 30 years old
  • Originally specialised in statistics but now very broad
  • Cross-platform
  • Used by many organisations

The logo for R programming, a large blue letter R

Why use R?

  • Free & open-source
  • Diverse ecosystem of packages
  • Write your own functions
  • Excellent support for statistical modelling
  • Professional graphics
  • Students learn R
  • Friendly community 💜

Reproducible Analytical Pipelines ⭐

It will be built using open source software for data management, analysis and visualisation (such as R or python) as this is standard, portable, and available to all for checking and re-use.

Goldacre Review 2022

Packages 📦

Can it work with my data?

  • Yes!

How do I get started? 🚀

  1. Install R

  2. Install RStudio

You can also try it out in Posit Cloud

Let’s go! 🚀

Length of Stay data

Load in data

Watch me 👀

Your turn 💻

Change the code below and click Run

Important

Capitalisation matters!

Watch me 👀

  • Keep just the Midsomer data

Note

Notice double ==

Your turn 💻

  • Change to keep Sex is Female

Watch me 👀

  • What is the mean length of stay for each sex?

Your turn 💻

  • Calculate the median() LOS for each Area

Note

Feel free to try grouping by other variables, or using other summarisation functions e.g. mean(), min(), max(), sd()

Making pretty graphics

Professional quality graphics

Six different graphics made by the BBC graphics team. There is a variety of different graph types, including histograms, area plots, maps and dot-plots. All are styled in the BBC branding.

Professional quality graphics

Tweet from jburnmurdoch

Watch me 👀

Your turn 💻

  1. Change the the colour of the points to "blue"
  2. Add the line geom_smooth(method = "lm"). You’ll need a + at the end of line 4.

We can make maps too

Shiny apps

Hello quarto 👋

What is quarto?

The quarto logo: A circle cut into four quadrants next to the word quarto

An open-source scientific and technical publishing system

A schematic representing the multi-language input (e.g. Python, R, Observable, Julia) and multi-format output (e.g. PDF, html, Word documents, and more) versatility of Quarto.

Artwork from “Hello, Quarto” keynote by Julia Lowndes and Mine Çetinkaya-Rundel, presented at RStudio Conference 2022. Illustrated by Allison Horst.

Automating reports 📋

  • The data was wrong - can you re-run the analysis?
  • Your manager loves it - can you do it for another region?
  • Next month - do it all again.
  • There must be a better way

Example report

statsrhian.github.io/example-quarto-report/example-report.html

Template based-reports

---
title: "NHS Workforce Statistics for `r params$ics_name`"
subtitle: "Data for `r params$month_year`"
author: "Maria Garcia"
date: "2023-09-28"
params:
  ics_name: "North East and North Cumbria"
  month_year: "April 2023"
---

Interweave text and code 🧶

Read in the data

filename = glue("NHS Workforce Statistics, {params$month_year} England and Organisation.xlsx")

Clean the data

staff_group = 
  staff_group |>
  filter(`ICS name` == params$ics_name) |>
  select(`Organisation name`, `Total`,
         `HCHS Doctors`, `Nurses & health visitors`,
         `Midwives`, `Ambulance staff`) 

Add insight

The table below shows the total number of doctors and nurses for each organisation within `r params$ics_name`. We can see that the organisation with the most midwives is the `r pull(max_midwives, "Organisation name")` with `r round(max_midwives$Midwives)` staff.

Re-run next month

---
title: "NHS Workforce Statistics for `r params$ics_name`"
subtitle: "Data for `r params$month_year`"
author: "Maria Garcia"
date: "2023-09-28"
params:
  ics_name: "North East and North Cumbria"
  month_year: "May 2023"

Report a different region

---
title: "NHS Workforce Statistics for `r params$ics_name`"
subtitle: "Data for `r params$month_year`"
author: "Maria Garcia"
date: "2023-09-28"
params:
  ics_name: "Lancashire and South Cumbria"
  month_year: "May 2023"

Using R in your organisation

Enterprise tools 🏢

  • Posit Workbench
  • Posit Package Manager
  • Posit Connect

Working with Posit

Posit Partners

The logo for Posit

Is R safe? ✅

The logo for testthat

Want to learn more? 📚