Taking the stress out of your code mess

Rhian Davies | @statsRhian

About Me 👋

Cartoon of a woman holding out a book

About Jumping Rivers

  • Data science & machine learning
  • Training courses
  • Dashboard development and deployment
  • Infrastructure
  • Managed Posit services

Cartoon of three people working at computers

I’m going to tell you a story

Meet Jane

  • Environmental scientist
  • Specialises in carbon models
  • Comfortable using R in academic setting

A cartoon robot holding a testtube and wearing a lab coat

Jane was frustrated

  • Inherited a pile of messy R code
  • Responsible for getting it to work, fast
  • Very deeply nested
  • Matryoshka doll code

Seven, traditional wooden Russian Dolls doing from largest on the left, to smallest on the right.

The solution

  • A series of 1:1 bespoke coding sessions
  • Unnesting the code, one doll at a time

A cartoon robot holding a testtube and wearing a lab coat

The messy zone

Tidy up bit-by-bit

How can we rewrite functions without breaking the code base?

  1. Start with the inner functions
  2. Re-write the main body
  3. Clearly define the messy zones

Jumping Rivers robot using a vacuum cleaner on a pile of code and text.

Messy zone example

example = function(arg1, arg2) {
  # Messy zone
  
  a_better_name = arg1$mess$ugh
  helpful_name = arg2$what$is$this
  
  # Refactor the internals
  
  useful_result = a_better_name + helpful_name
  sensible_name_tibble = a_better_name * helpful_name    
  
  # Messy Zone
  results$some$mess = useful_result
  results$another$naff$list = sensible_name_tibble
}

Push the mess up

Once the inner functions are clean

  1. List all the arguments of inners
  2. List all the returns of the inner
  3. Double check names
  4. Clear the messy zone in one go
  5. Move up one level
  6. Repeat

Salt N Pepa

Benefits

  • Being explicit about where the mess is allowed us to focus on simplifying internals
  • Higher level code functioned as expected
  • Didn’t have to commit to the structure of the function parameters upfront
  • Clear markers of where we would need to tidy later

Quick tips

Start with a blank slate

  • Avoid the temptation to copy-paste

  • Can tie you to the old style

Cartoon of two engineers.

Take time to design

  • One session we drew diagrams - no code

  • The most valuable session

  • We often referred back to the diagrams to remind ourselves of the design choices

Cartoon of artist next to an easel.

My favourite tools

Miro logo

draw.io logo

Excalidraw logo

A good name goes a long way

  • Just renaming arguments can make functions much clearer
defac = function(x) {
  as.numeric(as.character(x))
}
crm_summ = crimtab %>%
  as.data.frame() %>%
  mutate(
    fng = defac(Var1),
    ht = defac(Var2),
    n = Freq
  ) %>%
  select(fng, ht, n)

A good name goes a long way

  • Just renaming arguments can make functions much clearer
extract_numeric_from_level = function(x) {
  as.numeric(as.character(x))
}

criminal_summary = crimtab %>%
  as.data.frame() %>%
  mutate(
    finger_length = extract_numeric_from_level(Var1),
    height = extract_numeric_from_level(Var2),
    count = Freq
  ) %>%
  select(finger_length, height, count)

Test regularly

  • Things will go wrong

  • You want to know as soon as possible

  • Ensure that your code is always run-able

  • Numerical results should be unaffected by the refactor

  • Run unit tests with {testthat} and GitHub actions CI/CD

Cartoon robot at a laptop.

Why rather than How

  • Code was “How” programming

  • We regrouped the functions based on the science rather than the programming

  • Changing focus made it much clearer to follow

Cartoon of  four people sat around a table, with laptops. One of them is pointing at a projector screen with the python logo on it.

Do it with a friend

  • Refactoring can be daunting
  • Hard to hold the overall design and small technical details simultaneously
  • The person helping you doesn’t have to understand the details
  • Learn from each other
  • It’s fun

Cartoon of three people with speech bubbles.

Recap

Top tips

  1. Define the messy zone

  2. Push the mess up

  3. Start with a blank slate

  4. Take time to design

  5. A good name goes a long way

  6. Test regularly

  7. Why rather than How

  8. Do it with a friend

Cartoon of robot with a clipboard.

Questions?

StatsRhian

jumpingrivers.com

shiny-in-production.jumpingrivers.com