How your code might get rusty,

And what you can do about this

Maëlle Salmon (rOpenSci, cynkra)

https://user-maelle.netlify.app

Old, really degraded abandoned boats near the sea. Picture by Jean Claude-Salmon.

Maëlle pointing at old, really degraded abandoned boats near the sea. Picture by Jean Claude-Salmon.

Nice sailboat. Picture by Jean Claude-Salmon.

Big ship at sea, looks a bit rusty on the inside. https://www.pexels.com/photo/old-cargo-ship-on-sea-20703978/

Positive affirmations

Scrabble letters spelling the word ‘lousy’ https://www.pexels.com/photo/the-word-louise-is-spelled-out-in-scrabble-letters-19835557/

What is rusty code?

No one wants to approach it. 😉

  • Something isn’t working anymore.

  • Something isn’t readable anymore.

  • Something doesn’t follow standards anymore.

Code that used to be ok for a different context and less complicated features.

Old phone https://www.pexels.com/photo/gray-rotary-telephone-on-brown-surface-209695/

Characteristics of rusty code

  • At a small scale: patterns, code style, etc.

  • At a larger scale: design (classes, functions), script (and test files) organization, dependencies, docs or lack thereof, etc.

igraph, started by Gábor Csárdi
> 18 years ago.

Close Up Photography of Yellow Green Red and Brown Plastic Cones on White Lined Surface https://www.pexels.com/photo/close-up-photography-of-yellow-green-red-and-brown-plastic-cones-on-white-lined-surface-163064/

Funding from the
R Consortium ISC 🙏

“Paying off igraph’s tech debt”

Funding from the
R Consortium ISC 🙏

“Paying off igraph’s tech debt”

“Setting igraph for success in the next decade”

Who wrote this 💩?

Who wrote this 💩?

Philip Heltweg’s blog post

“Since then, I remind myself that legacy software is written by people like me. Torben had constraints I knew nothing about. He worried about deadlines long past. Torben had learned a lot since then. Development workflows had evolved, infrastructure had gotten better.”

https://www.heltweg.org/posts/who-wrote-this-shit/

Bad code?

Bad code?

Hadley Wickham’s toot

Periodic reminder: The only way to write good code is to write tons of shitty code first. Feeling shame about bad code stops you from getting to good code.

https://fosstodon.org/@hadleywickham/112021309035884210

Legacy technology

Marianne Bellotti’s book “Kill it with fire: Manage Aging Computer Systems (and Future Proof Modern Ones)”

Legacy technology exists only if it is successful. These old programs are perhaps less efficient than they were before, but technology that isn’t used doesn’t survive decades.

https://nostarch.com/kill-it-fire

Let’s refactor a codebase!

Construction Equipment and Tools Plastic Toys https://www.pexels.com/photo/construction-equipment-and-tools-plastic-toys-4492351/

What to refactor?

Write down your frustration in your issue tracker.

Some ideas

Set up: have tests

If no tests, add them first!

Set up: have tests

If no tests, add them first! Test inspiration:

  • code itself,
  • docs especially examples,
  • experts,
  • code coverage.

codecov.io report via {covr} on continuous integration

codecov.io report via {covr} on continuous integration

usethis::use_github_action("test-coverage")

codecov.io report via {covr} on continuous integration

codecov.io interface with the view of the coverage of the whole package, with one of these fancy round diagrams that one can hover or click one to see which folder of file the color, from green to red, corresponds to

codecov.io report via {covr} on continuous integration

codecov.io interface for a single file, where lines covered by tests are highlighted in green, and lines not covered by tests have a red mark on the left

Where to start?!

To get my mojo on, I find the simplest infelicity to fix, I fix it. Then I do it again.

GeePaw Hill in “Refactoring Pro-Tip: Easiest Nearest Owwie First”

Idea 1: Nearest, easiest owwie first

  • A way to get going!

  • Increase codebase quality

  • Increase understanding of codebase

Opportunity to explore many scripts: fix the owwie everywhere.

Examples of “owwies”

Idea 2: {lintr} on the whole codebase

{lintr} itself

reference index

lintr::lint()

bla.R
x <- c("sailboat", "houseboat", "ferry")
grepl("boat$", x)

lintr::lint()

lintr::lint(
  "bla.R", 
  linters = lintr::linters_with_tags("readability")
)

Line 2 [string_boundary_linter] Use !is.na(x) & endsWith(x, string) to detect a fixed initial substring, or, if missingness is not a concern, just endsWith. Doing so is more readable and more efficient.

lintr::lint()

Line 2 [string_boundary_linter] Use !is.na(x) & endsWith(x, string) to detect a fixed initial substring, or, if missingness is not a concern, just endsWith. Doing so is more readable and more efficient.

bla.R
x <- c("sailboat", "houseboat", "ferry")
endsWith(x, "boat")

(R 3.3.0 gem!)

{lintr}

🧰 Can be configured

.lintr.R
linters <- list(lintr::undesirable_function_linter(
  fun = c(
    # Base messaging
    "message" = "use cli::cli_inform()",
    "warning" = "use cli::cli_warn()",
    "stop" = "use cli::cli_abort()"
    )
)

🧰 Can be run on continuous integration.

Idea 3: automate the refactoring

Worth it if

  • a big codebase,
  • repeated efforts,
  • need to be able to regenerate the changes at a later point (other prioritary PRs waiting).

expect_that(x, equals(1)) to expect_equal(x, 1)

  • Code to XML with {xmlparsedata};
  • Edits in XML with {xml2} + XPath;
  • Amend edited lines only using line1, line2 attributes in XML.

https://masalmon.eu/2024/05/15/refactoring-xml/

Idea 4: discuss design ideas with collaborators

Or experiment for a bit.

Patience

Colorful puzzle pieces with scrabble tiles spelling the word ‘patience’ https://www.pexels.com/photo/colorful-puzzle-pieces-with-scrabble-tiles-9227484/

Idea 5: a little bit at a time

When working on a bug fix or feature…

  • “tech debt”-labelled issues “upkeep”-labelled issues

  • Refactoring pull requests / commits

Less, easier refactoring

A plastic toolbox with toy tools https://www.pexels.com/photo/plastic-carpenter-tools-in-a-toolbox-toys-4492368/

How to have less need for refactoring?

John Ousterhout’s book “A Philosophy of Software Design”

Two approaches to programming:

🐰 Tactical programming;

🐢 Strategic programming.

Word ‘dream’ on greenery https://www.pexels.com/photo/dream-text-on-green-leaves-1535907/

Training! Sharing!

Letters in felt https://www.pexels.com/photo/assorted-color-alphabet-1337385/

Kind code review

Add your comment here, be kind…

Pull Request reviews

  • Improve code ;
  • Improve knowledge of code (both ways!) ;
  • Improve knowledge of programming (both ways!).

Kind pull request reviews

👀 Tidyteam code review principles by Davis Vaughan.

👀 The Code Review Anxiety Workbook by Carol Lee and Kristen Foster-Marks.

Kind package reviews

rOpenSci software peer-review!

Easier future refactoring:

Git history

A gray room with shelves full of brown drawers https://www.pexels.com/photo/a-gray-room-with-shelves-full-of-brown-drawers-6549926/

Without Git

Day 1 : script.R

Day 2 : Add statistical stuff, script2.R

Day 42 : Add plot, script-final-2024-06-03.R (40th copy of the file 🤪)

With Git

different versions of a script with 'git commit' arrows between them

With Git : what you get

script.R and besides the .git folder containing the snapshots

Git is handy

  • Less loss of work;

  • Experiments in branches;

  • History to use (locally and on platforms like GitHub).

Git resources for beginners

Easier future refactoring: Git history

Small commits with informative messages

My blog post Why you need small, informative Git commits

A mysterious line of code

a script with a mysterious line 'x <- x - 1'

Git blame

simplified diagram of Git blame: for each line in a script on the left we see who added it, when, with what commit message.

Git blame: click on the commit…

Git blame: click on the commit…

“Commit a bunch of files before lunch 🍝

Showing 145 changed files with 2,624 additions and 2,209 deletions.

Git blame: click on the commit…

“fix: adapt code to tool’s 0-indexing”

Showing 2 changed files with 3 additions and 2 deletions.

Git history

Your repository’s Git history should be like your Instagram profile grid.

How to get a nice(r) Git history

Another dimension to your work.

My post: The two phases of commits in a Git branch

Work in branches

Pink blossoming branches https://www.pexels.com/photo/pink-blossoming-branches-20699831/

Hack your way to a good Git history

  • “The repeated amend”™️ : git commit --amend

  • “Squash and merge”: click the right GitHub/GitLab button

  • “Start from scratch”: git reset --soft + git add (--patch)

  • “Mix and match your commits”: git rebase -i

My post with details: Hack your way to a good Git history

Get better at Git

A laptop and a book on a beige leather sofa https://www.pexels.com/photo/a-laptop-and-a-book-on-beige-leather-sofa-6372812/

Get better at Git with {saperlipopette}

12 functions creating Git sandboxes with a small challenge to solve.

Inspired by Oh shit, Git! + my experience

Exercises to be solved however you want (command line! RStudio Git tab! GitKraken! GitHub Desktop!)

Practice Git history cleaning with {saperlipopette}

  • Put a change on the right branch (git cherry-pick + git reset),

  • Rewrite history in a branch (git rebase -i),

  • Split changes into several commits (git add --patch),

Practice Git history usage with {saperlipopette}

  • Remove untracked, useless files (git clean),

  • Find which commit introduced a bug (git bisect),

  • Undo a commit (git revert).

This was all easy

A red plastic hard hat toy and a yellow bucket on a wooden crate https://www.pexels.com/photo/a-red-plastic-hard-hat-toy-and-a-yellow-bucket-on-a-wooden-crate-4492347/

Now what if no one is there to maintain the code?

Brown bare tree on dry brown ground https://www.pexels.com/photo/photo-of-brown-bare-tree-on-brown-surface-during-daytime-60013/

No Maintenance Intended

Especially relevant for open-source projects.

What does it mean to maintain software?

  • ownership of the scope, code and community;

  • responsiveness to external requests;

  • regular housekeeping.

Blog post What Does It Mean to Maintain a Package?

Why maintain software?

  • pocket/wallet payment;

  • payment for the heart;

  • payment for the brain.

Yanina Bellini Saibene Three currencies of payment for our work

If the balance feels off, consider your needs. It might be time to try and recruit co-maintainers or join a community of other developers, or even to find a new maintainer or retire the package.

Blog post What Does It Mean to Maintain a Package?

rOpenSci community
of package maintainers

Never hesitate to ask for help.

Package maintainer cheatsheet

Publicity (blog, social media posts)

Technical infrastructure (R-universe) and content (dev guide, blog, Package Development Corner)

Social (Slack workspace, coworking, community call)

Succession

rOpenSci community of package maintainers

How to join?

  • Submit a package to software review “Do you expect to maintain your package for at least 2 years, or to be able to identify a new maintainer?”

  • Take over a package

  • Contribute significantly to a package

  • Other paths of participation

rOpenSci resources for everyone

Explicit statement of maintenance status and needs

Make succession easier

  • Have a good codebase 😉

  • Foster a welcoming atmosphere for contributors. 2021 community call

  • Regularly assess your maintenance responsibilities. rOpenSci yearly package maintainer survey.

Support maintainers
as a community

  • Be gracious with maintainers.

  • Provide support as relevant/possible, be it technical, social or monetary.

Conclusion

Plastic construction toys https://www.pexels.com/photo/handyman-tools-plastic-toys-4492348/

Your code could get rusty but it does not have to.

Refactors and successions are a fact of programming life as are bug fixes, but there are ways to limit damage.

https://user-maelle.netlify.app

Thank you and thank you to:

useR! program and organization committees.

cynkra team especially igraph project team Kirill Müller, Michael Antonov; and speaker training/practice buddies Angelica Becerra, David Granjon, Antoine Fabri, Mike Page, Christoph Sax.

rOpenSci team especially practice audience Yanina Bellini Saibene, Steffi LaZerte, Mark Padgham and abstract reviewer Jeroen Ooms.

igraph team now and then Szabolcs Horvát, Tamás Nepusz, Vincent Traag… and Gábor Csárdi! 😉

Hannah Frick, Athanasia Monika Mowinckel, Shannon Pileggi, Hugo Gruson

Old boats near sea. Picture by Jean-Claude Salmon.

Take-home messages

  • Refactoring one bit at a time or in a more ambitious ways. Tests as safety nets!

  • Ensuring code quality through kind code reviews.

  • Training oneself and others.

  • Learn a bit more Git.

  • Supporting maintainers even when they need to leave, and before they leave.

https://user-maelle.netlify.app

Photo credits