How defensive coding leads to bloat

Defensive coding is an important lesson from Pragmatic Programmer, although not my favorite. It says that you should validate your inputs. Don't trust anyone!

This is a good principle.

Your software should validate inputs. A client may have bugs, an API could spazz out, a database or hard drive can get corrupt, a user might do something silly. Hell, sometimes clients and users are downright malicious trying to break your code.

But on a team with lots of moving pieces and little big-picture understanding, defensively coding all the things leads to bload. Does every function in a chain really need to validate its inputs?

Side Eye Reaction GIF by MOODMAN

How defensive coding goes wrong

We once built an API that ran 29,000 database queries for every request. Thanks to defensive coding and what looked like clear sensible separation of concerns.

That example can't fit in a post you'll read. Too much background info and context required.

Here's an example from a different part of that project ๐Ÿ‘‰ we built a UI that sorts the same array 5 times ๐Ÿคจ

An innocent looking UI with a performance gotcha

Designed by engineer ๐Ÿ’ช

The UI lets you update and manage a person's schedule. You bulk upload a new list of availabilities via CSV, verify the update, and see the result in a calendar component off-screen after saving.

Because we work as a team, each portion was built by someone else. Because we have good engineers, they all thought "Oh wait, I gotta validate my inputs!"

You end up with a date range function that looks like this:

function getDateRange(dates) {
  const sorted = dates.sort();
  return `Updating between ${sorted[0]} and ${sorted[sorted.length - 1]}`;
}

A component that lists updates like this:

function DateList({ dates }) {
  const sorted = dates.sort();

  return (
    <ol>
      {sorted.map((date) => (
        <li>{date}</li>
      ))}
    </ol>
  );
}

A calendar that lists events like this:

import groupBy from "lodash.groupby";

function Calendar({ events }) {
  const sorted = events.sort();
  // groups preserve sort order internally
  const byDay = groupBy(sorted, (event) => event.day);

  return; // render UI
}

And a few others here and there.

What's going on?

Every engineer is following the Validate Your Inputs principle. When they need a sorted array and aren't sure if it's sorted โ€“ sort all the things!

This makes sense when you think of each piece of code as a unit. Which they do because that's what they're currently working on and thinking about.

Everything outside your function becomes The Untrusted Outside and you code accordingly.

Repeat across time and engineers and you create a bloated codebase that spends half its time validating and re-validating the same data. ๐Ÿ’ฉ

This tendency can lead to swallowed promises when functions handle their own errors instead of propagating them back up the stack.

Solution: Validate the edges, trust the insides

The solution is to more carefully design your units and agree on where the edges are.

Where does new data enter your system? Validate there.

Validate inputs at those entry points and the rest of your code can assume they're valid. A type system can help you communicate that to other engineers.

Cheers,
~Swizec

PS: better unit definitions help with testing too! You can avoid the cult of TDD by testing the interfaces, not the implementation โœŒ๏ธ