Why You Should Write Pure Functions

Published on 2021-10-30

Pure functions are a cornerstone of functional programming, but even if you are writing code that isn't purely functional its a great idea to prefer them!

Defining Pure Function

The two properties of a pure function:

  • Given the same set of arguments, the function will always produce the same result.
  • Invoking the function produces no side effects.

A side effect can be thought of as any observable effect besides returning a value to the invoker.

A simple example of a pure function:

const add = (a, b) => a + b;

For any input into this function, it will always produce the same value. That is to say, invoking the function like add(5,2) will always produce 7. It is also possible to see that nothing else, such as modifying state or interacting with other systems, so this function is pure!

Technically, if we were to rewrite the previous function to call console.log to output some info, that would make the function impure because it is having an observable effect that is not just returning the function.

Another example of an impure function would be Math.random() as it modifies the internal state of the Math object (breaking point 2) and you get different results each time the function is invoked (breaking point 1).

Side Effects Cause Complexity

Functions that are pure are easier to reason about - you can create a mapping of inputs to outputs, and that mapping will always hold true. It doesn't depend on external state or effects to produce a result!

Lets look at a function that might be written to determine the number of days since the UNIX epoch (January 1, 1970 00:00:00 UTC) to now (don't use this, and prefer a library if you are working with time, this is just an example 😉)

const daysSinceUnixEpoch = () => {
  const currentDate = new Date();
  const epochDate = new Date('1/1/1970');

  return Math.floor((currentDate - epochDate) /  (24 * 60 * 60 * 1000));
}

This function will produce the value 18930, and every time I run it it will produce that value. Well, it will produce that every time I run that today. Depending on when you read this, if you were to copy this function and invoke it, I have no idea what value it will produce! This makes it difficult to reason about, because I need to know the external state, namely the current day, to try and figure out what value should be produced. This function would also be incredibly difficult to test, and any test that might be written would be very brittle. We can see that the issue is that we are making use of an impure value produced by new Date() to determine the current date. We could refactor this to make a function that is pure and testable by doing the following:

const daysSinceUnixEpoch = (dateString) => {
  const currentDate = new Date(dateString);
  const epochDate = new Date('1/1/1970');
  return Math.floor((currentDate - epochDate) /  (24 * 60 * 60 * 1000));
}

A simple swap to require a date string for computing the difference makes this a pure function since we will always get the same result for a given input, and we are not make use of any effectful code. Now, if I were to call this with daysSinceUnixEpoch('10/31/2021') I get the same result, but now if you were to call it you should also get 18930, neat!

Side Effects Are Unavoidable

Now, while pure functions are awesome, we can't really build an app that does anything of note without side effects. If the user can't see output, or interact with the app in any way, they probably won't have much reason to stick around! Therefore, the idea of preferring pure functions isn't to get rid of side effect, but to reduce the surface area where effectful code is executed and extract pure functionality into reusable and testable functions.

Let's look at another example of some code that might be written server side with the Express web framework. A common thing that is done server side is ensuring that the data sent in a request contains all the expected values. Imagine writing a handler for a POST request to an endpoint /api/comment that expected a request body with keys for postId, userId, comment to indicate who posted the comment, what post the comment was on, and what the comment was. Lets take a first stab at this:

router.post('/api/comment', async (req, res) => {
  const {postId, userId, comment} = req.body

  try {
    if (postId !== null && userId !== null && comment != null) {
      const res = await Comment.create({postId, userId, comment})
      return res.send(res)
    } else {
      return res.status(400).json({message: 'Expected keys for postId, userId, and comment'})
    }
  } catch (e) {
    return res.status(500).json({error: e})
  }
})

This would work, we see that we pull the keys out of the request body, then we check that they all exists. If they do we do something to create the comment, otherwise we send back a 400 with the message saying we expected certain keys. If we want to test that our logic for rejecting the request based on the payload is correct we would need to do a lot of mocking and faking a request with different payloads. Thats a huge pain! What if we instead extracted the pure code from this effectful function?

const expectedReqBody = (body, keys) => {
  return keys.every(key => key in body)
}

router.post('/api/comment', async (req, res) => {
  const expectedKeys = ['postId', 'userId', 'comment']

  if(!expectedReqBody(req.body, expectedKeys)) {
    return res.status(400).json({message: `Body of request needs to contain the following keys: ${expectedKeys}`})
  }

  const {postId, userId, comment} = req.body

  try {
    const res = await Comment.create({postId, userId, comment})
    return res.send(res)
  } catch (e) {
    return res.status(500).json({error: e})
  }
})

Now, we have extracted out the pure functionality of checking if values exist. If we are given an array of expected keys and the request body we can ensure they all exist. Now we can test the functionality by testing the pure function expectedReqBody and feel safe when we are using this function as part of validation. As a bonus, if you wanted to validate the body on other requests you have an already tested solution!

Extra Bonuses

I have previously written briefly about function composition and this works really well with pure functions! If you compose a handful of pure functions it is really easy to reason about what will happen throughout the 'data pipeline'. If you have effectful code sprinkled in, it can cause a massive headache!

Pure functions can also be memoized! If you have functionality that takes a lot of CPU power to compute, but is pure, you can cache the results! I can write a bit about memoization but some libraries to use include ramda's memoizeWith and lodash's memoize

Conclusion

Thanks for taking the time to read about pure functions! I will leave you with a tldr bullet point list on the topic:

  • Pure functions always map the same input to output, and contain no side effects.
  • We can reason about and test pure functions easily, and pure functions are easier to reuse and compose with.
  • Side effect add extra complexity, but they are unavoidable if we want to write meaningful apps.
  • Writing pure functions allows us to reduce the surface area of effectful code.