That's right, State!

This is Part 3 of the Design Patterns in Functional Programming series

Today we discuss Episode 3: State from Clojure Design Patterns.

I will dig into the original Clojure code for a functional variant of the OOP pattern, and then translate it to R.

The pattern

The GoF define the intent of State as: “Allow an object to alter its behavior when its internal state changes”.

This seems similar to the Strategy pattern, and indeed we could solve it similarly. If object property X equals A, apply function F to it; if it equals B, apply function G. Wrap this logic in a function H and we’re done.

However this is such a common situation that we might want to provision a standard implementation. Two distinct concepts are of importance here: polymorphism and modify-by-reference.

Polymorphism refers to a situation where the same method or function exhibits different behaviours, depending on the object to which it is being applied. In OOP the crucial property of the object determining behaviour is its class, but in FP we can take a wider view.

Modify-by-reference is implied by the object having an ‘internal state’, which can ‘change’. In FP, the objective is typically to define functions such that they are pure - only dependent on their inputs and without side effects. If an object is changed by a function, a new object is returned, which can be bound to the same name as the old one to ‘overwrite’ it. The opposite of this, common in OOP, is to change an object by reference, within the function. Nothing is returned and yet the value of the input object has changed, in-place.

How does Clojure, as a FP language, approach this?

Let us simplify the example from Episode 3 to its essentials: A website user should receive a different version of the site depending on whether they are subscribed or not. Without explicit polymorphism or modify-by-reference we could do something like this in R:

user <- list(name = "John Doe", user_state = "not subscribed")

subscribe <- function(user) {
    
  # Change the state if needed
  if (user$user_state == "not subscribed") {
      user$user_state <- "subscribed"
  }
    
  # In any case, return a new object
  user
    
}

greet <- function(user) {
    
  # Different behaviours depending on the state are implemented through a conditional
  if (user$user_state == "not subscribed") {
      print(paste0("Greetings, ", user$name, ".")) 
  } else {
      print(paste0("Greetings, ", user$name, "!!!")) 
  } 
    
}

# Changes to the object are made by 'overwriting' it
greet(user)
## [1] "Greetings, John Doe."
user <- subscribe(user)
greet(user)
## [1] "Greetings, John Doe!!!"

This works, but as a programmer you would have to define this logic in a fool-proof manner every time, and the reader would have to study the code every time to understand that you want to achieve state-dependent behaviour. When the codebase becomes more complex, passing changed objects back might become cumbersome too.

Clojure

To provide modifify-by-reference, we can wrap the hashmap data structure holding all information on a user into an atom.

(def user (atom {:name "John Doe"
                 :user-state :not-subscribed}))

(println user)          ;; print the memory address
(println (deref user))  ;; dereference the memory address, and print the value 
(println @user)         ;; short-hand form of dereferencing

Atoms are a reference type to hold the memory address of the object, which can be dereferenced to get the current value using deref or @. There are other reference types as well in Clojure, with different properties in a situation of concurrent requests, but these details are not important right now.

What does matter is that we can swap the atom out for a new one, in its entirety and in-place:

(defn subscribe [user]
  (when (= :not-subscribed (:user-state @user))
    (swap! user assoc :user-state :subscribed)))

Well, swap! does not actually swap the value directly, but instead expects a pure function and accompanying arguments, which should be applied to the old value to get the new value. Here that function is assoc, used to set the user-state key of the hashmap to :subscribed.

So, assoc as a pure function returns a modified hashmap, which is then swapped out in-place for the old one.

This seems to sort out modify-by-reference, but what about polymorphism? The standard idiom in Clojure is to use its multimethods:

(defmulti greet :user-state)

(defmethod greet :subscribed [user]
  (println (str "Greetings, " (:name user) "!!!")))

(defmethod greet :not-subscribed [user]
  (println (str "Greetings, " (:name user) ".")))

defmulti defines the name of the multimethod, as well as the dispatch function. The dispatch function is the function to be applied to an object to obtain the value based on which the appropriate method will be chosen. In this case, the dispatch function is the getter function for the user-state key of the hashmap.

defmethod can then be used to define the possible methods. It requires as arguments the multimethod it implements, the dispatch value that will activate it, and a parameter vector.

So now we can do:

(greet @user)
(subscribe user)
(greet @user)

This demonstrates both the polymorphic nature of the greet function and the modification by reference of the user’s state when subscribing - implementing the intent of the State pattern.

R

Through its OOP systems, R already has the capability to achieve the same things. For instance, the S3 system dispatches methods based on the class attribute of an object, and R6 implements reference classes with mutable state. Rather than exploring those however, we will try to mimic Clojure’s FP-focused solutions.

So, quiz question! Which data type in R allows modification of its contents by reference? Environments you say? Correct 😉

To modify an object by reference, just like Clojure’s atom, we can then simply wrap it:

atom <- function(obj) {
  atom <- new.env()
  atom$obj <- obj
  atom
}

We now need an extra step to see the value of the object again:

deref <- function(atom) {
  atom$obj
}

But as a reward, we can swap out the object by reference. Let’s copy the API of Clojure’s swap!, and require a function to obtain the new value of the atom:

swap <- function(atom, fn, ...) {
  atom$obj <- do.call(fn, c(list(atom$obj), list(...)))
}

It should be noted that Clojure’s atom does provide more than just an object which can be swapped out for another. Most importantly, it provides validators and a queue for concurrent requests, to ensure that each state is valid and each request is based on the last state. This logic has not been implemented here, nor would it easily work since the atom’s environment would not be shared between multiple R threads or sessions (just try it with the future package - environments are copied). What we do achieve here though is explicit mutability. You won’t easily cause side-effects by accident.

Polymorphism, then. Contrary to S3’s dispatch on class, we here aim for a dispatch on any object property, through a custom dispatch function.

What we would like to achieve in practice is a single greet() function, but with polymorphic behaviour which we can configure at run-time when creating the function. This calls for a function factory - a function generating other functions. For instance:

multimethod <- function(dispatch_fn, ...) {
    
  args <- list(...)
  dispatch_values <- args[seq(1, length(args), 2)]
  functions <- args[seq(2, length(args), 2)]
  
  function(obj) { 
    functions[[which(dispatch_values == dispatch_fn(obj))]](obj)
  }
    
}

I skipped any checks for improper inputs in the interest of brevity.

Now that we have the same facilities as Clojure, we can directly translate both the greet() multimethod and the subscribe() function. Instead of Clojure’s assoc, we will use the equivalent purrr::list_modify() - purrr already provides many tools for functional programming with R lists, so let us leverage it where possible.

subscribe <- function(user) {
  if (deref(user)$user_state == "not_subscribed") {
    swap(user, purrr::list_modify, user_state = "subscribed")
  }
}

greet <- 
  multimethod(function(x) x$user_state, 
              "subscribed", function(x) paste0("Greetings, ", x$name, "!!!"), 
              "not_subscribed", function(x) paste0("Greetings, ", x$name, ".")) 

Now we can do something very similar to the Clojure code:

user <- atom(list(name = "John Doe", user_state = "not_subscribed"))

greet(deref(user)) 
## [1] "Greetings, John Doe."
subscribe(user)
greet(deref(user))
## [1] "Greetings, John Doe!!!"

Conclusions

State is a pattern that functional programmers are not very fond of imitating, because it implies violating principles of immutability, avoiding side effects, and arguably the single responsibility of functions (each function does one thing). Nevertheless, there are practical situations where polymorphism and modify-by-reference are desirable. Using Clojure’s concepts as a template, we were able to translate polymorphism and modify-by-reference to R using base R constructs of function factories and environments.