Sort your Strategy

This is Part 2 of the Design Patterns in Functional Programming series

Today we discuss Episode 2: Strategy from Clojure Design Patterns. As with Command, I will dig into the original Clojure code for a functional variant of the OOP pattern, and then translate it to R.

The pattern

The GoF define the intent of Strategy as: “Define a family of algorithms, encapsulate each one, and make them interchangeable. Strategy lets the algorithm vary independently from clients that use it.”

Now this does sound like an R kind of problem!

Since Episode 2 deals with a sorting task, take a look at ?order in R for a concrete use case. The method argument allows the user to pick one of a set of sorting methods, one of which is auto, which will pick a method depending on the data. This corresponds well to the intent of the Strategy pattern - allow the program to flexibly pick a how for its task from a library of different approaches.

But how do we implement Strategy? A beginning programmer will approach it invariably with a series of very large if-clauses. If the method is shell, run this code. If the method is radix, run those lines of code. And so on. If the function is small that’s still a fine take on it, but for large code it quickly becomes hard to maintain. If we would instead make each algorithm their own object (‘encapsulate each one’), we can neatly put them in separate files (or all together in the same file), unit test them separately, keep the main function code small, and even re-use the same algorithms in different functions.

Clojure

The sorting problem solved in Episode 2: Strategy is not that of choosing between sorting algorithms, but deciding which item should be regarded as smaller than which when sorting. This might seem straight-forward when sorting a vector of numbers, but it is not when dealing with more complex objects with many properties.

The first thing we need to do to make the code self-contained is to define some example data. In this case, some users who each have a name and a subscription status. In Clojure we can use a hash map data structure for this. Hash maps are key-value stores similar to Python dictionaries. In R key-value structures are usually coded as named lists, although environments are technically speaking a closer equivalent.

(def users [{:name "John Doe" :subscription 0}
            {:name "Jane Doe" :subscription 1}
            {:name "John Dur" :subscription 1}
            {:name "Sara Der" :subscription 1}])

So we used def to define users as a vector of four hash maps, each with the same structure. The use of the colon : indicates a keyword type for the keys - as the documentation says, keywords are symbolic identifiers that evaluate to themselves. That is, :name is a variable called name and but also represents the value name. Keys in hash maps can be of other types than clojure.lang.Keyword, but often it’s a good fit.

The users data structure might seem annoying to R users - what if we just want all the names? How do we get them? The map function then works similar to Map() or purrr::map() (or the apply family if you wish), in that it allows you to use the same function on every element of the input, and get a list of results back. Try this:

(map :name users)

Function? Which function did we apply? Didn’t we pass the :name keyword variable rather than a function? Well, keywords by themselves act like get functions, too. (:name {:name "John"}) is the same thing as (get {:name "John"} :name). So we could have written the above as (map #(get %1 :name) users) - but who likes typing so much?

With that out of the way we can look at the actual solution. So, we want to sort users by name, but always want the subscribed people at the top. We also separately want a reverse sort method, but it should still keep the subscribed people at the top. The Strategy pattern is implemented by passing different comparator functions for each method to the built-in sort function.

I have modified the code slightly to make it easier to follow. Here’s the comparator function for the first method:

(defn alphabetic-sort [u1 u2]
  (cond
   (= (:subscription u1)
      (:subscription u2)) (neg? (compare (:name u1)
                                         (:name u2)))
   (:subscription u1) true
   :else false))

For every pair of users, we can now decide whether the first element should come first (return true) or last (return false). Again a few new built-in Clojure functions are introduced. cond is more or less equivalent to dplyr::case_when() - it takes a number of expressions as arguments, followed by what should be returned if the expression evaluates to true. When that happens, the other expressions are no longer evaluated. The :else keyword handles the other cases.

The first expression is the most complicated. We check whether the two users have the same subscription status. If they do, we compare the name strings, which returns a negative number if the first user should indeed be sorted as coming first out of the two. The neg? function checks whether the number is negative and returns true or false. The other conditions to cond are self-explanatory - if the subscription statuses are not equal and the first user is subscribed, that user should come first. If not, last.

Given this comparator function, sort can work out the solution by itself.

(sort alphabetic-sort users)

As for the reverse sort, we need to change only one thing: When sorting by name when the subscription status are equal, return true for a positive compare result instead, using the pos? function.

(defn reverse-sort [u1 u2]
  (cond
   (= (:subscription u1)
      (:subscription u2)) (pos? (compare (:name u1)
                                         (:name u2)))
   (:subscription u1) true
   :else false))

(sort reverse-sort users)

A second, much shorter solution is also provided but I will skip it here. I rather agree with “Pedro” that it is dubious, because its succinctness is so specific to the problem. Also, it doesn’t make it as clear that implementing Strategy in functional programming is all about passing an implementation function to another function.

Because yes, did you notice? Everything in the above code is composed of function executions. The cond function, the = function, the neg? and pos? functions, the compare function, even the keyword getter functions for retrieving values from hash maps.

R

Sorting is of course a problem that is already well solved in R, and in practice you should use built-in functions like sort, order, or rank in the vast majority of cases. Where more complicated data structures are concerned, they can often be captured as data frames, and sorted using for instance dplyr::arrange().

But for the sake of continuing the exploration of the Strategy design pattern, let us translate the above Clojure code instead. The first thing we need is a sorting function which also takes a binary comparison function, just like in Clojure.

We can make a brief implementation of the quicksort recursive algorithm:

qsort <- function(inp, cfun) {
  if (length(inp) > 1) {
    cres <- cfun(inp, inp[[length(inp)]])
    c(qsort(inp[cres < 0], cfun), inp[cres == 0], qsort(inp[cres > 0], cfun))
  } else {
    inp
  }
}

I have opted to play to R’s strengths here, and let the comparator function take a vector of values as its first argument, and a scalar value (the pivot of the quicksort algorithm ) as its second. For each element in the first vector, the comparator function now has to decide whether it is smaller (negative value), equal (0), or larger (positive value) than the second argument. The return value should be a numeric vector equal in length to the inp argument.

But first of course we need some data. We will use a named list instead of a hashmap for each user, and put them together in an unnamed list vector.

users <- list(list(name = "John Doe", subscription = 0),
              list(name = "Jane Doe", subscription = 1),
              list(name = "John Dur", subscription = 1),
              list(name = "Sara Der", subscription = 1))

So here’s the first comparator function to sort these user structures:

alphabetic_sort <- function(uvec, uref) {

  vec_names <- sapply(uvec, function(x) x$name)
  vec_subs <- sapply(uvec, function(x) x$subscription)
  
  equal_users <- vec_subs == uref$subscription & vec_names == uref$name
  earlier_users <- vec_subs > uref$subscription | (vec_subs == uref$subscription & vec_names < uref$name)
  later_users <- vec_subs < uref$subscription | (vec_subs == uref$subscription & vec_names > uref$name)
  
  (!equal_users) * ((-1 * earlier_users) + (1 * later_users))

}

In brief, we detect users to be sorted equal, earlier, and later than the reference item in separate logical vectors, and recode the results to 0/-1/1 in the end. Luckily, R already has vectorised string comparison built in for the < and the > operators, so we didn’t need to implement that!

For the reverse sorting method, we simply have to change the comparison direction for the name strings again:

reverse_sort <- function(uvec, uref) {
    
  vec_names <- sapply(uvec, function(x) x$name)
  vec_subs <- sapply(uvec, function(x) x$subscription)
  
  equal_users <- vec_subs == uref$subscription & vec_names == uref$name
  earlier_users <- vec_subs > uref$subscription | (vec_subs == uref$subscription & vec_names > uref$name)
  later_users <- vec_subs < uref$subscription | (vec_subs == uref$subscription & vec_names < uref$name)
  
  (!equal_users) * ((-1 * earlier_users) + (1 * later_users))

}

And then we can pass both functions to the qsort() function we defined earlier:

qsort(users, alphabetic_sort)
qsort(users, reverse_sort)

Conclusion

Strategy can be solved by passing a function as an argument to another function. For the specific problem of sorting raised in Episode 2 we spent a good amount of time on defining comparator functions and an algorithm that would work with them, all while R already has excellent built-in sorting capabilities. But don’t let that distract you from the main message here: Implementing the sorting methods as separate function objects makes it easier to manage and test them.

Next up: Episode 3: State!