What the heck is NoNeedForMonad?

At work, someone long ago turned on the NoNeedForMonad wart remover for our Scala projects. I started bumping up against it recently, had trouble parsing exactly what the “wart” was, and decided to look into it.

Example: adding two optional numbers

Imagine this somewhat contrived example. We receive Input from some unreliable source; two numbers that may or may not be present.

final case class Input(
  a: Option[Int],
  b: Option[Int],
)

Let’s say we want to sum the two numbers in Input — if both are present, return a Some, otherwise return None. Here’s a “naive” way to write this:

object Input {
  def sum(input: Input): Option[Int] = {
    input.a match {
      case Some(a) =>
        input.b match {
          case Some(b) => Some(a + b)
          case _ => None
        }
      case _ => None
    }
  }
}

val input1 = Input(Some(1), Some(2))
val input2 = Input(None, Some(2))

Input.sum(input1)
// #=> Some(3)

Input.sum(input2)
// #=> None

That works, but it’s hard to read with lots of nested match statements. We can clean this up with a for comprehension:

object Input {
  def sum2(input: Input): Option[Int] = {
    for {
      a <- input.a
      b <- input.b
    } yield a + b
  }
}

Boom! The NoNeedForMonad wart is triggered and complains with the following:

No need for Monad here (Applicative should suffice).

> “If the extra power provided by Monad isn’t needed, it’s usually a good idea to use Applicative instead.”

Typeclassopedia (http://www.haskell.org/haskellwiki/Typeclassopedia)

Apart from a cleaner code, using Applicatives instead of Monads can in general case result in a more parallel code.

For more context, please refer to the aforementioned Typeclassopedia, http://comonad.com/reader/2012/abstracting-with-applicatives/, or http://www.serpentine.com/blog/2008/02/06/the-basics-of-applicative-functors-put-to-practical-work/

Let’s step through this.

“No need for Monad here”

The first head-scratcher is that there is no Monad concept anywhere in the Scala standard library; nor do we use a library that defines one, such as scalaz. Where exactly is the monad?

This is explained by the fact that flatMap is a monadic bind operation. If you understand flatMap, you already know what a monad is: a monad is something that can flatMap.

Where are we using flatMap? That comes from the for comprehension, which can be understood as syntatic sugar for using flatMap and map here. The “desugared” version of sum2 would look something like this:

object Input {
  def sum3(input: Input): Option[Int] = {
    input.a.flatMap(a =>
      input.b.map(b =>
        a + b
      )
    )
  }
}

This desugared code will also trigger NoNeedForMonad. Really the error is saying, “No need for flatMap here.”

“Applicative should suffice”

The second head-scratcher is that the solution to not needing monads is Applicative, which is also not in the standard library!

My best understanding of Applicative is that it provides a more powerful map. Instead of just applying a function that takes one argument to a context such as Option (e.g., 1.some.map(_ + 2)), you can apply a function to many arguments, all of them in a context such as Option.

For instance, the function + takes two Int values, and returns an Int. Using the method lift2 from scalaz’s Apply class (a superclass of Applicative), + can be transformed into a function that takes two Option[Int] values and returns an Option[Int] .

import scalaz.Apply
import scalaz.Scalaz._

object Input {
  def sum4(input: Input): Option[Int] = {
    val sum = (a: Int, b: Int) => a + b
    Apply[Option].lift2(sum)(input.a, input.b)
  }
}

Here, Apply[Option].lift2(sum) lifts the sum function to accept and return Option values; we then simply pass input.a and input.b to that function.

“Applicative should suffice” — if you don’t mind pulling in scalaz and are willing to deal with some rather awkward functions for anything more complex than our example here.

“For more context…”

The final head-scratcher is that if you try following any of the links in the NoNeedForMonad error text, you are taken to several posts — not one, not two, but three — all about using the Applicative typeclass in Haskell.

The comonad link in particular is absolutely full of category theory and GHC language extensions.

For more context, go learn you a Haskell!

NeedForMonad

You might be wondering, when do you actually need monad flatMap? We can make small tweak to the sum function that will no longer trigger the wart:

object Input {
  def sum5(input: Input): Option[Int] = {
    for {
      a <- input.a
      bPlusA <- input.b.map(_ + a)
    } yield bPlusA
  }
}

Now the value bPlusA, within the for expression, depends on the value of a; previously, the values a and b were separate and did not reference each other, and were only used together in the yield.

Conclusion: there’s probably no need for NoNeedForMonad

I think using NoNeedForMonad makes sense under two conditions: a) the team is familiar with the concepts of Monad and Applicative, and b) the project uses scalaz or some library that provides these abstractions.

Otherwise, it pushes you to make awkward tweaks to the for comprehension, such that it is deemed to need flatMap, or else you have no abstraction to use and have to fall back to nested match statements.

This seems like a wart meant for Haskell projects. In Haskell, Applicative is part of the standard library, and curried functions in particular make it easy to use. Here’s the same “add two optional numbers” example in Haskell, using fmap (<$>) and apply (<*>):

(+) <$> Just 1 <*> Just 2
-- #=> Just 3

That works entirely with functions from Prelude, no imports or libraries needed.

Representing AT TIME ZONE in Haskell and Rust

Postgresql’s AT TIME ZONE is one of those functions that seems intuitive at first but can bite you very easily. If you’re going to use it, you should carefully read the documentation and verify what the input types are, because it’s a function that’s both overloaded and polymorphic in its return type:

select '2019-08-10T8:51:00'::timestamp AT TIME ZONE 'PDT';
-- #=> 2019-08-10 11:51:00-04
select '2019-08-10T8:51:00'::timestamptz AT TIME ZONE 'PDT';
-- #=> 2019-08-10 05:51:00

In the first example we interpret the time as if it were in PDT. Human interpretation: It’s 8:51 in California. Postgres then prints out the time for my system time, which is EDT.

In the second, it’s the inverse: we interpret the time at EDT, and then print out the time it would be in PDT. Human interpretation: It’s 8:51 in Boston, what time is it in California? The reason 8:51 is interpreted as an EDT timestamp is because Postgres coerces to timestamptz using the system time zone, and on my system that’s EDT. So for me the following two queries are equivalent:

The double AT TIME ZONE looks weird, but it is useful for dealing with timestamp without time zone columns, because really, AT TIME ZONE does two jobs. One (when given a timestamp without a time zone) is to assert that the time is in such-and-such time zone. The other job (when given a timestamp with a time zone) is to query, what time is it in such-and-such time zone?

select '2019-08-10T8:51:00'::timestamptz
  AT TIME ZONE 'PDT';

select '2019-08-10T8:51:00'::timestamp
  AT TIME ZONE 'EDT'
  AT TIME ZONE 'PDT';

Part of the problem is that “At” is an overloaded word, and you could probably blame the English language for some of this. AT TIME ZONE may have been better named GO AHEAD AND DO A THING, because at least then we don’t think we know what it’s doing.

A Haskell implementation using functional dependencies

AT TIME ZONE is both overloaded and has a polymorphic return type. I can’t immediately summon what an accurate type signature for this function might be, so I wonder: could we define it in Haskell?

First, let’s define some dummy datatypes to work with. The timestamp types will just be wrappers around String; we won’t actually do any real conversion. For now, we’re just interested in the types.

type TimeZone    = String
data Timestamp   = Timestamp   String deriving Show
data TimestampTz = TimestampTz String deriving Show

Now let’s define the class of types that can be converted using “at time zone”. While normal, boring type classes are generic over one type variable, an instance of this class must be defined for two type variables: the input type and the output type. This kind of craziness wasn’t allowed in the original Haskell standard, and we need to enable the MultiParamTypeClasses language extension for this to compile.

From A History of Haskell: Being Lazy With Class:

“We [the Haskell Committee] felt that single-parameter type classes were already a big step beyond our initial conservative design goals, and they solved the problem we initially addressed. Going beyond that would be an unforced step into the dark, and we were anxious about questions of overlap, confluence, and decidability of type inference. […] As time went on, however, user pressure grew to adopt multi-parameter type classes, and GHC adopted them in 1997 (version 3.00). However, multi-parameter type classes did not really come into their own until the advent of functional dependencies.”

{-# LANGUAGE MultiParamTypeClasses #-}

class AtTimeZoneConvertible input output where
  atTimeZone :: TimeZone -> input -> output

Now let’s define the instances. One for Timestamp -> TimestampTz, and one for TimestampTz -> Timestamp.

instance AtTimeZoneConvertible Timestamp TimestampTz where
  atTimeZone timezone (Timestamp timestamp) =
    TimestampTz $ timestamp ++ " " ++ timezone

instance AtTimeZoneConvertible TimestampTz Timestamp where
  atTimeZone timezone (TimestampTz timestamp) =
    -- do some time calculations...
    Timestamp $ "10:10"

And now we can use them like so:

ghci> atTimeZone "EDT" (Timestamp "10:10") :: TimestampTz
TimestampTz "10:10 EDT"

ghci> atTimeZone "EDT" (TimestampTz "14:10 UTC") :: Timestamp
Timestamp "10:10"

Disregarding the fact that atTimeZone only ever returns "10:10" when given a timestamp with time zone, this looks good! One annoying thing is that we need to specify the return type, even though we’ve only defined one instance for each input type. The problem is that there isn’t anything preventing us from defining more instances and having multiple possible output types for, say, converting a Timestamp.

If we try to evaluate atTimeZone without specifying the return type, we end up with this error:

ghci> atTimeZone "EDT" (TimestampTz "14:10 UTC")

<interactive>:53:1: error:
    • Non type-variable argument
        in the constraint: AtTimeZoneConvertible TimestampTz output

GHC is saying something like: I can infer the types as far as AtTimeZoneConvertible TimestampTz output, and that’s just not enough to decide what instance to use, because output is a type variable, not a concrete type.

What we want to say is that the input type implies the output type. This is exactly what the FunctionalDependencies language extension lets us do. It looks like this:

{-# LANGUAGE FunctionalDependencies #-}

class AtTimeZoneConvertible input output | input -> output where
  atTimeZone :: TimeZone -> input -> output

Now the compiler will prevent us from defining more than one instance for a given input type, and we no longer need to specify the output type:

ghci> atTimeZone "EDT" (TimestampTz "14:10 UTC")
Timestamp "10:10"

We can even call it multiple times, just like we did with AT TIME ZONE:

ghci> atTimeZone "UTC" $ atTimeZone "EDT" (TimestampTz "14:10 UTC")
TimestampTz "10:10 UTC"

If we try to define another instance for the Timestamp input type, for say a String output type:

instance AtTimeZoneConvertible Timestamp String where
  atTimeZone timezone (Timestamp timestamp) =
    timestamp ++ " " ++ timezone

We’ll get an error like this:

AtTimeZone.hs:14:10: error:
    Functional dependencies conflict between instance declarations:
      instance AtTimeZoneConvertible Timestamp TimestampTz
        -- Defined at AtTimeZone.hs:14:10
      instance AtTimeZoneConvertible Timestamp String
        -- Defined at AtTimeZone.hs:18:10

A Rust implementation using associated types

I was also curious if this is possible in Rust. I am much less familiar with Rust, but I’ve at least heard a few times that Rust’s traits are like Haskell’s type classes. Let’s see how it might work. First, some data types:

type TimeZone = String;

#[derive(Debug)]
struct Timestamp {
    ts: String,
}

#[derive(Debug)]
struct TimestampTz {
    ts: String,
}

As in Haskell, we’ll define a trait for AtTimeZoneConvertible:

trait AtTimeZoneConvertible<Output> {
    fn at_time_zone(&self, time_zone: TimeZone) -> Output;
}

One difference with Haskell already is that Rust has a more object-oriented approach: a trait is defined in terms of some self type. In Haskell, this was just another type variable, input. Practically, there isn’t really a difference, as far as I can tell.

Now let’s define some instances:

impl AtTimeZoneConvertible<TimestampTz> for Timestamp {
    fn at_time_zone(&self, time_zone: TimeZone) -> TimestampTz {
        TimestampTz {
            ts: self.ts.to_string() + " " + &time_zone,
        }
    }
}

impl AtTimeZoneConvertible<Timestamp> for TimestampTz {
    fn at_time_zone(&self, _time_zone: TimeZone) -> Timestamp {
        Timestamp {
            ts: "10:10".to_string(),
        }
    }
}

This is similar to our approach in Haskell without functional dependencies. So I assumed the following code wouldn’t work:

fn main() {
    println!(
        "{:?}",
        Timestamp {
            ts: "14:10".to_string()
        }
        .at_time_zone("UTC".to_string())
        .at_time_zone("EDT".to_string())
    );
}

// $ cargo run
// Timestamp { ts: "10:10" }

Surprisingly, it does! Rust seems to be saying, you’ve only given me one instance for AtTimeZoneConvertible for your type, so I’ll use it, even though multiple instances could exist.

I’m not quite sure why the Rust compiler allows this. It seems like a reasonable thing to disallow, because there is no guarantee that the compiler can infer the types. Remember that the trait is generic over the Output type. If we add another instance, there is indeed a failure to compile that same code:

impl AtTimeZoneConvertible<String> for Timestamp {
    fn at_time_zone(&self, time_zone: TimeZone) -> String {
        self.ts.to_string() + " " + &time_zone
    }
}
$ cargo build
   Compiling rust-at-time-zone v0.1.0 (/Users/mjhoy/proj/rust-at-time-zone)
error[E0282]: type annotations needed
  --> src/main.rs:45:9
   |
45 | /         Timestamp {
46 | |             ts: "14:10".to_string()
47 | |         }
48 | |         .at_time_zone("UTC".to_string())
   | |__________^ cannot infer type for `Output`
   |
   = note: type must be known at this point

error: aborting due to previous error

But perhaps in the real world, this isn’t such a problem, and the benefits of making life easier when there is just one instance are too good to pass up.

All that said, we can prevent this issue by enforcing only one Output type per instance for a given type using an associated type. It looks like this:

trait AtTimeZoneConvertible {
    type Output;
    fn at_time_zone(&self, time_zone: TimeZone) -> Self::Output;
}

This is a lot like the input -> output functional dependency for Haskell. The AtTimeZoneConvertible trait is no longer generic over the Output type; instead, one Output type must be chosen for a given instance. Our instances now look like this:

impl AtTimeZoneConvertible for Timestamp {
    type Output = TimestampTz;
    fn at_time_zone(&self, time_zone: TimeZone) -> TimestampTz { ... }
}

impl AtTimeZoneConvertible for TimestampTz {
    type Output = Timestamp;
    fn at_time_zone(&self, _time_zone: TimeZone) -> Timestamp { ... }
}

If we try to define another instance for Timestamp, we now get an error:

impl AtTimeZoneConvertible for Timestamp {
    type Output = String;
    fn at_time_zone(&self, time_zone: TimeZone) -> String {
        self.ts.to_string() + " " + &time_zone
    }
}
$ cargo build
error[E0119]: conflicting implementations of trait `AtTimeZoneConvertible`
              for type `Timestamp`:
  --> src/main.rs:30:1

Conclusions

Spend extra time reviewing code that uses AT TIME ZONE or coercions such as ::timestamp or ::timestamptz. The behavior is often surprising.

At work, we have data that moves from a production database into a warehouse. In this process, for some reason, timestamp columns are coerced to timestamptz. This means any query that uses AT TIME ZONE is semantically different depending on whether you run it in the warehouse or in production, and was the source of some subtle bugs.

Also, both Haskell and Rust have good support for representing functions that are overloaded and polymorphic in their return types. GHC is a bit more strict than the Rust compiler, though; you may want to use associated types in Rust to enforce functional dependencies.

A presenter pattern for Rails controllers

Rails controllers have a trick to pass data to the view: all instance variables are copied over after an action method is executed. This is at odds with good Ruby object design and causes practical problems. It becomes awkward to refactor controller methods without exposing unwanted state to the view — a private method that sets an instance variable has side effects elsewhere in the code. Controllers have the strange task of mutating their inner state as their public behavior.

I’d have guessed that Object#instance_variable_get and #instance_variable_set were private methods. Not so. Apparently it is “public” API for you to mess with any Ruby object’s internals.

How many Ruby classes have you needed to test by examining their instance variables? I don’t think I’ve done this anywhere, except in controller tests, where it’s the norm.

It is true that the trick is aesthetically pleasing for simple controllers and for when you need to demo Rails. Take a standard Rails controller with one simple action:

class PostsController < ApplicationController
  def show
    @post = Post.find(params[:id])
  end
end

And now, in the corresponding view, the @post is available:

<h1>
  <%= @post.title %>
</h1>

For simple controllers, you might think of the view as an extension of the controller. It has access to private internals. The template is rendered as if it were just another method on the controller.

But in the MVC pattern, views and controllers have different concerns. A controller generally manipulates model objects to interact with the database; a view should not. Controllers send email, spawn background jobs, handle validation failure, catch exceptions. Views do not.

Using ActiveRecord objects in the view seems like a good thing to avoid if possible. These are generally the most powerful objects in a Rails app. They have an enormous API and many methods query the database. One of our models at Freebird has 619 instance methods (not counting those from Object) — most of those are added by ActiveRecord. It’s probably a bad idea for a view, responsible for producing HTML, to execute SQL queries.

Additionally, objects used in a view often need methods specific to formatting content for the user. Adding these to ActiveRecord classes makes their large interface even wider.

At Freebird, we’re moving to using Presenter objects that wrap ActiveRecord objects, exposing only the methods that the view needs and adding view-specific logic. But we found it was a little awkward to introduce them into standard controllers. Consider this code:

class PostController < ApplicationController
  def show
    post_model = Post.find(params[:id])
    @post = PostPresenter.new(post_model)
  end
end

Even in this simple show action, there is some awkwardness with naming. @post in the view is now a PostPresenter — good, this is what we want. Unfortunately, now the controller must distinguish between model and presenter objects, and with the @post name reserved for the presenter, our model variables get an awkward name like post_model. We have to rename and rearrange our controller’s instance variables because their names are used in the view; something about this feels wrong.

With a few more actions, the situation gets worse:

class PostsController < ApplicationController
  before_action :require_post

  def show; end

  def edit; end

  def update
    if @post_model.update(params[:post].permit(:title, :body))
      redirect_to @post_model, notice: "Post updated."
    else
      render :edit
    end
  end

  private

  def require_post
    @post_model = Post.find(params[:id])
    @post = PostPresenter.new(@post_model)
  end
end

An alternative is to overwrite the @post variable for the view, e.g., @post = PostPresenter.new(@post). But this is error-prone, confusing and — were we using Sorbet — a type error.

We’ve factored out common logic into require_post, but in doing so, our views now have access to the @post_model instance variable. And we now have two kinds of instance variables, one that is meant to be passed and used in the view, and one that is not. This makes the intention of the code harder to follow, especially in more complex controllers.

Thankfully, this is easy to address. The public method used to populate the view with instance variables is AbstractController#view_assigns. It builds a hash from all the instance variables in the controller object. It is not a complicated method:

# This method should return a hash with assigns.
# You can overwrite this configuration per controller.
def view_assigns
  protected_vars = _protected_ivars
  variables      = instance_variables

  variables.reject! { |s| protected_vars.include? s }
  variables.each_with_object({}) { |name, hash|
    hash[name.slice(1, name.length)] = instance_variable_get(name)
  }
end

Interestingly, in Merb, views were just methods on controllers, meaning the apparent sharing of state was structural. I think Rails got it right here.

Merb’s “proof that everything belongs in one class to begin with” was that it was more performant.

Controllers inherit from AbstractController, so we can override view_assigns to do whatever we want. Let’s have it simply return a hash that is set in a new method called present:

module Presenters
  def view_assigns
    @_presenters || {}
  end

  def present(hsh)
    @_presenters = hsh
  end
end

Now we can include the Presenters module in our controller, and our instance variables will not be passed to the views. Instead, we explicitly assign variables to the view with present:

class PostsController < ApplicationController
  include Presenters

  before_action :require_post

  def show
    present(post: PostPresenter.new(@post))
  end

  def edit
    present(post: PostPresenter.new(@post))
  end

  def update
    if @post.update(params[:post].permit(:title, :body))
      redirect_to @post, notice: "Post updated."
    else
      present(post: PostPresenter.new(@post))
      render :edit
    end
  end

  private

  def require_post
    @post = Post.find(params[:id])
  end
end

The awkwardness of controller instance variables is solved — they can simply be used to share instance state, as they are meant to do. We can refactor controller code without worrying what will end up in a view.

We can clean up this code a little more. If we assume that a Presenter is always instantiated in the same way — a ModelNamePresenter accepts a ModelName object as a single initialization parameter — we can instantiate the presenter using the class of the object passed in:

def present(hsh)
  @_presenters ||= {}
  hsh.each_with_object(@_presenters) do |(k, v), acc|
    acc[k] = "#{v.class}Presenter".constantize.new(v)
  end
end

Calling present(post: @post) now inspects the class name of @post, finds the corresponding Presenter class and instantiates a presenter with the @post object passed to the constructor. The final controller looks like this:

class PostsController < ApplicationController
  include Presenters

  before_action :require_post

  def show
    present(post: @post)
  end

  def edit
    present(post: @post)
  end

  def update
    if @post.update(params[:post].permit(:title, :body))
      redirect_to @post, notice: "Post updated."
    else
      present(post: @post)
      render :edit
    end
  end

  private

  def require_post
    @post = Post.find(params[:id])
  end
end

The present method will fail with a uninitialized constant exception if an appropriate Presenter class is not found. In this way we enforce some consistency in the view: instance variables must be presenter objects.

We have split apart the shared scope between the controller and the view, with the present method providing the interface between them. A developer working in controller code can be confident instance variables incidentally used for refactoring actions won’t affect view code. A front end developer knows exactly which variables were meant for the view. If nothing else, there is a self-documenting nature to present that the standard Rails controller lacks.

Freebird has extracted a simple library for setting this up in Rails controllers, as well as providing a base Presenter class with some conveniences. Check it out, it’s called Livery. It does a bit more than the PostsController example here, but the basic idea is the same. For instance, our implementation of present handles passing in presenter objects directly, objects with module namespaces, and collections.