Match Operator Guards

Context

One of my favorite things about Elixir is pattern matching, and guards are a powerful component of that. However, the simplest pattern matching mechanism, the match operator (=), does not allow for guards.

{:ok, x} when is_integer(x) = {:ok, "not an integer"}
error: undefined function when/2 (there is no such import)

In this experiment I will attempt to introduce guard clauses to the match operator. I was also talking to a friend today about quoted expression woes, so I will attempt to talk some about metaprogramming and how I approach such problems.

Goal

The match operator attempts to pair the match on the left of the operator with the value on the right. If the match succeeds, variables are bound as needed.

{:ok, x} = {:ok, "not an integer"}
x
"not an integer"

If the match fails, a MatchError is raised.

try do
  {:ok, _x} = :error
rescue
  error -> Exception.format(:error, error)
end
"** (MatchError) no match of right hand side value: :error"

My goal for this experiment is to have matching fail when guards are not satisfied.

{:ok, x} when is_integer(x) = {:ok, "not an integer"}
#=> ** (MatchError) no match of right hand side value: {:ok, "not an integer"}

The match operator should still work as usual when guards return true.

{:ok, x} when is_integer(x) = {:ok, 23}
#=> {:ok, 23}
x
#=> 23

It should also fail as usual when the pattern doesn't match at all.

{:ok, x} when is_integer(x) = :error
#=> ** (MatchError) no match of right hand side value: :error

Plan

Often when I want to experiment with tweaking Elixir syntax I check for a function or macro in Kernel I can replace with my own version via import. The match operator, however, is Kernel.SpecialForms.=/2. Functions in Kernel.SpecialForms don't actually get called--the compiler replaces calls to them. That prevents me from making my own =/2 function and importing it instead of the built-in operator.

I thought my experiment was foiled, but did you notice the error logged by the compiler when I tried using a guard clause with the match operator?

error: undefined function when/2 (there is no such import)

It thinks I'm calling a function named when/2. That's not a replaced special form in this context! This is a good time to check out the expression structure using quote/2.

quote do
  {:ok, x} when is_integer(x) = {:ok, "not an integer"}
end
{:when, [],
 [
   {:ok, {:x, [], Elixir}},
   {:=, [],
    [
      {:is_integer, [context: Elixir, imports: [{1, Kernel}]], [{:x, [], Elixir}]},
      {:ok, "not an integer"}
    ]}
 ]}

I'll talk more about quoted expressions further down, but essentially that means my guarded match actually looks something more like this call under the hood.

when({:ok, x}, is_integer(x) = {:ok, "not an integer"})

The operator precedence there is not what I expected (it makes sense, though, when I think about using the match operator within another match that may be guarded), but that works out in my favor. I can define and import a when/2 macro that transforms the code in those parts to a form that actually works!

What does that transformed code look like? I'm not aware of a way to have the match operator fail on a guard--that's what I'm trying to change--but guards work fine in other match contexts. The simplest way to go about things is probably just to transform that code into a match on a case/2 expression like this:

{:ok, x} =
  case {:ok, "not an integer"} do
    {:ok, x} = term when is_integer(x) -> term
    term -> raise MatchError, term: term
  end

That seems like a good plan, so now I just need a when/2 macro that performs that transformation.

Macro Introspection Utility

Macros are special functions that instead of taking and returning values at runtime take and return code at compile time.

When working with quoted expressions, I often find it helpful to use Macro.to_string/1 and IO.puts/1 to better understand the code I'm manipulating. Usually I inline these in my macro to print the returned code.

quote do
  # macro implementation
end
# remove when I'm satisfied with macro
|> tap(fn quoted -> quoted |> Macro.to_string() |> IO.puts())

But for this experiment I'm repeating this enough I'll create a macro that takes an implementation module and a block, imports the implementation module, prints what the block expands to, runs it in a try/rescue block since I want the macro to raise a MatchError, and prints any variables bound in the block.

defmodule Introspection do
  defmacro introspect(module, do: block) do
    print_block(__CALLER__, module, block)

    quote do
      try do
        import unquote(module)

        binding_before = binding()
        result = unquote(block)
        binding_after = binding()

        IO.inspect(binding_after -- binding_before, label: "binding")
        result
      rescue
        error ->
          :error
          |> Exception.format(error)
          |> IO.puts()
      end
    end
  end

  defp print_block(env, module, block) do
    module = Macro.expand(module, env)

    block
    |> Macro.prewalk(&import_and_expand(module, &1))
    |> Macro.expand(env)
    |> Macro.to_string()
    |> IO.puts()
  end

  defp import_and_expand(module, quoted) do
    with {:when, context, args} <- quoted do
      context
      |> Keyword.put(:imports, [{2, module}])
      |> Keyword.put_new(:context, __MODULE__)
      |> then(&{:when, &1, args})
    end
  end
end
{:module, Introspection, <<70, 79, 82, 49, 0, 0, 13, ...>>, {:import_and_expand, 2}}

The first iteration of my macro will just return :ok (atoms are valid quoted expressions of themselves). There's a fun hurdle around defining a function named when/2. The parser tries interpreting the call as a guard to defining the left value as a function, so I have to include a guard to signal that the call to when/2 is actually a function I want defined.

defmodule When.OK do
  defmacro (_left when _right) when true do
    :ok
  end
end
{:module, When.OK, <<70, 79, 82, 49, 0, 0, 6, ...>>, {:when, 2}}

If I use that macro, it generates the code :ok (which prints and binds no variables).

import Introspection

introspect When.OK do
  {:ok, x} when is_integer(x) = {:ok, "not an integer"}
end
:ok
binding: []
:ok

At its most basic, calls to when/2 will be in the form left when {:=, _meta, [guard, right]}. Using quote/1 and unquote/1, I can structure these pieces in the target form.

defmodule When.Basic do
  defmacro (left when {:=, _meta, [guard, right]}) when true do
    quote do
      unquote(left) =
        case unquote(right) do
          unquote(left) = term when unquote(guard) -> term
          term -> raise MatchError, term: term
        end
    end
  end
end
{:module, When.Basic, <<70, 79, 82, 49, 0, 0, 8, ...>>, {:when, 2}}

If I try the new version of the macro, it prints the desired case/2 block and in this instance raises the desired exception.

introspect When.Basic do
  {:ok, x} when is_integer(x) = {:ok, "not an integer"}
end
{:ok, x} =
  case {:ok, "not an integer"} do
    {:ok, x} = term when is_integer(x) -> term
    term -> raise MatchError, term: term
  end
** (MatchError) no match of right hand side value: {:ok, "not an integer"}
:ok

If the guard passes, however, the match succeeds and binds variables as expected.

introspect When.Basic do
  {:ok, x} when is_integer(x) = {:ok, 23}
end
{:ok, x} =
  case {:ok, 23} do
    {:ok, x} = term when is_integer(x) -> term
    term -> raise MatchError, term: term
  end
binding: [x: 23]
{:ok, 23}

Multiple Guards

Matches can use when more than once, so I will want to support that too.

In the source quoted expression, left will be the first argument in the topmost when call, right will be the second argument in the = call, which will be the second argument in the deepest when call.

quote do
  left when a when b when c = right
end
{:when, [],
 [
   {:left, [], Elixir},
   {:when, [],
    [
      {:a, [], Elixir},
      {:when, [], [{:b, [], Elixir}, {:=, [], [{:c, [], Elixir}, {:right, [], Elixir}]}]}
    ]}
 ]}

The desired quoted expression looks much the same, a series of when calls where the first argument is a guard expression (except for the topmost where the first argument is left). The difference is that c = right is replaced with c.

[{:->, [], [[guards], _term]}] =
  quote do
    left when a when b when c -> term
  end

guards
{:when, [],
 [
   {:left, [], Elixir},
   {:when, [], [{:a, [], Elixir}, {:when, [], [{:b, [], Elixir}, {:c, [], Elixir}]}]}
 ]}

I will want to yank right out of that deepest level and pass it as the first argument to case/2. This seems like a great case for recursion. I want a function that walks through the when sequence:

  • If call is a when, make a recursive call on its second argument to get an extracted right and an updated guard chain. Return when call with first guard and updated guard chain.
  • If call is =, return right and guard.

Only the outermost when will call my when/2 macro. The others will be given to case/2 and treated as special forms.

defmodule When.MultiGuard do
  defmacro (left when guards_and_right) when true do
    {guards, right} = parse_guards_and_right(guards_and_right)

    quote do
      unquote(left) =
        case unquote(right) do
          term = unquote(left) when unquote(guards) -> term
          term -> raise MatchError, term: term
        end
    end
  end

  defp parse_guards_and_right(guards_and_right)

  defp parse_guards_and_right({:when, meta, [guard, guards_and_right]}) do
    {guards, right} = parse_guards_and_right(guards_and_right)
    {{:when, meta, [guard, guards]}, right}
  end

  defp parse_guards_and_right({:=, _meta, [guard, right]}) do
    {guard, right}
  end
end
{:module, When.MultiGuard, <<70, 79, 82, 49, 0, 0, 9, ...>>, {:parse_guards_and_right, 1}}

This new macro supports multiple guards, which allow a match as long as one guard succeeds.

introspect When.MultiGuard do
  {:ok, x} when is_integer(x) when is_binary(x) = {:ok, "some string"}
end
{:ok, x} =
  case {:ok, "some string"} do
    term = {:ok, x} when is_integer(x) when is_binary(x) -> term
    term -> raise MatchError, term: term
  end
binding: [x: "some string"]
{:ok, "some string"}

Matching still fails if none of the guards pass.

introspect When.MultiGuard do
  {:ok, x} when is_integer(x) when is_binary(x) = {:ok, []}
end
{:ok, x} =
  case {:ok, []} do
    term = {:ok, x} when is_integer(x) when is_binary(x) -> term
    term -> raise MatchError, term: term
  end
** (MatchError) no match of right hand side value: {:ok, []}
:ok

Matching also continues to fail for match errors beyond the guards.

introspect When.MultiGuard do
  {:ok, x} when is_integer(x) when is_binary(x) = :error
end
{:ok, x} =
  case :error do
    term = {:ok, x} when is_integer(x) when is_binary(x) -> term
    term -> raise MatchError, term: term
  end
** (MatchError) no match of right hand side value: :error
:ok

Unused Variables

The left match in the macro is used twice: once in the case/2 match and once matching on the result of case/2. Some bound variables may only be used in one or the other, and the compiler will try and warn about unused variables. The introspect/2 macro checks the bindings from the context outside of case, so I haven't been seeing these warnings yet. If the match contains variables not used in the guard, however, I'll get a warning because the variables are bound inside case/2 but not used in that context.

introspect When.MultiGuard do
  result = {:ok, x} when is_integer(x) = {:ok, 19}
end
(result = {:ok, x}) =
  case {:ok, 19} do
    term = result = {:ok, x} when is_integer(x) -> term
    term -> raise MatchError, term: term
  end
warning: variable "result" is unused (if the variable is not meant to be used, prefix it with an underscore)

binding: [result: {:ok, 19}, x: 19]
{:ok, 19}

If I wanted to get really smart about things, I would traverse the guards to create a list of variables used there. Then for the match inside case/2 I would traverse left and replace any variables not used in guards with _. For the match outside case/2, I would mark the variables used in guards with generated: true, which suppresses warnings. Then any variables not used in guards would still need used in the context outside case/2 to avoid warnings.

For simplicity, though, I'll just tag every part of the quoted expression in left with generated: true.

defmodule When.UnusedVars do
  defmacro (left when guards_and_right) when true do
    left = Macro.prewalk(left, &tag_generated/1)
    {guards, right} = parse_guards_and_right(guards_and_right)

    quote do
      unquote(left) =
        case unquote(right) do
          term = unquote(left) when unquote(guards) -> term
          term -> raise MatchError, term: term
        end
    end
  end

  defp tag_generated(quoted) do
    with {name, meta, args_or_context} <- quoted do
      {name, Keyword.put(meta, :generated, true), args_or_context}
    end
  end

  defp parse_guards_and_right(guards_and_right)

  defp parse_guards_and_right({:when, meta, [guard, guards_and_right]}) do
    {guards, right} = parse_guards_and_right(guards_and_right)
    {{:when, meta, [guard, guards]}, right}
  end

  defp parse_guards_and_right({:=, _meta, [guard, right]}) do
    {guard, right}
  end
end
{:module, When.UnusedVars, <<70, 79, 82, 49, 0, 0, 11, ...>>, {:parse_guards_and_right, 1}}

Now variables can go unused.

introspect When.UnusedVars do
  result = {:ok, x} when is_integer(x) = {:ok, 19}
end
(result = {:ok, x}) =
  case {:ok, 19} do
    term = result = {:ok, x} when is_integer(x) -> term
    term -> raise MatchError, term: term
  end
binding: [result: {:ok, 19}, x: 19]
{:ok, 19}

Conclusion

My goal was to add support for guard clauses to the match operator, with everything working as one might expect.

The latest version of when/2 can match on values when guards are satisfied.

alias When.UnusedVars, as: When

introspect When do
  {:ok, x} when is_integer(x) = {:ok, 15}
end
{:ok, x} =
  case {:ok, 15} do
    term = {:ok, x} when is_integer(x) -> term
    term -> raise MatchError, term: term
  end
binding: [x: 15]
{:ok, 15}

It fails to match when the guards do not pass.

introspect When do
  {:ok, x} when is_integer(x) = {:ok, "not an integer"}
end
{:ok, x} =
  case {:ok, "not an integer"} do
    term = {:ok, x} when is_integer(x) -> term
    term -> raise MatchError, term: term
  end
** (MatchError) no match of right hand side value: {:ok, "not an integer"}
:ok

It also rejects values that would fail a regular match.

introspect When do
  {:ok, x} when is_integer(x) = :error
end
{:ok, x} =
  case :error do
    term = {:ok, x} when is_integer(x) -> term
    term -> raise MatchError, term: term
  end
** (MatchError) no match of right hand side value: :error
:ok

In most cases when/2 seems to work as expected, but there could be consequences to replacing the match operator with a case expression, especially in macros expecting the match operator.

ExUnit.Assertions.assert/1, for example, prints both sides of a failed match for an improved experience. Guarded matches using when/2 instead raise MatchError on the right side of the match instead of failing the match itself, so they won't get the improved experience.

ExUnit.start(autorun: false, seed: 0)

defmodule WhenTest do
  use ExUnit.Case, async: true
  import When

  test "match failure without guard" do
    assert {:ok, _x} = :error
  end

  test "match failure with guard" do
    assert {:ok, x} when is_integer(x) = :error
  end
end

ExUnit.run()


  1) test match failure without guard (WhenTest)
     match (=) failed
     code:  assert {:ok, _x} = :error
     left:  {:ok, _x}
     right: :error



  2) test match failure with guard (WhenTest)
     ** (MatchError) no match of right hand side value: :error


Finished in 0.00 seconds (0.00s async, 0.00s sync)
2 tests, 2 failures

Randomized with seed 0
%{total: 2, failures: 2, excluded: 0, skipped: 0}

My when/2 macro could be improved by having the case/2 only raise MatchError if left matches but doesn't pass the guards. Then any matching that could still be done by the match operator would be.

For testing, specifically, I could also make an assert/2 macro that wraps ExUnit.Assertions.assert/2 and breaks guarded matches into multiple asserts (assert {:ok, x} = {:ok, "not an integer"}, assert is_integer(x)), but you would have to do something similar anyplace a macro expected a match specifically.

I had also intended to show some of my process for approaching difficult quoted expressions, but the writeup isn't exactly chronological. I made plenty of mistakes along the way and amended the livebook as needed.

Overall, though, I'm pretty happy with the macro. It's fairly simple, but (in my opinion) it feels like a natural part of the language.