Monday, September 7, 2015

How to evaluate forms given to a clojure macro without throwing an exception

OK, I probably shouldn't admit this, but it took me the better part of 2 days of straight coding to come up with a macro that I wanted.  In a nutshell, I wanted to be able to call a sequence of functions and collect the results, even if one of those functions would throw an exception.  For example, something like this:

209 (let [x 2
210       y 0]
211   (try+
212     (* x 2)
213     (* x y)
214     (+ 9 y)
215     (/ 1 y)))

Do you see why I needed a macro for the try+?  What if I had tried to write it as a function?  Since clojure is by default an eager language, it will try to evaluate the arguments first and then supply the results of the evaluation to the calling function.  However, it is quite possible that a function that is supplied as an argument to another function can throw an exception, which will result in the calling function failing, as well as any other "functions as arguments" to the right of the offending function not getting evaluated at all.  One way around that would be to quote each function call, and in the function evaluate it

   1 (defn awkward-try [& fncalls]
   2   (for [fnc fncalls]
   3     (try 
   4       (eval fnc)
   5         (catch Exception ex ex))))
   6 
   7 (awkward-try
   8     '(* 2 2)
   9     '(/ 1 0))

However, making the user quote the functions is unnecessary, although the solution for it was quite a bit more difficult.  Before I show you the working solution, I'll show a failed attempt to make it work, because sometimes, it's just as useful to show something you thought would work but didn't.

So an early attempt I made was similar to the awkward-try function above and it looked like this:

162 (defmacro firsttry+
163   "Takes a body of function calls and calls them lazily.  If a function throws
164    an exception, dont propagate it.  Collect the exception in the
165    results"
166   [& body]
167   `(for [arg# '~body]
168      (try
169        (eval arg#)
170        (catch Exception ex#
171          (println "caught exception")
172          ex#)))))

And if you try this, it seems to work:

(try++
  (* 2 2)
  (/ 1 0))
caught exception
=> (4 #error {
 :cause "Divide by zero"
 :via
 [{:type java.lang.ArithmeticException
   :message "Divide by zero"
   :at [clojure.lang.Numbers divide "Numbers.java" 158]}]
 :trace
 ...)


The problem is when you try to use let bound symbols:

(let [x 2
      y 0]
  (try++
    (* 2 x)
    (/ 1 y)))
caught exception
caught exception
=> (#error {
 :cause "Unable to resolve symbol: x in this context"
 :via
 [{:type clojure.lang.Compiler$CompilerException
   :message "java.lang.RuntimeException: Unable to resolve symbol: x in this context, compiling:(/home/stoner/.IdeaIC14/system/tmp/form-init96648934167159599.clj:4:5)"
   :at [clojure.lang.Compiler analyze "Compiler.java" 6543]}
  {:type java.lang.RuntimeException
   :message "Unable to resolve symbol: x in this context"
   :at [clojure.lang.Util runtimeException "Util.java" 221]}]
 :trace
...
 } #error {
 :cause "Unable to resolve symbol: y in this context"
 :via
 [{:type clojure.lang.Compiler$CompilerException
   :message "java.lang.RuntimeException: Unable to resolve symbol: y in this context, compiling:(/home/stoner/.IdeaIC14/system/tmp/form-init96648934167159599.clj:5:5)"
   :at [clojure.lang.Compiler analyze "Compiler.java" 6543]}
  {:type java.lang.RuntimeException
   :message "Unable to resolve symbol: y in this context"
   :at [clojure.lang.Util runtimeException "Util.java" 221]}]
 :trace
 ...)

Hmmm, so what's all this stuff about not being able to resolve symbol x and y when there are let bound symbols?  The key is in understanding how at macroexpansion time, the arguments that got passed in are exposed.  If you notice, I have a somewhat strange '~body in the for expression.  First off, it wasn't even clear to me what was in the body symbol once it was evaluated.  I couldn't just do ~body because the whole exercise of the macro was to avoid evaluating the body!  But, I did need to pull the elements out.

I also couldn't use ~@body, because that would have the wrong form in a for expression.  like let, loop, doseq, and binding, a for macro takes one or more pairs.  If I had done a unquote-splice, it would have done something like this when expanded:

(for [arg# (+ 2 2) (/ 1 0)]
  ...)

Which is not the right form.  So I thought ok, let me try '~body which I thought would return what body represented (including any substitutions), but without actually evaluating it because it would be quoted.  I thought doing that would be like this:

(for [arg#  '((* 2 2) (/ 1 0))]
  ... )

But that's not what happens, and what you really get is:

(for [arg# '((* 2 x) (/ 1 y))]
  ... )

And that is why the clojure compiler complains that it doesn't know what the symbol x and y are.  So ok, that explains the unknown symbol problem, but why did that happen?  Why didn't it substitute the value of 2 for x and 0 for y?  I honestly am not sure of the answer to that question.  Also, how can I substitute all symbols within each s-expression in the body of the macro one by one if I cant do ~body, ~@body or '~body?

After a lot of trial and error, I finally decided to try a different tack, and I looked at the or macro in clojure.  I saw that it just did a simple (stack overflowing) self-recursion based on different arities.    I realized I could do this too, but I wanted to save the results of calling each function.  It took me a little while to realize that once again, lazy-seq is your friend.

Here's the final code I finally came up with that works:

163 (defmacro wrap
164   "Takes a function call and surrounds it with a try catch.  Logs the function name
165    the args supplied to the function "
166   [head]
167   `(let [fnname# (first '~head)
168          args# (rest (list ~@head))]
169      (timbre/info "evaluating function:" fnname# ", args:" args#)
170      (try
171        ~head
172        (catch Exception ex#
173          [{:name fnname# :args args# :ex ex#}]))))
174 
175 
176 (defmacro try+
177   ([head]
178    [`(wrap ~head)])
179   ([head & tail]
180    `(lazy-seq
181       (cons
182        (wrap ~head)
183        (try+ ~@tail)))))
184 

The wrap macro is really just a helper macro to help print out what is getting called.  It takes one of the forms from body.  So from the above example the first execution, head will be (* 2 2).  Notice that the value of x does get substituted in (otherwise head would be (* 2 x) ).  I was able to use ~head on line 178 and 182, to do the substitution....but without evaluation.   Recall that with macros, an expression is not eagerly evaluated automatically.  So what happens here is:

(wrap (* 2 2))

But since wrap is itself a macro, (* 2 2) does not get evaluated yet.  That's why when it gets to the next form of (/ 1 0), it does not throw an exception as soon as wrap is expanded.  Otherwis try+ uses destructuring to split the forms submitted to it as a head and tail.

Just another note, it's a little tricky to figure when and where to start the syntax unquoting.  For example, in one of my earlier attempts, I did not put the ` syntax quote literal on line 180, but on 181 instead.  And what I noticed was that the macro would not evaluate lazily.  The try+ macro would evaluate all the forms given to it in one shot.  I believe the reason for this is because macros have a macroexpansion time.  Because I did not syntax quote the entire lazy-seq form, the macro expander was expanding the entire form all in one shot at macro expansion time, so when it got back to the evaluation run time phase, everything had already been calculated.

Some other gotchas I noticed was that lazy-seq either wants to go on infinitely, or if it is finite recursion, the final thing the recursive call must return must be some seq type.  If you notice on line 178, it returns a vector.   I needed that because as the end of the recursion, it has to return a seq type. That's why normally you see a pattern of:

...
(lazy-seq
  (if some-pred?
    (cons x (foo y))
    [])

Since cons takes (element, collection) as it's args, the 2nd arg to cons should be some kind of collection.  If you see an error like:

IllegalArgumentException Don't know how to create ISeq from: java.lang.Long

Then you are probably trying to cons a scalar element (a Long for example) to a sequence.