Sunday, August 23, 2015

Evangelizing clojure

Since I've started my new position where I get to work in clojure, I've been itching to try to get others at my workplace to see where clojure would be useful.  Currently, my workplace is a python, java, C and shell workshop (with a smattering of ruby here and there).  I'm one of the few engineers at my work where I get to work with clojure.

And that's sad.  Unfortunately, a common fear for most companies is the difficulty in finding engineers who are proficient in a certain technology stack.  I quite frankly find that a rather lame excuse.  Any engineer worth their salt should be able to learn a new language.  And in fact, I'd rather hire engineers who have a mind curious enough to learn a not-hot language with a very different paradigm.  If management's concern is that they want an engineer to "hit the ground running", I think they are sacrificing short term gains for long term benefits.  I used to work at a company that decided to use perl for all its scripting efforts, and they wound up having many perl "camps" where engineers spent an entire week going through intensive perl training.  If there are companies that make a living teaching new languages, why not take advantage of that?  So when I hear managers claim that the lack of engineers with the skill to program in clojure is a detriment, I find that as a weak excuse.

There's also the odd paradox that some companies don't seem to be so concerned about other new "hot" langauges like Go or Swift.  Perhaps it's because those two languages are backed by the giants Google and Apple respectively, and so therefore, they must have gotten something right.  Personally, having had a cursory glance at Go and Swift, I've found nothing particularly outstanding about them compared to other new languages without the hotness (clojure, elixir, rust, elm or julia for example).

So what can we clojurians do to help others understand where clojure could be a viable alternative?  I think we need to do several things:

  • Point out how language X has certain weaknesses that could be resolved with clojure
  • Point out how clojure can live synergistically with a Java ecosystem
  • Help train and educate others that lisps aren't as scary/gross as they think
  • Get people familiar with the tools and ecosystem of clojure
For example, I hope to release a set of tutorials to help compare and contrast how clojure could solve problems more elegantly than python.  It would cover things like how to do highly concurrent programs in comparison with python and how immutability can help make more robust programs.  I'd also show how python decorators, which are sometimes compared to lisp macros, fail to deliver the same power of a lisp-style macro.

Another topic that I don't see discussed too much, is how to integrate clojure with legacy java projects.  I'd like to create some articles talking about how to use TestNG with clojure, how gen-class really works, and how to plug clojure into a maven or gradle based program.  I'd also like to give more examples on how to use java interop constructs, including defprotocols, proxy and using them to bridge java and clojure.

Another hinderance is, IMHO, purely psychological.  I find people's first reactions to lisp syntax somewhat amusing.  It's such an immediate and almost visceral reaction that I have to truly wonder why lisp syntax is so (initially) despised by so many.  Is it perhaps because there is a relationship between lisp syntax and XML (and people hate XML)?  I remember my first reaction to lisp in college and I was just like "whoaaaa".  But I also remember my first reaction to python's syntax where white space mattered was like, "who the hell thought having white space matter was a good thing!!".  But after about 3 weeks, I didn't even notice it anymore.  And the same thing happened with me with clojure.  But how do you get people to even try clojure for 3-4 weeks?

Finally, another big barrier for people coming to clojure is the tools and ecosystem.  For starters, a large chunk of tutorials and videos you will see online use emacs + CIDER as the IDE.  I basically started learning emacs about 3 years ago in order to do clojure.  Now, I'm an older guy, so I'm not afraid of basic text editors unlike some young whipper-snappers who seem to be at a loss without a full fledged IDE.  Now for Java programming I do enjoy something like IntelliJ or Eclipse, but emacs is a pretty cool IDE for clojure.  While there is a plugin for vim and clojure, the majority of the community does work with emacs.  There's another interesting IDE called cursive which is supposed to have the ability to debug both clojure and java code which would come in handy.

Beyond the IDE, there's build tools, and so coming to grips with leiningen and perhaps boot would be useful.  Also, if you don't have any background in Java, while it's not absolutely necessary to know clojure, it will definitely help (the same is true if you don't really know javascript, but want to know clojurescript).  So some familiarity with the underlying runtime (the JVM or javascript engine) will go a long way to making you a better clojurian.

Figuring out Clojure vars vs. symbols

Although I realized that there was some kind of difference between a clojure var and a symbol, I hadn't really considered what the difference was.  To make matters worse, I just considered a clojure symbol as symbols are usually considered in other languages.  In other words, I just considered a symbol to be an object or reference to something that could be used by the program.  However, symbols have their own special meaning in clojure.

So, what exactly is a var?  When I first was learning clojure, I kept reading on websites and in the books I had that clojure doesn't have variables.  Instead, they have vars and bindings.  Well, ok, but what in the world does that mean, and how do vars differ from variables?  Furthermore,  I pretty much had assumed vars and symbols were (almost) the same thing.  For example, if I have:

(def foo 10)

Ok, so foo kind of looks like what other programming languages would call a variable.  But if foo isn't a variable, what is it?  It's a var right?

Hold on partner, we have to consider how we are looking at the thing called foo.  In the line above, yes, foo is a var.  But if i just type foo at the repl, what is it?  Or what is 'foo, or #'foo?

Let's step back for a moment and consider what Rich has wanted clojure to do.  Clojure is a language that dearly wants to separate identity, state and values.  Identity is what names a thing, state is a value at a moment in time, and values are...well, values :)  In Python, if I do this:

bar = [10]

Then bar is a variable which has the value  of 10.  However, it has not separated the notion of identity, state and value.  Identity, state and value are all commingled in the variable bar.

So back in clojure land,  how we look at  foo depends on how it is being evaluated.  Put simply foo (by itself) is a symbol which can be used to look up a var.  In this example, foo is our identity.  So you might now be wondering what the 10 is as that obviously seems to be a value.  Values have to be stored somewhere  and the var is what actually holds some value.

Normally we think of foo as neither a symbol nor a var, but a value.  In other words, I could just mentally replace the value of 10 wherever I see foo.  But wait kimosabe, you are forgetting about clojure's macros, but I am getting ahead of myself.   If I just type foo in the repl, I get its value back which is 10.

foo
10

(type foo)
java.lang.Long

Okay, so it seems like for all intents and purposes the symbol foo _is_ 10.  But is it?  What does the documentation say about def anyways?

boot.user=>; (doc def)
-------------------------
def
  (def symbol doc-string? init?)
Special Form
  Creates and interns a global var with the name
  of symbol in the current namespace (*ns*) or locates such a var if
  it already exists.  If init is supplied, it is evaluated, and the
  root binding of the var is set to the resulting value.  If init is
  not supplied, the root binding of the var is unaffected.

  Please see http://clojure.org/special_forms#def
nil

Hmmm, so (def foo 10) interns a var in the current namespace with the name of the symbol.   Have you wondered if def returns anything?

(println (def x 100))

Ah, so def returns the var itself.  The definition says that a var with the name of the symbol is created by a def.  Ok, is a symbol just a lookup name?  Where does it fit into the picture?  Consider this:

(symbol "foo")
(type (symbol "foo"))

What does that return?  It returns....gasp....a symbol :)  But what good is that?  It doesn't actually return 10.  Why not?  To get the value (the var contains) that foo represents, we could do something like this:

(eval (symbol "foo"))

But let's try another thought experiment to help illuminate the difference between vars, symbols and values.  Consider what this returns before trying this in the repl:

(var (symbol "foo"))

If you did try that in the repl, you'll notice that threw an exception...how rude!!

clojure.lang.Compiler$CompilerException: java.lang.ClassCastException: clojure.lang.PersistentList cannot be cast to clojure.lang.Symbol, compiling:(/tmp/boot.user2720475669809682962.clj:1:1)
           java.lang.ClassCastException: clojure.lang.PersistentList cannot be cast to clojure.lang.Symbol

Hmmm, so it looks like var is actually evaluating (symbol "foo"), and not the result of (symbol "foo").  Ok, let's try this:

(defmacro huh [var-string]
  `(let [x# (-> ~(symbol var-string) var)]
     x#))

So why did I have to make a macro?  var is a special form, so it doesn't eagerly evaluate the arg that gets passed into it.  By the way, try doing (-> (symbol "foo") var) and see what happens (and you'll see why I needed a macro).  If you look at the documentation for var, it says that it returns the var (not the value) of a symbol.  You can see that by doing this:

(type (huh "foo"))
clojure.lang.Var

So remember what we've done here.  By having (symbol "foo") we are creating a symbol.  This object does not evaluate to 10.  In fact, neither does getting the var which is pointed to by the symbol foo.  In order to actually get the value of the var object, we need to dereference it.  Let's make a small change to our macro:

(defmacro huh [var-string]
  `(let [x# (-> ~(symbol var-string) var)]
     @x#)) ;;

(huh "foo")
10

So why bother making a distinction between symbols and vars?  I mean, wouldn't it be simpler to just have the symbol directly reference the value?  Why have this 2-level look up system of symbol -> var -> value?  Recall what I said earlier about maintaining a distinction between identity, state, and value.  Another answer is to think about macros and macro expansion time vs. compile time.  Here's another exploration:

(doseq [elem '(def foo 10)]
  (println e "is a" (type e)))

def is a clojure.lang.Symbol
foo is a clojure.lang.Symbol
10 is a java.lang.Long
nil


Ahhhh, so when the reader looks at (def foo 10), foo is a symbol.   By having a var looked up by a symbol, and then the value retrieved from the var, we can delay actually getting the value....by retrieving the var instead.  Also, consider how many times clojure wants the symbol of a thing, rather than its value.  Furthermore, some clojure functions want the var itself rather than the value.  For example:

(defn ^{:version "1.0"} doubler 
  [x]
  (* x 2))

;(meta doubler)     ;; Wrong, the metadata doesnt belong to the doubler function, but the var itself 
(meta #'doubler)   ;; equivalent to (meta (var doubler))
{:version "1.0", :arglists ([x]), :line 1, :column 1, :file "/tmp/boot.user2720475669809682962.clj", :name doubler, :ns #object[clojure.lang.Namespace 0x2d471d43 "boot.user"]}


Another example is when we require or import from within the repl.  When you require or import from the repl (as opposed to when you use the :require or :import directives from the ns macro), it requires a sequence of symbols which refer to classes in the classpath.

Finally, remember that vars can have thread-local bindings.  That's why symbols shouldn't just point to values, as you may want to give another thread some other binding value.

I hope this makes the differences between vars and symbols a little more clear.