Sunday, August 23, 2015

Figuring out Clojure vars vs. symbols

Although I realized that there was some kind of difference between a clojure var and a symbol, I hadn't really considered what the difference was.  To make matters worse, I just considered a clojure symbol as symbols are usually considered in other languages.  In other words, I just considered a symbol to be an object or reference to something that could be used by the program.  However, symbols have their own special meaning in clojure.

So, what exactly is a var?  When I first was learning clojure, I kept reading on websites and in the books I had that clojure doesn't have variables.  Instead, they have vars and bindings.  Well, ok, but what in the world does that mean, and how do vars differ from variables?  Furthermore,  I pretty much had assumed vars and symbols were (almost) the same thing.  For example, if I have:

(def foo 10)

Ok, so foo kind of looks like what other programming languages would call a variable.  But if foo isn't a variable, what is it?  It's a var right?

Hold on partner, we have to consider how we are looking at the thing called foo.  In the line above, yes, foo is a var.  But if i just type foo at the repl, what is it?  Or what is 'foo, or #'foo?

Let's step back for a moment and consider what Rich has wanted clojure to do.  Clojure is a language that dearly wants to separate identity, state and values.  Identity is what names a thing, state is a value at a moment in time, and values are...well, values :)  In Python, if I do this:

bar = [10]

Then bar is a variable which has the value  of 10.  However, it has not separated the notion of identity, state and value.  Identity, state and value are all commingled in the variable bar.

So back in clojure land,  how we look at  foo depends on how it is being evaluated.  Put simply foo (by itself) is a symbol which can be used to look up a var.  In this example, foo is our identity.  So you might now be wondering what the 10 is as that obviously seems to be a value.  Values have to be stored somewhere  and the var is what actually holds some value.

Normally we think of foo as neither a symbol nor a var, but a value.  In other words, I could just mentally replace the value of 10 wherever I see foo.  But wait kimosabe, you are forgetting about clojure's macros, but I am getting ahead of myself.   If I just type foo in the repl, I get its value back which is 10.

foo
10

(type foo)
java.lang.Long

Okay, so it seems like for all intents and purposes the symbol foo _is_ 10.  But is it?  What does the documentation say about def anyways?

boot.user=>; (doc def)
-------------------------
def
  (def symbol doc-string? init?)
Special Form
  Creates and interns a global var with the name
  of symbol in the current namespace (*ns*) or locates such a var if
  it already exists.  If init is supplied, it is evaluated, and the
  root binding of the var is set to the resulting value.  If init is
  not supplied, the root binding of the var is unaffected.

  Please see http://clojure.org/special_forms#def
nil

Hmmm, so (def foo 10) interns a var in the current namespace with the name of the symbol.   Have you wondered if def returns anything?

(println (def x 100))

Ah, so def returns the var itself.  The definition says that a var with the name of the symbol is created by a def.  Ok, is a symbol just a lookup name?  Where does it fit into the picture?  Consider this:

(symbol "foo")
(type (symbol "foo"))

What does that return?  It returns....gasp....a symbol :)  But what good is that?  It doesn't actually return 10.  Why not?  To get the value (the var contains) that foo represents, we could do something like this:

(eval (symbol "foo"))

But let's try another thought experiment to help illuminate the difference between vars, symbols and values.  Consider what this returns before trying this in the repl:

(var (symbol "foo"))

If you did try that in the repl, you'll notice that threw an exception...how rude!!

clojure.lang.Compiler$CompilerException: java.lang.ClassCastException: clojure.lang.PersistentList cannot be cast to clojure.lang.Symbol, compiling:(/tmp/boot.user2720475669809682962.clj:1:1)
           java.lang.ClassCastException: clojure.lang.PersistentList cannot be cast to clojure.lang.Symbol

Hmmm, so it looks like var is actually evaluating (symbol "foo"), and not the result of (symbol "foo").  Ok, let's try this:

(defmacro huh [var-string]
  `(let [x# (-> ~(symbol var-string) var)]
     x#))

So why did I have to make a macro?  var is a special form, so it doesn't eagerly evaluate the arg that gets passed into it.  By the way, try doing (-> (symbol "foo") var) and see what happens (and you'll see why I needed a macro).  If you look at the documentation for var, it says that it returns the var (not the value) of a symbol.  You can see that by doing this:

(type (huh "foo"))
clojure.lang.Var

So remember what we've done here.  By having (symbol "foo") we are creating a symbol.  This object does not evaluate to 10.  In fact, neither does getting the var which is pointed to by the symbol foo.  In order to actually get the value of the var object, we need to dereference it.  Let's make a small change to our macro:

(defmacro huh [var-string]
  `(let [x# (-> ~(symbol var-string) var)]
     @x#)) ;;

(huh "foo")
10

So why bother making a distinction between symbols and vars?  I mean, wouldn't it be simpler to just have the symbol directly reference the value?  Why have this 2-level look up system of symbol -> var -> value?  Recall what I said earlier about maintaining a distinction between identity, state, and value.  Another answer is to think about macros and macro expansion time vs. compile time.  Here's another exploration:

(doseq [elem '(def foo 10)]
  (println e "is a" (type e)))

def is a clojure.lang.Symbol
foo is a clojure.lang.Symbol
10 is a java.lang.Long
nil


Ahhhh, so when the reader looks at (def foo 10), foo is a symbol.   By having a var looked up by a symbol, and then the value retrieved from the var, we can delay actually getting the value....by retrieving the var instead.  Also, consider how many times clojure wants the symbol of a thing, rather than its value.  Furthermore, some clojure functions want the var itself rather than the value.  For example:

(defn ^{:version "1.0"} doubler 
  [x]
  (* x 2))

;(meta doubler)     ;; Wrong, the metadata doesnt belong to the doubler function, but the var itself 
(meta #'doubler)   ;; equivalent to (meta (var doubler))
{:version "1.0", :arglists ([x]), :line 1, :column 1, :file "/tmp/boot.user2720475669809682962.clj", :name doubler, :ns #object[clojure.lang.Namespace 0x2d471d43 "boot.user"]}


Another example is when we require or import from within the repl.  When you require or import from the repl (as opposed to when you use the :require or :import directives from the ns macro), it requires a sequence of symbols which refer to classes in the classpath.

Finally, remember that vars can have thread-local bindings.  That's why symbols shouldn't just point to values, as you may want to give another thread some other binding value.

I hope this makes the differences between vars and symbols a little more clear. 




No comments:

Post a Comment