Sunday, March 13, 2016

Haskell, java 8, and calculus

I bought the Haskell Programming book last week, and though the price tag was pretty hefty, it's been a good read so far.  It is the first book I have seen that actually covers the basics of the lambda calculus.  After that cursory examination I now understand currying much better, and the basis of lisp also makes more sense.  I also discovered that haskell can look a lot like lisp.

((+) (length [1,2,3])
       (length [x | x <- [0..20], ((==) (mod x 2) 0)]))

That basically adds the lengths of 2 lists.  You do need to convert some infix operators to prefix form (eg (+) or (==)), but otherwise, that's very lispy.  I'm sure a real haskeller would cry at that code, but honestly, I don't see anything wrong with it :).  I'm also getting a better grasp on haskell's types, namely the difference between type contructors, data constructors, and typeclasses.  If only C++ templates or java generics were like haskell's type classes!!

This is the number one thing I miss in clojure.  As cool as clojure is, the lack of type information still bothers me.  The little bit I've looked at core.typed leaves me wishing for a stronger type system that can handle type parameterization better.  Also, there's just the limitations of the JVM itself.  For example the lack of tail call optimization, continuations, lightweight threads or the generation of native binaries.  I've often said that clojure will be the gateway drug to haskell.  I think that clojure really kickstarted a lot of peoples interest in functional programming (even moreso than Scala which has a hybrid approach and didn't really require people to use immutable, persistent, or lazy data structures).  I can see haskell taking a prominent place in my programming future, but Clojure is still an interesting language, and I'll be using it for any JVM related project where I can (though I'll be exploring frege too).  I think clojure will be in my toolbox kind of like python (should be)...a tool to quickly hash out an idea, and then re-implement in a fully typed language.

Another thing that's caused clojure to lose a little bit of lustre for me is some recent problems I've been having with pheidippides.  Recently, I removed the Java bits of my code, because I discovered a way to workaround a bug I had found in clojure.  Everything seemed to work ok when a single Module sent just one message to the Controller.  But if I sent another message (either from the same Module or a different one), my code in clojure would sit in a while loop like this:

(while (.hasRemaining buff)
    (.read chan buff))

The buff is a size of one byte.  It was meant to read in the first byte of a message (the opcode) and given that determine how many more bytes to read in.  The Selector had already determined that the channel (the chan var there) was in the ready set.  Now according to the documentation it is possible that there really isn't data there from the SelectionKey.  So I double checked and called in a previous function (.isReadable sel-key), where sel-key contains the channel I'm reading from.  Since that returned true, the channel should have been readable and there must be at least one byte in it.  But for some reason, my clojure code would spin in that while loop forever.

However, when I resurrected my Java code....which ironically calls the very same clojure code, it works.

  public static PersistentVector test(SocketChannel chan) {
        IFn require = Clojure.var("clojure.core", "require");
        require.invoke(Clojure.read("pheidippides.messaging.core"));
        IFn getMessage = Clojure.var("pheidippides.messaging.messages", "get-chan-msg");
        PersistentVector msg = (PersistentVector) getMessage.invoke(chan);
        System.out.println("Opcode is " + msg.get(1).toString());
        return msg;
    }

Why is the Java class invoking the same function from clojure working, but the actual function from clojure is not?  Well, it turns out I forgot to do something in my clojure code that I was doing in the Java version.  In the Java version, as I was using an Iterator to walk through the SelectedKey set, and as I handled the event, I called iter.remove().  In the clojure code, rather than directly walk through the set with an iterator, I was using doseq to walk through the set.  However, at the end, I forgot to clear the set.  That meant that the SelectedKeys set was growing everytime a SocketChannel was sending data to the Controller.  The NIO Selector wants you to essentially remove a key from one of the sets (interest or ready set) which means that the Set is mutable.

Basically, although writing in pure clojure is nice, as soon as you have to reach down into Java, things get dirty.  Since I was using java's nio.channels, I have no choice but to deal with their mutable data.  Indeed, looking at my clojure code, I see how imperative it looks.


One reason I'll continue to use clojure is that for better or worse, the JVM is the world's number one platform (though javascript is mounting a big offensive on that front).  The JVM is continually improving, and in the Da Vinci project they are working on things like TCO, value types, and even the generation of native executables (though according to the JVMLS of 2015, that feature will only allow statically linked executables, and it will be a premium feature not in Java SE).  Since so many projects use the JVM in one form or another, it pays to understand the JVM and JVM languages.

In fact, at work, I had the need to grab all classes that had the @Test annotation applied to them from TestNG.  I'll get to that in a later post.  But I will say a few things:

  • Java lambdas take getting used to since they are in essence type inferred interfaces
  • Java lambdas don't seem to allow mutation of local variables
  • The stream API has a way to create new data structures instead of mutating existing ones
Although it was a little funky getting used to Java 8 lambdas and the new stream API, at least Java does seem to be heading in the right direction.  Unfortunately, there's just a ton of impure stuff and it can make things a pain.

The other thing occupying my time is refreshing my calculus.  About 3 weeks ago, there was a sale on udemy and I bought a couple of courses for $24, including 3 for calculus, and one for linear algebra.  Why go back over calculus?  I'd eventually like to study for the GRE, and my math is pretty rusty.  Plus, haskell has just gotten me back in the mood to get better at math.

I've always felt it was a shame that I minored in math and yet in my career, I basically haven't used it.  I think people tend to denigrate languages like haskell for being too academic, and not "real world" enough.  But at least there is scientific rigor to what they are doing.  I sometimes feel that "software engineering" is a misnomer.  "Programmers" are just churning out code that hopefully works.  But there's not really an analysis of what's going on.  As Rich Hickey said about simple vs. easy, the programming world wants the "easy" way to do things.  The problem is that very often "easy" is not composable, testable, or really what we need.  The easiness is just a veneer to get something "up and running", but the minute you run into trouble, you have to descend into a sea of madness to figure out what's really going on.

The argument I hear from the dynamic programmers is that types just get in their way.  But does it?  How many times does a python programmer have to sit at a debugger, and figure out why something isn't working, only to realize they passed in the wrong type?  How many times have you cursed at a null or None because some file wasn't there that you expected or a network hiccup caused a socket timeout?  Of course, all we have is anecdotal evidence, and if we can't prove that type information, Maybe monads that eliminate null conditions, or guaranteeing when functions can produce side effects are good....why go through all that trouble?  Plus, it seems that typing out a few extra characters is just too much for some (never mind that type inference can save you a lot of typing).

I truly think we need more rigor in the software engineering world.  To give you an example, at Red Hat, they want us to do more end-to-end scenario testing.  That sounds like a reasonable goal.  After all, unit tests don't guarantee that they will work in actual usage (perhaps your mock didn't do something the real component would do).  And ditto for functional tests.  But why?  Why should a functional test pass, but in the real world fail?  It's because of impurity.  If a functional (or unit test) passes, it's because your mock essentially guaranteed a certain result.  In effect, your mock made your test more pure.  But the real world gets in the way.  Network connections time out, files disappear, a hardware resource becomes unavailable,  a module changes some singleton or other global variable, or simply some part of your data mutates due to something else.

In a pure functional language, if a functional test passes but not in the real world, you know it has to be in one of your monads.  This helps pinpoint the problem much faster.  Good luck tracking down the problem if your language doesn't explicitly point out which functions are pure and which are not.

Since haskell essentially is mathematical, I figure getting better at calculus and linear algebra will put me back into a better frame of mind.  To be honest, calculus and linear algebra actually help in the real world too.  I feel like with what I've been doing lately, all I'm doing is pushing ghostly bits in the electronic ether.  Hopefully one day I can actually do something that will provide real-world value.