Sunday, September 7, 2014

OpenStack clojure tool

So, I'm at Red Hat now :)  I'm now a Quality Engineer working on OpenStack which is a new direction for me.  This is the first time I have been working on a 100% software project.  Well, that's not entirely true I guess.  I did spend 18 months designing an automation framework from scratch.  Nevertheless, this is new and interesting.

That being said, OpenStack is written almost entirely in Python.  Namely Python 2.7. uggh.

Why the moaning?  I used to be a big rah-rah python guy.  A former co-worker of mine even jokingly wondered if Guido Van Rossum was paying me money to try to switch our company over to Python (which by the way, I pretty much single handedly did).  However, over the years, I have come to find many pain points with the language that has considerably dimmed my enjoyment of the language.  Now, hopefully I won't get any flames.  Python isn't a bad language and it has quite a few interesting features.  I just find myself longing for some things python lacks.  And indeed, with some really interesting dynamic interpreted (or JIT'ed) system programming languages like Go, Julia, and even Swift, I really wonder how much wind is going to get taken out of Python's sails in the next 5 years?  And that doesn't even factor in the non-system's programming languages like Clojure, Elixir or even TypedScript or LiveScript (a haskell-like variant of javascript).

Duck Typing ain't enough
I think Python 3 came to this realization with their new argument annotations, and so Python 3 doesn't suffer from this problem like Python 2 does.  Type hinting is the way to go.  It allows the developer to rapidly prototype an idea, and then for performance or documentation reasons, type the variables and return code later.  It would be nicer if Python was like Julia (or TypedClojure) and allowed even locals to be optionally typed.

When you start getting code bases into the many tens of thousands or more of code, you just look at a function and wonder "ok, what kind of variable am I supposed to pass in?".  Some of you may be saying that's what a good docstring is for.  I would agree, except that we all know the first thing to bit-rot is documentation.

Moreover, duck-typing can lead to unintended problems.  Perhaps you want to pass in an object that supports the method quack().  Unfortunately, the user so happens to pass in a BadDoctor class object, and your function happily calls the quack() method for you.

Hard to make constants (immutables)
Basically, if you want to truly make an immutable object in python, you'll need to subclass from int, tuple, string or some other immutable built in type.  And it's a little odd to do so.  It's one of the few places that implementing __new__() is required.  I often tell people that python's __init__ is not the constructor, __new__ is.  It is __new__ that actually allocates the memory for the object, and __init__ initializes the allocated memory.  If you have an immutable object, you have to give it a value as soon as it is created.

Another way to make "read-only" objects is to use a setter property.  It's not fool-proof, but it does allow one to make a mostly read-only object.  You could also reimplement __getattr__ and __setattr__ for the class and have it look up what you are trying to access.  And lastly, you could write a C(++) module for the data structure which does have const.  But really, would you want to do that?

Pypy aside, python's performance leaves something to be desired.  It also seems that Guido is totally nonplussed by python's performance and thinks it's good enough.  I was quite startled to recently learn how good the V8 Javascript engine performs.  That's not bad at all, and would make it on average about as fast as PyPy.  But Openstack requires regular CPython, mainly because of lots of dependencies on modules that use C modules (when you install OpenStack from devstack or packstack, you'll see some source compilation going on).

The Browser as the new VM
Like it or not, the browser is kind of the new VM.  That means that javascript is becoming as important as C or Java and just as ubiquitous.  Having an application that can run virtually anywhere, including mobile devices is not to be scoffed at.  Also, I was surprised to learn the new tricks HTML 5 has up its sleeve.  This includes a File System API so that you can finally read local files (albeit to a sandboxed file system), the websocket API, WebGL, and drag and drop support just to mention a few.  Since javascript is the de facto language of the browser, that means for better or worse learning javascript.  There are quite a few python-to-javascript libraries out there, including pyjs, and brython. However, they are not developed by the core python team and so I wonder if/when support will end?  And brython only supports python3.

No persistent data structures built-in
So there is pysistence.  But being a non-standard 3rd party library and with the lack of data-typing, it means that users will not be sure when persistent or non-persistent data types are being used.  Why do we want persistent data structures by default?  This page and this one sum it up pretty well.

Lack of good concurrency
Python, thanks to the GIL, doesn't really have true concurrency.  There is multiprocessing, which fires up a new python interpreter, but it does have some limitations (like the arguments must be pickle-able on Windows) which can be a real pain.  Also, since a new python interpreter is getting fired up for each new multiprocess, python developers can't really laugh at the JVM's large consumption of memory once you start firing up 20+ processes.  Hopefully pypy will solve this problem with Software Transactional Memory.

So because of all these concerns, I decided to focus on writing a clojure library to help do manual testing of an OpenStack deployment, using the public REST API.  This does have the disadvantage of not being able to do internal unit testing, but that's ok for my needs.  Why clojure? Because it addresses pretty much all the shortcomings above, though it does have its own limitations.

Type hinting or TypedClojure
In clojure, you can specify with the use of the metadata, all kinds of orthogonal information about a function or data.  This includes the data type for arguments and return values.  And there's also the possibility of using TypedClojure.  TypedClojure (aka core.typed) will even let you annotate local let variables which clojure type hinting can't do.

Immutable by default
Clojure, being more of a functional language, has immutable data structures by default.  Mutable data is the exception, and is clearly denoted as one of the mutable types (atoms, refs, agents, and vars).  Moreover, other than the var type (when you use def), changes to these types must occur in some kind of special function (for example dosync for ref, reset! for atoms or send for agents).

The latest benchmarks from the alioth shootout show clojure is about 3x slower than C.  I've heard that it requires some non-idiomatic style coding to achieve this performance, but the same could be said for the C code as well.  Nevertheless, that makes clojure about 7-10x faster than python, or a little faster than pypy.

Can target the browser
Speaking of which....since clojurescript is a dialect of clojure, you can write some code that clojure and clojurescript can share.  You could also write Single Page Apps in clojurescript, run it through the compiler, and have some blazing fast code.  And unlike the python to javascript libraries, clojurescript is worked on by the main clojure core team which means it will always be up to date (and in some ways, it's getting more love than clojure itself is right now).

Data persistence baked in
Just as the data structures are immutable by default, they are also persistent.  It does require a different way of thinking about programming, but the penalty you pay for in performance is easily paid for in robustness and the simplicity of reasoning about correctness in your programs.

So, clojure isn't perfect.  Most of the flaws come from the JVM, but others come from the current programming culture.

JVM startup time-  The JVM startup time is notoriously slow.  I believe this is one of the biggest reasons Java has never caught on in the systems programming world.  When you have startup time in the several seconds range, you can spend more time waiting for the JVM to spin up than a little "script" would take.  Hopefully Project Jigsaw will help alleviate this by only requiring the bits of the JVM that is required.

JVM bloat-  The JVM is also notorious for how much memory it can eat up.  About 300-400MB is typical, and I have seen some long running java agents eat up a Gb of RAM.  Java 8 helped a bit with this though, and hopefully Project Jigsaw will also tackle this problem.

Systems Programming is very hard- Java's ability to work with subprocesses is less than easy (though a recent JEP for Java 9 should help with this).  And writing JNI code is difficult due to the lack of unsigned data types and having to simultaneously write Java and native code (compared with writing a C python module, which is only in C).  Java has slowly been improving here: Java 7 introduced some new File APIs, and better asynchronous IO (at least on sockets and Files, but sadly, not on pipes or unix domain sockets AFAIK).  It is possible to use some helper libraries like JNA or bridj similar to python's ctypes, but still, writing native code can be a pain.

Reluctance to learn a lisp or functional language- Unfortunately, there's a lot of built in resistance to clojure.  Many people's perception of lisp or scheme is from college and all they could see was the tons of parenthesis.  Just like many people hate XML for all of the nested <>;, people hate lisp for all the nested ().  There's also the Smug Lisp Weenie problem, where there's the perceived notion that lisp programmers are insufferable know-it-alls, or "my lisp is superior to every other language in existence or in the future".  All I can say here is that I hope slowly but surely people will see that clojure programmers aren't much different than any other, and they see the advantages over time.

There's also the problem of its newness.  Clojure is still pretty young, and tooling and perhaps more importantly, familiarity is just not there.  There probably aren't many co-workers you can ask for help, and clojure debugging is not trivial.  It also doesn't help that the major IDE for most clojurians is emacs, which in itself is a turn-off for many.  Hopefully lighttable and counter-clockwise can help in that arena, and over time, people will learn more about it.

So for the reasons listed above, that's why I chose to write some tooling in clojure and clojurescript/javascript rather than Python.  It does mean that I wouldn't be able to do internal white box testing, but since OpenStack uses a REST API to have the service endpoints talk to each other, it really doesn't matter what language is used.  And in fact, you can write your own endpoints in any language you want since it does use a REST architecture.  Technically, since the Nova project uses AMQP to exchange messages internally within Nova itself, it would even be possible to extend Nova with non-python language created service endpoints that exchange message with the AMQP broker.

My ultimate goal is to provide manual tools to help explore and manually test OpenStack components.  If there's one thing I've discovered about OpenStack, it's that Tempest leaves a lot to be desired, and there's not a lot of good manual test tools or visualization of how things are going on.  When I say visualization, I'm not talking about Horizon or some other dashboard.  I want to see a visual representation of compute nodes on a neutron network.  I want to see http requests in-flight.

I could have used jclouds or contributed to it, but I wanted to concentrate on OpenStack only, and jclouds handles multiple cloud type environments.  So this will be a set of libraries that will only work against an OpenStack deployment.

This is all new to me, so progress will be slow.  I'm new to OpenStack, clojure, javascript and web programming (both client and server side).  But that's why I am actually stoked about this.  It's something new and I am enjoying this quite a bit.  I actually spent most of Saturday working on some WebGL, which I will be using for some of the visualization aspects.  And I also got a proof-of-concept REST client working in clojure that will get an authentication token from a v2.0 keystone service.

Currently, I have 2 repos, the first is the clojure openstack library called shi, and the second is a WebGL project to help me learn WebGL, javascript, and HTML 5 (client side).  The shi library will eventually become a web server that will serve up the client interaction code, as well as a remote nREPL.  The kan library will be the actual client side rich client program that will serve as the front end to issue REST commands, visualize the results, and get a view of the OpenStack deployment itself.   <= server side  <= client side

By the way shi is taken from mandarin &#35797, meaning "to attempt" or "try" (unfortunately, I can'tfigure out how to get blogger to read the unicode character even in the "raw" html mode).  Also, kan is taken from "to look at" in Mandarin (though I don't know the unicode character for it).

No comments:

Post a Comment