Moving to Jekyll, on github

This will be the last post on tumblr.

I have created a new blog here that you should go check out.

Thanks for reading!


Evented Apis with Node.js

Lately I’ve been playing with node.js a bit. If you’ve been living under a rock and don’t know it, node is a set of libraries for javascript, implemented on top of Google’s wonderful V8. It’s pretty neat, and it’s package manager utility: npm, is just awesome.

With node you can create a streaming http client just with a few lines of javascript, like this:

https://gist.github.com/1623715

There’s a lot of talk going on about node scalability, performance, memory use, etc. This time I want to focus on a different thing: api design.

Let’s take the web server example again and see the api that’s presented to us:

https://gist.github.com/1623720

The response object we get is just a handle for us to subscribe to (or observe) certain events. While weird at first, this is a really elegant design decision. We don’t need to inspect status codes or stuff, we just subscribe to the data event if we want to parse the response, or the error event if we want additional error handling. None of these are mandatory and there are no if/switch statements.

Unfortunately this requires a change in our mindset, it’s harder if you’re not fond of javascript and specially if you come from a language where functions are not first-class citizens (e.g. java). To make things worse, there are some node libraries out there that do not work this way. Let’s go and create our own non-evented api, a simple “Calculator”:

https://gist.github.com/1623948

As you can see we now have to check for the value of err (what is err anyway? a flag, an object with error information?) and do some conditional handling. While not very ugly, this whole “2 value callback” is not elegant. What happens if there’s another outcome of the method (not just error or data)? In this silly example that’s a non-issue, but you sure can tell this doesn’t scale. Alternatives are passing in another parameter, which would break backwards compatibility, or adding information to the error object, mixing concerns. This is clearly why node developers use the evented api over this err, data thing for non-trivial callbacks.

Fear not since doing this is not hard at all, in fact its so simple that we’re going to do that now in about the same lines of code:

https://gist.github.com/1623971

We’re using EventEmitter, a class that’s part of node’s library (the object you get from an http response inherits from it too). Feel free to dig into the docs, but the most important methods of it are emit, which is used to fire events, and on which attaches handlers to particular events. Those handlers receive the parameters provided to the emit method.

Our example is really simple, almost stupid, you probably won’t see the gain here. But suppose you’re clients are only interested in knowing if the sum is negative, you could then emit a negative event. Hopefully you’ll see how this scales to more complex apis with a few callbacks.

Is this a silver bullet? should you design all node apis like this? Of course not. But it sure is a good tool to keep close when designing the next killer node app.


Xb

If you are doing some kind of client polling from javascript, you might want to check Xb, a new tiny library for doing exponential backoff callbacks.

It works pretty well out of the box with no config at all, you can check the examples folder to get a hint of what you can implement with it.

Also if you happen to need it in a node.js script, you can easily require it with npm, just by doing: 

npm install xb

from your terminal, and then require it with:

var Xb = require('xb');

Hope you find it useful, and if you don’t please let me know by creating an issue


Implementing the ternary operator in scala (step by step)

I’ve been working with scala lately and I can tell you: it is one of the best programming languages I’ve ever used. It’s expressive, succinct and fun to write. It also runs on the JVM with little configuration and its performance is as good as java (it’s probably one of the few JVM languages that achieves this). It has, though, one big problem:

It doesn’t have the ternary operator.

I’m actually saying this tongue-in-cheek since the if statement in scala is an expression (it has a return value). What you would do if you’d like to have ternary-like behavior in scala would be:

https://gist.github.com/1306603

Good uh? But let’s face it, nothing beats the wonderful ternary operator (I’m not kidding, I love that little prick). Fear not, though, since scala has some cool features that will let us create something quite similar. Our home-made ternary operator will look like this:

https://gist.github.com/1306611

Note that we’re using a pipe `|` instead of a colon `:`, because of the way scala handles the colon in method identifiers. I won’t dig into this, trust me when I say we cannot use the colon here.

Let’s start, we need a Ternary class with some methods, a first shot at it would be:

https://gist.github.com/1306619

Great stuff right? Well no, it actually looks like crap. No worries though, all first versions do, let’s improve it. We don’t want that ‘options’ method there, it would be better to have two methods, one for the true path and another for false one. We also want to make the two look less like method calls and more like a baked-in thing, let’s see:

https://gist.github.com/1306626

Making progress now, the yes and no method handle both cases, also since scala lets us use operator-like notation for every method, we can leave the dots and parens out. There’s a big problem that has been with us for a while now: the fact that our methods take strings. For this example that’s fine, but we want our ternary operator to handle not just strings but anything, even our own objects. To change this we simply add a new class with a type annotation, like this:

https://gist.github.com/1306630

Great! We just added a new class TernaryResultHandler. It gets created with our yes method, and from then it takes care of the conditional logic. This should be pretty familiar for most java programmers. (Note that it’s a scala convention to use A instead of T for type annotations)

Now we can use any object for our yes and no methods. Talking about these, they have to go. Hope you didn’t get too attached to them. Scala is more flexible with the identifiers than java, it will let us name our methods using almost any character, like `?` for example.

As a matter of fact when you do `2 + 2` in scala, you are actually doing `2.+(2)` (remember that we could leave dots and parens out?). This is awesome for Api design, and it’s a great design decision since it doesn’t involve additional concepts like operators: Everything is a method. Enough talking let’s improve our code once more:

https://gist.github.com/1306637

Cool, uh? Our example is pretty much done, the only thing that doesn’t look very neat there is the whole Ternary thing, if only we could open up Boolean and add the `?` method like the ruby guys do…

Scala doesn’t allow monkeypatching, this is a good thing since monkeypatching sucks (most of the time). Even the ruby guys know this and they are doing something about it. Scala has something which is sort of a more controlled cousin of monkeypatching, implicit conversions.

Implicit conversions (AKA implicits) are beyond the scope of  this blog post (maybe in the future?) and I’m just bringing this up since it will make our Ternary class super-awesome. If you want to learn more about implicits please go here. This is the last version of our Ternary operator class:

https://gist.github.com/1306673

That’s it. What’s happening here? Well the scala compiler first sees we’re trying to invoke the `?` method on Boolean. Instead of saying “WTF dude?” (or more politely, NoSuchMethodError) it checks for implicit conversions. It finds ours, sees that the return type is Ternary and (surprise!) that it happens to have the `?` method defined. This is why implicits are more controlled than monkeypatching, we didn’t force every boolean out there to implement the `?` method, just the ones that are in the scope of our conversion (again, for more info check the previously linked article).

Conclusion:

This is probably a silly example, and the first alternative (using the if expression) is nicer. It does however illustrate the underlying power of scala. You should really give it a try, it’s a pretty neat language and really easy to adopt if your code is already running on the JVM.

Cheers! 


"The simplest thing that works"™

Most likely, you’ve heard that phrase. The full version is:

What’s the simplest thing that could possibly work?

It was coined by Ward Cunningham, one of the smartest hackers out there.

I knew the term for a long time, but it wasn’t until a few months that I actually found myself in a situation that made me actually “get it”. It was kind of an “aha moment” that I’d like to share here.

I maintain an opensource library, called Scribe, that lets you make OAuth calls without all the boilerplate. A while ago, a ticket was created to support a particular feature.

Basically, the library makes an http request, obtains a response and parses the contents. The contents are pretty much standard stuff (though some providers are too smart-ass to give a fuck about the spec).

What you get back is a string that looks like this:

`oauth_token=random_string_here&oauth_token_secret=random_string_here`

After some parsing using simple regular expressions (doing regexes in java is as fun as getting shot at, BTW), the lib creates this simple object, called Token, that lets you easily access the token and secret.

But again, some smart providers send you extra info there, and the string becomes:

`oauth_token=random_string_here&oauth_token_secret=random_string_here&random_string_here=random_string_here`

With the original Token class we don’t handle this scenario. So, for this case, scribe no longer “works”.

The guy that pointed this to me also contributed a nice patch with his solution. There are some problems with his patch, though:

  • Touchs 5 files and adds ~60 LOC.
  • Adds a public (overloaded) api call.
  • Now we are passing a Map<String, String> around as a parameter, something not cool.
  • Not only that but the map is passed to be modified (kind of an “output” param, ouch!)
  • It only works for access_tokens, not for request_tokens (this is more OAuth-specific stuff so don’t worry about it).

What this solution does is basically return a Map<String,String> with the additional stuff that comes in the response as key-value pairs. 

Is this a bad solution? not at all, the code works and the user gets more or less what he/she wants. But the library now has to inspect the string, parse it, create a map and change the public APIs to return it. It’s kind of an overkill IMHO.

Here’s where “The simplest thing that works” came to my mind. I now have a different interpretation of this concept that I’d like to draw to you guys here:

Simple-Complex

The idea is that the graph represents increasing complexity. Near the “simple” end, things are straightforward, easy to maintain and to use, on the “complex” end you get maintenance nightmares, your users don’t like your lib and send you hate mail or bitch about it on Twitter.

I’ve marked 3 points on the graph:

  • A - The original Token class. Dead simple and easy to maintain, but it doesn’t “work” (since it doesn’t handle the extra params).
  • B - The ideal solution. Simple enough, and works.
  • C - The contributed patch. It works, but it increases complexity and it’s harder to maintain.

I believe now that finding that ‘B spot’ is what Cunningham meant by saying those words.

This is the solution that finally got into the lib. I believe it’s pretty close to B, because:

  • It’s short (about ~15 LOC).
  • It works for both cases (request and access tokens).
  • It doesn’t involve any additional parsing (doesn’t introduce new bugs).
  • It doesn’t modify the public APIs with strange overloads (Just a new constructor signature, on an object that is always constructed by the lib).

It does have a drawback: users now have to parse the raw string instead of having a nice map structure. But perhaps they didn’t want a map in the first place, and also, you must never forget the scope of your code; In the case of Scribe it is OAuth-signing http requests, not parsing strings or making utility methods.

That’s it. Next time you have a problem in your hands, just sit back and think “What is the simplest thing that could possibly work?”. Who knows? Perhaps you’ll find a simple and future-proof solution, just like me.


Never return null

Most programming languages have a representation of nothing. Java has null, Scala has Unit, ruby has nil, javascript has quite a few terms itself, and so on. From now on I’m going to refer to all of those as null for simplicity’s sake.

Null origin’s can be traced to ALGOL. It was invented by Tony Hoare, while he was creating a reference system for this language.

He calls it “my billon dollar mistake”.

As with most mistakes in our industry, it looked as a good idea at that time:

My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn’t resist the temptation to put in a null reference, simply because it was so easy to implement.

The rest is history.

What’s wrong with null? Well, the thing is that it acts like a time bomb. Consider the following snippet of (java) code:

https://gist.github.com/1144225

This code might not seem strange to you. In fact it’s pretty common stuff. 

The problem here is that we’re living in a world of fantasy from the time we ask for the Person object. The fantasy ends when we try to actually use the object, and find out we’ve been fooled and our friend ‘null’ lives in there.

How soon do we find out? Well it can be soon enough, or (as in the example) it can take a few lines of code and some method invocations. In the latter case, debugging becomes harder since we have to backtrack our steps and see where we get the ‘null’ reference the first time.

Does this mean that we have to check for nulls every time we get a reference from somewhere? No, please. That would pollute the code, making it harder to read and introducing a new set of bugs. But we can (and must) avoid returning null references.

I’ll say it out loud: There’s not a single scenario when returning a null reference is a good idea.

It is obvious that when you do have something to return, you just return that. But what happens when have nothing to return? Well you have a few options:

  • Return the empty equivalent version of the object
  • Suppose your method returns the lowerCase representation of a String (e.g. it takes “Hello World” and returns “hello world”). If your method receives a null reference, an empty string or another thing of which you can’t calculate the lower-case version, just return an empty string (“”). For methods returning collections or arrays, always return empty (perhaps also immutable) versions of them.

  • If no natural empty version is available, consider creating a NullObject
  • Often, there is no natural empty representation of the reference you’re returning, like in our Person example. In this case, consider creating a NullObject. This is a special instance of the object that has “neutral behavior”. Think of it as the empty string or array for complex types like Person.

  • If all else fails, throw an (unchecked) exception
  • If implementing the NullObject is an overkill, throw an exception. An unchecked one (I believe all exceptions should be unchecked, but that’s a matter of another post). This way, the client code will fail as soon as it tries to retrieve something that doesn’t exist, but you give them the chance to recover from this error (catching the exception) and implementing some kind of recovery strategy if they want to.

To resume:

Interesting guards against null:

Other languages have better protection against null.

C# for example, throws an unchecked exception when a key is not found in a Dictionary (Microsoft calls maps ‘Dictionaries’… whatever). In fact all .net exceptions are unchecked, something they got right.

Scala has the Option class. It’s a very elegant solution when you have a method that some times has to return an empty reference.