This is a new project Jerome and I are working on. It’s still not close to completion, but I thought I’d give you a taste. Just to whet your appetite and get some feedback.
It started by looking at the libraries out there that let you use MongoDB with Node.js, and realizing they’re all quite bad. They work, that’s for sure, and bugs get fixed. But when we write application code using these APIs, well … it’s just not code we want to look at all day and maintain.
So we re-imagined what a kick-ass Node.js object mapper for MongoDB would look like. And it all starts with: “does this make my code easier to write, test and maintain”?
Here’s a sample, I’ll let you be the judge:
class User extends Model @collection "users" @field "name", String @field "password", String @set "password", (clear)-> @_.password = crypt(clear) @field "email", String @get "posts", -> Post.where(author_id: @_id) # me is a scope me = User.where(name: "Assaf") me.one (error, user)-> console.log "Loaded #{user.name}" user.posts.count (error, count)-> console.log "Published #{count} posts"
We started with three principles. The first one is progressive enhancement.
Working at the driver level is OK, and in fact, for some things the better choice. But it’s also great to have more capable model objects that encapsulate data and business logic. I’ve worked with different database servers to realize that for most applications both are necessary.
I also noticed how most drivers are too low-level and hard to use, with APIs that leak the underlying network protocol. That ends up pushing developers to do all their work with the more friendly, better-designed ORM/ODM. Even the stuff ORM/ODM clearly sucks at.
Why does it have to be one or the other? Why can’t you work at the driver level when you need to, and work with models when it’s easier to? Why can’t you do both with the same, well designed API?
What if this gave you driver-level performance:
collection("posts").where(category: "db").all (error, posts)-> # These posts are POJOs for post in posts console.log "Loaded #{post.title}"
And this gave you model objects, with schema, before/after hooks, validations, and all other good things:
collection(Post).where(category: "db").all (error, posts)-> # These posts are model objects for post in posts console.log "Loaded #{post.title}"
It’s practically the same API, but can take advantage of model objects when you tell the collection to use one. And we let you bring your own model implementation if you don’t like ours. This reminds me of CSS, where some rules only apply to more modern browsers. Progressive enhancement.
Did you notice, we care a lot about code readability? After all we spend more time reading code than typing it in. So not surprising, the second principle is readability. The readability of the code you’ll be writing when using the API.
It turns out CoffeeScript allows you to call methods on a class (more specifically, the constructor function) from within the class definition. Quite a nifty feature if you write a lot of application code in CoffeeScript.
It lets your write this:
class Post extends Model @collection "posts" # -- Title stuff -- @field "title", String random: -> @title = sampleTitles.atRandom() @validate -> assert @title, "Missing title"
I personally like organizing larger models into groups of related fields/behaviors. For example, a use model would have one section dealing with credentials, another with profile display, another with usage metric. Being able to mix meta-data methods with method definitions makes that easy.
Another way we care about readability of your code, we’re very conservative with what methods we put in the Model class. Only the methods we think you’ll want to use, like field and validate. Everything else is tucked away in a separate namespace. The idea of working with a base class that holds 100′s of implementation methods is an anti-pattern.
The third principle deals with giving you just the right level of control. And by that I mean taking care of as many details as possible, and picking the best possible defaults. But also realizing that sometimes you need finer control, making that possible, and teaching you when and how to take action.
For example, I’ve seen a few code bases that run an entire application using a single MongoDB connection. That works since the driver can make sense of requests sent/received concurrently, but it’s only using on server thread, so all operations are serialized and you’re not getting much performance out of MongoDB.
You could open one connection per request, but a Node.js Web server can easily scale to thousands of those, MongoDB can barely handle hundreds. So we put a connection pool in there, and it works behind the scene and dispatches each request to the next available connection. Whether you use one connection in the entire application, or one for each request, your code will use as many connections as the pool allows. You don’t have to worry about a thing.
Except when you do. In some cases you need a sequence of requests to go over the same underlying TCP connection. So we made that possible as well, and we documented the hell out of it, so you’ll know when and how to use this.
These are just examples, we’re not done yet, but I think you’ll agree, we’re off to a great start.