I like Ruby. Everything about this language feels right, and it has some of the best development libraries and frameworks. One thing I like in particular, is the way Ruby combines a full-fledged programming language with rich libraries, while retaining it’s dynamic and easy to script nature. And so lately I’ve been trying to imagine what it would look like to add process capabilities to Ruby. Over the next few weeks I’m going to explore that. I’m going to put Ruby through its paces and discover the best fit between Ruby and processes, and find ways to make it simple, scalable and most of all fun. Before I start, brief overview of why the hell we need processes to begin with. In Web development, a lot of what we do involves short processing. Running a search, tagging a picture, updating a to-do-list, are all tasks that take a fairly short time to complete. In fact, shorter is better: it’s easier to scale and it gives the users a better response time. AJAX, for example, is all about making UI interactions even shorter by pushing less data to the browser. And Web pages that update quickly are just more fun to use. But not every problem can be solved that way. You can’t index the Web fast enough to generate search results and show them up each time a user makes a search query. You can’t even do that for a moderately-sized Web site. Instead, you do all the indexing in the background, and push that information into a database, so the Web server can run queries on pre-indexed data. That type of background processes that may take minutes, days or even years to complete, is the type of processes I’m talking about. The thing with background processes is, they will fail. Even the most reliable system will crash every so often. It may be the power going out, or the hardware dying, or your application running out of memory, or your Ruby interpreter mysteriously crashing. When it happens to a Web site, the site becomes irresponsive, and you click the Refresh button. When it happens to a background task, that background task just dies. So the first problem to solve is adding the Refresh button smarts to background processes, so they can recover from failure. Imagine a process that indexes the Web, a process that takes a month to complete a full-index update, but routinely ends up failing just short of a month. The first time it fails, recovery kicks in and restarts the process. Except, it will fail again, restart, fail again, never get to complete. So processes have to be really short, or … The way applications deal with this problem is by breaking up complex processes into smaller tasks, tasks that take a fairly short time to complete, and storing the state information after each task completes. So instead of indexing the entire Web, each task indexes one site, taking a few minutes to complete. Now, if the server crashes, instead of restarting the entire process, it simply recovers and continues the remaining uncompleted tasks. The most you lose is a few minutes of work. For Web indexing, this is fairly simple to do. You solve it once, and for a fairly simple task. But when you have a lot of different processes, each with its own set of tasks and states, constantly re-inventing the wheel takes all the fun out of it. Why not use a library to take care of the grunt work, deal with all the boring detail, and leave you to just write the logic? So the next task on my list is to start exploring some creative ways to get Ruby to do processes. photo: [essjay nz](http://www.flickr.com/photos/essjay/29443700/)
-
Oct 16th, 2005
Step By Step
Your comment, here ⇓