How to Build a Node.js REST API on Windows Azure Websites

In "Enabling a Node.js Server-Side App on Windows Azure," I introduced the idea of running Node.js within Windows Azure roles and why you might want to do so. Recent developments have made it easier (and potentially more cost-effective) to host Node.js applications on Windows Azure -- by hosting Node.js applications within Windows Azure Websites. A compelling use of Node.js and its lightweight, non-blocking, highly asynchronous HTTP infrastructure is the building of REST APIs. To demonstrate this, the article will discuss how to build a lightweight trace-logging API that can be used from any REST-capable client. This API is built with Node.js and uses the NoSQL MongoDB document-oriented database to store the event log trace entries. In contrast to the previously referenced article, in which we used the Azure SDK to build and deploy the solution to Azure, here I will show how to use WebMatrix 2 to build and deploy the application to Azure Websites.

Node.js in Azure Websites

The architecture for running Node.js within Azure Websites is identical to that used when hosting within a Web role, as I described in my previous article -- except that it also leverages the iisnode and URL Rewrite modules within IIS to, respectively, manage the execution of and communication with node process instances. Let's move on to examining the architecture hosting our logging API, codenamed "Ronin," within Azure Websites (see Figure 1), so that we have a high-level view of all the moving parts before we dive into implementation details.

Figure 1: Hosting a Node.js REST API in Azure Websites

The API is simple, but it demonstrates many of the features needed for creating a robust REST API. By sending an HTTP 1.1 GET request to "/traces" (with optional query string parameters to limit results), we enable the traces to be stored in the MongoDB database. We add new traces by making an HTTP POST to "/traces", where the request body contains a JSON-encoded payload that describes the message, severity, and source. (I'll go through the parts of the API code shortly.)

On the server side, within the Node.js code contained in server.js, the trace is augmented using a dateTime attribute, which serves as a timestamp for when the trace was created. This trace is then inserted into the traces collection (e.g., table) of the Ronin database in MongoDB, at which point a unique identifier field (_id) is appended to the trace to identify it within the collection.

Because the API for querying and inserting traces is only part of the story, we also deploy a sample web page, shown in Figure 2, which shows invoking the API using jQuery -- adding traces by Ajax post and retrieving them with an Ajax GET.

Figure 2: Sample API Client Web Page

Starting from Scratch

WebMatrix 2 includes three Node.js templates that help you get started building a Node.js site that's configured to run under IIS as described previously, both locally and when hosted by Azure Websites. For our purposes in building a REST API in Node.js, we need only the bare-bones Empty Site template. To create this template, launch WebMatrix. On the startup splash screen, click Templates, and in the dialog that appears, click Node.js on the left and select the Empty Site template, as Figure 3 shows. Continue through the wizard (there are no further options to select), and you will be ready to go.

Figure 3: WebMatrix Node.js Templates

Building the Server: Adding the Required Packages

With the WebMatrix project ready, we can start building the API. Windows Azure Websites will run Node Package Manager (NPM) to get required modules if your deployment does not include a node_modules folder. However, you could run into problems if the modules need to be compiled, as Azure Websites might not have the necessary tools to compile them. The safe bet is to make sure your node_modules folder contains all your required packages before you deploy to Azure Websites.

The Ronin database relies on the following packages:

Express: Express is a framework that provides functionality similar to IIS for Node to use. In our scenario, Express provides the ability to respond to HTTP requests, manages routes for accessing our API, provides simplified access to the request body, and parses out query string parameters for us.
Async: Async provides a set of patterns for managing asynchronous method calls, chaining them, and joining parallel calls. Because non-blocking work is fundamental to Node.js, having the structure offered by Async helps to minimize the "nesting spaghetti" of callbacks that tends to result otherwise. It gives us a simplified structure for defining a chain of asynchronous calls as an array of functions.
MongoDB: This is the native driver for using MongoDB from JavaScript running within Node.js. In our scenario, we use it to open a connection to MongoDB, authenticate, and insert to or query from the traces collection, which contains our trace log entries.

The natural way to get these packages installed into your node_modules folder is to first install the Node Gallery extension for WebMatrix 2. You do this within WebMatrix, by clicking the Extensions button that appears at the far right of the home tab on the ribbon, as shown in Figure 4.

Figure 4: WebMatrix Ribbon with Extensions Button at Right

In the search text box of the dialog that appears, search for NPM Gallery and click through the install process. After you've done this, at the right of the Extensions button you should now see a button for NPM Gallery. Once NPM Gallery is installed, installing the three required packages (Express, Async, and MongoDB) is a matter of clicking the Gallery button, searching for each package by name, then clicking through the installation process.

Now we are ready to start coding in Node.js. To do so, within WebMatrix on the accordion at the bottom left, click the Files button. In the tree that appears, double-click server.js. We will replace the "Hello, World!" code that the template provides with the code shown in Listing 1, which represents our completed Ronin API. We will walk through this code in the sections that follow.

Configuring Application Frameworks and Middleware

Starting from the top of the code, we use the require statement to load in the HTTP module and our three requisite modules: Mongodb, Async, and Express. We access the functionality of these modules through variables they are loaded into.

var http = require('http');
var mongodb = require('mongodb');
var async = require('async');
var express = require('express');

After that, we define a set of global variables that store our connection information and credentials in the MongoDB database.

var mongo_host = 'my-mongodb.cloudapp.net'; var mongo_port = 30000; var mongo_db_name = 'ronin'; var mongo_requires_auth = true; var mongo_auth_username = 'test1'; var mongo_auth_password = 'abc!123';

Next, we create an instance of an Express application, which will be responsible for responding to HTTP requests.

var app = express();

Notice that this instance of Express is what is passed in to create the HTTP server in the penultimate line in Listing 1. This just means that methods on the app instance of Express will be used for handling requests and that the server will listen for HTTP requests either in the port defined in the environment variable named port (it is defined) or on port 8080. The option for using port 8080 is there in case you want to run the app outside of WebMatrix, using Node.js from the command line. When you run Node.js with WebMatrix or in Azure Websites, the environment variable port is used.

http.createServer(app).listen(process.env.PORT || 8080);

Following the instantiation of the Express instance, we configure Express to use certain "middleware." Think of this middleware as components in a pipeline that processes requests. In order, we enable:

express.logger, which logs request information to the console
express.query, which assists with parsing out the parameters from the request query string
express.bodyParser, which decodes the request body for us
app.router, which enables us to refine routes, such as /trace, to access the API

One thing to note: Many samples using Express will add express.static to handle access to static content files. However, when we are hosting within IIS, the URL Rewrite module provides this functionality for us, and so an express.static line is not needed. You can configure the routes used by the URL Rewrite module by editing the element in the web.config file included with the WebMatrix template.

The second call to app.configure yields color-coded log messages on the console when you run the app using Node directly from the command line, as shown in Listing 2.

Listing 2: Call to app.configure Showing Color-Coded Log Messages

app.configure(function () {     app.use(express.logger('dev'));     app.use(express.query());     app.use(express.bodyParser());     //app.use(express.static(__dirname + '/public')); NOT NEEDED-- IIS REWRITE DOES THIS     app.use(app.router); }); app.configure('development', function () {     app.use(express.errorHandler()); });

Following this call are two long calls to app.post(…) and app.get(…). These are the heart of API and define what happens when you perform an HTTP POST or HTTP GET to "/trace", respectively. Let's look at these in more detail, starting with app.get.

GETting Trace Data

The logic defined in app.get responds to requests of the form:

GET http://localhost:8080/traces HTTP/1.1

The reply contains a JSON-encoded array of trace objects pulled from MongoDB in the response body, such as the following (formatted for clarity):

[ {"message":"Hello, Node!","severity":"Error","source":"Fiddler","dateTime":"2012-10-10T22:26:16.679Z","_id":"5075f60882026e9c0c000001"} {"message":"Hello, there again Node!","severity":"Error","source":"Fiddler","dateTime":"2012-10-10T22:29:43.840Z","_id":"5075f6d7e5a6d77037000001"} ]

It also supports making requests of the form:

GET http://localhost:8080/traces?limit=5&start=2012-10-10T19:45:16.255Z&end=2012-10-10T19:47:16.255Z HTTP/1.1

In this request, the limit parameter specifies the maximum number traces retrieved from MongoDB, and the start and end parameters define the date range of traces to include in the query. The values for these last two are ISO-formatted date strings.

Let us to turn to the code for supporting this. The app.get method registers a route to respond to GET requests on the "/traces" path. The second parameter defines a callback that provides the request and response object needed to process the request and issue a reply. Recall that a tenet of Node.js is to make your code asynchronous and to minimize blocking operations. For this we use the async.waterfall method of the Async package. This enables us to define a chain of non-blocking calls, where one call must finish before the next one is made, by defining an array of functions, in the order these functions need to be called. If we "collapse" the array, the chain should become clear to you, as Listing 3 shows.

app.get('/traces', function (req, res) {
    var server = new mongodb.Server(mongo_host, mongo_port, { auto_reconnect: true });
    var db = new mongodb.Db(mongo_db_name, server, { safe: false });
    async.waterfall(
    [
function openDb(callback){...},
function authenticate(callback){...},
function getTracesCollection(callback){...},
function queryTraces(traces, callback){...},
function returnQueryResults(results, callback){...},
    ],
    function finalize(err){...});
});

We start by opening a connection to the MongoDB database, then authenticate, get the traces collection, query it, and return the results. The finalize method in the waterfall (that sits outside of the array) is called on two occasions. When everything goes smoothly, it is called last (e.g., after returnQueryResults). If an error occurs within any of the methods, the finalize method is called immediately, and none of the subsequent methods in the array are called.

Notice that for each function in the array, the last parameter is callback. Any parameters before callback are those expected to be passed to the function by the previous function in the array. Let's look at the getTracesCollection method as an example. Notice we invoke the MongoDB driver's db.collection method, passing it the name of the desired collection ("traces") and a callback function as its second parameter. When the operation is completed, this function will be passed an error object (if an error occurred or null if not) and a reference to the traces collection.

The way we invoke the next step, queryTraces(), is by invoking the passed-in callback, passing the error object as its first parameter and any parameters the next step needs as the remaining parameters (in this case, just the traces object). Async will inspect the err object and, if it is null, will call queryTraces, passing it the traces object and the next callback.

function getTracesCollection(callback) {
    console.log('getting traces...');
    db.collection('traces', function (err, traces) {
    callback(err, traces);
    });
},

The bulk of the work is basic use of the MongoDB driver to prepare for querying, so we will focus on the implementation of the actual query (within queryTraces) and the returning of the results (within returnQueryResults). The first few lines in queryTraces extract the values for limit, start, and end, if they are supplied with the request:

var limit = req.query.limit ? { "limit": req.query.limit} : { "limit": 1000 };
var start = req.query.start ? new Date(req.query.start) : null;
var end = req.query.end ? new Date(req.query.end) : null;

Then, if both start and end are defined, we build an object that describes the equivalent of a filter or an SQL "where" clause for MongoDB that searches for documents within the date range:

var filter = {};

if (start && end) {
    filter = { $and: [{ "dateTime": { $gte: start} }, { "dateTime": { $lt: end}}] };
}

In the preceding code, we have two conditions: that the document's dateTime field is greater than or equal to start (represented by $gte) and (represented by $and) that it is less than end (represented by $lt).

Next we are ready to issue our query using the find method:

traces.find(filter, limit).toArray(function (err, results) {
    callback(err, results);
});

Observe that the first parameter to find is our filter object, and the second limits the documents to return in the query. By default, find will return a cursor. However, we just want a simple array, so that we can return all the results to the client: Hence we call the toArray function. If all goes well, the function parameter passed in will be invoked with the array of matching traces, and if not, the err object will describe what went wrong. In either case, we invoke the callback and let Async decide what to do next.

Assuming the query succeeded, Async will next invoke returnQueryResults. Before we write the results to the body of the response, we should set the response headers to indicate that the operation was a success (HTTP 200) and that the body will contain JSON-encoded content (Content-Type: application/json). This is accomplished by the call to res.writeHead(). The results array retrieved from the previous step is serialized to JSON (using JSON.stringify) and sent in the response body with the call to res.write(). Finally, we must invoke the callback, as shown in Listing 4. In this case, we pass null as the error parameter as long as no errors occurred (any errors will be caught by the catch block and invoke the callback appropriately).

function returnQueryResults(results, callback) {
    try {
    console.log(results);
    res.writeHead(200, { 'Content-Type': 'application/json' });
    res.write(JSON.stringify(results));
    callback(null);
    }
    catch (ex) {
    callback(ex)
    }
}

In the error-free case, we still have some cleanup work to do. The finalize() function, shown in Listing 5, handles this task.

function finalize(err) {
    if (err) {
    console.log('error: ' + err);
    res.writeHead(500, err.toString());
    }
    else {
    console.log('traces were returned.');
    }

res.end();

if (db) {
    db.close(true); //ignore errors
    console.log('db connection closed');
    }
});

Namely, we end the response by calling res.end() and ensure that we close our TCP connection to MongoDB by calling db.close(), passing it a true parameter to force the connection to close and free the resource. If an error did occur in one of the waterfall steps, we handle that in finalize by writing a server error header (HTTP 500) that contains the human-readable error message. We do this by calling res.writeHead before we close the response using res.end().

POSTing Trace Data

The call to app.post registers a handler for POST requests that contain a JSON-encoded trace in the request body, such as this:

POST http://localhost:8080/traces HTTP/1.1
Content-Type: application/json
Content-Length: 97

{ 'message': 'Hello.', 'severity': 'Info', 'source': 'Fiddler' }

With an understanding of the waterfall pattern, the code for responding to POST requests to "/trace" and inserting a trace document into the database is nearly identical. We will focus on the unique aspects found in the writeLogEntry() function, shown in Listing 6, and the finalize() function, shown in Listing 7.

function writeLogEntry(traces, callback) {
    try {
    //append a time stamp field
    //ISO format required for MongoDB to treat as Date type
    //conversion handled automatically by driver
    req.body.dateTime = new Date();

    var traceDoc = JSON.stringify(req.body);
    console.log('writing log entry: "' + traceDoc + '"');

    traces.insert(req.body);
    callback(null)
    }
    catch (ex) {
    callback(ex);
    }
}

The crux of the work in the writeLogEntry function is to append to the trace object provided in the request body a dateTime property with a value of the current time. This is accomplished by assigning req.body.dateTime the value of a new Date() instance (which will represent the current time). The MongoDB driver for Node.js will convert this JavaScript date into the ISO-formatted string that MongoDB expects, automatically. With the dateTime field set, we call the insert method on the traces collection object, passing it the body object. The astute reader may notice that we do not specify a callback function within the insert method call -- this is so we can take a fire-and-forget approach (no callback to hear about an error, and no blocking overhead, either, caused by waiting on MongoDB) in regard to logging trace messages.

The finalize() function is also slightly different from that shown for app.get, in that in the case of an error-free completion of the waterfall, we respond with the HTTP 200 success header.

Building the HTML Client

With the server-side components ready, let us complete the story by showing a simplistic client. We do this by adding a basic HTML page to the public folder already created by the template in WebMatrix. Listing 8 shows the complete source for this.

Observe that the page loads jQuery from Microsoft's Content Delivery Network (CDN) and uses jQuery's $.get() and $.post methods to invoke the Ronin API (within the getLogs and submitTrace methods at the bottom of the listing), reading values from or writing values to the HTML input elements and textarea on the page. Although this console page demonstrates how to use the API with an HTML form, it's more likely that you will invoke the API within your own page's JavaScript logic to emit traces as it executes. In this case, you would just call $.post as shown, passing in an appropriately constructed trace object. Because we already have the URL Rewrite module configured to allow static access to files within the public folder, we do not have to make any further configuration changes to make this page accessible.

Publishing to Azure Websites

Now that you have this all built locally in WebMatrix, it's time to deploy it to Azure Websites. If you haven't already created an Azure Website to deploy to, you will want to do so now using the Azure Website Portal. This interface, shown in Figure 5, changes frequently, but as of this writing you need to click the + NEW button at the bottom of the preview portal, select Compute, Website, then select Quick Create. Fill in a unique DNS name for your site, choose your region and subscription, then click Create Website.

Figure 5: Creating a New Azure Website

When your website is provisioned, you will be taken to a dashboard. At the right, look for the Quick Glance set of links, click Download publish profile, and save this file. The file contains all the connection information WebMatrix needs to deploy your solution to the Azure Website you just created. Within WebMatrix, click the Publish button, and in the dialog click Import publish profile and select the file you just downloaded. Click through the process to finish the configuration and publish your Node.js website. Anytime you make changes to your site, just click Publish again; you will not need to re-enter the connection information.

REST Easy with Azure

Hopefully, after reading this article, you've gained an appreciation for the process of developing REST APIs with Node.js and MongoDB, developing with WebMatrix, and the ease of deploying to Azure Websites. One final thing to note: If you will be posting across domains from the browser to use the Ronin API, you will need to enable Cross-Origin Resource Sharing (CORS) -- see my article "Cross-Origin Resource Sharing for Windows Azure Cloud Apps" for more information. Perceptive readers should be able to spot what headers to plug into the app.get and app.post methods as well as the minor tweaks needed on the client side.

Comments

Plain text