Nodejs Tutorial: Top 10 Mistakes Node.js Developers Make

3 Executing a callback multiple times

How many times have you saved a file and reloaded your Node web app only to see it crash really fast? The most likely scenario is that you executed the callback twice, meaning you forgot to return after the first time.
Let's create an example to replicate this situation. We will create a simple proxy server with some basic validation. To use it install the request dependency, run the example and open (for instance) http://localhost:1337/?url=http://www.google.com/. The source code for our example is the following:

var request = require('request');
var http = require('http');
var url = require('url');
var PORT = process.env.PORT || 1337;
var expression = /[-a-zA-Z0-9@:%_\+.~#?&//=]{2,256}\.[a-z]{2,4}\b(\/[-a-zA-Z0-9@:%_\+.~#?&//=]*)?/gi;
var isUrl = new RegExp(expression);
var respond = function(err, params) {
var res = params.res;
var body = params.body;
var proxyUrl = params.proxyUrl;
res.setHeader('Content-type', 'text/html; charset=utf-8');
if (err) {
console.error(err);
res.end('An error occured. Please make sure the domain exists.');
} else {
res.end(body);
}
};
http.createServer(function(req, res) {
var queryParams = url.parse(req.url, true).query;
var proxyUrl = queryParams.url;
if (!proxyUrl || (!isUrl.test(proxyUrl))) {
res.writeHead(200, { 'Content-Type': 'text/html' });
res.write("Please provide a correct URL param. For ex: ");
res.end("<a href='http://localhost:1337/?url=http://www.google.com/'>http://localhost:1337/?url=http://www.google.com/</a>");
} else {
// ------------------------
// Proxying happens here
// TO BE CONTINUED
// ------------------------
}
}).listen(PORT);

The source code above contains almost everything except the proxying itself, because I want you to take a closer look at it:

request(proxyUrl, function(err, r, body) {
if (err) {
respond(err, {
res: res,
proxyUrl: proxyUrl
});
}
respond(null, {
res: res,
body: body,
proxyUrl: proxyUrl
});
});

In the callback we have handled the error condition, but forgot to stop the execution flow after calling the respond function. That means that if we enter a domain that doesn't host a website, the respond function will be called twice and we will get the following message in the terminal:

Error: Can't set headers after they are sent.
at ServerResponse.OutgoingMessage.setHeader (http.js:691:11)
at respond (/Users/alexandruvladutu/www/airpair-2/3-multi-callback/proxy-server.js:18:7)
This can be avoided either by using the `return` statement or by wrapping the 'success' callback in the `else` statement:

request(.., function(..params) {
if (err) {
return respond(err, ..);
}
respond(..);
});
// OR:
request(.., function(..params) {
if (err) {
respond(err, ..);
} else {
respond(..);
}
});

4 The Christmas tree of callbacks (Callback Hell)

Every time somebody wants to bash Node they come up with the 'callback hell' argument. Some of them see callback nesting as unavoidable, but that is simply untrue. There are a number of solutions out there to keep your code nice and tidy, such as:

Using control flow modules (such as async);
Promises; and
Generators.

We are going to create a sample application and then refactor it to use the async module. The app will act as a naive frontend resource analyzer which does the following:

Checks how many scripts / stylesheets / images are in the HTML code;
Outputs the their total number to the terminal;
Checks the content-length of each resource; then
Puts the total length of the resources to the terminal.

Besides the async module, we will be using the following npm modules:

request for getting the page data (body, headers, etc).
cheerio as jQuery on the backend (DOM element selector).
once to make sure our callback is executed once.

var URL = process.env.URL;
var assert = require('assert');
var url = require('url');
var request = require('request');
var cheerio = require('cheerio');
var once = require('once');
var isUrl = new RegExp(/[-a-zA-Z0-9@:%_\+.~#?&//=]{2,256}\.[a-z]{2,4}\b(\/[-a-zA-Z0-9@:%_\+.~#?&//=]*)?/gi);
assert(isUrl.test(URL), 'must provide a correct URL env variable');
request({ url: URL, gzip: true }, function(err, res, body) {
if (err) { throw err; }
if (res.statusCode !== 200) {
return console.error('Bad server response', res.statusCode);
}
var $ = cheerio.load(body);
var resources = [];
$('script').each(function(index, el) {
var src = $(this).attr('src');
if (src) { resources.push(src); }
});
// .....
// similar code for stylesheets and images
// checkout the github repo for the full version
var counter = resources.length;
var next = once(function(err, result) {
if (err) { throw err; }
var size = (result.size / 1024 / 1024).toFixed(2);
console.log('There are ~ %s resources with a size of %s Mb.', result.length, size);
});
var totalSize = 0;
resources.forEach(function(relative) {
var resourceUrl = url.resolve(URL, relative);
request({ url: resourceUrl, gzip: true }, function(err, res, body) {
if (err) { return next(err); }
if (res.statusCode !== 200) {
return next(new Error(resourceUrl + ' responded with a bad code ' + res.statusCode));
}
if (res.headers['content-length']) {
totalSize += parseInt(res.headers['content-length'], 10);
} else {
totalSize += Buffer.byteLength(body, 'utf8');
}
if (!--counter) {
next(null, {
length: resources.length,
size: totalSize
});
}
});
});
});

This doesn't look that horrible, but you can go even deeper with nested callbacks. From our previous example you can recognize the Christmas tree at the bottom, where you see indentation like this:

if (!--counter) {
next(null, {
length: resources.length,
size: totalSize
});
}
});
});
});

To run the app type the following into the command line:

$ URL=https://bbc.co.uk/ node before.js
# Sample output:
# There are ~ 24 resources with a size of 0.09 Mb.

After a bit of refactoring using async our code might look like the following:

var async = require('async');
var rootHtml = '';
var resources = [];
var totalSize = 0;
var handleBadResponse = function(err, url, statusCode, cb) {
if (!err && (statusCode !== 200)) {
err = new Error(URL + ' responded with a bad code ' + res.statusCode);
}
if (err) {
cb(err);
return true;
}
return false;
};
async.series([
function getRootHtml(cb) {
request({ url: URL, gzip: true }, function(err, res, body) {
if (handleBadResponse(err, URL, res.statusCode, cb)) { return; }
rootHtml = body;
cb();
});
},
function aggregateResources(cb) {
var $ = cheerio.load(rootHtml);
$('script').each(function(index, el) {
var src = $(this).attr('src');
if (src) { resources.push(src); }
});
// similar code for stylesheets && images; check the full source for more
setImmediate(cb);
},
function calculateSize(cb) {
async.each(resources, function(relativeUrl, next) {
var resourceUrl = url.resolve(URL, relativeUrl);
request({ url: resourceUrl, gzip: true }, function(err, res, body) {
if (handleBadResponse(err, resourceUrl, res.statusCode, cb)) { return; }
if (res.headers['content-length']) {
totalSize += parseInt(res.headers['content-length'], 10);
} else {
totalSize += Buffer.byteLength(body, 'utf8');
}
next();
});
}, cb);
}
], function(err) {
if (err) { throw err; }
var size = (totalSize / 1024 / 1024).toFixed(2);
console.log('There are ~ %s resources with a size of %s Mb.', resources.length, size);
});

5 Creating big monolithic applications

Developers new to Node come with mindsets from different languages and they tend to do things differently. For example including everything into a single file, not breaking things into their own modules and publishing to NPM, etc.

Take our previous example for instance. We have pushed everything into a single file, making it hard to test and read the code. But no worries, with a bit of refactoring we can make it much nicer and more modular. This will also help with 'callback hell' in case you were wondering.

If we extract the URL validator, the response handler, the request functionality and the resource processor into their own files our main one will look like so:

// ...
var handleBadResponse = require('./lib/bad-response-handler');
var isValidUrl = require('./lib/url-validator');
var extractResources = require('./lib/resource-extractor');
var request = require('./lib/requester');
// ...
async.series([
function getRootHtml(cb) {
request(URL, function(err, data) {
if (err) { return cb(err); }
rootHtml = data.body;
cb(null, 123);
});
},
function aggregateResources(cb) {
resources = extractResources(rootHtml);
setImmediate(cb);
},
function calculateSize(cb) {
async.each(resources, function(relativeUrl, next) {
var resourceUrl = url.resolve(URL, relativeUrl);
request(resourceUrl, function(err, data) {
if (err) { return next(err); }
if (data.res.headers['content-length']) {
totalSize += parseInt(data.res.headers['content-length'], 10);
} else {
totalSize += Buffer.byteLength(data.body, 'utf8');
}
next();
});
}, cb);
}
], function(err) {
if (err) { throw err; }
var size = (totalSize / 1024 / 1024).toFixed(2);
console.log('\nThere are ~ %s resources with a size of %s Mb.', resources.length, size);
});

The request functionality might look like this:

var handleBadResponse = require('./bad-response-handler');
var request = require('request');
module.exports = function getSiteData(url, callback) {
request({
url: url,
gzip: true,
// lying a bit
headers: {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36'
}
}, function(err, res, body) {
if (handleBadResponse(err, url, res && res.statusCode, callback)) { return; }
callback(null, {
body: body,
res: res
});
});
};

Note: you can check the full example in the github repo.

Now things are simpler, way easier to read and we can start writing tests for our app. We can go on with the refactoring and extract the response length functionality into its own module as well.

The good thing about Node is that it encourages you to write tiny modules and publish them to NPM. You will find modules for all kinds of things such as generating a random number between an interval. You should strive for modularity in your Node applications and keeping things as simple as possible.

An interesting article on how to write modules is the one from substack.

6 Poor logging

Many Node tutorials show you a small example that contains console.log here and there, so some developers are left with the impression that that's how they should implement logging in their application.
You should use something better than console.log when coding Node apps, and here's why:

No need to use util.inspect for large, complex objects;
Built-in serializers for things like errors, request and response objects;
Support multiple sources for controlling where the logs go;
Automatic inclusion of hostname, process id, application name;
Supports multiple levels of logging (debug, info, error, fatal etc);
Advanced functionality such as log file rotation, etc.

You can get all of those for free when using a production-ready logging module such as bunyan. On top of that you also get a handy CLI tool for development if you install the module globally.

Let's take a look at one of their examples on how to use it:

var http = require('http');
var bunyan = require('bunyan');
var log = bunyan.createLogger({
name: 'myserver',
serializers: {
req: bunyan.stdSerializers.req,
res: bunyan.stdSerializers.res
}
});
var server = http.createServer(function (req, res) {
log.info({ req: req }, 'start request'); // <-- this is the guy we're testing
res.writeHead(200, { 'Content-Type': 'text/plain' });
res.end('Hello World\n');
log.info({ res: res }, 'done response'); // <-- this is the guy we're testing
});
server.listen(1337, '127.0.0.1', function() {
log.info('server listening');
var options = {
port: 1337,
hostname: '127.0.0.1',
path: '/path?q=1#anchor',
headers: {
'X-Hi': 'Mom'
}
};
var req = http.request(options, function(res) {
res.resume();
res.on('end', function() {
process.exit();
})
});
req.write('hi from the client');
req.end();
});

If you run the example in the terminal you will see something like the following:

$ node server.js
{"name":"myserver","hostname":"MBP.local","pid":14304,"level":30,"msg":"server listening","time":"2014-11-16T11:30:13.263Z","v":0}
{"name":"myserver","hostname":"MBP.local","pid":14304,"level":30,"req":{"method":"GET","url":"/path?q=1#anchor","headers":{"x-hi":"Mom","host":"127.0.0.1:1337","connection":"keep-alive"},"remoteAddress":"127.0.0.1","remotePort":61580},"msg":"start request","time":"2014-11-16T11:30:13.271Z","v":0}
{"name":"myserver","hostname":"MBP.local","pid":14304,"level":30,"res":{"statusCode":200,"header":"HTTP/1.1 200 OK\r\nContent-Type: text/plain\r\nDate: Sun, 16 Nov 2014 11:30:13 GMT\r\nConnection: keep-alive\r\nTransfer-Encoding: chunked\r\n\r\n"},"msg":"done response","time":"2014-11-16T11:30:13.273Z","v":0}

But in development it's better to use the CLI tool like in the screenshot:

As you can see, bunyan gives you a lot of useful information about the current process, which is vital into production. Another handy feature is that you can pipe the logs into a stream (or multiple streams).

7 No tests

We should never consider our applications 'done' if we didn't write any tests for them. There's really no excuse for that, considering how many existing tools we have for that:

Testing frameworks: mocha, jasmine, tape and many other
Assertion modules: chai, should.js
Modules for mocks, spies, stubs or fake timers such as sinon
Code coverage tools: istanbul, blanket

The convention for NPM modules is that you specify a test command in your package.json, for example:

{
"name": "express",
...
"scripts": {
"test": "mocha --require test/support/env --reporter spec --bail --check-leaks test/ test/acceptance/",
...
}

Then the tests can be run with npm test, no matter of the testing framework used.

Another thing you should consider for your projects is to enforce having all your tests pass before committing. Fortunately it is as simple as doing npm i pre-commit --save-dev.

You can also decide to enforce a certain code coverage level and deny commits that don't adhere to that level. The pre-commit module simply runs npm test automatically for you as a pre-commit hook.

In case you are not sure how to get started with writing tests you can either find tutorials online or browse popular Node projects on Github, such as the following:

express
loopback
ghost
hapi
haraka

8 Not using static analysis tools

Instead of spotting problems in production it's better to catch them right away in development by using static analysis tools.
Tools such as ESLint help solve a huge array of problems, such as:

Possible errors, for example: disallow assignment in conditional expressions, disallow the use of debugger.
Enforcing best practices, for example: disallow declaring the same variable more then once, disallow use of arguments.callee.
Finding potential security issues, such as the use of eval() or unsafe regular expressions.
Detecting possible performance problems.
Enforcing a consistent style guide.

For a more complete set of rules checkout the ESLint rules documentation page. You should also read the configuration documents if you want to setup ESLint for your project.
In case you were wondering where you can find a sample configuration file for ESLint, the Esprima project has one.

There are other similar linting tools out there such as JSLint or JSHint.

In case you want to parse the AST (abstract source tree) and create a static analysis tool by yourself, consider Esprima or Acorn.

9 Zero monitoring or profiling

Not monitoring or profiling a Node applications leaves you in the dark. You are not aware of vital things such as event loop delay, CPU load, system load or memory usage.

There are proprietary services that care of these things for you, such as the ones from New Relic, StrongLoop or Concurix, AppDynamics.

You can also achieve that by yourself with open source modules such as look or by gluing different NPM modules. Whatever you choose make sure you are always aware of the status of your application at all times, unless you want to receive weird phone calls at night.

10 Debugging with console.log

When something goes bad it's easy to just insert console.log in some places and debug. After you figure out the problem you remove the console.log debugging leftovers and go on.

The problem is that the next developer (or even you) might come along and repeat the process. That's why module like debug exist. Instead of inserting and deleting console.log you can replace it with the debug function and just leave it there.

Once the next guy tries to figure out the problem they just start the application using the DEBUG environment variable.

This tiny module has its benefits:

Unless you start the app using the DEBUG environment variable nothing is displayed to the console.
You can selectively debug portions of your code (even with wildcards).
The output is beautifully colored into your terminal.

Let's take a look at their official example:

// app.js
var debug = require('debug')('http')
, http = require('http')
, name = 'My App';
// fake app
debug('booting %s', name);
http.createServer(function(req, res){
debug(req.method + ' ' + req.url);
res.end('hello\n');
}).listen(3000, function(){
debug('listening');
});
// fake worker of some kind
require('./worker');
// worker.js
var debug = require('debug')('worker');
setInterval(function(){
debug('doing some work');
}, 1000);

If we run the example with node app.js nothing happens, but if we include the DEBUG flag voila:

Besides your applications, you can also use it for tiny modules published to NPM. Unlike a more complex logger it only does the debugging job and it does it well.
Written by Alexandru Vladutu

If you found this post interesting, follow and support us.
Suggest for you:

Complete Node JS Developer Course Building 5 Real World Apps

Node.js Tutorials: The Web Developer Bootcamp

Learn How To Deploy Node.Js App on Google Compute Engine

Learn and Understand NodeJS

Learn Nodejs by Building 12 Projects

Nodejs Tutorial

Wednesday, August 17, 2016

Top 10 Mistakes Node.js Developers Make _part 2(end)

No comments:

Post a Comment