Here are the links to the previous installments:
- Introduction
- Threads vs. Events
- Using Non-Standard Modules
- Debugging with node-inspector
- CommonJS and Creating Custom Modules
- Node Version Management with n
- Implementing Events
- BDD Style Unit Tests with Jasmine-Node Sprinkled With Some Should
- “node_modules” Folders
It’s the 10th blog post already in this series on Node.js! And for this post we’ll be talking about a fairly common scenario when developing applications with Node.js, namely reading data from one stream and sending it to another stream. Suppose we want to develop a simple web application that reads a particular file from disk and send it to the browser. The following code shows a very simple and naïve implementation in order to make this happen.
Here we create a stream for reading the data of an mp3 file and writing it to the response stream. When we point our browser to http://localhost:2000, it pretty much behaves as we expect. The mp3 file either starts playing or the browser asks whether the file should be downloaded.
But as I mentioned earlier, this is a pretty naïve implementation. The big issue with this approach is that reading the data from disk through the read stream is usually faster than streaming the data through the HTTP response. So when the data of the mp3 file is read too fast, the write stream is not able to flush the data it is given in a timely manner so it starts buffering this data. For this simple example this is not really a big deal, but if we want to scale this application to handle lots and lots of requests, then having Node.js to compensate for this can imply an intolerable burden for the application.
So, the way to fix this problem is to check whether all the data gets flushed when we send it to the write stream. If this data is being buffered, then we need to pause the read stream. As soon as the buffers are emptied and the write stream gets drained, we can safely resume the data fetching process from the read stream.
This example illustrates a fairly common pattern of throttling data between a read stream and a write stream. This pattern is generally referred to as the “pump pattern”. Because it’s so commonly used, Node.js provides a helper function that takes care of all the goo required to correctly implement this behavior.
var http = require('http'),
fileSystem = require('fs'),
path = require('path');
http.createServer(function(request, response) {
var filePath = path.join(__dirname, 'AstronomyCast Ep. 216 - Archaeoastronomy.mp3');
var stat = fileSystem.statSync(filePath);
response.writeHead(200, {
'Content-Type': 'audio/mpeg',
'Content-Length': stat.size
});
var readStream = fileSystem.createReadStream(filePath);
readStream.on('data', function(data) {
response.write(data);
});
readStream.on('end', function() {
response.end();
});
})
.listen(2000);
Using this utility function certainly clears up the code and makes it more readable and easier to understand what is going on, don’t you think? If you’re curious, then you also might want to check out the implementation of the util.pump() function.
So get that data flowing already :-).