Node.js streams for better memory consumption while handling huge files.

Streams: Streams are collection of data which arrives over time and don't have to fit into the memory all at once.

Let's go though an example for understading streams processing in node.js.

Step 1: Generate a file of big size or use any existing file of big size.

Here the below code will generate a text file of size approximately 1GB using write stream of node.js.

const fs = require('fs')
const file = fs.createWriteStream("E:/node_stream/sample.txt");//change path

for(let i=0 ;i<=1e6; i++){
  file.write("Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum");
  file.write("Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum");
}
file.end();

 

Step 2: The below code will be using node.js simple http server to serve the file over http. It's using the node.js core modules like fs and http.

const fs = require('fs');
const server = require('http').createServer();


server.on('request',(req,res) => {
	fs.readFile('E:/node_stream/sample.txt',(err,data) => {
     if(!err) res.end(data);
     else
     console.log(err);	
	})
})

server.listen(8999,(err) => {
if(!err)
	console.log("listening on 8999");
});

 

Run the above script from your system, check your process manager for memory usage before hitting the server for fetching the file.

Attaching the screen shot before and after hitting the http server from my browser.

Before:

It's quite clear that the memory usage is approx 9 MB. Let's hit the http server on port 8999 and check again the memory usage.

After:

 

Now you can notice the memory usage jumps to approx 1GB, this is because the hole file has to be buffered into memory all at once, this is really inefficient and cannot handle huge no of requests.

Step 3: Let's now change our http server implementation to use the streams for processing the same file.

const fs = require('fs');
const server = require('http').createServer();


server.on('request',(req,res) => {
	fs.createReadStream('E:/node_stream/sample.txt').pipe(res);
})

server.listen(8999,(err) => {
if(!err)
	console.log("listening on 8999");
});

Here we are using the createReadStream method from fs module and then chaining pipe method to writable stream res. Since res is also a writable stream we have directly piped read stream to res.

The pipe methods takes care of data leakage as well which can happen in case write operation is very fast as compared to read operation.

Now let's again compare the memory usage using the process manager.

Before:

As obvious since we haven't hit the server yet memory usage approx 9 MB.

Now hit the http server over port 8999 and check the memory usage again.

After:

 

If you notice the memory usage for sometime it will vary approx within the range of 10MB to 40MB  which is quite impressive and now we can handle many more http requests.

Thanks for reading.