- read

Multiprocessing for NodeJS Web Servers: Dealing with CPU-bound tasks

Covenant Okoroafor 124

Photo by Olivier Collet on Unsplash

This article is heavily based on Hussein Nasser’s video on a similar topic. Check it out if you haven’t.

Prerequisites:

I don’t fancy putting big ‘pre-requisite’ blocks in my text — I find they can do more harm than good, sometimes — but I won’t deny that knowing these would make life easier for all of us:
— Familiarity with the NodeJS and Express ecosystem — you’ve built an Express server, no matter how simple.
— A basic knowledge of Promises(specifically async/await and .then())
— A basic knowledge of asynchronous JavaScript — You should be able to write code that uses Callbacks, Promises, async/await.

Intro

Imagine you’re building an innovative weather forecasting app that depends on intricate meteorological simulations for pinpoint predictions. These calculations, however, pose an issue: they’re CPU-intensive and can potentially stall your application from responding and processing requests; Node.js excels in handling I/O-bound tasks with its async/await logic, but CPU-intensive operations require a different approach. In this article, we’ll embark on a journey into multiprocessing to gracefully handle CPU-bound tasks, allowing us to build high-performance applications.

Here’s a simple example of a blocking CPU operation to make this relatable:

const express = require("express");
const app = express();
app.use(express.json());

function fibonacci(number) {
if (number < 0) {
throw new Error("Non-negatives only, please");
}
if (number < 2) {
return number;
}
return fibonacci(number - 1) + fibonacci(number - 2);
}

app.get("/blocking", async (req, res) => {
number = parseInt(req.query.number);
const start = Date.now();
const result = fibonacci(number);
const end = Date.now();
res
.status(200)
.json({ number: number, result: result, time: end - start + "ms" });
});
app.get("/non-blocking", async (req, res) => {
res.status(200).json("Hello");
});
app.listen(3000, () => {
console.log(`Server running`);
});

This code creates a simple web server with the Express framework in NodeJS; a “blocking” route gets the number sent as a query parameter and calculates the Fibonacci of that number, marks the time it took to calculate the Fibonacci, and returns this information as a JSON response. We also have a “non-blocking” route that returns a string whenever it is visited.

Quick Note: This is not an efficient way to calculate the Fibonacci sequence of a number — there are more efficient ways to implement this function (look up memoization). I deliberately made this function inefficient to demonstrate a CPU-bound task.

Create a new file called “index.js” in VScode (or your preferred editor) and copy the code into the newly created file. Open a terminal window in the same directory and execute the file with Node(node index.js), visit localhost:3000/blocking?number=10(this sends a request to the “blocking” route with a query parameter of 10), and you should get a response like the one below:

From the results, the code was executed in less than a millisecond. All good and fine, but what about a situation with a more significant number?

That doesn’t look good

This operation seems to be taking a while, let’s check our “non-blocking route”:

Our “non-blocking” route is blocked. How ironic.

Why is this happening?

The NodeJS runtime uses a single-threaded execution model; the V8 engine operates using a call stack and a heap — the call stack holds the currently executing function, and the heap contains dynamic data such as objects. As our program runs, each function is pushed onto the stack and is executed until it returns. When our Fibonacci function starts calculating the Fibonacci of the given input, the rest of the program is stalled waiting for it to complete — this includes the request-listening and -processing parts of our application.

One suggestion for a solution would be to use a Promise — after all, JS developers are typically recommended to use Promises for blocking operations.

Potential Solution 1: Using Promises

Let’s try to wrap the entire operation in a promise handler. The rest of the code will remain the same.

app.get("/blocking", async (req, res) => {
number = parseInt(req.query.number);
Promise.resolve().then(() => {
const start = Date.now();
const result = fibonacci(number);
const end = Date.now();
res
.status(200)
.json({ number: number, result: result, time: end - start + "ms" });
})
});

Note: The main body of a Promise executes synchronously, the Promise handler(.then/.catch) is the block of code that executes asynchronously. This is why we create a resolved Promise instead of running this code in the body of a Promise. To find out more on this, visit this wonderful article.

Let’s check on our modified application:

Same as before

Under the Hood: Promises and the Promise Queue
We end up with the same behaviour; this happens because a Promise only staggers the execution of our code — the main body of the Promise executes synchronously, and the Promise handler is queued up to run after all synchronous lines of code in our program have run — our CPU-intensive function will block the generation of a response until its done running. The Event Loop will also be unable to process other requests while handling our queued-up operation in the Promise Queue.

How Then Does Async I/O Work?
How is it possible that await asynchronousIOFunction — where “asynchronousIOFunction” corresponds to any promise-based asynchronous I/O function — allows our code to be non-blocking if a Promise only staggers the execution of our code?

Asynchronous I/O is possible because Javascript can delegate I/O tasks(disk reads, network reads, etc.) to Libuva C library for handling asynchronous I/O operations. These tasks will execute independently of Javascript and return data for Callbacks and Microtasks. These Callbacks and Microtasks(this includes Promise handlers, by the way) will run after all of the synchronous code completes.(For a detailed and visualised guide on this process, see https://www.builder.io/blog/visual-guide-to-nodejs-event-loop#libuv)

The Fix: Multithreading and Multiprocessing

These are two ways of achieving the same concept: dividing a program into sub-tasks and delegating these tasks to the operating system. Multiprocessing and multithreading enable simultaneous operation of multiple segments in our program.
A thread is a unit of execution that represents a sequence of instructions. In the context of operating systems, a thread refers to a sequence of instructions that can be scheduled to run on a CPU core. It is the smallest unit of a CPU’s execution, and multiple threads can exist within a single process. Each thread within a process can execute its own set of instructions independently and is scheduled to run on the “cores” of the CPU.
In a CPU, the hardware components responsible for executing instructions are called “cores”. A core represents a computational unit designed to carry out instructions. In a single-core CPU, only one set of instructions runs at any given moment. To allow concurrent execution, techniques like context switching are employed to process multiple programs simultaneously.

Multi-threading in software development typically refers to the separation of our program into different threads of the same process, this allows them to execute pseudo-independently. Microsoft Word, for example, employs multi-threading by having a dedicated thread to collect user input via keystrokes, a different thread to update the document being worked on, etc.

Single vs multithreaded vs multiprocess — Original article: https://towardsdatascience.com/multithreading-and-multiprocessing-in-10-minutes-20d9b3c6a867

Multi-threading offers several advantages:

Lighter than Processes: Creating a new thread is relatively inexpensive compared to creating an entire process on the system.

Faster Context Switching: Since the threads are all part of the same process, it’s easier to switch between them when they are being executed.

Lighter Communication and Shared Memory: Threads in a process share the same memory space; this includes memory, variables, and other program data. In addition, communication between threads is lighter than between two distinct processes.

The advantages of threads also bring a set of unique challenges:

High Potential for Memory Corruption: Since threads share the same memory space, any thread’s corruption can stop the entire program. An unhandled exception in one thread will likely cause the program to crash.

Scaling Issues: Threads are also more difficult to scale over multiple machines due to the intense coupling between them.

Race Conditions and Complexity Issues: Multi-threaded code is significantly harder to test, debug, and develop than plain single-threaded code; this is due to having to deal with synchronisation issues, testing, and other problems that arise from this approach, chief among them being the problems of race conditions and deadlocks.

In the multiprocessing approach, we once again divide our program into different units, but this time, we isolate them as different processes on the computer. This solves some of our major issues with threads:

Memory Corruptions: Processes are isolated programs on the computer — a crash in one process has almost zero effect on the other processes.

Ease of Scaling: Due to their high independence, processes force the programmer to develop effective inter-process communication (IPC) channels. These channels usually scale easily across multiple machines.

Unfortunately, using multiple processes causes us to lose many of the previously discussed benefits of threads; computer processes are expensive to create and communicate across due to their high independence, and as previously mentioned, processes require more structured communication channels than threads. Like many things in software development, each choice brings its set of tradeoffs and advantages.

Due to the previously mentioned challenges of threads, we will use processes for the rest of this tutorial; this is a bit of a personal choice, though — I think computer threads would work just as fine for this demonstration, but I would rather avoid the issues with them. In a real-world situation, however, your choice will be determined by the nature of the problem and other factors. As software engineers, we must be able to evaluate different options and choose the appropriate solution for the problem.

Child Processes in NodeJS

The Child_Process module in NodeJS enables us to create and manage additional processes, known as child processes, from within the main Node.js application. This module is handy when performing tasks concurrently or interacting with external executables or scripts. Child processes allow the application to effectively utilize multicore systems, enhance performance, and execute tasks that might otherwise block the main event loop.

Offloading our Tasks to Child Processes
We’re going to have a main process for receiving and dispatching requests, and we’ll create a subprocess to handle the operations of each request. This way, the main process is freed up to continue to receive and distribute requests from clients.

Create a new file called “child.js”, and copy the Fibonacci function into the file. Going back to our “index.js”, we need to make this change to the top of the file:

const express = require('express');
const { fork } = require('child_process');

We import the Fork function from the Child_process module, Fork allows us to create a new NodeJS instance from a given JavaScript file with its own V8 engine, Event Loop, and everything that comes with a NodeJS instance. In addition to this, the new child process will have an IPC channel(remember this term?) between the parent(our main file — index.js) and the newly-formed child process. This allows easy communication between the parent process and the child.

Here’s the rest of the code in our main file(index.js):

const app = express();
const port = 3000;

app.get('/blocking', (req, res) => {
const number = parseInt(req.query.number);
const child = fork('./child.js');
child.send(number);
child.on('message', (
data) => { res.status(200).json(data); });
})

app.get('/non-blocking', (req, res) => {
res.status(200).send("Hello!");
})

app.listen(port, 'localhost', () => { console.log('Now listening on port', port) });

There’s some stuff to unpack here, so let’s begin:

— We still get the integer entered by the user from the query parameters.

const child = fork(‘./child.js’)create a new child process from the child file. The fork line returns a <ChildProcess> object.

— We send the number entered by the user to the child process using the <ChildProcess>.send() function.

— We attach an event listener to the child process object for the “message” event. This event is emitted whenever the child sends a message to the parent. In the event listener’s body, we generate a response from the calculated Fibonacci value.

The code for the “child.js” file:


const fibonacci = (n) => {
if (n == 0) {
return 0;
}
if (n == 1) {
return 1;
}
return fibonacci(n - 1) + fibonacci(n - 2);
}

process.on('message', (data) => {
const initialTime = Date.now();
const fibonacciValue = fibonacci(data);
const endTime = Date.now() - initialTime;
process.send({
number: data,
result: fibonacciValue,
time: endTime + 'ms'
});
process.exit();
})

— We’ve moved the entire timestamping and Fibonacci-calculation blocks to the “child” file.

— We attach an event listener to the “process” object of the child; this corresponds to the actual child process, and we’re listening for the “message” event — emitted whenever the parent sends a message.

— The Fibonacci calculation happens in this process, not the parent process. This means the parent process will never be blocked.

— After the calculation, we send the value and the time taken for the calculation back to the parent process. Recall that this will emit a “message” event which will be picked up by the parent.

— At the end of everything, we exit the child process using the “process.exit()” function. This prevents us from having an excessive amount of processes clogging up memory and doing nothing.

And that’s it! All that’s left is to test the new version of the code.

Our long-running Fibonacci operations don’t disrupt the other ones. Yay!

Final Thoughts

Back to our fictional weather app, we’d most likely have to use a multiprocessing(or multithreading) approach to avoid those dreaded blocks. This article will be helpful for individuals intending to go down this route. Just remember to close the child processes to avoid memory hogs.