#tech

#node

#konva

#ffmpeg

#video

8 min read

Rendering Videos in Node.JS with Konva and FFmpeg

Andriy Obrizan

Did you know how easy it is to render videos in Node.JS?

Every video consists of multiple frames combined in a sequence. That’s the way to create a video! Just generate each frame separately and stitch them to a single video. There are plenty of open-source tools to do that programmatically.

Konva is one of the best canvas abstraction libraries in JavaScript. Instead of separate draw-calls, it enables developers to work with higher-level abstractions like Layers and shapes, just like in professional image editing software.

Konva also works in Node.JS on top of node-canvas using konva-node package. Node-canvas is a widespread Canvas API implementation for Node, built on top of Cairo, a native 2d graphics library.

Stitching frames is also pretty straightforward. There’s a JavaScript FFmpeg port, but since the app would be running on the server, we prefer using a native FFmpeg binary. It’s been here for quite a while and became a robust go-to solution when dealing with videos.

Let’s render our first video!

The complete source code for this article is available on GitHub. Feel free to use it as a kick-starter for your projects.

Video Rendering Framework

Konva is a bit tricky for JavaScript developers. It holds the whole node tree, and you have to do the memory management yourself. That’s especially important in the Node.JS environment, where it runs on top of the native Cairo library. While it doesn’t matter that much for our sample application that generates one video and exits, you’re guaranteed to run out of memory in a proper backend that should run constantly.

Doing it correctly from the start will save lots of debugging and refactoring time in the long run.

Konva’s root Stage object is pretty heavy, as it’s the one creating the Canvas container in Node.JS. We can significantly improve performance by reusing it between frames. Fortunately, it’s possible and not that hard.

Reusing the Konva layers and other nodes between frames also makes sense. Some of them, like images, can be cached to improve the rendering performance even further. For smooth animations, the app should generate at least 25 frames per second. It’s a long process, and it’s worth leveraging every small trick to make it faster.

Will all that in mind, our algorithm looks pretty straightforward:

Create the Stage
Add all layers, shapes, and other nodes
Render every frame to a file, animating the objects on the Stage
Destroy the Stage together with everything on it
Combine the image files to a single video

In JavaScript, it will look something like this:

async function renderVideo(){
  const stage = new Konva.Stage({width, height});
  try{
    // add layers and shapes
    for (let frame = 0; frame < totalFrames; ++frame){
      // animate objects
      // draw layers
      // save the stage image
    }
  } 
  finally{
    stage.destroy();
  }

  // create video from frames
}

Konva Layer has .batchDraw() method that may combine multiple draw calls into one. It’s a good practice to use it on the frontend when rendering to visible Canvas. However, we need precision and every frame here, so we must use regular draw instead.

The app can save frames to a temporary location and delete them after generating the whole video. We skipped that step in our sample application to show the process better.

Saving Frames to Image Files

Konva has two methods to generate an image from the Stage: toDataURL and toImage. It’s better to use the first one in Node.JS. It will give the .png image content in a data URL format, so we’ll have to get rid of that to get the raw image:

async function saveFrame({ stage, outputDir, frame }) {
  const data = stage.toDataURL();

  // remove the data header
  const base64Data = data.substring("data:image/png;base64,".length);

  const fileName = path.join(
    outputDir,
    `frame-${String(frame + 1).padStart(frameLength, "0")}.png`
  );

  await fs.promises.writeFile(fileName, base64Data, "base64");
}

The function accepts the current frame, which starts at zero for convenience and saves it to a file in outputDir. It uses the naming convention with leading zeros, which makes stitching with ffmpeg and browsing the files more effortless.

Handling animations

Our rendering code will use separate functions to render groups of elements similar to components in frontend frameworks, like React, for example. Every function will create the shapes, add them to a layer, and return the animation function. The animation function receives the frame number and tweaks shapes accordingly.

To DRY, we’ve added functions to simplify programming animations:

function makeAnimation(callback, { startFrame, duration }) {
  return (frame) => {
    const thisFrame = frame - startFrame;
    if (thisFrame > 0 && thisFrame <= duration) {
      callback(thisFrame / duration);
    }
  };
}

function combineAnimations(...animations) {
  return (frame) => {
    for (const animation of animations) {
      if (animation) {
        animation(frame);
      }
    }
  };
}

makeAnimation accepts a callback that receives the phase in 0…1 range and animates objects, the startFrame and duration of the animation in frames.

combineAnimations combines multiple animation functions into a single one.

Example usage:

return combineAnimations(
  makeAnimation((d) => hello.x((d - 1) * videoWidth), {
    startFrame: 0,
    duration: 2 * videoFps,
  }),
  makeAnimation((d) => from.x((1 - d) * videoWidth), {
    startFrame: 1 * videoFps,
    duration: 2 * videoFps,
  }),
  makeAnimation((d) => konva.opacity(d), {
    startFrame: 2.5 * videoFps,
    duration: 1 * videoFps,
  })
);

Notice that we use a videoFps constant to convert seconds to frame numbers.

For advanced animations with tweening, it’s easy to use another third-party animation library of your choice.

Loading images

Loading images is an asynchronous process. Unfortunately, Konva doesn’t provide a modern Promise-based API at the time of writing, so we have to add a conversion code:

function loadKonvaImage(url) {
  return new Promise((res) => {
    Konva.Image.fromURL(url, res);
  });
}

Working with images is simple, as long as you add them to the Stage. They will be destroyed together with the Stage inside our “framework” code after generating all frames. It’s also possible to cache images and .clone() them for actual use, but that would require custom memory management for the caching code.

Konva supports different image manipulation methods, from simple resizing and opacity to pretty advanced filters. Filters are pretty slow, so it’s better to .cache() the intermediate image for better performance.

SVGs are also supported, but they seem to be loaded as raster images by Node-Canvas. We had to manually resize them to the actual dimensions in a real project, or else they would look pixelated when resized by Konva.

Combining Frame Images to a Video

It’s only a single FFmpeg command. We’ll use execa to make things even more straightforward:

const frameLength = 6;

async function createVideo({ fps, outputDir, output }) {
  await execa(
    "ffmpeg",
    [
      "-y",
      "-framerate",
      String(fps),
      "-i",
      `frame-%0${frameLength}d.png`,
      "-c:v",
      "libx264",
      output,
    ],
    { cwd: outputDir }
  );
}

It will combine the files named frame-XXXXXX.png in the current directory (which is changed to outputDir for the command) to a single video output with fps frames per second encoded with libx264, overwriting the output file if it exists. Setting the pixel format and video quality is also a good practice, and we skipped it here for simplicity.

Honorable Mention of Remotion

The Remotion framework is another way to render videos in NodeJs using familiar React components. It relies on puppeteer under the hood to render each frame, which controls a real headless Chrome or Chromium browser. This approach is slightly slower than Konva but allows you to leverage pretty much any third-party React library in the videos. The variety of open-source React components is enormous.

The developer experience gets improved thanks to the built-in debugger that enables you to debug videos frame-by-frame in a browser, just like a regular React application.

The framework is still in active development at the time of writing, and there are a few quirks that need improvement. However, it’s already possible to generate pretty complex videos using it.

Unfortunately, it has paid license for commercial use.

Conclusion

Embedded content was blocked by your Privacy Preferences. Feel free to view it externally.

Rendering videos in Node.JS is not that hard using a few excellent open-source libraries and tools! It’s a bit slow - we were getting around six frames per second for a simple video, but that’s normal for such CPU intensive task. The code, of course, can be further optimized, for example, by adding different elements to separate layers and redrawing only those that changed. Konva optimization techniques work here as well.

Programmatic videos enable new horizons for your projects. It’s a great way to summarize information in a concise form visually. The videos are small in file size without fancy graphics and can be easily attached to an automatic email, social media post, or even uploaded to YouTube.

You can find the complete source code on GitHub.

Feel free to contact us using the form below if you need some help adding videos to your application.