OpenCV tutorial: Computer vision with Node.js

In this openCV tutorial, I will show you how to work with computer vision in Node.js. I will explain the basic principles of working with images using the open source library called OpenCV - with real-life use cases.

Currently, I am working on my Master thesis in which I use React Native, neural networks, and the OpenCV computer vision library. Allow me to show you a few things that I have learned while working with OpenCV.

Computer vision is a field of computer science, which focuses on retrieving data from images or videos using different algorithms.

Computer vision is widely used, for example for motion tracking in security cameras, control of autonomous vehicles, identification of /searching for objects in a picture/video.

Implementing algorithms of computer vision is a nontrivial task but there is a really good open source library called OpenCV which is being developed from 1999 until now.

This library officially supports C, C ++, Python, and Java. Fortunately, JavaScript programmers led by Peter Braden started working on the interface library between the JavaScript and OpenCV called node-opencv.

With the OpenCV library, we can create Node.js applications with image analysis. This library currently hasn't implemented all of OpenCV's features - especially the features of OpenCV 3 - but it is possible to use it nowadays.

Installation

Before using the OpenCV library in Node.js, you need to install it globally. On MacOS, you can install it through Homebrew. In this article, I am using and installing OpenCV version 2.4.

$ brew tap homebrew/science
$ brew install opencv

12

If you are using another platform, here is a tutorial for Linux and Windows. After successful installation we can install node-opencv to our Node.js project.

$ npm install --save opencv

1

Sometimes the installation could fail (this is open-source, and it isn't in the final phase), but you can find a solution for your problem on project’s GitHub.

OpenCV basics

Loading and saving images + Matrix

The basic OpenCV feature enables us to load and save images. You can do this by using the following methods: cv#readImage() and Matrix#save();

const cv = require('opencv');

cv.readImage('./img/myImage.jpg', function (err, img) {
  if (err) {
    throw err;
  }

  const width = im.width();
  const height = im.height();

  if (width < 1 || height < 1) {
    throw new Error('Image has no size');
  }

  // do some cool stuff with img

  // save img
  img.save('./img/myNewImage.jpg');
});

12345678910111213141516171819

A loaded image is an Object that represents the basic data structure to work with in OpenCV - Matrix. Each loaded or created image is represented by a matrix, where one field is one pixel of the image. The size of the Matrix is defined by the size of the loaded image. You can create a new Matrix in Node.js by calling new Matrix() constructor with specified parameters.

new cv.Matrix(rows, cols);
new cv.Matrix(rows, cols, type, fillValue);

12

Image modifying

One of the basic methods that we can use is converting color. For example, we can get a grayscale image by simply calling the Matrix#convertGrayscale() method.

 img.convertGrayscale();
 img.save('./img/myGrayscaleImg.jpg');

12

This method is often used before using an edge detector.

We can convert images to HSV cylindrical-coordinate representation just by calling
Matrix#convertHSVscale().

 img. convertHSVscale();
 img.save('./img/myGrayscaleImg.jpg');

12

We can crop an image by calling the Matrix#crop(x, y, width, height)method with specified arguments.
This method doesn't modify our current image, it returns a new one.

  let croppedImg = img.crop(1000, 1000, 1000, 1000);
  croppedImg('./img/croppedImg');

12

If we need to copy a file from one variable to another, we can use the Matrix#copy() method which returns a new image Object.

  let newImg = img.copy();

1

In this way, we can work with basic Matrix functions. We can also find various blur filter features for drawing and editing images. You can find all implemented methods on Matrix Object in the Matrix.cc file on project’s Github.

Dilation and Erosion

Dilation and erosion are fundamental methods of mathematical morphology. I will explain how they work using the following image modifications.

The dilation of the binary image A by the structuring element B is defined by

OpenCV has a Matrix#dilate(iterations, structEl) method where iterations is the number of the dilation that will be performed, and structEl is the structuring element used for dilation (default is 3x3).

We can call a dilate method with this parameter.

img.dilate(3);

1

OpenCV calls a dilate method like this.

cv::dilate(self->mat, self->mat, structEl, cv::Point(-1, -1), 3);

1

After this call, we can get modified image like this.

The erosion of the binary image A by the structuring element B is defined by

In OpenCV, we can call a Matrix#erode(iterations, structEl) method which is similar to the dilation method.

We can use it like this:

img.erode(3);

1

and we get an eroded image.

Edge detection

For edge detection, we can use the Canny Edge Detector algorithm, which was developed in 1986 and became a very popular algorithm - often being called the “optimal detector”. This algorithm meets the following three criteria, which are important in edge detection:

Detection of edge with low error rate
Good localization of edge - distance between edge and real edge pixels have to be minimal
Edges in the image can only be marked once

Before using the Canny Edge Detector algorithm, we can convert the image to grayscale format, which can sometimes produce better results. Then, we can eliminate unnecessary noise from the image by using a Gaussian Blur filter which receives a parameter as a field - Gaussian kernel size. After using these two methods, we can get better and more accurate results in a Canny Edge.

im.convertGrayscale();
im.gaussianBlur([3, 3]);

12

The image is now ready to be detected by the Canny Edge algorithm. This algorithm receives parameters: lowThreshold and highThreshold.

Two thresholds allow you to divide pixels into three groups.

If the value of a gradient pixel is higher as highThreshold, the pixels are marked as strong edge pixels.
If the value of the gradient is between the high and low threshold, the pixels are marked as weak edge pixels.
If the value is below the low threshold level, those pixels are completely suppressed.

There isn't something like a global setting of the threshold for all images. You need to properly set up each threshold for each image separately. There are some possibilities for predicting the right thresholds, but I will not specify them in this article.

After calling the Canny Edge method, we also call a dilate method.

  const lowThresh = 0;
  const highThresh = 150;
  const iterations = 2;

  img.canny(lowThresh, highThresh);
  img.dilate(iterations);

123456

After these steps, we have an analyzed image. From this image, we can now select all the contours by calling the Matrix#findContours() method and writing it as a new Image.

  const WHITE = [255, 255, 255];
  let contours = img.findContours();
  let allContoursImg = img.drawAllContours(contours, WHITE);
  allContoursImg.save('./img/allContoursImg.jpg');

1234

Image with dilate.

Image without dilate.

In this picture, we can see all the contours found by the Canny Edge Detector.

If we want to select only the biggest of them, we can do it by using the following code, which goes through each contour and saves the biggest one. We can draw it by the Matrix#drawContour() method.

  const WHITE = [255, 255, 255];
  let contours = img.contours();
  let largestContourImg;
  let largestArea = 0;
  let largestAreaIndex;

  for (let i = 0; i < contours.size(); i++) {
    if (contours.area(i) > largestArea) {
      largestArea = contours.area(i);
      largestAreaIndex = i;
    }
  }

  largestContourImg.drawContour(contours, largestAreaIndex, GREEN, thickness, lineType);

1234567891011121314

If we want to draw more contours, for example, all contours larger than a certain value, we only move the Matrix#drawContour() method into a for loop and modify the if condition.

  const WHITE = [255, 255, 255];
  let contours = img.contours();
  let largestContourImg;
  let largestArea = 500;
  let largestAreaIndex;

  for (let i = 0; i < contours.size(); i++) {
    if (contours.area(i) > largestArea) {
      largestContourImg.drawContour(contours, i, GREEN, thickness, lineType);
    }
  }

1234567891011

Polygon Approximations

Polygon approximation can be used for several useful things. The most trivial is an approximation by bounding a rectangle around our object using the Contours#boundingRect(index) method. We call this method on the Contours object, which we get by calling the Matrix#findContours()method on an image after the Canny Edge Detection (which we discussed in the previous example).

let bound = contours.boundingRect(largestAreaIndex);
largestContourImg.rectangle([bound.x, bound.y], [bound.width, bound.height], WHITE, 2);

12

The second alternative to using approximation is the approximation of precision specified polygons by calling the Contours#approxPolyDP()method. By using the Contours#cornerCount(index) method, you get the number of angles in our polygon. I attached two images with various levels of precision below.

  let poly;
  let RED = [0, 0, 255];
  let arcLength = contours.arcLength(largestAreaIndex, true);
  contours.approxPolyDP(largestAreaIndex, arcLength * 0.05, true);
  poly.drawContour(contours, largestAreaIndex, RED);

  // number of corners
  console.log(contours.cornerCount(largestAreaIndex));

12345678

It is also interesting to use an approximation by the rotated rectangle of the minimum area, using the Contours#minAreaRect() method.

I use this method in my project to determine the angle of a particular object which is rotated into the right position after. In the next example, we add a rotated polygon into the largestContourImg variable and print the angle of our rotated polygon.

  let rect = contours.minAreaRect(largestAreaIndex);
  for (let i = 0; i < 4; i++) {
      largestContourImg.line([rect.points[i].x, rect.points[i].y], [rect.points[(i+1)%4].x, rect.points[(i+1)%4].y], RED, 3);
  }

// angle of polygon
console.log(rect.angle);


12345678

Image rotation without cropping

One of the things which I needed to solve and OpenCV have not implemented it, is image rotation without image cropping. We can easily rotate an image with the following code.

img.rotate(90);

1

But we get something like this:

How can we rotate an image without cropping? Before the rotation, we create a new square 8-bit 3-channel Matrix called bgImg whose size is the diagonal size of our image for rotation.

After that, we calculate the position for our image which we can put into new bgImg Matrix. On the bgImg, we call the Matrix#rotate(angle)method with our value.

  let rect = contours.minAreaRect(largestAreaIndex);
  let diagonal = Math.round(Math.sqrt(Math.pow(im.size()[1], 2) + Math.pow(im.size()[0], 2)));
  let bgImg = new cv.Matrix(diagonal, diagonal, cv.Constants.CV_8UC3, [255, 255, 255]);
  let offsetX = (diagonal - im.size()[1]) / 2;
  let offsetY = (diagonal - im.size()[0]) / 2;

  IMG_ORIGINAL.copyTo(bgImg, offsetX, offsetY);
  bgImg.rotate(rect.angle + 90);

  bgImg.save('./img/rotatedImg.jpg');

12345678910

After that, we can run the Canny Edge Detector on our new rotated image.

  const GREEN = [0, 255, 0];;
  let rotatedContour = new cv.Matrix(diagonal, diagonal);
  bgImg.canny(lowThresh, highThresh);
  bgImg.dilate(nIters);
  let contours = bgImg.findContours();

  for (let i = 0; i < contours.size(); i++) {
    if (contours.area(i) > largestArea) {
      largestArea = contours.area(i);
      largestAreaIndex = i;
    }
  }

  rotatedContour.drawContour(contours, largestAreaIndex, GREEN, thickness, lineType);
  rotatedContour.save('./img/rotatedImgContour.jpg');

123456789101112131415

There are so many other methods that we can use on a picture. For example, there’s background removing, which can be very useful - but they are not covered in this article.

Object detection

I work with plants and I don't use a detector for faces, cars or other objects in my application.

Even so, I decided to mention face detection in this article because it can show the strength of OpenCV technology.

We call the Matrix#detectObject() method on our loaded image, which accepts a parameter as a path to cascade classifier, which we want to use. OpenCV comes with some pre-trained classifiers which can find figures, faces, eyes, ears, cars and some other object in pictures.

cv.readImage('./img/face.jpg', function(err, im){
  if (err) throw err;
  if (im.width() < 1 || im.height() < 1) throw new Error('Image has no size');

  im.detectObject('./data/haarcascade_frontalface_alt2.xml', {}, function(err, faces){
    if (err) throw err;

    for (var i = 0; i < faces.length; i++){
      var face = faces[i];
      im.ellipse(face.x + face.width / 2, face.y + face.height / 2, face.width / 2, face.height / 2, [255, 255, 0], 3);
    }

    im.save('./img/face-detection.jpg');
    console.log('Image saved.');
  });
});

12345678910111213141516

OpenCV tutorial: Computer vision with Node.js

In this article, I talked about some interesting features of the popular OpenCV library used in Node.js. It is a real shame that there is no official interface for Node.js, although there is a library node-opencv, with less implemented features and an inconsistent API.

If you want to work with this library, you need to study the .cc files in the node-opencv repository, because there is no complete documentation of this library, at least yet.

Reading the code is absolutely OK, I love doing it, but I'm not happy with some inconsistencies and differences in return values compared with official OpenCV. I hope this library will soon develop, and I will try to contribute to it with a few lines of my own code.

"The Node-opencv project is not a waste of time. It has big potential & would deserve much more attention."

完形心理學！？讓我們了解“介面設計師”為什麼這樣設計

完形心理學！？讓我們了解“介面設計師”為什麼這樣設計 — 說服客戶與老闆、跟工程師溝通、強化設計概念的有感心理學 — 情況 1 : 為何要留那麼多空白? 害我還要滾動滑鼠(掀桌) 情況 2 : 為什麼不能直接用一頁展現? 把客戶的需求塞滿不就完工啦! (無言) 情況 3: 這種設計好像不錯，但是為什麼要這樣做? (直覺大神告訴我這樣設計，但我說不出來為什麼..) 雖然世界上有許多 GUI 已經走得又長又遠又厲害，但別以為這種古代人對話不會出現，一直以來我們只是習慣這些 GUI 被如此呈現，但為何要這樣設計我們卻不一定知道。由於完形心理學歸納出人類大腦認知之普遍性的規則，因此無論是不是 UI/UX 設計師都很適合閱讀本篇文章。但還是想特別強調，若任職於傳統科技公司，需要對上說服老闆，需要平行說服（資深）工程師，那請把它收進最愛；而習慣套用設計好的 UI 套件，但不知道為何這樣設計的 IT 工程師，也可以透過本文來強化自己的產品說服力。那就開始吧~（擊掌）完形心理學，又稱作格式塔（Gestalt）心理學，於二十世紀初由德國心理學家提出 — 用以說明人類大腦如何解釋肉眼所觀察到的事物，並轉化為我們所認知的物件。它可說是現代認知心理學的基礎，其貫徹的概念就是「整體大於個體的總合 “The whole is other than the sum of the parts.” — Kurt Koffka」。若深究完整的理論將會使本文變得非常的艱澀，因此筆者直接抽取個人認為與 UI 設計較為相關的 7 個原則（如下），並搭配實際案例做說明。有興趣了解全部理論的話可以另外 Google。 1. 相似性 (Similarity) — 我們的大腦會把相似的事物看成一體如果數個元素具有類似的尺寸、體積、顏色，使用者會自動為它們建立起關聯。這是因為我們的眼睛和大腦較容易將相似的事物組織在一起。如下圖所示，當一連串方塊和一連串的圓形並排時，我們會看成（a）一列方塊和兩列圓形（b）一排圓形和兩排三角形。對應用到介面設計上，FB 每則文章下方的按鈕圖標（按讚 Like / 留言Comment / 分享 Share）雖然功能各不相同，但由於它們在視覺上顏色、大小、排列上的相似性，用戶會將它們視認為...

閱讀完整內容

機器人的... 一天...

----------- 未完... 待續... ---------------

搜尋此網誌

OpenCV tutorial: Computer vision with Node.js

OpenCV tutorial: Computer vision with Node.js

Installation

OpenCV basics

Loading and saving images + Matrix

Image modifying

Dilation and Erosion

Edge detection

Polygon Approximations

Image rotation without cropping

Object detection

OpenCV tutorial: Computer vision with Node.js

留言

張貼留言

這個網誌中的熱門文章

完形心理學！？讓我們了解“介面設計師”為什麼這樣設計

2017通訊大賽「聯發科技物聯網開發競賽」決賽團隊29強出爐！作品都在11月24日頒獎典禮進行展示