跳到主要內容

OpenCV tutorial: Computer vision with Node.js

OpenCV tutorial: Computer vision with Node.js

In this openCV tutorial, I will show you how to work with computer vision in Node.js. I will explain the basic principles of working with images using the open source library called OpenCV - with real-life use cases.
Currently, I am working on my Master thesis in which I use React Native, neural networks, and the OpenCV computer vision library. Allow me to show you a few things that I have learned while working with OpenCV.
Computer vision is a field of computer science, which focuses on retrieving data from images or videos using different algorithms.
Computer vision is widely used, for example for motion tracking in security cameras, control of autonomous vehicles, identification of /searching for objects in a picture/video.
Implementing algorithms of computer vision is a nontrivial task but there is a really good open source library called OpenCV which is being developed from 1999 until now.
This library officially supports C, C ++, Python, and Java. Fortunately, JavaScript programmers led by Peter Braden started working on the interface library between the JavaScript and OpenCV called node-opencv.
With the OpenCV library, we can create Node.js applications with image analysis. This library currently hasn't implemented all of OpenCV's features - especially the features of OpenCV 3 - but it is possible to use it nowadays.

Installation

Before using the OpenCV library in Node.js, you need to install it globally. On MacOS, you can install it through Homebrew. In this article, I am using and installing OpenCV version 2.4.
$ brew tap homebrew/science
$ brew install opencv
12
If you are using another platform, here is a tutorial for Linux and Windows. After successful installation we can install node-opencv to our Node.js project.
$ npm install --save opencv
1
Sometimes the installation could fail (this is open-source, and it isn't in the final phase), but you can find a solution for your problem on project’s GitHub.

OpenCV basics

Loading and saving images + Matrix

The basic OpenCV feature enables us to load and save images. You can do this by using the following methods: cv#readImage() and Matrix#save();
const cv = require('opencv');

cv.readImage('./img/myImage.jpg', function (err, img) {
  if (err) {
    throw err;
  }

  const width = im.width();
  const height = im.height();

  if (width < 1 || height < 1) {
    throw new Error('Image has no size');
  }

  // do some cool stuff with img

  // save img
  img.save('./img/myNewImage.jpg');
});
12345678910111213141516171819
OpenCV Loaded Image
A loaded image is an Object that represents the basic data structure to work with in OpenCV - Matrix. Each loaded or created image is represented by a matrix, where one field is one pixel of the image. The size of the Matrix is defined by the size of the loaded image. You can create a new Matrix in Node.js by calling new Matrix() constructor with specified parameters.
new cv.Matrix(rows, cols);
new cv.Matrix(rows, cols, type, fillValue);
12

Image modifying

One of the basic methods that we can use is converting color. For example, we can get a grayscale image by simply calling the Matrix#convertGrayscale() method.
 img.convertGrayscale();
 img.save('./img/myGrayscaleImg.jpg');
12
OpenCV Grayscaled Image
This method is often used before using an edge detector.
We can convert images to HSV cylindrical-coordinate representation just by calling
Matrix#convertHSVscale().
 img. convertHSVscale();
 img.save('./img/myGrayscaleImg.jpg');
12
OpenCV HSV image
We can crop an image by calling the Matrix#crop(x, y, width, height)method with specified arguments.
This method doesn't modify our current image, it returns a new one.
  let croppedImg = img.crop(1000, 1000, 1000, 1000);
  croppedImg('./img/croppedImg');
12
Cropped image
If we need to copy a file from one variable to another, we can use the Matrix#copy() method which returns a new image Object.
  let newImg = img.copy();
1
In this way, we can work with basic Matrix functions. We can also find various blur filter features for drawing and editing images. You can find all implemented methods on Matrix Object in the Matrix.cc file on project’s Github.

Dilation and Erosion

Dilation and erosion are fundamental methods of mathematical morphology. I will explain how they work using the following image modifications.
Loaded logo
The dilation of the binary image A by the structuring element B is defined by
OpenCV dilate
OpenCV has a Matrix#dilate(iterations, structEl) method where iterations is the number of the dilation that will be performed, and structEl is the structuring element used for dilation (default is 3x3).
We can call a dilate method with this parameter.
img.dilate(3);
1
OpenCV calls a dilate method like this.
cv::dilate(self->mat, self->mat, structEl, cv::Point(-1, -1), 3);
1
After this call, we can get modified image like this.
Dilated logo
The erosion of the binary image A by the structuring element B is defined by
OpenCV Erode
In OpenCV, we can call a Matrix#erode(iterations, structEl) method which is similar to the dilation method.
We can use it like this:
img.erode(3);
1
and we get an eroded image.
OpenCV Eroded Logo Image

Edge detection

For edge detection, we can use the Canny Edge Detector algorithm, which was developed in 1986 and became a very popular algorithm - often being called the “optimal detector”. This algorithm meets the following three criteria, which are important in edge detection:
  1. Detection of edge with low error rate
  2. Good localization of edge - distance between edge and real edge pixels have to be minimal
  3. Edges in the image can only be marked once
Before using the Canny Edge Detector algorithm, we can convert the image to grayscale format, which can sometimes produce better results. Then, we can eliminate unnecessary noise from the image by using a Gaussian Blur filter which receives a parameter as a field - Gaussian kernel size. After using these two methods, we can get better and more accurate results in a Canny Edge.
im.convertGrayscale();
im.gaussianBlur([3, 3]);
12
Gaussian blur
The image is now ready to be detected by the Canny Edge algorithm. This algorithm receives parameters: lowThreshold and highThreshold.
Two thresholds allow you to divide pixels into three groups.
  • If the value of a gradient pixel is higher as highThreshold, the pixels are marked as strong edge pixels.
  • If the value of the gradient is between the high and low threshold, the pixels are marked as weak edge pixels.
  • If the value is below the low threshold level, those pixels are completely suppressed.
There isn't something like a global setting of the threshold for all images. You need to properly set up each threshold for each image separately. There are some possibilities for predicting the right thresholds, but I will not specify them in this article.
After calling the Canny Edge method, we also call a dilate method.
  const lowThresh = 0;
  const highThresh = 150;
  const iterations = 2;

  img.canny(lowThresh, highThresh);
  img.dilate(iterations);
123456
After these steps, we have an analyzed image. From this image, we can now select all the contours by calling the Matrix#findContours() method and writing it as a new Image.
  const WHITE = [255, 255, 255];
  let contours = img.findContours();
  let allContoursImg = img.drawAllContours(contours, WHITE);
  allContoursImg.save('./img/allContoursImg.jpg');
1234
Canny edge image with dilate
Image with dilate.
Canny edge image without dilate
Image without dilate.
In this picture, we can see all the contours found by the Canny Edge Detector.
If we want to select only the biggest of them, we can do it by using the following code, which goes through each contour and saves the biggest one. We can draw it by the Matrix#drawContour() method.
  const WHITE = [255, 255, 255];
  let contours = img.contours();
  let largestContourImg;
  let largestArea = 0;
  let largestAreaIndex;

  for (let i = 0; i < contours.size(); i++) {
    if (contours.area(i) > largestArea) {
      largestArea = contours.area(i);
      largestAreaIndex = i;
    }
  }

  largestContourImg.drawContour(contours, largestAreaIndex, GREEN, thickness, lineType);
1234567891011121314
Canny edge image with only one contour
If we want to draw more contours, for example, all contours larger than a certain value, we only move the Matrix#drawContour() method into a for loop and modify the if condition.
  const WHITE = [255, 255, 255];
  let contours = img.contours();
  let largestContourImg;
  let largestArea = 500;
  let largestAreaIndex;

  for (let i = 0; i < contours.size(); i++) {
    if (contours.area(i) > largestArea) {
      largestContourImg.drawContour(contours, i, GREEN, thickness, lineType);
    }
  }
1234567891011
Canny edge image with only more contour

Polygon Approximations

Polygon approximation can be used for several useful things. The most trivial is an approximation by bounding a rectangle around our object using the Contours#boundingRect(index) method. We call this method on the Contours object, which we get by calling the Matrix#findContours()method on an image after the Canny Edge Detection (which we discussed in the previous example).
let bound = contours.boundingRect(largestAreaIndex);
largestContourImg.rectangle([bound.x, bound.y], [bound.width, bound.height], WHITE, 2);
12
Polygon approximation
The second alternative to using approximation is the approximation of precision specified polygons by calling the Contours#approxPolyDP()method. By using the Contours#cornerCount(index) method, you get the number of angles in our polygon. I attached two images with various levels of precision below.
  let poly;
  let RED = [0, 0, 255];
  let arcLength = contours.arcLength(largestAreaIndex, true);
  contours.approxPolyDP(largestAreaIndex, arcLength * 0.05, true);
  poly.drawContour(contours, largestAreaIndex, RED);

  // number of corners
  console.log(contours.cornerCount(largestAreaIndex));
12345678
Approximation with specific precision 1
Approximation with specific precision 2
It is also interesting to use an approximation by the rotated rectangle of the minimum area, using the Contours#minAreaRect() method.
I use this method in my project to determine the angle of a particular object which is rotated into the right position after. In the next example, we add a rotated polygon into the largestContourImg variable and print the angle of our rotated polygon.
  let rect = contours.minAreaRect(largestAreaIndex);
  for (let i = 0; i < 4; i++) {
      largestContourImg.line([rect.points[i].x, rect.points[i].y], [rect.points[(i+1)%4].x, rect.points[(i+1)%4].y], RED, 3);
  }

// angle of polygon
console.log(rect.angle);

12345678
Approximation by the rotated rectangle

Image rotation without cropping

One of the things which I needed to solve and OpenCV have not implemented it, is image rotation without image cropping. We can easily rotate an image with the following code.
img.rotate(90);
1
But we get something like this:
Rotated image with rotate method
How can we rotate an image without cropping? Before the rotation, we create a new square 8-bit 3-channel Matrix called bgImg whose size is the diagonal size of our image for rotation.
After that, we calculate the position for our image which we can put into new bgImg Matrix. On the bgImg, we call the Matrix#rotate(angle)method with our value.
  let rect = contours.minAreaRect(largestAreaIndex);
  let diagonal = Math.round(Math.sqrt(Math.pow(im.size()[1], 2) + Math.pow(im.size()[0], 2)));
  let bgImg = new cv.Matrix(diagonal, diagonal, cv.Constants.CV_8UC3, [255, 255, 255]);
  let offsetX = (diagonal - im.size()[1]) / 2;
  let offsetY = (diagonal - im.size()[0]) / 2;

  IMG_ORIGINAL.copyTo(bgImg, offsetX, offsetY);
  bgImg.rotate(rect.angle + 90);

  bgImg.save('./img/rotatedImg.jpg');
12345678910
Rotated image without crop
After that, we can run the Canny Edge Detector on our new rotated image.
  const GREEN = [0, 255, 0];;
  let rotatedContour = new cv.Matrix(diagonal, diagonal);
  bgImg.canny(lowThresh, highThresh);
  bgImg.dilate(nIters);
  let contours = bgImg.findContours();

  for (let i = 0; i < contours.size(); i++) {
    if (contours.area(i) > largestArea) {
      largestArea = contours.area(i);
      largestAreaIndex = i;
    }
  }

  rotatedContour.drawContour(contours, largestAreaIndex, GREEN, thickness, lineType);
  rotatedContour.save('./img/rotatedImgContour.jpg');
123456789101112131415
Rotated image with contour
There are so many other methods that we can use on a picture. For example, there’s background removing, which can be very useful - but they are not covered in this article.

Object detection

I work with plants and I don't use a detector for faces, cars or other objects in my application.
Even so, I decided to mention face detection in this article because it can show the strength of OpenCV technology.
We call the Matrix#detectObject() method on our loaded image, which accepts a parameter as a path to cascade classifier, which we want to use. OpenCV comes with some pre-trained classifiers which can find figures, faces, eyes, ears, cars and some other object in pictures.
cv.readImage('./img/face.jpg', function(err, im){
  if (err) throw err;
  if (im.width() < 1 || im.height() < 1) throw new Error('Image has no size');

  im.detectObject('./data/haarcascade_frontalface_alt2.xml', {}, function(err, faces){
    if (err) throw err;

    for (var i = 0; i < faces.length; i++){
      var face = faces[i];
      im.ellipse(face.x + face.width / 2, face.y + face.height / 2, face.width / 2, face.height / 2, [255, 255, 0], 3);
    }

    im.save('./img/face-detection.jpg');
    console.log('Image saved.');
  });
});
12345678910111213141516
OpenCV Face detection example

OpenCV tutorial: Computer vision with Node.js

In this article, I talked about some interesting features of the popular OpenCV library used in Node.js. It is a real shame that there is no official interface for Node.js, although there is a library node-opencv, with less implemented features and an inconsistent API.
If you want to work with this library, you need to study the .cc files in the node-opencv repository, because there is no complete documentation of this library, at least yet.
Reading the code is absolutely OK, I love doing it, but I'm not happy with some inconsistencies and differences in return values compared with official OpenCV. I hope this library will soon develop, and I will try to contribute to it with a few lines of my own code.

留言

這個網誌中的熱門文章

完形心理學!?讓我們了解“介面設計師”為什麼這樣設計

完形心理學!?讓我們了解“介面設計師”為什麼這樣設計 — 說服客戶與老闆、跟工程師溝通、強化設計概念的有感心理學 — 情況 1 : 為何要留那麼多空白? 害我還要滾動滑鼠(掀桌) 情況 2 : 為什麼不能直接用一頁展現? 把客戶的需求塞滿不就完工啦! (無言) 情況 3: 這種設計好像不錯,但是為什麼要這樣做? (直覺大神告訴我這樣設計,但我說不出來為什麼..) 雖然世界上有許多 GUI 已經走得又長又遠又厲害,但別以為這種古代人對話不會出現,一直以來我們只是習慣這些 GUI 被如此呈現,但為何要這樣設計我們卻不一定知道。 由於 完形心理學 歸納出人類大腦認知之普遍性的規則,因此無論是不是 UI/UX 設計師都很適合閱讀本篇文章。但還是想特別強調,若任職於傳統科技公司,需要對上說服老闆,需要平行說服(資深)工程師,那請把它收進最愛;而習慣套用設計好的 UI 套件,但不知道為何這樣設計的 IT 工程師,也可以透過本文來強化自己的產品說服力。 那就開始吧~(擊掌) 完形心理學,又稱作格式塔(Gestalt)心理學,於二十世紀初由德國心理學家提出 — 用以說明人類大腦如何解釋肉眼所觀察到的事物,並轉化為我們所認知的物件。它可說是現代認知心理學的基礎,其貫徹的概念就是「整體大於個體的總合 “The whole is other than the sum of the parts.” —  Kurt Koffka」。 若深究完整的理論將會使本文變得非常的艱澀,因此筆者直接抽取個人認為與 UI 設計較為相關的 7 個原則(如下),並搭配實際案例做說明。有興趣了解全部理論的話可以另外 Google。 1. 相似性 (Similarity)  — 我們的大腦會把相似的事物看成一體 如果數個元素具有類似的尺寸、體積、顏色,使用者會自動為它們建立起關聯。這是因為我們的眼睛和大腦較容易將相似的事物組織在一起。如下圖所示,當一連串方塊和一連串的圓形並排時,我們會看成(a)一列方塊和兩列圓形(b)一排圓形和兩排三角形。 對應用到介面設計上,FB 每則文章下方的按鈕圖標(按讚 Like / 留言Comment / 分享 Share)雖然功能各不相同,但由於它們在視覺上顏色、大小、排列上的相似性,用戶會將它們視認為...

2017通訊大賽「聯發科技物聯網開發競賽」決賽團隊29強出爐!作品都在11月24日頒獎典禮進行展示

2017通訊大賽「聯發科技物聯網開發競賽」決賽團隊29強出爐!作品都在11月24日頒獎典禮進行展示 LIS   發表於 2017年11月16日 10:31   收藏此文 2017通訊大賽「聯發科技物聯網開發競賽」決賽於11月4日在台北文創大樓舉行,共有29個隊伍進入決賽,角逐最後的大獎,並於11月24日進行頒獎,現場會有全部進入決賽團隊的展示攤位,總計約為100個,各種創意作品琳琅滿目,非常值得一看,這次錯過就要等一年。 「聯發科技物聯網開發競賽」決賽持續一整天,每個團隊都有15分鐘面對評審團做簡報與展示,並接受評審們的詢問。在所有團隊完成簡報與展示後,主辦單位便統計所有評審的分數,並由評審們進行審慎的討論,決定冠亞季軍及其他各獎項得主,結果將於11月24日的「2017通訊大賽頒獎典禮暨成果展」現場公佈並頒獎。 在「2017通訊大賽頒獎典禮暨成果展」現場,所有入圍決賽的團隊會設置攤位,總計約為100個,展示他們辛苦研發並實作的作品,無論是想觀摩別人的成品、了解物聯網應用有那些新的創意、尋找投資標的、尋找人才、尋求合作機會或是單純有興趣,都很適合花點時間到現場看看。 頒獎典禮暨成果展資訊如下: 日期:2017年11月24日(星期五) 地點:中油大樓國光廳(台北市信義區松仁路3號) 我要報名參加「2017通訊大賽頒獎典禮暨成果展」>>> 在參加「2017通訊大賽頒獎典禮暨成果展」之前,可以先在本文觀看各團隊的作品介紹。 決賽29強團隊如下: 長者安全救星 可隨意描繪或書寫之電子筆記系統 微觀天下 體適能訓練管理裝置 肌少症之行走速率檢測系統 Sugar Robot 賽亞人的飛機維修輔助器 iTemp你的溫度個人化管家 語音行動冰箱 MR模擬飛行 智慧防盜自行車 跨平台X-Y視覺馬達控制 Ironmet 菸消雲散 無人小艇 (Mini-USV) 救OK-緊急救援小幫手 穿戴式長照輔助系統 應用於教育之模組機器人教具 這味兒很台味 Aquarium Hub 發展遲緩兒童之擴增實境學習系統 蚊房四寶 車輛相控陣列聲納環境偵測系統 戶外團隊運動管理裝置 懷舊治療數位桌曆 SeeM智能眼罩 觸...
2019全台精選3+個燈會,週邊順遊景點懶人包 2019燈會要去哪裡看?全台精選3+個燈會介紹、週邊順遊景點整理給你。 東港小鎮燈區-鮪鮪到來。 2019-02-15 微笑台灣編輯室 全台灣 各縣市政府 1435 延伸閱讀 ►  元宵節不只看燈會!全台元宵祭典精選、順遊景點整理 [屏東]2019台灣燈會在屏東 2/9-3/3:屏東市 · 東港鎮 · 大鵬灣國家風景區 台灣燈會自1990年起開始辦理,至2019年邁入第30週年,也是首次在屏東舉辦,屏東縣政府與交通部觀光局導入創新、科技元素,融入在地特色文化設計,在東港大鵬灣國家風景區打造廣闊的海洋灣域燈區,東港鎮結合漁港及宗教文化的小鎮燈區,及屏東市綿延近5公里長的綵燈節河岸燈區,讓屏東成為璀璨的光之南國,迎向國際。 詳細介紹 ►  2019台灣燈會在屏東 第一次移師國境之南 大鵬灣燈區 主題樂園式燈會也是主燈所在區,區內分為農業海洋燈區、客家燈區、原住民燈區、綠能環保燈區、藝術燈區、宗教燈區、競賽花燈及317個社區關懷據點手作的萬歲光廊等。 客家燈籠隧道。 平日:周一~周四14:00-22:30(熄燈) 假日:周五~周六10:00-22:30(熄燈)  屏東燈區: 萬年溪畔 屏東綵燈節藍區-生態。 綵燈節--每日17:30 - 22:00(熄燈) 勝利星村--平日:14:00 - 22:30(熄燈) 假日:10:00 - 22:30(熄燈) 燈區以「彩虹」為主題,沿著蜿蜒市區的萬年溪打造近5公里長的光之流域,50組水上、音樂及互動科技等不同類型燈飾,呈現紅色熱情、橙色活力、黃色甜美、綠色雄偉、藍色壯闊、靛色神祕、紫色華麗等屏東風情。勝利星村另有懷舊風的燈飾,及屏東公園聖誕節燈飾。 東港小鎮燈區 東港小鎮燈區-鮪鮪到來。 小鎮燈區以海的屏東為主題,用漁港風情及宗教文化內涵規劃4個主題區,分別為張燈結綵趣、東津好風情、神遊幸福海、延平老街區。每日17:00~22:30(熄燈) 以上台灣燈會資料來源: 2019台灣燈會官網 、 i屏東~愛屏東 。 >> 順遊行程 小吃旅行-東港小鎮 東港小吃和東港人一樣,熱情澎湃...