This content originally appeared on Modern Web Development with Chrome and was authored by Paul Kinlan
<p>I was at the party of the <a href="https://developer.chrome.com/devsummit">Chrome Dev
Summit</a> and <a href="https://twitter.com/yellowdoge">Miguel Casas-Sanchez</a> on the
Chrome team came up to me and said "Hey Paul, I have a demo for you". Once I
saw it, I had to get it into my talk.</p>
<p>That API was the <a href="https://wicg.github.io/shape-detection-api/#introduction">Shape Detection
API</a> that is currently
in the <a href="https://github.com/wicg/">WICG</a> in an incubation and experimentation
phase and is a nice incremental addition to the platform.</p>
<p>The Shape Detection API is interesting because it creates a standard interface
on top of some underlaying hardware features on the user's device and opens up a
new set of capabilities to the web platform.</p>
<p>Shape Detection has been possible on the web for a long time. There are numerous
libraries that have been able to do Edge Detection, Face Detection, bar-code
and QR code detection (I even wrote a web app that has done it.)</p>
<p>The Shape Detection API is currently in Chrome Canary (M57) and can detect both
faces and bar-codes (and QR Codes) and because it is still experimental
you have to enable it via <code>chrome://flags/#enable-experimental-web-platform-features</code></p>
<p>The API is relatively simple to use, with the simplest form of face detection
being to invoke the API with an image and get the list of faces back.</p>
<div class="highlight"><pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-javascript" data-lang="javascript"><span style="color:#66d9ef">var</span> <span style="color:#a6e22e">faceDetector</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">new</span> <span style="color:#a6e22e">FaceDetector</span>();
<span style="color:#a6e22e">faceDetector</span>.<span style="color:#a6e22e">detect</span>(<span style="color:#a6e22e">image</span>)
.<span style="color:#a6e22e">then</span>(<span style="color:#a6e22e">faces</span> => <span style="color:#a6e22e">faces</span>.<span style="color:#a6e22e">forEach</span>(<span style="color:#a6e22e">face</span> => <span style="color:#a6e22e">console</span>.<span style="color:#a6e22e">log</span>(<span style="color:#a6e22e">face</span>)))
.<span style="color:#66d9ef">catch</span>(<span style="color:#a6e22e">e</span> => {
<span style="color:#a6e22e">console</span>.<span style="color:#a6e22e">error</span>(<span style="color:#e6db74">"Boo, Face Detection failed: "</span> <span style="color:#f92672">+</span> <span style="color:#a6e22e">e</span>);
});
</code></pre></div><p>It takes an image object (either a CanvasImageSource, Blob, ImageData or an
<code><img></code> element) and then passes that to the underlying system API and it will
return an array of <code>DetectedFace</code> objects that implement <code>DetectedObject</code> which
essentially gives you the bounds of each face in the image.</p>
<p>Miguel wrote a fuller demo (which I stole and put on
<a href="https://jsbin.com/gegudoc/4/">JSBin</a>) that loads an image, passes it through
the detection API and then draws on the image a rectangle around each of
the <code>DetectedFace</code> faces. (Note: currently only works on Chrome for Android,
Desktop support is landing soon.)</p>
<div class="highlight"><pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-javascript" data-lang="javascript"><span style="color:#66d9ef">var</span> <span style="color:#a6e22e">image</span> <span style="color:#f92672">=</span> document.<span style="color:#a6e22e">getElementById</span>(<span style="color:#e6db74">'image'</span>);
<span style="color:#66d9ef">var</span> <span style="color:#a6e22e">canvas</span> <span style="color:#f92672">=</span> document.<span style="color:#a6e22e">getElementById</span>(<span style="color:#e6db74">'canvas'</span>);
<span style="color:#66d9ef">var</span> <span style="color:#a6e22e">ctx</span> <span style="color:#f92672">=</span> <span style="color:#a6e22e">canvas</span>.<span style="color:#a6e22e">getContext</span>(<span style="color:#e6db74">'2d'</span>);
<span style="color:#66d9ef">var</span> <span style="color:#a6e22e">scale</span> <span style="color:#f92672">=</span> <span style="color:#ae81ff">1</span>;
<span style="color:#a6e22e">image</span>.<span style="color:#a6e22e">onload</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">function</span>() {
<span style="color:#a6e22e">ctx</span>.<span style="color:#a6e22e">drawImage</span>(<span style="color:#a6e22e">image</span>,
<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">0</span>, <span style="color:#a6e22e">image</span>.<span style="color:#a6e22e">width</span>, <span style="color:#a6e22e">image</span>.<span style="color:#a6e22e">height</span>,
<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">0</span>, <span style="color:#a6e22e">canvas</span>.<span style="color:#a6e22e">width</span>, <span style="color:#a6e22e">canvas</span>.<span style="color:#a6e22e">height</span>);
<span style="color:#a6e22e">scale</span> <span style="color:#f92672">=</span> <span style="color:#a6e22e">canvas</span>.<span style="color:#a6e22e">width</span> <span style="color:#f92672">/</span> <span style="color:#a6e22e">image</span>.<span style="color:#a6e22e">width</span>;
};
<span style="color:#66d9ef">function</span> <span style="color:#a6e22e">detect</span>() {
<span style="color:#66d9ef">if</span> (window.<span style="color:#a6e22e">FaceDetector</span> <span style="color:#f92672">==</span> <span style="color:#66d9ef">undefined</span>) {
<span style="color:#a6e22e">console</span>.<span style="color:#a6e22e">error</span>(<span style="color:#e6db74">'Face Detection not supported'</span>);
<span style="color:#66d9ef">return</span>;
}
<span style="color:#66d9ef">var</span> <span style="color:#a6e22e">faceDetector</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">new</span> <span style="color:#a6e22e">FaceDetector</span>();
<span style="color:#a6e22e">faceDetector</span>.<span style="color:#a6e22e">detect</span>(<span style="color:#a6e22e">image</span>)
.<span style="color:#a6e22e">then</span>(<span style="color:#a6e22e">faces</span> => {
<span style="color:#75715e">// Draw the faces on the <canvas>.
</span><span style="color:#75715e"></span> <span style="color:#66d9ef">var</span> <span style="color:#a6e22e">ctx</span> <span style="color:#f92672">=</span> <span style="color:#a6e22e">canvas</span>.<span style="color:#a6e22e">getContext</span>(<span style="color:#e6db74">'2d'</span>);
<span style="color:#a6e22e">ctx</span>.<span style="color:#a6e22e">lineWidth</span> <span style="color:#f92672">=</span> <span style="color:#ae81ff">2</span>;
<span style="color:#a6e22e">ctx</span>.<span style="color:#a6e22e">strokeStyle</span> <span style="color:#f92672">=</span> <span style="color:#e6db74">'red'</span>;
<span style="color:#66d9ef">for</span>(<span style="color:#66d9ef">let</span> <span style="color:#a6e22e">face</span> <span style="color:#66d9ef">of</span> <span style="color:#a6e22e">faces</span>) {
<span style="color:#a6e22e">ctx</span>.<span style="color:#a6e22e">rect</span>(Math.<span style="color:#a6e22e">floor</span>(<span style="color:#a6e22e">face</span>.<span style="color:#a6e22e">x</span> <span style="color:#f92672">*</span> <span style="color:#a6e22e">scale</span>),
Math.<span style="color:#a6e22e">floor</span>(<span style="color:#a6e22e">face</span>.<span style="color:#a6e22e">y</span> <span style="color:#f92672">*</span> <span style="color:#a6e22e">scale</span>),
Math.<span style="color:#a6e22e">floor</span>(<span style="color:#a6e22e">face</span>.<span style="color:#a6e22e">width</span> <span style="color:#f92672">*</span> <span style="color:#a6e22e">scale</span>),
Math.<span style="color:#a6e22e">floor</span>(<span style="color:#a6e22e">face</span>.<span style="color:#a6e22e">height</span> <span style="color:#f92672">*</span> <span style="color:#a6e22e">scale</span>));
<span style="color:#a6e22e">ctx</span>.<span style="color:#a6e22e">stroke</span>();
}
})
.<span style="color:#66d9ef">catch</span>((<span style="color:#a6e22e">e</span>) => {
<span style="color:#a6e22e">console</span>.<span style="color:#a6e22e">error</span>(<span style="color:#e6db74">"Boo, Face Detection failed: "</span> <span style="color:#f92672">+</span> <span style="color:#a6e22e">e</span>);
});
}
</code></pre></div><h3 id="what-does-this-enable">What does this enable?</h3>
<p>There are quite a few different use-cases that are opened up a little more
with the FaceDetection API, for example you could:</p>
<ul>
<li>Vastly more peformant experiences when detecting faces — we have a lot
of flexibility with this API as it allows us to move the processing into a
Service or Web Worker.</li>
<li>Profile picture cropping — find your face in the picture and automatically
crop the image so that</li>
<li>Enable quick tagging — quickly find all the faces in a scene and create
a UI that enables you to quickly tag them.</li>
<li>Optimising facial recognition — once you have the image of the face you
can then pass just those regions to your Facial Recognition tools.</li>
</ul>
<p>Are these all possible today in the browser? Yes, but you need to plan for
progressive use ahead of time.</p>
<h3 id="planning-for-progressive-ness">Planning for Progressive-ness.</h3>
<p>This is obviously a pure JS API that requires access to the underlying
hardware APIs, but this can "easily" (heh) be built to be fully progressive and
ensuring that users who don't use the latest version of Chrome are still
able to access your experience.</p>
<p>My thoughts around this follow a relatively standard approach to progressive
enhancement: Server → JS (+ Web ASM maybe) → Web API but I thought
I would explore this a little bit further as I do see a number of challenges.</p>
<h4 id="server">Server</h4>
<p>We can create a simple form that has an <code><input type="file"></code> that uploads an
image to your server and you do your image detection on the server and
return the results to the client.</p>
<h4 id="js">JS</h4>
<p>If we have JS enabled we have the ability to do facial detection inside the
browser and directly in the context of the page using any one of a number of
client libraries.</p>
<p>The Web Assembly aside:</p>
<p>It is incredibly hard (at least in my opinion) to do image processing and even
harder to do object detection especially in a performant way. "Native platforms"
have long had many libraries (<a href="http://docs.opencv.org/2.4/modules/contrib/doc/facerec/facerec_tutorial.html">Open
CV</a>
for example) that are primarily written in C which can now be brought to the
browser and take advantage of the rich eco-system and also be in the same order
of magnitude of performance.</p>
<p>It would be incredible useful if someone made a polyfil for this ShapeDetection
API.</p>
<h4 id="web-api">Web API</h4>
<p>Now that we can get ubiquity across all platforms it is possible to utilise the
underlying system API when it is available.</p>
<p>I think this is an interesting API to bring to the platform and it certainly
opens up a range of possibilities, specifically for me this is about vastly
increasing the performance of object detection on the web by using the
underlying system as opposed to pure javascript and this is why I am looking
forward to the bar-code detection API as it will greatly increase the performance
of my <a href="https://qrsnapper.appspot.com/">QR Scanner Web app</a> whilst at the same
time reducing the complexity of the application.</p>
This content originally appeared on Modern Web Development with Chrome and was authored by Paul Kinlan