This content originally appeared on Modern Web Development with Chrome and was authored by Paul Kinlan
<p>I've been playing around a lot with the [Shape Detection
API](<a href="https://paul.kinlan.me/face-detection/">https://paul.kinlan.me/face-detection/</a>
<a href="https://paul.kinlan.me/barcode-detection/">https://paul.kinlan.me/barcode-detection/</a>
<a href="https://paul.kinlan.me/detecting-text-in-an-image/">https://paul.kinlan.me/detecting-text-in-an-image/</a>) in Chrome a lot and I really
like the potential it has, for example a very simple <a href="https://qrsnapper.com">QRCode
detector</a> I wrote a long time ago has a JS polyfill, but
uses <code>new BarcodeDetector()</code> API if it is available.</p>
<p>You can see some of the other demo's I've built here using the other
capabilities of the shape detection API: <a href="https://paul.kinlan.me/face-detection/">Face
Detection</a>,<a href="https://paul.kinlan.me/barcode-detection/">Barcode
Detection</a> and <a href="https://paul.kinlan.me/detecting-text-in-an-image/">Text
Detection</a>.</p>
<p>I was pleasantly surprised when I stumbled across <a href="https://jeeliz.com">Jeeliz</a>
at the weekend and I was incredibly impressed at the performance of their
toolkit - granted I was using a Pixel3 XL, but detection of faces seemed
significantly quicker than what is possible with the <code>FaceDetector</code> API.</p>
<p><a href="https://jeeliz.com/sunglasses">Checkout some of their demos</a>.</p>
<figure>
<img src="https://paul.kinlan.me/images/2019-03-11-object-detection-and-augmentation.jpeg">
</figure>
<p>It got me thinking a lot. This toolkit for Object Detection (and ones like it)
use API's that are broadly available on the Web specifically Camera access,
WebGL and WASM, which unlike Chrome's Shape Detection API (which is only in
Chrome and not consistent across all platforms that Chrome is on) can be used to
build rich experiences easily and reach billions of users with a consistent
experience across all platforms.</p>
<p>Augmentation is where it gets interesting (and really what I wanted to show off
in this post) and where you need middleware libraries that are now coming to the
platform, we can build the fun snapchat-esque face filter apps without having
users install MASSIVE apps that harvest huge amount of data from the users
device (because there is no underlying access to the system).</p>
<p>Outside of the fun demos, it's possible to solve very advanced use-cases quickly
and simply for the user, such as:</p>
<ul>
<li>Text Selection directly from the camera or photo from the user</li>
<li>Live translation of languages from the camera</li>
<li>Inline QRCode detection so people don't have to open WeChat all the time :)</li>
<li>Auto extract website URLs or address from an image</li>
<li>Credit card detection and number extraction (get users signing up to your site
quicker)</li>
<li>Visual product search in your store's web app.</li>
<li>Barcode lookup for more product details in your stores web app.</li>
<li>Quick cropping of profile photos on to people's faces.</li>
<li>Simple A11Y features to let the a user hear the text found in images.</li>
</ul>
<p>I just spent 5 minutes thinking about these use-cases — I know there are a
lot more — but it hit me that we don't see a lot of sites or web apps
utilising the camera, instead we see a lot of sites asking their users to
download an app, and I don't think we need to do that any more.</p>
<p><strong>Update</strong> Thomas Steiner on our team mentioned in our team Chat that it sounds
like I don't like the current <code>ShapeDetection</code> API. I love the fact that this
API gives us access to the native shipping implementations of the each of the
respective systems, however as I wrote in <a href="https://paul.kinlan.me/the-lumpy-web/">The Lumpy Web</a>, Web
Developers crave consistency in the platform and there are number of issues with
the Shape Detection API that can be summarized as:</p>
<ol>
<li>The API is only in Chrome</li>
<li>The API in Chrome is vastly different on every platforms because their
underlying implementations are different. Android only has points for
landmarks such as mouth and eyes, where macOS has outlines. On Android the
<code>TextDetector</code> returns the detected text, where as on macOS it returns a
'Text Presence' indicator... This is not to mention all the bugs that Surma
found.</li>
</ol>
<p>The web as a platform for distribution makes so much sense for experiences like
these that I think it would be remiss of us not to do it, but the above two
groupings of issues leads me to question the long-term need to implement every
feature on the web platform natively, when we could implement good solutions in
a package that is shipped using the features of the platform today like WebGL,
WASM and in the future Web GPU.</p>
<p>Anyway, I love the fact that we can do this on the web and I am looking forwards
to seeing sites ship with them.</p>
This content originally appeared on Modern Web Development with Chrome and was authored by Paul Kinlan