Recognize your users’ handwriting

The Handwriting Recognition API is part of the capabilities project and
is currently in development. This post will be updated as the implementation progresses.

What is the Handwriting Recognition API? #
The Handwriting Recognition API allows you to …


This content originally appeared on web.dev and was authored by Christian Liebel

The Handwriting Recognition API is part of the capabilities project and is currently in development. This post will be updated as the implementation progresses.

What is the Handwriting Recognition API? #

The Handwriting Recognition API allows you to convert handwriting (ink) from your users into text. Some operating systems have long included such APIs, and with this new capability, your web apps can finally use this functionality. The conversion takes place directly on the user's device, works even in offline mode, all without adding any third-party libraries or services.

This API implements so-called "on-line" or near real-time recognition. This means that the handwritten input is recognized while the user is drawing it by capturing and analyzing the single strokes. In contrast to "off-line" procedures such as Optical Character Recognition (OCR), where only the end product is known, on-line algorithms can provide a higher level of accuracy due to additional signals like the temporal sequence and pressure of individual ink strokes.

Suggested use cases for the Handwriting Recognition API #

Example uses include:

  • Note-taking applications where users want to capture handwritten notes and have them translated into text.
  • Forms applications where users can use pen or finger input due to time constraints.
  • Games that require filling in letters or numbers, such as crosswords, hangman, or sudoku.

Current status #

Step Status
1. Create explainer Complete
2. Create initial draft of specification Not started
3. Gather feedback & iterate on design In progress
4. Origin trial In progress
5. Launch Not started

How to use the Handwriting Recognition API #

Enabling via about://flags #

To experiment with the Handwriting Recognition API locally, without an origin trial token, enable the #experimental-web-platform-features flag in about://flags.

Note that the API is currently exclusive to Chrome OS devices. Chrome 91 already contains limited support for the API, but to fully experience it, we recommend you test on Chrome 92 to Chrome 94.

Enabling support during the origin trial phase #

Starting in Chrome 92, the Handwriting Recognition API will be available as an origin trial on Chrome OS. The origin trial is expected to end in Chrome 94 (October 13, 2021).

Origin trials allow you to try new features and give feedback on their usability, practicality, and effectiveness to the web standards community. For more information, see the Origin Trials Guide for Web Developers. To sign up for this or another origin trial, visit the registration page.

Register for the origin trial #

  1. Request a token for your origin.
  2. Add the token to your pages. There are two ways to do that:
    • Add an origin-trial <meta> tag to the head of each page. For example, this may look something like:
      <meta http-equiv="origin-trial" content="TOKEN_GOES_HERE">
    • If you can configure your server, you can also add the token using an Origin-Trial HTTP header. The resulting response header should look something like:
      Origin-Trial: TOKEN_GOES_HERE

Feature detection #

Detect browser support by checking for the existence of the createHandwritingRecognizer() method on the navigator object:

if ('createHandwritingRecognizer' in navigator) {
// ? The Handwriting Recognition API is supported!
}

Core concepts #

The Handwriting Recognition API converts handwritten input into text, regardless of the input method (mouse, touch, pen). The API has four main entities:

  1. A point represents where the pointer was at a particular time.
  2. A stroke consists of one or more points. The recording of a stroke starts when the user puts the pointer down (i.e., clicks the primary mouse button, or touches the screen with their pen or finger) and ends when they raise the pointer back up.
  3. A drawing consists of one or more strokes. The actual recognition takes place at this level.
  4. The recognizer is configured with the expected input language. It is used to create an instance of a drawing with the recognizer configuration applied.

These concepts are implemented as specific interfaces and dictionaries, which I'll cover shortly.

The core entities of the Handwriting Recognition API: One or more points compose a stroke, one or more strokes compose a drawing, that the recognizer creates. The actual recognition takes place at the drawing level.

Creating a recognizer #

To recognize text from handwritten input, you need to obtain an instance of a HandwritingRecognizer by calling navigator.createHandwritingRecognizer() and passing constraints to it. Constraints determine the handwriting recognition model that should be used. Currently, you can specify a list of languages in order of preference:

const recognizer = await navigator.createHandwritingRecognizer({
languages: ['en'],
});

Caution: The current implementation on Chrome OS can only recognize one language at a time. It only supports English (en), and a gesture model (zxx-x-gesture) to recognize gestures such as crossing out words.

The method returns a promise resolving with an instance of a HandwritingRecognizer when the browser can fulfill your request. Otherwise, it will reject the promise with an error, and handwriting recognition will not be available. For this reason, you may want to query the recognizer's support for particular recognition features first.

Querying recognizer support #

By calling navigator.queryHandwritingRecognizerSupport(), you can check if the target platform supports the handwriting recognition features you intend to use. In the following example, the developer:

  • wants to detect texts in English
  • get alternative, less likely predictions when available
  • gain access to the segmentation result, i.e., the recognized characters, including the points and strokes that make them up
const { languages, alternatives, segmentationResults } =
await navigator.queryHandwritingRecognizerSupport({
languages: ['en'],
alternatives: true,
segmentationResult: true,
});

console.log(languages); // true or false
console.log(alternatives); // true or false
console.log(segmentationResult); // true or false

The method returns a promise resolving with a result object. If the browser supports the feature specified by the developer, its value will be set to true. Otherwise, it will be set to false. You can use this information to enable or disable certain features within your application, or to adjust your query and send a new one.

Due to fingerprinting concerns, you cannot request a list of supported features, such as particular languages, and the browser may ask for user permission or reject your request entirely if you send too many feature queries.

Start a drawing #

Within your application, you should offer an input area where the user makes their handwritten entries. For performance reasons, it is recommended to implement this with the help of a canvas object. The exact implementation of this part is out of scope for this article, but you may refer to the demo to see how it can be done.

To start a new drawing, call the startDrawing() method on the recognizer. This method takes an object containing different hints to fine-tune the recognition algorithm. All hints are optional:

  • The kind of text being entered: text, email addresses, numbers, or an individual character (recognitionType)
  • The type of input device: mouse, touch, or pen input (inputType)
  • The preceding text (textContext)
  • The number of less-likely alternative predictions that should be returned (alternatives)
  • A list of user-identifiable characters ("graphemes") the user will most likely enter (graphemeSet)

The Handwriting Recognition API plays well with Pointer Events which provide an abstract interface to consume input from any pointing device. The pointer event arguments contain the type of pointer being used. This means you can use pointer events to determine the input type automatically. In the following example, the drawing for handwriting recognition is automatically created on the first occurrence of a pointerdown event on the handwriting area. As the pointerType may be empty or set to a proprietary value, I introduced a consistency check to make sure only supported values are set for the drawing's input type.

let drawing;
let activeStroke;

canvas.addEventListener('pointerdown', (event) => {
if (!drawing) {
drawing = recognizer.startDrawing({
recognitionType: 'text', // email, number, per-character
inputType: ['mouse', 'touch', 'pen'].find((type) => type === event.pointerType),
textContext: 'Hello, ',
alternatives: 2,
graphemeSet: ['f', 'i', 'z', 'b', 'u'], // for a fizz buzz entry form
});
}
startStroke(event);
});

Caution: The current implementation on Chrome OS does not support grapheme sets yet, they are silently ignored.

Add a stroke #

The pointerdown event is also the right place to start a new stroke. To do so, create a new instance of HandwritingStroke. Also, you should store the current time as a point of reference for the subsequent points added to it:

function startStroke(event) {
activeStroke = {
stroke: new HandwritingStroke(),
startTime: Date.now(),
};
addPoint(event);
}

Add a point #

After creating the stroke, you should directly add the first point to it. As you will add more points later on, it makes sense to implement the point creation logic in a separate method. In the following example, the addPoint() method calculates the elapsed time from the reference timestamp. The temporal information is optional, but can improve recognition quality. Then, it reads the X and Y coordinates from the pointer event and adds the point to the current stroke.

function addPoint(event) {
const timeElapsed = Date.now() - activeStroke.startTime;
activeStroke.stroke.addPoint({
x: event.offsetX,
y: event.offsetY,
t: timeElapsed,
});
}

The pointermove event handler is called when the pointer is moved across the screen. Those points need to be added to the stroke as well. The event can also be raised if the pointer is not in a "down" state, for example when moving the cursor across the screen without pressing the mouse button. The event handler from the following example checks if an active stroke exists, and adds the new point to it.

canvas.addEventListener('pointermove', (event) => {
if (activeStroke) {
addPoint(event);
}
});

Recognize text #

When the user lifts the pointer again, you can add the stroke to your drawing by calling its addStroke() method. The following example also resets the activeStroke, so the pointermove handler will not add points to the completed stroke.

If necessary, you can also use the drawing's getStrokes() method to list all strokes, and the removeStroke() method to remove a particular one from the drawing.

Next, it's time for recognizing the user's input by calling the getPrediction() method on the drawing. Recognition usually takes less than a few hundred milliseconds, so you can repeatedly run predictions if needed. The following example runs a new prediction after each completed stroke.

canvas.addEventListener('pointerup', async (event) => {
drawing.addStroke(activeStroke.stroke);
activeStroke = null;

const [mostLikelyPrediction, ...lessLikelyAlternatives] = await drawing.getPrediction();
if (mostLikelyPrediction) {
console.log(mostLikelyPrediction.text);
}
lessLikelyAlternatives?.forEach((alternative) => console.log(alternative.text));
});

This method returns a promise which resolves with an array of predictions ordered by their likelihood. The number of elements depends on the value you passed to the alternatives hint. You could use this array to present the user with a choice of possible matches, and have them select an option. Alternatively, you can simply go with the most likely prediction, which is what I do in the example.

The prediction object contains the recognized text and an optional segmentation result, which I will discuss in the following section.

Detailed insights with segmentation results #

If supported by the target platform, the prediction object can also contain a segmentation result. This is an array containing all recognized handwriting segment, a combination of the recognized user-identifiable character (grapheme) along with its position in the recognized text (beginIndex, endIndex), and the strokes and points that created it.

if (mostLikelyPrediction.segmentationResult) {
mostLikelyPrediction.segmentationResult.forEach(
({ grapheme, beginIndex, endIndex, drawingSegments }) => {
console.log(grapheme, beginIndex, endIndex);
drawingSegments.forEach(({ strokeIndex, beginPointIndex, endPointIndex }) => {
console.log(strokeIndex, beginPointIndex, endPointIndex);
});
},
);
}

You could use this information to track down the recognized graphemes on the canvas again.

Boxes are drawn around each recognized grapheme

Complete recognition #

After the recognition has completed, you can free resources by calling the clear() method on the HandwritingDrawing, and the finish() method on the HandwritingRecognizer:

drawing.clear();
recognizer.finish();

Demo #

The web component <handwriting-textarea> implements a progressively enhanced, editing control capable of handwriting recognition. By clicking the button in the lower right corner of the editing control, you activate the drawing mode. When you complete the drawing, the web component will automatically start the recognition and add the recognized text back to the editing control. If the Handwriting Recognition API is not supported at all, or the platform doesn't support the requested features, the edit button will be hidden. But the basic editing control remains usable as a <textarea>.

The web component offers properties and attributes to define the recognition behavior from the outside, including languages and recognitiontype. You can set the content of the control via the value attribute:

<handwriting-textarea languages="en" recognitiontype="text" value="Hello"></handwriting-textarea>

To be informed about any changes to the value, you can listen to the input event.

You can try the component using this demo on Glitch. Also be sure to have a look at the source code. To use the control in your application, obtain it from npm.

Security and permissions #

The Chromium team has designed and implemented the Handwriting Recognition API using the core principles defined in Controlling Access to Powerful Web Platform Features, including user control, transparency, and ergonomics.

User control #

The Handwriting Recognition API can't be turned off by the user. It is only available for websites delivered via HTTPS, and may only be called from the top-level browsing context.

Transparency #

There is no indication if handwriting recognition is active. To prevent fingerprinting, the browser implements countermeasures, such as displaying a permission prompt to the user when it detects possible abuse.

Permission persistence #

The Handwriting Recognition API currently does not show any permissions prompts. Thus, permission does not need to be persisted in any way.

Feedback #

The Chromium team wants to hear about your experiences with the Handwriting Recognition API.

Tell us about the API design #

Is there something about the API that doesn't work like you expected? Or are there missing methods or properties that you need to implement your idea? Have a question or comment on the security model? File a spec issue on the corresponding GitHub repo, or add your thoughts to an existing issue.

Report a problem with the implementation #

Did you find a bug with Chromium's implementation? Or is the implementation different from the spec? File a bug at new.crbug.com. Be sure to include as much detail as you can, simple instructions for reproducing, and enter Blink>Handwriting in the Components box. Glitch works great for sharing quick and easy repros.

Show support for the API #

Are you planning to use the Handwriting Recognition API? Your public support helps the Chromium team prioritize features and shows other browser vendors how critical it is to support them.

Share how you plan to use it on the WICG Discourse thread. Send a tweet to @ChromiumDev using the hashtag #HandwritingRecognition and let us know where and how you're using it.

Acknowledgements #

This article was reviewed by Joe Medley, Honglin Yu and Jiewei Qian. Hero image by Samir Bouaked on Unsplash.


This content originally appeared on web.dev and was authored by Christian Liebel


Print Share Comment Cite Upload Translate Updates
APA

Christian Liebel | Sciencx (2021-05-17T00:00:00+00:00) Recognize your users’ handwriting. Retrieved from https://www.scien.cx/2021/05/17/recognize-your-users-handwriting/

MLA
" » Recognize your users’ handwriting." Christian Liebel | Sciencx - Monday May 17, 2021, https://www.scien.cx/2021/05/17/recognize-your-users-handwriting/
HARVARD
Christian Liebel | Sciencx Monday May 17, 2021 » Recognize your users’ handwriting., viewed ,<https://www.scien.cx/2021/05/17/recognize-your-users-handwriting/>
VANCOUVER
Christian Liebel | Sciencx - » Recognize your users’ handwriting. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2021/05/17/recognize-your-users-handwriting/
CHICAGO
" » Recognize your users’ handwriting." Christian Liebel | Sciencx - Accessed . https://www.scien.cx/2021/05/17/recognize-your-users-handwriting/
IEEE
" » Recognize your users’ handwriting." Christian Liebel | Sciencx [Online]. Available: https://www.scien.cx/2021/05/17/recognize-your-users-handwriting/. [Accessed: ]
rf:citation
» Recognize your users’ handwriting | Christian Liebel | Sciencx | https://www.scien.cx/2021/05/17/recognize-your-users-handwriting/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.