The zebra jumps quickly over a fence, vexed by a lazy ox

The zebra jumps quickly over a fence, vexed by a lazy ox. Eden tries to alter soft stone near it. Tall giants often need to rest, and open roads invite no pause. Some long lines appear there. In bright cold night, stars drift, and people watch them. A few near doors step out. Much light […]


This content originally appeared on phpied.com and was authored by Stoyan

The zebra jumps quickly over a fence, vexed by a lazy ox. Eden tries to alter soft stone near it. Tall giants often need to rest, and open roads invite no pause. Some long lines appear there. In bright cold night, stars drift, and people watch them. A few near doors step out. Much light finds land slowly, while men feel deep quiet. Words run in ways, forward yet true. Look ahead, and things form still, yet dreams stay hidden. Down the path, close skies come, forming hard arcs. High above, quiet kites drift, fast on pure wind, yanking joints.

What's so special about the nonsense paragraph above? It's attempting to match the average distribution of letters in texts written in the English language.

This article by Peter Norvig discusses a 2012 study of letter frequency using Google books data set. And the distribution look like so:

For font-fallback matching purposes (more on this later) I want a shorter paragraph, representing roughly similar distribution. One can, of course, just create a paragraph like "Zzzzzzzzz" (9 Zs), followed by 12 Qs and so on, all the way to 1249 Es. But where's the fun in that? Plus texts have spaces and punctuation too.

So after some tweaking and coaching AI, this is a paragraph that came out that looks more realistic and matches the letter frequency pretty well.

Here's a CSV that shows:

  • each letter,
  • the Norvig's frequencies (based on 3,563,505,777,820 letters in the dataset) and
  • my frequencies too (based on mere 424 letters, once you take out spaces and punctuation)
Letter,Norvig,Tall giants
E,12.49%,12.26%
T,9.28%,8.73%
A,8.04%,7.55%
O,7.64%,7.08%
I,7.57%,6.60%
N,7.23%,7.55%
S,6.51%,6.84%
R,6.28%,6.13%
H,5.05%,4.01%
L,4.07%,4.48%
D,3.82%,5.42%
C,3.34%,1.89%
U,2.73%,2.36%
M,2.51%,2.12%
F,2.40%,2.83%
P,2.14%,2.59%
G,1.87%,2.12%
W,1.68%,2.12%
Y,1.66%,2.12%
B,1.48%,0.94%
V,1.05%,0.94%
K,0.54%,1.18%
X,0.23%,0.47%
J,0.16%,0.47%
Q,0.12%,0.71%
Z,0.09%,0.47%

Here's the same data represented graphically:

Well, what's the point of this?

Similar to the nonsense etaoin shrdlu used by typesetters, this paragraph can be used to find out the average character width of a font.

Just render the paragraph in a non-wrapping inline-block DOM element, measure the width of the element and divide by the length of the text.

How is this useful? Welp, to set the size-adjust CSS property of a fallback font to match a custom web font. Further write up is coming, stay tuned!

Close enough

As you can see in the graph, the two lines do not match exactly. I think this is OK. It's extremely unlikely that any text on your page will have the exact average distribution of letters in it. So we're talking about an approximation to begin with. May also be site-dependent. E.g. in an adult site maybe the X character will occur more often than the average book.

Also Norvig's analysis doesn't mention spaces and punctuation. In my paragraph, these exist, maybe making it possible to match the average text on a web page just a little bit closer.

Aside: why not just Lorem Ipsum

Well, it doesn't attempt to match the character distribution in English. (Duh, it's not even English!)
Here's what it looks like in the same digram:

Note: no K, J, Z, W or Y. Barely any H.

Here are the stats in CSV and .numbers for your perusal.

May "The zebra jumps quickly over a fence, vexed by a lazy ox" be always in your favor!


This content originally appeared on phpied.com and was authored by Stoyan


Print Share Comment Cite Upload Translate Updates
APA

Stoyan | Sciencx (2024-10-21T05:18:13+00:00) The zebra jumps quickly over a fence, vexed by a lazy ox. Retrieved from https://www.scien.cx/2024/10/21/the-zebra-jumps-quickly-over-a-fence-vexed-by-a-lazy-ox/

MLA
" » The zebra jumps quickly over a fence, vexed by a lazy ox." Stoyan | Sciencx - Monday October 21, 2024, https://www.scien.cx/2024/10/21/the-zebra-jumps-quickly-over-a-fence-vexed-by-a-lazy-ox/
HARVARD
Stoyan | Sciencx Monday October 21, 2024 » The zebra jumps quickly over a fence, vexed by a lazy ox., viewed ,<https://www.scien.cx/2024/10/21/the-zebra-jumps-quickly-over-a-fence-vexed-by-a-lazy-ox/>
VANCOUVER
Stoyan | Sciencx - » The zebra jumps quickly over a fence, vexed by a lazy ox. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2024/10/21/the-zebra-jumps-quickly-over-a-fence-vexed-by-a-lazy-ox/
CHICAGO
" » The zebra jumps quickly over a fence, vexed by a lazy ox." Stoyan | Sciencx - Accessed . https://www.scien.cx/2024/10/21/the-zebra-jumps-quickly-over-a-fence-vexed-by-a-lazy-ox/
IEEE
" » The zebra jumps quickly over a fence, vexed by a lazy ox." Stoyan | Sciencx [Online]. Available: https://www.scien.cx/2024/10/21/the-zebra-jumps-quickly-over-a-fence-vexed-by-a-lazy-ox/. [Accessed: ]
rf:citation
» The zebra jumps quickly over a fence, vexed by a lazy ox | Stoyan | Sciencx | https://www.scien.cx/2024/10/21/the-zebra-jumps-quickly-over-a-fence-vexed-by-a-lazy-ox/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.