Highlights of the HTTP Archive Web Almanac (#blogPost)

I spent my Sunday reading the HTTP archive web almanac and shared the surprising and interesting pieces in a Twitter thread. I like to own my data – so here we have the thread on my own site. Enjoy!
Spending my Sunday morning readin…


This content originally appeared on Stefan Judis Web Development and was authored by Stefan Judis

I spent my Sunday reading the HTTP archive web almanac and shared the surprising and interesting pieces in a Twitter thread. I like to own my data – so here we have the thread on my own site. Enjoy!

Spending my Sunday morning reading the Web Almanac sharing internet stats and analyzing HTTP Archive data for 2019.

I'll share facts and stats that I think are interesting in a thread. ? :)

Only ~80% of sites compress their JavaScript files. ?

Section titled Only \~80% of sites compress their JavaScript files. ?

Chart showing how many sites compress their JavaScript files using gzip (~65%) or brotli (~15%)

Source

Edited: I tweeted initially that it's 65% because I missed the fact that gzip and brotli should count together. ?

jQuery still powers 85% of the crawled sites

Section titled jQuery still powers 85% of the crawled sites

It always feels like React/Vue/Angular are all over the internet – they're not... jQuery still powers 85% of the crawled sites. ?

Table showing the usage of libraries – jQuery is on first position with being included on 85% of the crawled sites

Source

React powers 5% of the crawled sites

Section titled React powers 5% of the crawled sites

The numbers for sites using "cutting-edge" frameworks are relatively low with React being the most popular with ~5% on desktop.

Distribution showing how many sites are using React, Angular or others. Only ~5% of sites use react.

Source

Even though the ES Module support is quite good these days they are not really used.

~1% is surprisingly low because you can use a fallback strategy shipping a single bundle using the `nomodule` attribute and use modules for supporting browsers today.

Chart showing the usage of ES Modules. Only roughly 1% of crawled sites use modules

Source

Low usage of source maps in production

Section titled Low usage of source maps in production

Only ~20% of sites use source maps? ?

Chart showing the usage of source maps – roughly 20%

Source

Roughly 50% of sites use flexbox – only 2% use grid.

Chart showing that 50% of crawled sites use flexbox

Graphic showing that only 2% use CSS grid

Source

The highest found z-index value

Section titled The highest found z-index value

incredible high z-index value with dozens of 9s and an !important

Source

Only 20% of sites make use of responsive images...

Chart showing the usage of responsive image markup: `sizes` 18%, `srcset` 21% and the picture element 2% usage

Source

Usage of the alt attribute on images

Section titled Usage of the alt attribute on images

No surprise here, but yeah... image alt attributes are not used as much as they should. :/

Paragraph from the almanac highlighting the following sentence: Only 39% of images use alt text that is longer than six characters.

Source

Edited: As Boris Schapira pointed out, images can be hidden from assistive technology by providing an empty alt attribute (alt=""). This fact was not taken into consideration by the Almanac and makes the statistic meaningless.

26% of the pages use font-display. ? That's surprisingly high in my opinion. Because the support is not super-duper yet. I wonder how big google fonts' influence is in this trend. ?

Graphic: 26% of pages use font-display

Source

Honestly, I expected fewer sites being served over a secure connection. 80% of sites ship with https these days.

Chart showing that 80% of sites are served via HTTPS (mobile and desktop)

Source

12 - 14% of sites use HSTS to ensure they are only accessible by supporting browsers via HTTPS. This is also higher then I expected. ?

Table showing the usage of HSTS: 12 (mobile) / 14 (desktop) percent use the `max-age` directive, 3 percent use `include-subdomains` and 2 percent use `preload`

Source

I got this statistic by myself recently, but it's still sooooo low. ?

Only roughly 5% of crawled sites use Content-Security-Policy (CSP).

Paragraph from the article: We find that only 5.51% of desktop pages include a CSP and only 4.73% of mobile pages include a CSP, likely due to the complexity of deployment.

Source

4 of 5 sites ship with color contrast issues. I really wish that we get better at this. :/

Paragraph of the article: Only 22.04% of sites gave all of their text sufficient color contrast. Or in other words: 4 out of every 5 sites have text which easily blends into the background, making it unreadable

Source

The often missing language attribute

Section titled The often missing language attribute

26% of the pages don't specify the language of their content. This can trouble text-to-speech technology like screenreaders.

Paragraph of the article: Of the pages analyzed, 26.13% do not specify a language with the lang attribute.

Source

4 of 5 forms don't ship with labels for their input elements. :/ I'm used to these bad numbers, but well... filling out forms can be tough for everybody (even tech people), we really have to get better at this. :/

Paragraph of the article with highlighted text: Sadly, only 22.33% of pages provide labels for all their form inputs, meaning 4 out of every 5 pages have forms that may be very difficult to fill out.

Source

10% of sites ship without headings at all. ?

Paragraph of the article with highlighted text: Despite the importance of headings, 10.67% of pages have no heading tags at all.

Source

Google shows 50-60 characters in their search results. Generally speaking, the used title length is not optimal across the web. (at least for google)

Graph showing the distribution of title length: median value shows 20 characters for the title and 10 characters on the 25 percentile

Source

Service workers are mainstream, right? ? Not really... Only 0.44% of the crawled sites register a service worker.

Graphic showing that 0.44% of the crawled sites register a service worker

Source

How often do we click the wrong thing because something moved around? Too often.. Jumpy pages are the standard... :/

2 of 3 pages have a huge content shift while loading.

CLS stands for Cumulative Layout Shift – more info.

Paragraph of the article: Nearly two out of every three sites (65.32%) have medium or large CLS (Cumulative Layout Shift) for 50% or more of all user experiences.

Source

Speaking about tapping the wrong thing. Only 34% of the pages include big enough buttons and links...

Graphic showing sufficient tap targets... With explanation: As of now, 34.43% of sites have sufficiently sized tap targets. So we have quite a ways to go until 'fat fingering' is a thing of the past

Source

Wordpress usage is still massive. 75% of sites using a CMS are running on wordpress.

Graphic showing CMS distribution: Wordpress is on top with 75% followed by Drupal and Joomla (both below 10%)

Source

Page weight and number of requests of CMS sites

Section titled Page weight and number of requests of CMS sites

CMS pages are heavy and make many requests... I did Wordpress development in the past and that makes sense thinking of the audience and users of e.g. wordpress. "Just install another plugin"...

Resource consumption of CMS sites: median page weight is 2.3mb and median request count is ~85

Source

HTML is mainly served from its origin server (80%). Most used CDN is cloudflare (10%). ?

Distribution of CDN usage for HTML: - 80% origin - 9.61% cloudflare - 5.54% google

Source

I thought the median value for page weight would be higher these days. :D On desktop it's 1.9MB and on mobile, it's 1.7MB. It's still fairly high though imo. ? (and median is clearly only one piece of the puzzle)

Tables showing the distribution of page weight across mobile and desktop: - median page weight for desktop is 1.9MB and for mobile it's 1.7MB - 90 percentile for page weight is for desktop 6.9MB and for mobile it's 6.2MB

Source

And that's it. I highly recommend to check it out! It's a very fascinating and interesting read about the state of the internet. :)


Reply to Stefan


This content originally appeared on Stefan Judis Web Development and was authored by Stefan Judis


Print Share Comment Cite Upload Translate Updates
APA

Stefan Judis | Sciencx (2019-11-30T23:00:00+00:00) Highlights of the HTTP Archive Web Almanac (#blogPost). Retrieved from https://www.scien.cx/2019/11/30/highlights-of-the-http-archive-web-almanac-blogpost/

MLA
" » Highlights of the HTTP Archive Web Almanac (#blogPost)." Stefan Judis | Sciencx - Saturday November 30, 2019, https://www.scien.cx/2019/11/30/highlights-of-the-http-archive-web-almanac-blogpost/
HARVARD
Stefan Judis | Sciencx Saturday November 30, 2019 » Highlights of the HTTP Archive Web Almanac (#blogPost)., viewed ,<https://www.scien.cx/2019/11/30/highlights-of-the-http-archive-web-almanac-blogpost/>
VANCOUVER
Stefan Judis | Sciencx - » Highlights of the HTTP Archive Web Almanac (#blogPost). [Internet]. [Accessed ]. Available from: https://www.scien.cx/2019/11/30/highlights-of-the-http-archive-web-almanac-blogpost/
CHICAGO
" » Highlights of the HTTP Archive Web Almanac (#blogPost)." Stefan Judis | Sciencx - Accessed . https://www.scien.cx/2019/11/30/highlights-of-the-http-archive-web-almanac-blogpost/
IEEE
" » Highlights of the HTTP Archive Web Almanac (#blogPost)." Stefan Judis | Sciencx [Online]. Available: https://www.scien.cx/2019/11/30/highlights-of-the-http-archive-web-almanac-blogpost/. [Accessed: ]
rf:citation
» Highlights of the HTTP Archive Web Almanac (#blogPost) | Stefan Judis | Sciencx | https://www.scien.cx/2019/11/30/highlights-of-the-http-archive-web-almanac-blogpost/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.