How much data should my Service Worker put upfront in the offline cache?

I love when Web site/apps work even when I'm offline. I've made my SVG game esviji work offline thanks to appcache just after attending Jake Archibald conference about why Application Cache is a Douchebag during the 2012 edition of the Paris Web conference. Fortunately, we have now Service Workers (in some browsers), which gives us more control over this kind of cache for offline browsing. But as Uncle Ben says, “With Great Power Comes Great Responsibility”.

Just like with appcache, it is possible with Service Workers to put a full website in the cache when loading the first visited page.

It is very interesting, because you can then go offline and browse the whole site just as if you were online, without even noticing you're offline. The cache will then be updated when you visit pages of the site while online. Depending on the nature of the content, you will fetch the page from the server when it is requested by the user, so that she gets the up-to-date version[1], or you will show the cached version first, and update the cache only for subsequent visits.

All of this is really well explained by Jeremy Keith in a series of posts on his blog, including the recent one about Making Resilient Web Design work offline.

Resilient Web Design is a Web book Jeremy wrote a few weeks ago. I urge you to read this book, it's really great. Just like most of Jeremy's creations, anyway.

Here's an extract of the book's introduction:

The World Wide Web has been around for long enough now that we can begin to evaluate the twists and turns of its evolution. I wrote this book to highlight some of the approaches to web design that have proven to be resilient. I didn’t do this purely out of historical interest (although I am fascinated by the already rich history of our young industry). In learning from the past, I believe we can better prepare for the future.

So, back to the topic of this post.

Jeremy had the great idea to make this book available offline thanks to a Service Worker, so you can visit it once, even only one page of it, and read the whole book while offline, commuting like me in Paris underground subway for example[2].

This is great! There is a lot to come for the Web thanks to such features, assembled in Progressive Web Apps[3].

But, it means Jeremy chose to fetch the whole site content and resources in every capable browser[4], even if the user only wants to read the introduction, and decide that she doesn't need to read the rest. I would call her a fool, of course, but it might happen.

According to my browser network panel or WebPagetest, it means almost 16 Mb are downloaded right away when you access one page of the site.

The Resilient Web Design web book audited by WebPagetest

The site is very fast, and all checks are green, but that's because most of the downloads happen asynchronously, after the visited page has been rendered.

I must confess I did almost the same thing for a while in my game esviji when I started using appcache, because I put almost 2 Mb of audio files in the cache. I decided later that offline users could play without sound, so I removed it from the cache.

For a small site/app that takes 2 or 3 Mb, I can accept to download everything, because the convenience to have all this available while offline can be great. But I think 16 Mb is really to much.

Just to illustrate, it means that one visit to this site will cost a Mauritanian at least 10 % of his daily income, according to Tim Kadlec's simulation on What Does My Site Cost?.

Cost of visiting this website as a percentage of daily income

Only 0.24 % for Jeremy in UK or 0.28 % for me in France, but we are here because we love the World Wide Web, not Wealthy Westerners' Web, as presented by Bruce Lawson during 2016 edition of the Paris Web conference.

Because I use it quite a lot these days to check my own Progressive Web Apps, I thought it would be nice if Lighthouse, the Chrome extension that check web pages against a growing list of best practices, included a check on total page weight. It looks like Hubert Sablonnière already had this idea and created an issue, which got support from Paul Irish, so it will come sooner or later.

For my own website, I first thought I would only cache visited pages. But I now cache the homepage, the two about pages, and the last post, regardless of the page on which the user arrives, for a really light total weight of 87 KB additional resources. The offline fallback page lists the pages that are in the cache, so that the user can discover some unknown content even when she's offline. This is a , so it might break, and it will change over the coming weeks, because I might adjust my strategy.

There is a user setting to "save data" in some browser, which activation adds a new HTTP header we can test in our Service Workers, as shown by Dean Hume in his post Service Workers: Save your User's Data using the Save-Data Header, but I think most people that are not as tech savvy as us will never notice this setting, so it's obviously a nice to have, but it's not enough.

So, it might be nicer to initially cache only the files needed to enhance the performance of the site and provide a clean offline fallback, then add the pages when they are visited, and provide the user with an option to cache the whole site, or part of it, for future offline browsing.

It would be less magical, indeed, but more respectful of users with limited and/or costly data plans.

I don't know if Jeremy thought about this or not, but I hope there will be some discussions around this in the community, because Service Workers give us a lot of power, that could be abused by people not aware of the damages it can cause, or even on purpose, just because it helps making websites faster. When the average page is already more than 2 Mb, we really have to be careful.

To conclude, it's kind of amusing to see that Jeremy also provides links to download other versions of the book, including PDF, epub and mobi, and most of these files weight less than 16 Mb.

February 25, 2017 update: Lighthouse will now give a lower score if total byte weight is too high.

August 1st, 2019 update: Lighthouse's total byte weight audit unfortunately just checks the weight of the page and its resources, without counting any request performed by the Service Worker. So I opened another issue: #9493 "total byte weight" but for the service worker installation and activation


  1. Be careful, you can still get a not so up-to-date version if the page is taken from the traditional browser cache. Yes, "it's complicated" sometimes, as shown in this awesome post written by Yoav Weiss. ↩︎

  2. Well, that's pure fiction, because I have an iPhone, and Apple didn't work yet on supporting Service Workers in iOS. Just like Scott Jehl, "As an iOS user, the lack of Service Worker support in Safari is almost enough for me to switch to Android. Almost.". ↩︎

  3. You can read more about Progressive Web Apps in french on my company's blog: Les Progressive Web Apps pour booster l’UX de vos services. ↩︎

  4. As of today, these include only Firefox, Chrome and Opera. ↩︎

78 Webmentions

26 likes

3 reposts

  1. Bob Volte avatar
  2. Anselm Hannemann avatar
  3. Arthur Stolyar avatar

24 replies

  1. Boris avatar Boris
    save the visited pages only. Add a CTA to propose full download.
  2. Boris avatar Boris
    this is a book so it's different. But I'd rather advice to offer this feature instead of forcing it w/o the user consent.
  3. Smashing Magazine avatar Smashing Magazine
    Good point! Well, basically we are using Service Workers Cache to store all assets and a couple of pages, but not images.
  4. Smashing Magazine avatar Smashing Magazine
    Instead, we are serving offline placeholders. CSS/JS/fonts are stored though + a couple of HTML pages.
  5. Smashing Magazine avatar Smashing Magazine
  6. Philippe Vayssière avatar Philippe Vayssière
    I was very pleased being able to read half of it on my two ~8H journeys in TGV. But that's only acceptable for THIS website, sure
  7. Nicolas Hoizey avatar Nicolas Hoizey
    I see that indeed, I didn’t take time yet to read your huge SW… 😉
  8. Nicolas Hoizey avatar Nicolas Hoizey
    true
  9. Nicolas Hoizey avatar Nicolas Hoizey
  10. Smashing Magazine avatar Smashing Magazine
    No worries! :-)
  11. Doron Sherman avatar Doron Sherman
    since UX and cost are prime considerations, the SW can act as a smart agent representing the user's best interests, adapt & obey.
  12. Nicolas Hoizey avatar Nicolas Hoizey
    I agree
  13. Tim Kadlec avatar Tim Kadlec
    Easy solution would be to provide a "Save Offline" button so user determines. BUT..
  14. Owen Campbell-Moore avatar Owen Campbell-Moore
    totally agree with problem. My take is that the system is best placed to make these decisions for our ~1B users.
  15. Owen Campbell-Moore avatar Owen Campbell-Moore
    Now we need to do that! I often get asked "how much should I pre-cache, and keep in cache?" No good answer today.
  16. Nicolas Hoizey avatar Nicolas Hoizey
    you could also ask the user if she wants to cache anything at all, juste like you ask for geolocation or other APIs. +@ChromiumDev
  17. Owen Campbell-Moore avatar Owen Campbell-Moore
    I don't think users are best placed to understand costs vs benefit to storage / bandwidth. System way better informed.
  18. Owen Campbell-Moore avatar Owen Campbell-Moore
    (users like you or me may be well placed and understand, but we're not the 99% that just want to read the news etc)
  19. Nicolas Hoizey avatar Nicolas Hoizey
    I agree most users wouldn’t know, unfortunately. +@ChromiumDev
  20. antoinefauchié avatar antoinefauchié
    d'où mon ticket → github.com/adactio/resili… 🤔
  21. Nicolas Hoizey avatar Nicolas Hoizey
    Je me souviens, oui… 😉
    Il n’a malheureusement jamais réagit, ni à mes interpellations sur Twitter à ce sujet…
  22. WebPageTest bot avatar WebPageTest bot
    No problem @nhoizey. I submitted the test for nicolas-hoizey.com/2017/01/how-mu… to webpagetest.org, check the result at webpagetest.org/result/180324_…
  23. Chris Love avatar Chris Love
    yeah! another layer of architecture I can add to my PWA course :) Actually this is important and I had not thought of it at this level yet. I have logic to intelligently purge caches based on TTL or max items and even a manifest. Need to adjust for user choice now
  24. Nicolas Hoizey avatar Nicolas Hoizey
    Awesome, this post is still useful then! 😉

25 mentions

  1. Bruce Lawson avatar Bruce Lawson
    How much data should my Service Worker put upfront in the offline cache? nicolas-hoizey.com/2017/01/how-mu… by @nhoizey
  2. antoinefauchié avatar antoinefauchié
    super article de @nhoizey sur la quantité de données stockées par les Service Workers nicolas-hoizey.com/2017/01/how-mu…
  3. GreenIT.fr avatar GreenIT.fr
    Excellent article de @nhoizey sur téléchargements abusifs à cause de mise en cache par des Service Workers #software #ecodesign #greenIT


    twitter.com/nhoizey/status…

  4. Веб-стандарты avatar Веб-стандарты
    Сколько офлайнового кэша стоит хранить в сервис-воркерах, Николас Хойзи о стоимости трафика и ответственности — nicolas-hoizey.com/2017/01/how-mu…



  5. Fresh Frontend Links avatar Fresh Frontend Links
    How much data should my Service Worker put upfront in the offline cache? nicolas-hoizey.com/2017/01/how-mu…
  6. Radimir Bitsov avatar Radimir Bitsov
    How much data we should put upfront in the offline cache using #serviceworker?nicolas-hoizey.com/2017/01/how-mu…#webperf
  7. stefan judis avatar stefan judis
    How much data should a #serviceworker cache upfront? Good article on the "offline all the things" topic. 👉 nicolas-hoizey.com/2017/01/how-mu…
  8. Gokul avatar Gokul
    How much data should my Service Worker put upfront in the offline cache? nicolas-hoizey.com/2017/01/how-mu… #js #javascript #perfmatters #web #mobile
  9. Page screenshot https://www.smashingmagazine.com/2017/02/web-development-reading-list-...
  10. Page screenshot http://webdesign.jitheshpr.com/web-development-reading-list-168-preloa...
  11. Page screenshot http://americanfido.com/web-development-reading-list-168-preload-with-...
  12. Brian Daley avatar Brian Daley
    How much data should my Service Worker put upfront in the offline cache? nicolas-hoizey.com/2017/01/how-mu…
  13. 🦄 bredele avatar 🦄 bredele
    @hsablonniere This morning I randomly read this article (nicolas-hoizey.com/2017/01/how-mu…) and guess who was mentioned in it? :)
  14. Page screenshot http://www.webhostingreviewsbynerds.com/web-development-reading-list-1...
  15. pierre choffe avatar pierre choffe
    How Much Data Should My Service Worker Put Ufront In The Offline Cache par @nhoizey (FR translation coming Sir ?)nicolas-hoizey.com/2017/01/how-mu…
  16. Jacky avatar Jacky
    How much data should my Service Worker put upfront in the offline cache? nicolas-hoizey.com/2017/01/how-mu…
  17. Andrew Solomon avatar Andrew Solomon
    How much data should my Service Worker put upfront in the offline cache? nicolas-hoizey.com/2017/01/how-mu…
  18. CiTA avatar CiTA
    How much data should my Service Worker put upfront in the offline cache?buff.ly/2lbO80t via @nhoizey#offlice #web #workers



  19. Page screenshot http://www.brucelawson.co.uk/2017/reading-list-164/
  20. Thomas Steiner avatar Thomas Steiner
    How much data should you (pre-) load via a Service Worker? nicolas-hoizey.com/2017/01/how-mu…. Great questions raised in @nhoizey’s blog post.
  21. Thomas Steiner avatar Thomas Steiner
    How much data should you (pre-) load via a Service Worker? nicolas-hoizey.com/2017/01/how-mu…. Great questions raised in @nhoizey’s blog post.
  22. Max Böck avatar Max Böck
    what should go in your default SW offline cache? @nhoizey has some good thoughts on this: nicolas-hoizey.com/2017/01/how-mu…
  23. Šime Vidas avatar Šime Vidas
    @nhoizey In nicolas-hoizey.com/2017/01/how-mu…, you wrote in the update at the bottom that “Lighthouse will now give a lower score if total byte weight is too high,” but that’s unrelated:
  24. Šime Vidas avatar Šime Vidas
    @nhoizey In nicolas-hoizey.com/2017/01/how-mu…, you wrote in the update at the bottom that “Lighthouse will now give a lower score if total byte weight is too high,” but that’s unrelated:
  25. Page screenshot https://learnpracticeandshare.com/awesome-offline-first-massive-collec...