How much memory is reserved for cute kitten pictures?

post by Bezzi · 2021-10-04T21:30:39.646Z · LW · GW · No comments

This is a question post.

Contents

  Answers
    1 Quintin Pope
None
No comments

In my social circles, I frequently tell a joke making fun of the awful lot of cute kitten pictures available on the internet ("somewhere in the world, a whole server farm is doing nothing but storing pictures of cute kittens"). Joking apart, there are thousands of data centers around and the world's total data storage capacity is measured in Zettabytes.

How much of this memory do you think to be actually occupied by cute kitten pictures? What could be an effective way to make a Fermi estimate?

Answers

answer by Quintin Pope · 2021-10-04T23:18:35.246Z · LW(p) · GW(p)

Look at ImageNet (https://image-net.org/index.php) tags and find the percent of them that are kitten pictures. The International Data Corperation estimates there are around 6.8 zettabytes of storage globally (https://www.idc.com/getdoc.jsp?containerId=prUS46303920). Now we just need the fraction of total storage dedicated to consumer images. Maybe 2%?

I’d guess something like (0.1% kitten pictures) x (2% consumer images) x (6.8 zettabytes) = 21,500 terabytes of kitten images.

comment by JBlack · 2021-10-05T04:14:59.864Z · LW(p) · GW(p)

Oh no! You missed out on stating this as 21.5 PETabytes!

I arrived at a similar order of magnitude via a different path. I suspect that almost all cute kitten pictures are just stored in a few (1-5?) locations, and assuming a power law such that maybe 1% of the world population kept 100-ish kitten pictures (almost all of which are by default cute), and maybe 5% more keep an average of 10-ish pictures. I'd estimate each picture to be on the order of a megabyte.

That yields around 36 Petabytes.

comment by Zac Hatfield-Dodds (zac-hatfield-dodds) · 2021-10-05T05:54:38.445Z · LW(p) · GW(p)

ImageNet was constructed to match the WordNet hierarchy, and is not representative of the distribution of images stored online. I'd guess that cat pics are10x--10Kx overrepresented.

I'd also be shocked if consumer images are even 0.1% of all data stored; there's a huge volume of other heavier datasets out there.

No comments

Comments sorted by top scores.