So, let’s do this again, but this time cleanly. In a Facebook Post, Michael Seemann has been explaining why the Facebook App does not listen to every word you ever say, all of the time.
He is right. A telephone is a device with limited power supply, limited cooling and limited, metered connectivity. It has an operating system that monitors and manages these critical resources, hard. You can’t listen to things all of the time and expect not to be noticed. Like, “the battery is empty and my LTE budget is gone” noticed.
Other devices, an Alexa, a Sonos One or a Google Home, are on cabled power and unmetered Wifi. The could theoretically get away with listening all of the time.
So how much data is that? Let’s do the math.
Let’s assume one human talks, on the average, one hour each day. We are not recording environmental noise or gaps in speech. Just all the words.
Let’s also assume there are at most 20.000 talking days (55 years) in a human life on the average.
Let’s assume a really good Codec, like G.723.1 with 750 Byte/s.
One hour of talking is 750 Byte * 3600s = 2700000 Byte per Hour, or 2.6 MB per hour. A less frugal codec would consume about 10x this.
A lifetime of speech in a human is around 50 GB of recording, then.
There are currently 7.5 billion (10E9) humans alive. Let’s assume there have been 5 times as many humans alive ever, with 20.000 hours/50 GB of space requirements for each.
So 37 500 000 000 humans consume 50 GB each, that’s 1 875 000 000 000 GB, 1 875 000 000 TB, 1 875 000 PB, 1 875 EB of storage.
I am simplifying by rounding up to 2000 EB, or 200 Million 10 TB drives of data. We can have 12 of these in a 1U server, so 16 666 666 servers.
Google or Facebook each have around 3 Million servers. So, no, not possible by a factor of not quite 10.
On the other hand, if we made no error with the assumptions, barely possible for all living humans. Quite possible for some hundred millions of people.