How and where does Google store their data?
Its pretty safe to assume that in today’s time, Google probably has more data than any other organization on the planet. Google however is very secretive about its working as would any major organization, so we don’t really know for sure how much data the company manages or how they manage it. That doesn’t mean we cannot estimate it though.
Follow the trail
One benefit of a company being public, is the information that becomes public as a result. Though Google does not share much, we do know that the aggregate capital expenditure by the company is $12 billion. Their most expensive data center has a cost tag of around half a billion dollars. By rough calculations, its safe to assume they would contain around 20 data centers.
On their website, Google acknowledges that they have data centers in the following locations:
- Berkeley County, South Carolina
- Council Bluffs, Iowa
- Atlanta, Georgia
- Mayes County, Oklahoma
- Lenoir, North Carolina
- The Dalles, Oregon
- Hong Kong
- Hamina, Finland
- St Ghislain, Belgium
- Dublin, Ireland
- Quilicura, Chile
In addition, they appear to operate a number of other large data centers (sometimes through subsidiary corporations), including:
- Eemshaven, Netherlands
- Groningen, Netherlands
- Budapest, Hungary
- Wroc?aw, Poland
- Reston, Virginia
- Additional sites near Atlanta, Georgia
The next step in our quest will be to figure out how many servers does Google run at these data centers. SInce we can’t access any of the data centers directly, we look at their energy consumption.
The company disclosed that in 2010 they consumed an average of 258 megawatts of power.
Now we also know that these data centers are highly efficient with only 10-20% of the power being spent on cooling . To gather approximately how much energy a server consume , let’s take a look back to the “container data center” concept from 2005. Although we have no way of verifying if they actually do utilize this concept today – it will give us a fair idea of what Google considers reasonable power consumption. The result – 215 watts per server.
Using this information along with previous data, Google was running around 1 million servers back at the turn of the decade. The company has grown by leaps and bounds since and by 2013 it was estimated that the capital invested in data centers could’ve been 3x to 4x times the capital of 2010. Based on this knowledge, we can assume the number of servers being operated by the company to be between 1.8 to 2.4 million.
Calculate the storage
Now this number doesn’t necessarily mean they’re all relevant to us. These servers might just be a part of one of the many tests and experiments that Google is known for. But since we have such a wide range in the final number, we can never be too sure what the actual number of servers being used by their data centers.
If we assume that these servers have a couple of hard drives of 2 TB each, we come to the mind boggling figure of 10 exabytes !. This isn’t too hard to believe when you consider that approximately 8 exabytes of hard drive storage is shipped annually. While that figure might not include Google, the company has been active for quite a few years and the storage would’ve also accumulated in that time.
If that number still feels too high, let us remember that we haven’t even considered cold storage yet. We have absolutely no definitive way of knowing how much of the storage space is cold storage thus, the final number might be even higher. Paul Mah of SMB Tech, Simon Anderson of Tandberg Data, in a phone interview back in 2011, mentioned that Google is the single largest buyer of magnetic tapes ordering around 200,000 per year. Assuming that number has since increased, we wouldn’t be too surprised if the number is higher than 10 exabytes.