Google’s Internet Services Has 2 Billion Lines Of Code, Which Is 40 Times Size Of The The Windows
According to Google Engineering Manager, Rachel Potvin at an engineering conference in Silicon Valley on Monday estimated that the software required to run all of Google’s Internet services from Gmail to Google Search to Google Maps extends to about 2 billion lines of code, in comparison to Microsoft’s Windows operating system, which is likely around 50 million lines of code. This means that the size of the software required for the Google service is 40 times the size of the Windows.
These 2 billion lines of code that support Google services include running Google Search, Google Maps, Google Docs, Google+, Google Calendar, Gmail, YouTube, and any other Google Internet service, all sit in a singular formula repository accessible to all 25,000 Google engineers.
Google treats a formula like a very large handling system within the company. “Though we can’t infer it,” Potvin says, “I would theory this is a largest singular repository in use anywhere in a world.”
Only coders inside Google have access to its enormous repository. With the internet at large, in some way, it is similar to GitHub, the public open source repository where engineers can share enormous amounts of code.
“Having 25,000 developers, as Google does, means it’s sharing code with a diverse set of people with diverse set of skills,” says Sam Lambert, the director of systems at GitHub. “But, as a small company, you can get some of that same advantage using GitHub and open source. There’s that saying: ‘A rising tide raises all boats.’”
On the down side, it is no simple task to build and run a 2-billion-line monolith. “It must be a technical challenge—a huge feat,” Lambert says. “The numbers are absolutely staggering.”
The best thing about GitHub is that it lets coders to easily share and work jointly on code. While GitHub spans millions of projects without housing any software project, Google combines many projects into one. This might look a little foolish due to the difficultly of juggling that much code across that many engineers. However, it works, according to Potvin.
Listen to the Piper
To juggle all this code, Google has built its own “version control system” called Piper, which runs across the immense online infrastructure it has built to run all its online services. The system spans 10 different Google data centers, according to Potvin.
This system gives Google engineers a different freedom to use and combine code from across countless projects. Further, any single code change made by the engineers can be immediately deployed across all Google services. One update everything gets updated.
However, the flipside to it, as Potvin points out, is the highly sensitive code such as Google’s PageRank search algorithm that are present in separate repositories only available to specific employees. The code for Android and Chrome also gets stored in different separate version control systems. Google code for the most part is a monument that allows for the free flow of software building blocks, ideas, and solutions.
The Bot Factor
Building and running such a system needs not only know-how but very large amounts of computing power, points out Lambert. Piper spans about 85 terabytes of data (aka 85,000 gigabytes), and Google’s 25,000 engineers make about 45,000 commits (changes) to the repository each day. While Google engineers modify 15 million lines of code across 250,000 files each week, the Linux open source operating spans 15 million lines of code across 40,000 software files.
Simultaneously, Piper must work to get rid off most of the burden from human coders. It must make sure that humans can cover their heads around all that code without stepping on other’s toes with code changes and also remove unused code and bugs from the repository, making life easier for humans. By Google switching to Piper from its previous version control system—a tool called Perforce, it has started generating a lot of the data and configuration files required to run the company’s software.
Potvin explains that humans not only maintain the health of the code to ensure changes are made and bugs removed, but also bots.
Piper for Everyone
It seems that many of today’s high-tech internet companies run their business similarly. Facebook treats its main app as a single project, as it spans upwards of 20 million lines of code. The same is done by others on a smaller scale. But, dthe logistics can become a hindrance, when companies becomes as big as Google or a Facebook. However, Google and Facebook are searching for ways to change that for everyone.
Currently, they both are working on an open source version control system that can be used to juggle code on an exceptionally large scale by everyone. It’s based on an existing system called Mercurial. “We’re attempting to see if we can scale Mercurial to the size of the Google repository,” Potvin says, pointing out that Google is working in tandem with programming guru Bryan O’ Sullivan who assists in supervising coding work at Facebook.
Very few companies today juggle as much code as Google or Facebook do. However, they will in the near future.