Friday, 27 January 2017

Quora

OK, I indexed Google's source for an internal code browsing tool (see: Indexing Google's Source Code), so I can talk a bit about this.
First of all, not all 2B lines of code were in one language. Maybe about 50% of the code was in C++, 25% in Java, and 25% in Python and other random small languages like sawzall and protocol buffer declarations. (Maybe Go as well, nowadays.)
So it's 1B lines of C++, 500M in Java, 500M in python, et al.
Google has all sorts of crazy issues with its source control system, to the point where it replaced p4 with a home built clone called Piper eventually. That took years, and in the meantime Google spent a lot of energy making its p4 machine super-duper fast. Expensive SSDs were purchased, and the p4 server had the most RAM you could pack into the server (about 128 GB, I believe).
As I described in the blog post above, programmers with IDEs would learn to subset the projects they worked in, and the rest of the folks senior enough to never get used to IDEs would simply use Emacs/Vim, and build our own tools for code browsing, etc.
But branches, etc. were insane. And refactoring an interface? The checkins were massive and so were the code reviews.

No comments:

Post a Comment