OK, I indexed Google's source for an internal code browsing tool (see: Indexing Google's Source Code), so I can talk a bit about this.
First
of all, not all 2B lines of code were in one language. Maybe about 50%
of the code was in C++, 25% in Java, and 25% in Python and other random
small languages like sawzall and protocol buffer declarations. (Maybe Go
as well, nowadays.)
So it's 1B lines of C++, 500M in Java, 500M in python, et al.
Google
has all sorts of crazy issues with its source control system, to the
point where it replaced p4 with a home built clone called Piper
eventually. That took years, and in the meantime Google spent a lot of
energy making its p4 machine super-duper fast. Expensive SSDs were
purchased, and the p4 server had the most RAM you could pack into the
server (about 128 GB, I believe).
As I
described in the blog post above, programmers with IDEs would learn to
subset the projects they worked in, and the rest of the folks senior
enough to never get used to IDEs would simply use Emacs/Vim, and build
our own tools for code browsing, etc.
But branches, etc. were insane. And refactoring an interface? The checkins were massive and so were the code reviews.
No comments:
Post a Comment