How I Failed a Google SRE Interview
Yesterday I got the final answer from the hiring committee, I am not going to be a SRE (Site Reliability Engineer) at Google.
It all started 3 months ago when I got the first screening interview. I answered some basic questions (50% answers were correct, though) and I was notified that I will get the second phone interview with a real SRE.
During this interview I enjoyed the troubleshooting part the most. It was
basically a role-playing game when you ask the interviewer the questions about
the system or the commands you would run on the machine and the interviewer
tells you what you see on the screen. It took me mere seconds to figure out
the failing machine but then I started going back and forth around the
different components. In the end I did manage to find the actual failure. The
coding exercise was fun too. I learned that I forgot everything about hashes,
dict.items()
? dict.entries()
? (actually it is
dict.items()
) but managed to write a properly working code from
the first try in the shared Google document. There were also some basic UNIX
questions, like how you can delete a file starting with a dash as well as
basic networking - how does a switch work.
After a week or so I was notified that they would like me to come to Zurich, Switzerland for an on-site interview. I had 2 weeks to read the recommended books, play with the various systems and network configuration, get the Visa and come to Switzerland (the flights and the hotel was paid for by Google). I had to leave home quite early since we just had a huge snow storm, so I was quite sleepy and did not walk around much.
The Engimatt Hotel was within the walking distance from the Google Zurich office which looked quite... let's say "cost-effective".
I walked around the location to get familiar with the surroundings, found the city to be quite expensive, returned back to the hotel, and shut down.
On the 26th of March I woke up, went to the hotel restaurant for a breakfast being the only one who did not wear the suit (the recruiters explicitly told me to stay away from the suit and tie), and went to the Google office.
The inside of the office looked extremely different, it was joyful, interesting and well thought. I passed through the cafeteria, received my bottle of Google Water and was led to the conference room. Each interview lasts for 45 minutes and there are 5 interviews and a lunch break.
The interviews focused on the following:
Linux internals.
System administration.
Python coding skills.
Troubleshooting.
Large scale systems design.
Linux internals
You will need to know how the malloc()
works and how the memory
allocator is implemented in, say, glibc, how the processes are started and
pass the data between themselves on the low level. I believe I got something
around 3/5 there, because I never jumped that far into the kernel for the
memory allocation and I never had to find the reason why strace
output is full of sbrk()
and mmap()
calls.
System administration.
Having just finished building my little system for the kernel bisection with PXE I had the answer even before the interviewer finished the question and the overall experience was fun. I played with puppet and rsync in the past, so the questions about synchronizing machine data and configuration were simple. After all the prepared questions were answered we had a chat about ZFS and how awesome it is. I believe this is the only interview where I scored 5/5.
Python coding skills.
I started pretty well with explaining what's wrong with def
foo(data=[]): ...
but I failed miserably at implementing a simple
calculator performing only additions and multiplications. For some reason I
forgot all the python I knew ("elseif
?", definitely not
"else if
?"... it is "elif
", yes) and the rest 30
minutes I struggled to come up with something I believed to be working. After
spending 3 years primarily on the little one-time scripts and support issues I
got really rusty and uncertain in programming something w/o running the
intermediate solution to see whether it is working or not. I feel that I got
2/5 there and when I returned back home I had a beautiful solution written in
5 minutes. The critical point is that you need to find out how you would do
this task, don't try to think how the computer should be doing it immediately.
Troubleshooting
Oh this was fun. I managed to get through 2 tasks (I don't know how much more
the interviewer has prepared), but they were great. I never dealt with so many
failures as what I got at the interview while trying to do a simple
"umount /usr
", however I've spent more than I should on finding
the reason of hanging terminal during the SSH connection. Locales,
environment, libraries, networking MTU... everything should have been
considered. I think I got 4/5 there but the score might have been lower.
Large scale systems design
I've never designed anything like this before and while I read a lot about this, I haven't had the hands-on experience. The task was humble, design the logging system for the whole Google. I had difficult time coming up with a design, used technologies and it was a completely failed interview. 1/5 or even 0/5.
Afterwards
I was escorted out of the building, went to the hotel, had dinner there, prepared for the upcoming flight and shut down. It was a pretty tiring experience but it was well worth it.
Usually it takes one or two weeks for the hiring committee to come up with a decision, however this year the H1B US visa cap was reached in mere 5 days, and while the original US hiring committee decision was negative, the recruiters decided to forward the interview results to the Europe-based committee too. The overall review process took almost 4 weeks and yesterday I finally received an answer, "No". The committee has noted that I have a potential though.
I feared I would face the people who would be all "I work for Google, you are nobody". Instead, everybody remained positive even when I was obviously failing. The office internals are interesting. You do want to spend the day inside. I sometimes visit my wife at Global Logic Kiev and now I know where they tried to copy the environment from. One of the interviewers brought the Chromebook Pixel to take the notes (and guess what, I did not ask to look at the screen, double fail!), Lenovo Thinkpads and Apple MacBooks were spotted as well. The food is free for all the employees and there are 4 or 5 restaurants which prevent the over-crowding during the lunch hours, there is a gym and a workshop where you can build something (as in, the real, physical workshop with the power tools etc.), there is a room full of trees and plants and you definitely feel that there is more oxygen there. Every room has a natural light source, i.e. a window while all the stairs are located in the center of the building. The office felt really bright and light.
I saw a Hipster Dog image pinned to the corkboard.
It's definitely a place one would like working at.
Recommended books
You would want to read UNIX and Linux System Administration Handbook for system administration. It took me a week to get through all the linux things and it structured my administration knowledge. It is fun and easy to read.
As per recommended reading by Google itself:
TCP Illustrated; Volume 3 by W. Richard Stevens, but you don’t really need to dive into the details of TCP for Transactions, which turned out to have security flaws. Understanding of IP, TCP/IP and UDP/IP is a must.
The Design and Implementation of the 4.4BSD Operating System by Marshall Kirk McKusick
The UNIX Programming Environment by Brian Kernighan and Rob Pike
The Practice of System and Network Administration by Tom Limoncelli, Christina Hogan and Strata R Chalup