At Google Next 2019, co-workers Sasha Kipervarg, Patrick Raymond and I presented about how we migrated our company’s on-premise big-data Hadoop environment to GCP, from both a technical and cultural perspective. You can view the presentation and slides on YouTube:
Our Google Next presentation did not give us enough time to go into deep technical details; to give a more in-depth technical view of our migration, I’ve put together a series of blog posts on the LiveRamp engineering blog, with an emphasis on how we migrated our Hadoop environment, the infrastructure my team is responsible for maintaining:
Part 1 where I discuss our Hadoop environment, why we decided to migrate LiveRamp to the cloud, and how we chose GCP.
Part 2 where I discuss the design we chose for our Hadoop environment on GCP.
Part 3 where I discuss the migration process and system architecture which let teams incrementally migrate applications to the cloud.
Part 4 written by coworker Porter Westling, where he discusses how we worked around our data center egress bandwidth restrictions.
Part 5 where I discuss LiveRamp’s use of the cloud going forward, and the cultural changes enabled by migrating to the cloud from an on-premise environment.
LiveRamp’s migration into GCP has been my (and my team’s) primary objective for over a year, and we’ve learned (sometimes painfully) a ton on the way. Hopefully these articles help others who are planning big-data cloud migrations skip a few painful lessons.