One of the hardest choices facing engineering organizations is which technology stacks to leverage. We faced this decision last year. Our customer base was growing and our computational needs were growing even faster. As a result, our existing infrastructure was struggling to support business demands. We made the decision to bet on Spark (and Scala) in order to support our growing customer base and users.
The break-neck speed of technical innovation means decisions often revolve around using battle-tested technologies or newer technologies which have sex-appeal but remain unproven. These engineering decisions can have a dramatic impact on the business and are often made with incomplete information. This post describes why we chose Spark.
How We Started Our Evaluation Process
Apache Spark is an open-source cluster-computing framework built for speed, ease of use, and sophisticated analytics (it’s also well-suited for machine learning algorithms). UC Berkeley’s AMPLab developed Apache Spark in 2009, and they made it open-source in 2010.
We first combed our network for engineers and managers who had first-hand experience deploying Spark. These conversations gave us a set of pros and cons.We have summarized these pros and cons below.
Pros of Deploying Spark
- Scale – Spark easily meets our current and future scaling needs.
- Designed for Machine Learning and Artificial Intelligence – Vidora provides advanced machine learning technologies to optimize consumer-facing experiences. Spark provides a wealth of Machine Learning tools and will continue to provide more in the future.
- Speed – Spark is up to 100x faster than traditional map-reduces and supports Spark streaming – given the volume and real-time needs of our product, speed is a major concern.
- Continued Innovation in the Spark Community – The Spark community will continue to innovate and provide open source solutions in the future in areas very relevant to Vidora.
Cons of Deploying Spark
- Relatively Immature Technology – Very few deployment cases have been published publicly. To boot, nearly every example is at trivial scale and not directly applicable to Vidora’s existing scale.
- Steep Learning Curve – Unless you have previous experience with Scala, learning several new technology stacks at the same time can be a big investment. We knew it would take our team several weeks to get up to speed.
- Cryptic error messages due in part to multiple layers of abstraction (potentially unique to Vidora’s implementation of Spark). We write our code in Scala, which compiles to Java Bytecode. We execute this across our cluster using Mesos. As a result, debugging can get very tricky.
- Spark has a plethora of under-documented parameters, and you can only determine their tuning through trial and error.
- Lack of Operational Tooling – It’s often very difficult to determine why tasks are running very slowly. With big data sets, trial-and-error debugging is very expensive since each job may take minutes or hours to exhibit the problem.
Despite some of these drawbacks, we knew our business would continue to scale and support hundreds of millions of unique users every month. Spark and Scala seemed like the right choice for Vidora to meet current and future business needs.
Anyone at a startup is in the business of placing bets, and we decided to make a bet on Spark and Scala – and it’s paying off. Despite some initial ramp-up time, our platform easily scales to accommodate even the largest global consumer brands. We are able to generate real-time intelligence to help some of the largest global brands increase click-through-rates and transactions by 20-30%.
We will continue to innovate around our algorithms and technology. But we will also constantly evaluate the best infrastructure to scale and deploy our technology.
– Abhik and Darrell