When should an ecommerce business use multiple web servers?

, 674 words

Following my earlier post on load-balancing I was asked by the CTO of small, growing ecommerce business:

“In your opinion, is there a rough point (in terms of revenue, orders per month, round of venture funding, etc.) at which you'd typically want to move to multiple web servers for an ecommerce store? How would you weigh up when was the right time?”

As this touches on a subject I wrote about so recently, I thought I'd write up a view.

Do it early if there's no legacy: it's easier and it helps the team think about scale

It is said one should generally code and design to scale 10x and not worry about 100x, as there are too many unknowns that one will discover on the way to get there. That is very sensible.

Architecture is fundamentally more important than code quality, and generally speaking if one don't have a good design principle in at the start -- like being able to scale horizontally -- it's unlikely to get easier to add in with time.

When that is combined with a desire to move fast in putting out new functionality, options that aren't used get broken. Some numpty will commit something which breaks the principle. Putting that together, my view is scaling the web layer horizontally should be done pretty much from the get go, when it's easiest: and if one asserts that it's possible without doing it, I'd suspect it won't fully work when tried.

The benefits of simple horizontal scale

At Reincubate we scaled horizontally almost from the start, and it was one of my first projects at Wiggle. Knowing and having proof that one can scale horizontally vs. not being able to radically changes a CTO's options and potential costs. If one has a virtual stack it should be possible to tell investors that if the shit hits the fan and traffic jumps 10x overnight, after an email or TV spot, extra web boxes can be cloned for more capacity. Usually this can be done without any downtime at all, and with a linear cost increase assuming there is a bit of extra welly left in the database boxes. With a single big server, one rarely has robust theory on what a 10x increase will do to it, and the "getting a much bigger box" option is hard to do quickly, without downtime, or without a non-linear cost increase. (I've had to send a taxi from Portsmouth to the north of England and back to pick up rare part before... the bigger or upgraded box option can get terrifying!)

As for scaling the database, redis, memcached, load-balancers, and so on: I would wait until it is nearly necessary. Read slaving MySQL usually makes life much easier around speed and backups, so one usually ends up doing that quite quickly. My hand has been forced more recently on load-balancers: at Reincubate we lasted on one until our traffic beat 200Mb/s and we started to hit instability in the kernel netstack code. In this case I felt scaling horizontally to be a more pragmatic option than building kernel and netstack tuning skills in the team. In general I'd approach sharding slowly as it can be a tough gig, and limiting for some engineers to get their heads around.

Having a couple of web boxes is great for other reasons, of course: one can put out canary builds of the app, A/B test over machines, load-balance, and take boxes out for prolonged updates and maintenance.

Be careful if there's limiting legacy code

With that said, if I inherited an ecommerce stack in a business which couldn't easily scale horizontally, I probably wouldn't worry too much about sorting that out until it got past £10 - 15m in revenue. At Wiggle I did it at around £30m, so it's almost never too late! If the company was smaller than that but growing fast there would probably be a number of more pressing priorities, and if it wasn't growing fast, well... no need!