When WP Engine was young, life was simple. We didn’t have meetings, because everyone knew everything. We didn’t have training classes, because we hired about one person every three months, and they could just “ride shotgun” and learn the ropes. We were working out of an amazing accelerator called the Capital Factory and space was constantly getting tighter, which is a good thing, because energy increases in tight quarters. (Music and stand-up comedy are best in tight quarters too). It was small, intimate, and probably one of the most energizing spaces I have ever been in. I knew everyone’s name, the names of their dogs and kids, and what they liked on their Jimmy Johns subs.
Today I walked into our weekly All-Hands meeting. We have a lovely office to call all our own in downtown Austin, TX and it breathes with dedication and creativity. But I did notice something. There were people I didn’t instantly know—and when I saw a picture of their dog I didn’t know its name. (We’re big on pets here at WP Engine!)
This has shown itself in other more material ways, that are both good and challenging. We have more customers than ever before, and that means more folks working here, and more servers. WordPress and life on the Big Bad Internet are ever-changing, and this means there are more edge cases, more high traffic cases, more servers, and a lot more people we touch every day.
Growth is Hard
The hardest thing about growth is that rare things become common.
A Linux kernel bug that causes a hard-crash once every 4 years happens every single day when you have 2000 servers (as we do). A code-push we did yesterday impacted 4 customer installations out of over 150,000—something so rare that even in retrospect it would have been impossible to catch the problem beforehand, and in fact we didn’t catch it even with multiple code insepctions, QA testing, canary-server testing, and so on.
This same principle is equally biting in customer service. Sometimes, we will inevitably generate a bad customer experience. Sometimes, we’re going to tell a customer that the problem is X when in fact it is Y. Sometimes, we’re going to misread a response from a customer, or misread one of our own responses. Or a ticket will be handed off from one person to another, and everything gets lost in translation. Sometimes we’re not going to have disseminated information about a change, or a policy, or how our code works, or whether we’ve made a change to our platform that impacts customers, and thus some support folks won’t be in the loop and then turn around and tell our customers something incorrect.
This is Not An Excuse For Having Those Sorts of Problems!
Rather, these are the types of things we have to focus on, every day. We need to ask ourselves: How could that have been prevented? If not prevented, how could we have caught the problem before the customer did? If not caught, how could we have messaged the customer better? How can we communicate better internally and externally? How should we reorganize teams? How should our training programs change? How should on-boarding for new folks change? The list goes on.
All hosting companies will sometimes generate terrible experiences and sometimes generate great ones. So, to me, the two primary questions are:
- What does the company do with those bad experiences?
- What does the company do with statistics about the average, or a 95th percentile case?
I’d like to tell you our answers to those questions.
After every ticket we ask the customer whether they had a good experience—yes/no—with an optional comment box. A surprisingly large number of people give us that feedback—which we’re grateful for. The ones marked “no” we call an “UNSAT” (unsatisfactory).
Our CEO, Heather Brunner, along with everyone in the support management organization, reads every, single UNSAT. In fact, I myself get emails everyday from Heather, forwarding something with a note like “Could this be related to [an existing initiative]?” or “Isn’t this a pattern?” or “Is this something we need to create a special project around?” or “Do you think [person Y] could help with this?”
In other words, we treat UNSATs like the FAA treats airplane crashes—we look at every one and ask ourselves: Could this have been prevented? Do we need to give this feedback to some folks internally? Does this need to be institutionalized in training material? Could we write code to detect/fix/monitor/notify/analyze this? Only the most idiosyncratic UNSATs end up passing all the questions without action.
That’s the answer to question 1. As for question 2, let’s look at some data.
In mid-2013, we had a 96% satisfaction rating, meaning only 1 out of 25 rated tickets were UNSAT. That’s very good—in fact I haven’t heard of another hosting company with a higher statistical rating. We were happy with that. But we watched it like a hawk. A hawk who’s really into statistics.
By January of 2014, that rating had slipped to 93%. That number is still world-class compared to our peers, but that is not our yardstick. This was an all-hands-on-deck critical catastrophe as far as we were concerned.
To see why, consider that this meant twice as many people had UNSAT than before. When you think of it that way, you stop patting yourself on the back for a 93% satisfaction score.
So we took action. Lots of action.
Today, I’m happy to report that the last-30-day-rolling score is back up to 97% satisfaction. That doesn’t mean “we’re done.” What it means it “OK, that stuff worked, now don’t stop innovating in service or else that will happen again.”
So, what did we do exactly in the first half of 2014 to address this?
- Hiring. We closed our Series C financing in January and immediately put it to work in hiring in the Support Team. We’ve increase the team by 50% since then. It’s very hard to hire quickly and yet maintain our standards of both attitude (culture) and aptitude (ability). We’ve even hired additional internal recruiters to help us accelerate this process.
- Training. We already had a great training team, but they needed more help—more hands, more material, and more feedback about what would be useful, both for existing employees and new employees.
- Specialization. We learned that, especially during the first six months that someone is at WP Engine and (of course) is still learning the ropes, it’s better to specialize in a few subjects and become deep in those rather than getting the complete lay of the land. We also need to balance that against variety and ensuring that everyone has the chance to learn many things and become well-rounded.
- Re-Grouping. With more folks, we were able to reorganize the support groups around specific things. For example, now we have a special team who handles all tickets from brand new customers. New customers have special needs, they don’t know our platform yet, they have questions around things like site migrations, DNS, email, etc..
- Direct-to-Engineer. Some of our customers are highly technical, so whenever they contact us, it’s with difficult, interesting problems—not ones that can be solved with a knowledge base article or a simple, obvious response. Therefore, we started creating pathways for those customers to get to engineers faster—people who can work on the mind-bending stuff. Of course we don’t have that 24/7 yet, like we do with regular support. Fortunately, those problems are usually OK to be solved during normal business hours, so overall this approach has been effective.
- New Customer tools in User Portal. Sometimes, the best service is when our customers can take an action or solve a problem themselves, without having to contact us. We’ve launched a bunch of things in the User Portal to this effect. For example, you can now purchase, install, and configure SSL certificates, whereas before that took a few support tickets to get working. Not only is this faster for customers, it means our support folks can focus on the issues which do require human interaction.
- Unique Innovation in our Platform. The best way to have fewer “my site is slow” tickets is for the site to not be slow! To that end we’ve built and deployed improvements, much of which are novel innovation. We’ll talk about these things in greater detail in future blog posts, but one example is what we call “WPEngineX” which is our own customer module to nginx (our front-end web server technology) in which we do WordPress-specific traffic shaping. What this means, is when a human being is waiting for a page to render, and so is a robot (like a web crawler), the web crawler is forced to wait until the human being gets her page, so the human gets a better experience. There’s also traffic which is almost always nefarious but not always, so we make it even lower priority. There’s much more to it, but even just this high-level behavior is unique in the hosting industry. We have a lot more coming that we’ll be announcing over the next few months. (If you want early access to some of this, contact us!)
To anyone that has had a support issue in the past, or is having one now, I would like to say I’m sorry for your experience. But as you can see, we’ve been busy. We will continue to improve, and are seeing our current efforts working on many fronts. We do this because we love every customer here at WP Engine and your success is incredibly important to us.
The last thing we can be right now is complacent. We are guided by staying humble, staying honest, and working hard. If things are pointing in the right direction, that’s great. The moment we stop being hungry, stop reading those UNSAT’s, stop striving to become better, that will be the beginning of the end.
But in fact, this is just the beginning!
Our VP of Customer Experience, Tina Dobie, loves hearing directly from our customers. Her philosophy is that it’s only by listening to you that we can find out exactly what we need to do to continuously improve—and give you the best possible experience. If you have specific feedback, please let us know. Either open a Support Ticket or email Tina directly—email@example.com.