There are a lot of ways to write code that won’t scale under traffic, or will bring a site down altogether. Because WP Engine hosts thousands of custom WordPress sites, we see every kind of “not scalable code.” In particular, we see the same few mistakes people make using the Transients API when they want to cache a site. This post will cover some of those incorrect methods, and how to correct them. If you’re a technical plugin or theme developer, or you’re responsible to tune a website for speed, including setting up the caching, read on.
Refresher on the Transients API.
(Skip this if you already know what transients are!)
Caching webpage data is essentially a way of temporarily saving a website’s data so that if there are multiple requests for the same data, the site doesn’t have to re-run MySQL or PHP. This can save seconds of time, AND reduce server load.
The idea is to keep data around temporarily, hence the word “transients.” The Transients API is similar to the WordPress Options API. You give data a name and a value—which can be complex, like a multi-level PHP array—and it stores the data. Later, perhaps even on a different request, you can retrieve that data using its name. The difference is that data in the Options table will stick around forever. That is, you can store data, and three years later, it will still be there.
Transients data will not stick around, however. That’s the point! You can request the data, and find that it’s missing in one of two ways. First, when you *store* the data, you specify an expiration date. For example, you could say “store this for three hours.” So if you request it after four hours, it will be missing.
The second way is that piece of data is allowed to simply vanish, at any time, for any reason. That sounds odd I know! What’s the point of storing the data if you can’t count on it? The point is that the storage is a REQUEST which WordPress will attempt to honor, but because of this flexibility, it’s possible to use different kinds of implementations for transient storage, and THAT means it’s possible to use different, advanced technology to make transients extremely efficient, and operate properly even in a multi-server clustered environment.
Because of this “vanish at any time” thing, you generally use transients for a cache. That is, if you need to compute something that takes real time, like a slow MySQL query, or retrieving data from an external source like someone’s Twitter or RSS feed, you can store the computed data in a transient, knowing that if it goes missing it’s always possible to recreate it. But in the usual case — when it does NOT go missing — you have the data quickly without having to recompute.
Different Transient Implementations Break Sites
What gets you into trouble with Transients are the different, and hard to reproduce, behaviors you get when your plugin/theme runs under different transients implementations. Different transients implementations means you will ONLY run into problems in certain configurations with certain types of sites, and never in others. As a developer, if you’re not aware of this pitfall and aren’t coding accordingly, you’ll think your code is sound, but in fact it will break in the field, and you won’t even know how to reproduce it!
The two most common implementations of the Transients API backend:
- The one built into WordPress, and therefore the most common by far. Transient values are stored in the wp_options table just like regular options. With transients, an additional option is stored to hold the expiration date. When a transient is accessed, WordPress pulls the expiration date first. If it’s expired, WordPress deletes both options from the table, thereby “cleaning up” the data, and pretends the data was never there. If it’s not expired, it grabs the content from the options table.
- Memcached. Memcached is a very simple yet efficient and reliable server-side software designed to do exactly what the Transients API is supposed to do — store data based on a key, which expires at a given time, and which can vanish at any time if it needs to.
Using memcached instead of WordPress Transients has two special benefits, which is why we automatically preconfigure it for you here at WP Engine. We’ve got your back.
i. Memcached is 10x-100x faster at storing and retrieving values than WordPress Transients, which is especially interesting since the point of transients is to cache data to increase the speed of a site.
ii. Memcached sets the maximum amount of space it will take up with data (e.g. 64MB of RAM), which means if a site stores too much data at once, it will automatically throw out old data, and therefore never runs out of space. But the built-in WordPress Transients will store an arbitrary amount of data in the options table.
Applying Transients in Your Development
Suppose you’re reading and writing the same Transients key repeatedly, and suppose it’s 1k of data. In that case, both Memcached and the WordPress Transients will do just what you expect them to, and both will take up about 1k of space (either in the options table or in memcached).
Now suppose you’re reading and writing different Transients keys, different for each browser session. In short, what if you’re storing user-session data in Transients? This makes sense on the surface. Session data shouldn’t last forever, and you don’t want to bother with special database tables. Besides that, many WordPress Hosting companies don’t allow PHP sessions, so this really is the next best thing. It’s even fast and multi-server.
Here’s where the differences appear. With Memcached, and with a site with low traffic, this method appears to work. The values last for a while, then get deleted as they expire. However, if the server is heavily loaded, the amount of session data the server needs to store will exceed the space available in memcached, and therefore you’ll start losing session data sooner than you thought. And if you didn’t test in this exact, heavily loaded, environment, you’d never know that. In general, these environments end up filling memcached so quickly that effectively memcached is disabled because it can never hold onto data long enough to be useful. You effectively don’t have a Transient API cache at all!
But with the WordPress Transients, you get different but still very undesirable behavior. Because the values are written to the database, not a fixed-sized block of RAM, they all stick around. Which means even with the heavily-loaded site, you still have your session data. Awesome!
Or so you thought. Because what isn’t said in the Transient API is what happens when you use unique keys like sessions. And what happens in with the built-in method is that the options table fills up indefinitely! Because: WordPress’s “old data clean up” only operates when you request the key (as we covered earlier). If you just leave the key, it’s left in the options table, forever. There’s no separate process that cleans these up!
We’re not only aware of this problem, we want to help out our customers who might be running code that doesn’t understand this. That’s where the “managed” in Managed WordPress comes in! Every night, we have an automated process to look through your options table for expired transients, and delete them (both the data and the expiration date item). Boom! They’re gone, and you don’t have to worry about them.
So that’s awesome that we do this, but as a developer of a general plugin or theme which is supposed to operate properly on any hosting environment, you can’t do this in all the potential environments your plugin or theme might be deployed, and you want to make sure you’re providing your users and customers with the most optimized code possible.
However, now that you know how Transients can go awry, it might be an awesome idea to take a look and see how you can make your code more efficient. Chances are, if you’ve read this far, you like optimizing your code, and this blog post may throw down the gauntlet for you and motivate you to re-write some things to be more scalable!
Hope this helps.
Austin W. Gunter