One of media websites which we administer has 14,000 posts and close to 400,000 comments. Is WordPress any good as running a big site like this? Does it scale well?
Read some of our tips how to optimize WordPress for large sites below:
Post revisions
This is a weakness of WordPress. On this site there are over 30,000 revisions for the 14,000 posts. That makes the table bigger and it’s slower to search in it. WordPress users realized this three years ago.
Tip #1
We backed up the the wp_posts table and then used a simple MySQL command to remove old post revisions. This decreased the table size from 400MB to 120MB:
DELETE FROM `wp_posts` WHERE post_type = 'revision' AND post_date NOT LIKE '2012-%'
Long-term solution
There are WordPress plugins which can limit the number of revisions per post. We think that the WordPress code should be improved and the revisions should be stored in a different table to maximize speed. You can support this on WordPress bug tracking.
Large number of comments
We need to be extra careful specially with the wp_comments table which has close to 400,000 comments and 320MB. We often find speed issues with MySQL queries which take only a couple of milliseconds on smaller sites.
WordPress 3.2 added a count of comments into the WordPress Admin Bar which shows up for logged in Administrators and Editors when browsing the site.
Comment count in WordPress Admin Bar
Counting the comments actually takes a lot of time on our big database. Here’s the query and it’s duration in log:
SQL query for counting of the comments
SELECT comment_approved, COUNT(*) AS num_comments FROM wp_comments GROUP BY comment_approved;
In the above case, it takes 0.3 seconds, while all the other queries are done in 0.05 – 0.001 seconds.
Tip #2
To do this kind of audits use WPDB Profiling plugin for WordPress. It shows you all sorts of information in the site footer and you can turn it on and off as you like.
Beware: this plugin turns off post revisions and autosaves when activated, you need to turn it off once you are finished or change it’s settings.
It’s clearly the slowest query when the site is loading. And this site has up to 20 editors who like to come to the site and read the comments to their articles, so it affects the performance. Keep in mind that this information is shown on multiple places in WordPress Admin Interface.
We created our own queries to do this count. It’s 5 queries instead of 1, but they are faster. Just try to test them:
SELECT COUNT(comment_ID) FROM wp_comments WHERE comment_approved = 'trash'
SELECT COUNT(comment_ID) FROM wp_comments WHERE comment_approved = 'spam'
SELECT COUNT(comment_ID) FROM wp_comments WHERE comment_approved = '0'
SELECT COUNT(comment_ID) FROM wp_comments WHERE comment_approved = 'post-trash'
SELECT COUNT(comment_ID) FROM wp_comments
The last query counts all the commands, so we subsctract the previous counts. Here are our results – 0.042144 seconds:
SQL query for counting of the comments improved
That’s a big improvement over 0.3 seconds duration with the standard query.
Tip #3
You can also download a code and test it on your site, just put it into functions.php file of template here.
If you want to test above queries, replace SELECT with SELECT SQL_NO_CACHE to make sure no MySQL caching will be used.
We also opened a bug tracking ticket for this on WordPress Trac – Speeding up Dashboard and Comment moderation SQL load.
Finding slow database queries with MySQL tools
Another way of finding the slow queries is using this MySQL command. It’s best used when you have SSH access to the site.
- Watch the server load using top.
- When you see that mysql process is taking too much of the CPU, just try to list running queries with following command in MySQL console:
mysql> SHOW PROCESSLIST; +-------+------+---------+------+--------------+------------------------------------------------------------------------------------------------------+ | Id | User | db | Time | State | Info | +-------+------+---------+------+--------------+------------------------------------------------------------------------------------------------------+ | 59462 | site | site_db | 0 | NULL | SHOW PROCESSLIST | | 61208 | site | site_db | <strong>62</strong> | Sending data | SELECT ID FROM wp_posts AS a LEFT JOIN (SELECT post_id FROM wp_postmeta WHERE meta_key='...') AS b | | 61228 | site | site_db | 25 | Locked | UPDATE `wp_postmeta` SET `meta_value` = '1327484262:39' WHERE `post_id` = 66955 AND `meta_key` = '_e | | 61238 | site | site_db | 16 | Locked | SELECT post_id, meta_key, meta_value FROM wp_postmeta WHERE post_id IN (17992) | | 61241 | site | site_db | 16 | Locked | SELECT post_id, meta_key, meta_value FROM wp_postmeta WHERE post_id IN (12931) | | 61249 | site | site_db | 11 | Locked | SELECT post_id, meta_key, meta_value FROM wp_postmeta WHERE post_id IN (34465) | | 61251 | site | site_db | 11 | Locked | SELECT post_id, meta_key, meta_value FROM wp_postmeta WHERE post_id IN (5209) | | 61257 | site | site_db | 6 | Locked | SELECT post_id, meta_key, meta_value FROM wp_postmeta WHERE post_id IN (34465) | | 61258 | site | site_db | 5 | Locked | SELECT meta_id FROM wp_postmeta WHERE meta_key = '....' AND post_id = 24661 | | 61262 | site | site_db | 1 | Locked | SELECT post_id, meta_key, meta_value FROM wp_postmeta WHERE post_id IN (5367) | +-------+------+---------+------+--------------+------------------------------------------------------------------------------------------------------+
In the above list, there is some nasty query which is taking 62 (!) seconds to get executed and the other queries are just waiting until it’s finished. Yes, the site was in real trouble when the above list was saved.
There is also an MySQL option called “log slow queries”, but we newer had a success with it.
Here’s the query in it’s full beauty. Since it’s using subqueries, it’s slow and hard to optimize:
SELECT ID FROM wp_posts AS a LEFT JOIN (SELECT post_id FROM wp_postmeta WHERE meta_key='fb') AS b ON a.ID=b.post_id WHERE b.post_id IS NULL;
We re-coded the plugin and this bad query is no longer used. This was some weird query in some old plugin which was running as WordPress Cron job, so it’s not visible in WPDB Profiling which was described above.
Tip #4
If the query seems to complicated, try to move some of the processing into PHP.
Use indexes!
If you see a slow query, try to use MySQL EXPLAIN command on it. Here’s the query:
SELECT count(*) FROM wp_comments AS c JOIN wp_posts AS p ON c.comment_post_ID = p.ID WHERE c.user_id = '1079' AND c.comment_approved = '1' AND p.post_status = 'publish' AND comment_content REGEXP '[[::]]' = 1
Here’s the output of the EXPLAIN command. Notice the big number in the “rows” column. It means that MySQL has to examine 377,606 rows in the wp_comments table – that’s all the comments on the site.
mysql> EXPLAIN SELECT count(*) FROM wp_comments AS c JOIN wp_posts AS p ON c.comment_post_ID = p.ID WHERE c.user_id = '1079' AND c.comment_approved = '1' AND p.post_status = 'publish' AND comment_content REGEXP '[:asdf:]' = 1; +-------------+-------+--------+------------------------------------------------------------+------------------+---------+------------------------+--------+-------------+ | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +-------------+-------+--------+------------------------------------------------------------+------------------+---------+------------------------+--------+-------------+ | SIMPLE | c | ref | comment_approved,comment_post_ID,comment_approved_date_gmt | comment_approved | 62 | const | 377606 | Using where | | SIMPLE | p | eq_ref | PRIMARY | PRIMARY | 8 | site.c.comment_post_ID | 1 | Using where | +-------------+-------+--------+------------------------------------------------------------+------------------+---------+------------------------+--------+-------------+ 2 rows in set (0.00 sec)
The SQL query is using user_id in the WHERE clause. And there is no index which would contain this (see possible_keys in above output, you can also execute “SHOW INDEXES IN wp_comments;”).
So we create a new index which will combine two fields from the WHERE clause:
CREATE INDEX userid_approved_index ON `wp_comments` (`user_id`,`comment_approved`)
The table index size will increase a bit, but SQL will search only 1,423 rows when executing our query, because it’s able to use our new index:
mysql> EXPLAIN SELECT count(*) FROM wp_comments AS c JOIN wp_posts AS p ON c.comment_post_ID = p.ID WHERE c.user_id = '1079' AND c.comment_approved = '1' AND p.post_status = 'publish' AND comment_content REGEXP '[:asdf:]' = 1; +-------------+-------+--------+----------------------------------------------------------------------------------+-----------------------+---------+----------------------------------+------+-------------+ | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +-------------+-------+--------+----------------------------------------------------------------------------------+-----------------------+---------+----------------------------------+------+-------------+ | SIMPLE | c | ref | comment_approved,comment_post_ID,comment_approved_date_gmt,userid_approved_index | userid_approved_index | 70 | const,const | 1896 | Using where | | SIMPLE | p | eq_ref | PRIMARY | PRIMARY | 8 | mondoweiss_net.c.comment_post_ID | 1 | Using where | +-------------+-------+--------+----------------------------------------------------------------------------------+-----------------------+---------+----------------------------------+------+-------------+
Tip #5
When you create a plugin which uses custom SQL queries, have look at the tables and make sure the fields which you used in WHERE or ORDER BY statements are indexed.
Add more indexes to the table if necessary (but then be careful if you ever reinstall WordPress).
Another example would be sorting comments by comment_date. comment_date field is not indexed, but comment_date_gmt is. So use that one and your queries will be faster.
- Check out the indexes on wp_comments table in Database Description on WordPress.org Codex.
- Read more about indexes in MySQL 5.0 Reference Manual
Results
Check out the load of our server. It dropped significantly once we added these tweaks.
Reduced load
Martin Viceník
Martin graduated as an engineer in Computer Science from Slovak Technical University in Bratislava. He grew up in Liptovský Mikuláš in northern Slovakia next to the beautiful Tatra mountains. He is the developer behind our FV Player.
Some good advice here but I disagree with top#1 : the proper way to deal with a big table is not to delete the data… wordpress bug thread says the same : the post table may grow big anyway, this shouldn’t be a problem… .. if your tables uses indexes properly and if there’s no problem with the queries it doesn’t matter that your table has 500k rows or 5000, the SQL engine is built so that it shouldn’t scan the 500k rows. A proper way to measure performance gains would have been to test with and without each tip, if possible more than once, with the same load.The graph with your server load through time is far from enough
Hello Micky,
I agree with you that DB engines are built to operate on big tables. The problem is that 0.1sec or 0.2sec makes no difference for MySQL. They are both fast queries. We think that under high load, this makes a difference. And deleting old revisions and other unimportant data is no problem at all.
Hi, I wrote an article on slow performing meta data you might be interested in reading… nathanfranklin.com.au/coding/experiments-with-meta
I discusses some of the problems with meta data queries and some ways around them that compliments this article.
Warning: Undefined array key 1 in /srv/users/fvision-staging/apps/staging-foliovision/public/site/wp-content/plugins/thoughtful-comments/fv-thoughtful-comments.php on line 1449
I know, that this is out of topic. I have also web site advokator.cz/ I am going publish there 50 000 pages and database is quite big. cca 350 MB. But this things are trying to solve problem, what can fix this plugin … wordpress.org/plugins/wp-super-cache/ It will create in my case 50000 html pages and they not will be uploaded from database anymore. After this, pages should be very vast to load. What do you think about it?
Hi Advokat,
Thanks for stopping by.
You’ll want to read our post about WordPress caching plugins. Hyper Cache is the winning ticket.