Optimizing Using Xdebug and Kcachegrind
Introduction
Recently it came to light that a product of ours was going to be receiving more and more volume. Our client had plans of putting more users on our system (about ten times as many), and we had already experienced some performance problems before. It is the type of problem that you would have a hard time fixing by throwing more servers at it. The original application wasn’t all that well designed. We made a lot of mistakes and we learned a lot about product development on this particular application. It would seem that horizontal scalability (adding more servers) wouldn’t solve our problem. There are far too many bottlenecks in the database, so more application servers don’t really help us. That leaves the option of vertical scaling (upgrading or replacing servers). Vertical scaling, however, brings with it another element: Cost. Not wanting to spend a ton of money on replacing our already very expensive servers, we turned our eyes to the code itself.
Installing Xdebug and KCacheGrind
How does one find these problems? My first thought was some way of profiling code. Xdebug (http://xdebug.org/) provides a pretty good profiler, and exports results to cachegrind files that can be analyzed using KCacheGrind (http://kcachegrind.sourceforge.net/). This particular combination of open source ventures is very useful. You simply install the xdebug extension, enabled it, then edit your xdebug settings. I use mostly default settings, and you can enable the profiler with the following setting:
xdebug.profiler_enable="1"
XDebug has an enormous amount of options, and the default configuration provided with the extension is pretty solid. There’s a page on their website outlining all the settings and their uses (http://xdebug.org/docs/all_settings).
First Run
Once you have installed and XDebug and KCacheGrind, run your application, and then look in /tmp to see if you have a cachegrind output file. For example:
root@localhost /tmp # ls -al -r cachegrind.out.*
-rw-r--r-- 1 apache apache 105863 Jun 17 12:35 cachegrind.out.510234603
If you do not have a file similar to this, be sure the extension is loaded. If you are using a web server such as apache, remember to restart the daemon after installing xdebug. If you’re using the command line, simply running a php –v should tell you whether or not XDebug is installed and working. For example:
root@localhost /etc/php/cli-php5/ext-active # php -v
PHP 5.2.2-pl1-gentoo (cli) (built: May 25 2007 12:34:43)
Copyright (c) 1997-2007 The PHP Group
Zend Engine v2.2.0, Copyright (c) 1998-2007 Zend Technologies
with Xdebug v2.0.0RC3, Copyright (c) 2002, 2003, 2004, 2005, 2006, 2007, by Derick Rethans
If you are using a web server for your application, XDebug will show up in the output of a phpinfo() call if it is installed properly. You can also use this to see if the profiler is properly enabled. It should look something like this (There's a lot more options than I show here):

An Example
I have created a simple application which loops 10,000 times, printing stuff to the screen, performing arithmetic, generating random numbers, and file output. The entire purpose of this little noisy script is to do a whole bunch of stuff, and give us an opportunity to see which parts take the longest, using Xdebug and KCacheGrind. I wanted to profile an actual application at some point, but I felt as though it would be too cumbersome and it might be difficult to illustrate the idea. Below is my example.
profile1.php
<?php
define ('NUM_LOOPS', 10000);
function complex_calculation()
{
return 1.034587763 * mt_rand() % mt_rand();
}
function print_something()
{
echo "something";
}
function write_something()
{
$fp = fopen("test.tmp", "w");
fwrite($fp, "something_important");
print_something();
fclose($fp);
}
for ($i = 0; $i < NUM_LOOPS; $i++)
{
complex_calculation();
print_something();
write_something();
}
?>
Once you have loaded the cachegrind output file into KCacheGrind, you should be presented with a few panels. One is a list view, another is a panel with a bunch of tabs, and another is a tree view which provides a very useful graphical representation of where CPU is being spent. In the list (and the graph) you should notice an item labeled {main}. This is the all-inclusive element that shows the total execution of the program you're profiling. It should show as 100% of your CPU usage. Here is the list view:

Here's a breakdown of each column:
- Incl.: The total CPU time spent in this function and every function it called inclusively.
- Self: Only CPU time spent in this function, not counting the time spent in functions CALLED by this function.
- Called: Total number of times this function was called.
- Function: Name of this function.
- Location: Script file containing this function.
Next is the graph/table view. The nice thing about this feature is its ability to show major choke points in a very noticeable format. Typically, this graph excludes calls which are very tiny compared to the rest of the application. Very useful. If you select the {main} box in the graph and you should see something very close to this:

Even the most untrained eye can probably guess that write_something() is the slower part of this application. Double click the write_something() box to make it become the new focal point of the graph. Now that you have that selected, you should see something similar to this:

With write_something() centered, you can see who has called this method (100% of the time, it is {main}), and how much time is spent in functions called by write_something(). So, of the time spent in write_something(), most of that time is spent opening the file, and less time is spent in fwrite().
Conclusion
I could go into details about optimizing my example script, but I feel as though the point has been delivered. Xdebug and KCacheGrind work together to become a very powerful tool when attempting to optimize a PHP application. There are many features to KCacheGrind and I encourage everyone to explore them all. Additionally, it has the capability to track memory usage, but I've had personal experiences which suggest that does not work perfectly.
Good luck!

Leave a comment
You must be logged in to post a comment.