Saturday, December 09, 2006

PHP Optimization Tricks

There are a number of tricks that you can use to squeeze the last bit of performance from your scripts. These tricks won't make your applications much faster, but can give you that little edge in performance you may be looking for. More importantly it may give you insight into how PHP internals works allowing you to write code that can be executed in more optimal fashion by the Zend Engine. Please keep in mind that these are not the 1st optimization you should perform. There are some far easier and more performance advantageous tricks, however once those are exhausted and you don't feel like turning to C, these maybe tricks you would want to consider. So, without further ado...
1) When working with strings and you need to check that the string is either of a certain length you'd understandably would want to use the strlen() function. This function is pretty quick since it's operation does not perform any calculation but merely return the already known length of a string available in the zval structure (internal C struct used to store variables in PHP). However because strlen() is a function it is still somewhat slow because the function call requires several operations such as lowercase & hashtable lookup followed by the execution of said function. In some instance you can improve the speed of your code by using a isset() trick.

Ex.
if (strlen($foo) < href="http://php-docs.blogspot.com/">PHP optimizer. It is a still a good idea to keep in mind since not all opcode optimizers perform this optimization and there are plenty of ISPs and servers running without an opcode optimizer.

3) When it comes to printing text to screen PHP has so many methodologies to do it, not many users even know all of them. This tends to result in people using output methods they are already familiar from other languages. While this is certainly an understandable approach it is often not best one as far as performance in concerned.

print vs echo

Even both of these output mechanism are language constructs, if you benchmark the two you will quickly discover that print() is slower then echo(). The reason for that is quite simple, print function will return a status indicating if it was successful or not, while echo simply print the text and nothing more. Since in most cases (haven't seen one yet) this status is not necessary and is almost never used it is pointless and simply adds unnecessary overhead.

printf

Using printf() is slow for multitude of reasons and I would strongly discourage it's usage unless you absolutely need to use the functionality this function offers. Unlike print and echo printf() is a function with associated function execution overhead. More over printf() is designed to support various formatting schemes that for the most part are not needed in a language that is typeless and will automatically do the necessary type conversions. To handle formatting printf() needs to scan the specified string for special formatting code that are to be replaced with variables. As you can probably imagine that is quite slow and rather inefficient.

heredoc

This output method comes to PHP from PERL and like most features adopted from other languages it's not very friendly as far as performance is concerned. While this method allows you to easily output large chunks of text while preserving things like newlines and even allow for variable handling inside the text block this is quite slow and there are better ways to do that. Performance wise this is just marginally faster then printf() however it does not offer nearly as much functionality.

?> <?

When you need to output a large or even a medium sized static bit of text it is faster and simpler to put it outside the of PHP. This will make the PHP's parser effectively skipover this bit of text and output it as is without any overhead. You should be careful however and not use this for many small strings in between PHP code as multiple context switches between PHP and plain text will ebb away at the performance gained by not having PHP print the text via one of it's functions or constructs.

4) Many scripts tend to reply on regular expression to validate the input specified by user. While validating input is a superb idea, doing so via regular expression can be quite slow. In many cases the process of validation merely involved checking the source string against a certain character list such as A-Z or 0-9, etc... Instead of using regex in many instances you can instead use the ctype extension (enabled by default since PHP 4.2.0) to do the same. The ctype extension offers a series of function wrappers around C's is*() function that check whether a particular character is within a certain range. Unlike the C function that can only work a character at a time, PHP function can operate on entire strings and are far faster then equivalent regular expressions.
Ex.
preg_match("![0-9]+!", $foo);
vs
ctype_digit($foo);

5) Another common operation in PHP scripts is array searching. This process can be quite slow as regular search mechanism such as in_array() or manuall implementation work by itterating through the entire array. This can be quite a performance hit if you are searching through a large array or need to perform the searches frequently. So what can you do? Well, you can do a trick that relies upon the way that Zend Engine stores array data. Internally arrays are stored inside hash tables when they array element (key) is the key of the hashtables used to find the data and result is the value associated with that key. Since hashtable lookups are quite fast, you can simplify array searching by making the data you intend to search through the key of the array, then searching for the data is as simple as $value = isset($foo[$bar])) ? $foo[$bar] : NULL;. This searching mechanism is way faster then manual array iteration, even though having string keys maybe more memory intensive then using simple numeric keys.

Ex.

$keys = array("apples", "oranges", "mangoes", "tomatoes", "pickles");
if (in_array('mangoes', $keys)) { ... }

vs

$keys = array("apples" => 1, "oranges" => 1, "mangoes" => 1, "tomatoes" => 1, "pickles" => 1);
if (isset($keys['mangoes'])) { ... }

The bottom search mechanism is roughly 3 times faster.