Refactoring Your PHP Code

This article describes in a nutshell what I learned from the Refactoring Workshop with Lars Jankowfsky and Thorsten Rinne at the International PHP Conference 2009 Spring Edition in Berlin some weeks ago. The main focus was on refactoring and test-driven development which I always wanted to do but actually never did. I had already installed PHPUnit some years ago after attending a session with Sebastian Bergmann but knowing how to install and use PHPUnit does not necessarily mean that you know how refactoring works and what you need to focus on.

So it has actually been very helpful for me to attend that workshop although I now know that continuous refactoring is essential. It doesn’t make sense to refactor a project and then again refactor it a year later despite adding new features and fixing bugs during the whole year. You always have to refactor your code while developing. It is essential when beginning with refactoring not to start a large refactoring project that takes a year to complete because this is expensive and doesn’t create any value for the end user, i.e. the visitor of your website or your customer. Instead you should allocate a specific amount of time, e.g. 20-30% to spend on refactoring your code.

I won’t go into the complete details because that would be far more than I want to cover in this article. Also remember that I have not yet used unit testing so this article reflects just what I have learned yet.

Refactoring is better that rewriting your code because rewriting normally takes too long and is therefore too expensive. I personally love to rewrite but I know that it may take very long and if there is something more important – something that is directly related to revenue – I may stop rewriting and instead focus on the more important project. Just remember that starting a project from scratch once again may feel good but most of the time it takes much more time than you may have anticipated at the beginning.

I have learned that refactoring with methods like pair programming is very helpful especially if a senior developer is working together with a less experienced developer so that he can learn new methods as well and gets to know the code much better.

When Should I Refactor?

If a specific method has (too) many lines of code and you cannot understand within a short time what the purpose of that method is or if it has more than around five parameters in the constructor most of the time it’s time for refactoring. Long methods can almost always be broken down into several smaller methods which also eases writing unit tests for the code.

Test-Driven Development

Test-driven development is the way to go and it always reappeared during the conference in many different sessions. You need to distinguish between unit tests and acceptance tests however. To test your layout and GUI (i.e. the view) you should use tools such as Selenium instead.

Tests should be regarded as part of the documentation because they actually help documenting your code.

So how to start? First of all never refactor the easy parts first, move the risks to the beginning. It’s not helpful if you have created 100 tests which may have taken only 30 minutes to complete but the one single test that’s missing requires 30 hours to create.

When refactoring always create a test for the existing code first then change the code so that you can be sure that you did not actually introduce a new bug during refactoring.

How To Create Testable Code

With test-driven development you first of all create the raw skeleton of the method you wish to create the test for (i.e. implement the feature) and then implement the first test which will of course fail because no code has yet been written in the method that has been tested.

By focusing on just getting the code (the single method) to work you are no longer tempted to abstract and will just focus on implementing the feature that is required. So no longer try to implement a feature that you may eventually some time in the future need to implement but in fact most of the time would never be used. Just implement what is required, what your test need to succeed.

You should never use global or superglobal variables in your methods because this doesn’t make your code testable. So don’t use $_GET, $_SESSION etc. in any of your methods at all and instead give these values as parameters.

You also need to implement one test per result type that the method returns, i.e. if it returns false in one case and true in another you need to create two tests for that method.

Useful Tools

There are several tools available that help you with test-driven development and refactoring:

  • phpcpd which is a copy&paste detection tool
  • phpcs – PHP CodeSniffer which will check if the code does not violate your coding guidelines
  • a coding style plugin for Eclipse
  • ZendStudio For Eclipse offers debugging, profiling and a good integration of the Zend Framework

Always test protected and private methods as well because if you only test public methods you are unlikely to immediately find the source of a failing test if a tested public method calls private or protected methods internally. If you need to write tests for protected methods you can use the unittools to create a proxy class.

Static methods make creating tests extremely difficult (may be possible using dependency injection) – maybe you do not have to use static methods as all?

You do not need to create tests for methods which only use some of PHP’s built-in functions like filesystem functions that load a file if it exists because you want to test your own code, not PHP’s code.

Quite often – e.g. if you are using a database or some other dynamic datasource – you need to make sure that you are really testing your PHP code and the test won’t fail if the database server is down. You need to make sure that the underlying data structure is not dynamic. In that case you have to create fixtures and/or mock objects that simulate the dynamic structure (albeit static) in your tests.

You may also want to have a look at (PHP)YAML

Checklist

Here is a short checklist I got to know at the workshop:

  1. use phpcs to find errors
  2. Fix all errors
  3. Find duplicates with phpcpd (copy/paste lines)
  4. Create tests for old code (w/ refactoring if needed to remove dependencies) (mock/fixtures)
  5. Refactor

I only scratched the surface in this article and I am very keen on beginning with test-driven development. I hope that I could share some insight into what I learned at the PHP conference regarding refactoring. For more in-depth information you should really attend one of the conferences – especially the workshops.

I am planning to share more information on setting up and using PHPUnit in one of the following articles on my blog so stay tuned.

Be Sociable, Share!
Sascha Kimmel - Living The Web Experience Since 1996 by tricosmedia