Smelly PHP code

Adam Culp posted the 3rd article in his Clean Development Series this week, Dirty Code (how to spot/smell it). When you read it, you should keep in mind that he is pointing out practices which correlate with poorly written code not prescribing a list of things to avoid. It’s a good list of things to look for and engendered quite a discussion in our internal Musketeers IRC.

Comments are valuable

Using good names for variables, functions, and methods does make your code self commenting, but often times that is not sufficient. Writing good comments is an art, too many comments get in the way, but a lack of comments is just as bad. Code can be dense to parse where a comment will help you out. They also let you quickly scan through a longer code block, just skimming the comments, to find EXACTLY the bit you need to change/desbug/fix/etc. Of course, the latter you can also get by breaking up large blocks of code into functions.

Comments should not explain what the code does, but should capture the “why” of how you are solving a problem. For example, if you’re looping over something a bad comment is “// loop through results”, a good comment is “// loop through results and extract any image tags”

Using Switch Statements

You definitely should not take this item in his list to mean that “Switch statements are evil.” You could have equally bad code if you use a long block of if/then/elseif statements. If you’re using them within a class, you’re better off using polymorphism, as he suggests, or maybe look at coding to an Interface instead of coding around multiple implementations.

Other code smells

In reviewing the article, I thought of other smells that indicate bad code. Some are minor, but if frequent, you know you’re dealing with someone who knows little more than to copy-and-paste code from the Interwebs. These include:

  • Error suppression with @. There are very, very, very few cases where its ok to suppress an error instead of handling the error or preventing it in the first place.
  • Using globals directly. Anything in $_GET, $_POST, $_REQUEST, $_COOKIE should be filtered and validated before you use it. ‘Nuff said
  • Deep class hierarchy. A deep class hierarchy likely means you should be using composition instead of inheritance to change class behaviors.
  • Lack of Prepared DB Statements. Building SQL queries as strings instead of using PDO or the mysqli extensions’ prepared statements can open up sql injection vulnerabilities.
  • Antiquated PHP Practices. A catch all for things we all did nearly a decade ago, includes depending on register globals being on, using “or die()” to catch errors, using the mysql_* functions. PHP has evolved, there’s no reason for you not to evolve with it.

That’s generally what I look for when evaluating code quality. What are some things I missed?

Automating FTP uploads

I needed to automate copying files for a website that I was building. Since this site was hosted on an inexpensive shared hosting plan, I didn’t have the luxury of shell or rsync access to automate copying files from my local development environment to the host. The only option was FTP, and after wasting too much time manually tracking down which files I needed to update, I knew I needed an automated solution. Some googling led me to lftp, a command-line and scriptable FTP client. It should be available via your distribution’s repository. Once installed, you can use a script like the one below to automatically copy files.

{syntaxhighlighter BASH}
#!/bin/sh
# paths
HERE=`pwd`
SRC=”$HERE/web”
DEST=”~/www”
# login credentials
HOST=”ftp.commodity-hosting.com”
USER=”mysiteusername”
PASS=”supersecretpassword”
# FTP files to remote host
lftp -c “open $HOST
user $USER $PASS
mirror -X img/* -X .htaccess -X error_log –only-newer –reverse –delete –verbose $SRC $DEST

The script does the following:

  • Copy files from the local ./web directory to the remote ~/www directory.
  • Uses $HOST, $USER, $PASS to login, so make sure your script is readable, writeable, and executable only by you and trusted users.
  • the lftp command connects and copies the files. The -c switch specifies the commands to issue, one per line. The magic happens with the mirror command which will copy the files. Since we added the –only-newer and –reverse switches, this will upload only files which have changed.
  • You could be a little safer and remove the –delete switch, which will remove files from the destination which are not on your local machine.
  • You can use the -X to give it glob patterns to ignore. In this case, it won’t touch the img/ directory or the .htacess file.

If you’re still moving files over FTP manually, even with a good GUI, it’ll be worth your time to automate it and make it a quicker, less error-prone process.

Back to SQL it is

An honest write up with first hand details of the shortcomings of couchdb in production. There’s a reason to stick with proven technologies and not simply chasing the latest shiny. Not saying sauce labs did that, just sayin’.

This post describes our experience using CouchDB, and where we ran into trouble. I’ll also talk about how this experience has affected our outlook on NoSQL overall, and how we designed our MySQL setup based on our familiarity with the positive tradeoffs that came with using a NoSQL database.

From: Goodbye, CouchDB | Selenium Testing? Do Cross Browser Testing with Sauce Labs

More programmers != more productivity

Carl Erickson observes that a small, boutique team of developers can be massively more productive than a larger team.

To complete projects of 100,000 equivalent source lines of code (a measure of the size of the project) they found the large teams took 8.92 months, and the small teams took 9.12 months. In other words, the large teams just barely (by a week or so) beat the small teams in finishing the project!

Its immediately reassuring to see those numbers, since I’ve been on enough projects that, once they start falling behind, the temptation to throw more programmers at it grows. Project managers see it as a resource scarcity problem (not enough programmers) and don’t realize that coordination and communication burden that they adding by bringing more people on to a project. Now you have a new group of programmers that need to be brought up to speed, learn the codebase, and accept design decisions that have already been made. You’re lead programmers won’t have as much time to actually program, since they’ll be helping bring everyone else up to speed. Developers have known about this for years, Fred Brooks wrote the book in it – The Mythical Man-Month.

But while the study’s conclusion is reassuring, I wonder if there are other factors at work. Theres an obvious selection bias in the type of people who go to work at a large IT programming department/shop versus those who choose to work solo or in smaller teams. Are large teams filled with junior 9-5 programmers who just want a steady job but punch out in the evening? Do smaller teams attract more experienced and productive people who prefer to work smarter rather than harder? From the study summary, it doesn’t look like they considered this aspect.

What’s in your Project Management toolbox?

Matthew at DogStar describes his PM toolbox today, The Project Management Tool Box | Opensource, Nonprofits, and Web 2.0.  It’s a detailed and well organized list, and I think reflects a very practical approach. The first thing that strikes me, is the overwhelming amount of tools available to the would-be PM.  Certainly, there is no lack of tools out there.

You see, the general feeling is, there is no silver bullet. There is no grail of a tool that does everything a single Web Producer, Project Manager, Product Manager, or Content Manager might need or want. There is clearly a gap that is filled with a series of different products. This walked hand in hand with a desire to review processes at work and engage in course corrections. It is an excellent habit to follow – look what you are doing with a critical eye, analyse why you are doing it, and make changes as needed. I have worked across four different shops with a wide variety of different ways of practicing project management. I have used these methodologies and tools across ~ 50 different Drupal projects and another 25 or so custom PHP MySQL projects.

I could not agree more that its important to not be seduced into picking the one right tool for every situation.  It is a difficult tempation to resists, especially when you have tool makers pushing to sell you a solution. The best tool for the job isn’t the one that has the most features, its the one that you, and your team, end up using the most.

As I read the article, a thought that struck me is that sometimes, you don’t need ONE tool, you just need to make sure everyone has the right tools (and skills) to be productive and responsible. At work, we’re a tiny team of 3 who deal with day to day management of our Drupal site, unexpected requests on tight deadlines, and long term projects to build new features. Here’s a secret – we don’t have a central bug/ticket tracking tool. We can be productive simply with email, IM, code bearing, and face to face conversations. For big projects we use a whiteboard to wireframe, capture tasks, and track progress.  This works better than a more sophisticated technical solution that would impose a greater burder on our time.

What’s your experience with tools and grappling with finding the perfect tool?

Have you heard of Devops?

Seems to be the next big thing in software processes land. So, hire competent peeople and try to get out of the way.

These things are all the basics you pick up by reading Learn How Not to be a Complete Failure at Software Development in 24 Hours. None of it will make your developers any less prone to do stupid shit, and none of it will prevent your systems administrators from roadblocking developers just for funsies.

Devops Is a Poorly Executed Scam

Why switch to git?

Get ready to clone.

Clone … What could possibly go wrong?

If you’re a coder, you’ve already heard about distributed version control systems (DVCS) and git in particular. I was content, almost complacent, in my usage of subversion to manage my source code, both for personal projects and at work. Subversion was intended as a “compelling” upgrade for CVS, and the next version control system (VCS) I used had to be equally as compelling. I think git has cemented itself as the preferred DVCS within the Open Source community. No doubt services like github have helped popularize it over bzr and mercurial. I’ve switched over to git over the course of the last month, these are some of my motivations and a record of what I learned along the way.

Why not stick with Subversion?

Subversion, since it has been around longer, enjoys better support in text editors, IDEs, and GUI front ends . It’s easy to setup a repository that you can share with a collaborator, or sign up for a hosted repository. It’s killer feature is the ability to fetch parts of a remote repository via the svn:externals feature. I’d used that extensively to organize projects and track 3rd party software such as Pressflow or Zend Framework. Git does not have an alternative as straightforward as svn:externals.

Over the last year I noticed that I wasn’t committing code as often as I should be, resulting in a lot of conflicts when I did try to do an inevitably large commit. Ideally, I would want to have a lot of frequent and small commits. When I tried to use branching and merging in Subversion, it was tedious and felt that one misstep would result in lost work or a bad merge. Even though merge tracking was added to subversion in version 1.5, my early exposure to it has caused me to avoid it like the plague. That’s a shame, because branching and merging is very useful for keeping new features from mixing in with mainline development until it is ready. Finally, at work we’d moved to a repo hosted on assembla.com and lost some useful pre-commit hooks for keeping PHP syntax errors and debugging code from being committed. A year later, it seems our group is the only ones still using this repo so I’ve been contemplating moving back to one under our control.

I know one option would be to use git-svn to use git locally and somehow push changes to the subversion repo, but I preferred to ditch the latter wholesale and not worry about having two systems in place. There’s enough to keep track of as it is and version control should simplify things, not complicate it.

Git has quite a learning curve

Understanding the terminology of git is confusing, particularly if you’re used to working with subversion. It’s not just your working copy and the repository anymore – now there’s a stash, the index where your commits are staged, the local repository, and the remote (or origin) repository. A good front-end can really help, if you are on a Mac I’d recommend using one of the active forks of GitX. This interactive cheat sheet also helped me to figure out git and what some of its commands do (Hint: click on the colored boxes). I think you’d be better off if you could somehow forget everything you ever learned about subversion.

You’re work flow has to change too, since you now have to remember to push, that is send, your commits to a shared repository, and also fetch and merge (or pull) the commits that your colleagues have pushed to a shared repository. A good front-end or first-class support in your editor/IDE of choice really helps here.

Git is useful once muscle memory kicks in

Once we initialized our repository and started committing changes, it didn’t take more than a day to get comfortable with using git. There were a couple instances where I was jumping to my co-workers cubicle to figure out how to send my changes to him, or to learn about the difference between git’s fetch and pull commands. Google and StackOverflow were most helpful in finding answers to our questions and explaining the commands we needed to invoke.

I’ve found myself committing again quite frequently, after finishing small and discrete fixes and features. In the long run, having a more fine-grained record of changes will be useful. As promised, git also makes branching and merging extraordinarily easy. Its trivial to be working on a new feature branch, switch to the mainline development branch to commit a minor fix, the switch back to your branch and merge in the fixes you just committed. It takes just a handful of commands and you don’t need to worry about version numbers or direction, it just happens. I’ve found myself doing that 2 or 3 times during the day without worrying that I was one command line switch away from total devastation. Another cool feature we found in git, via the GitX front end, is the ability to stage and commit just some of the changes within a file.

If you’ve recently switched, or are contemplating a switch, to using git, please share your thoughts and experiences below. In the near future, I’ll be sharing the details of the work flow we’ve established and how to take advantage of git’s hooks to do useful things before and after commits.

Using git to deploy website code

Jow Maller outlines a straightforward system for using git to manage code from development copies and branches through production. The fact that deployment to live is automated, but I’d be worried about broken or unreviewed code getting deployed unintentionally. I think the best way to prevent that is to have live be its own branch, and then pushing changes to the live branch once they’ve been reviewed, tested, and blessed.

While his approach doesn’t require moving or redeploying the live site, I don’t think that works when you’re using Drupal.  You’ll want to track drupal core, sites/all, and your local site’s folders in different branches, per this setup.

The key idea in this system is that the web site exists on the server as a pair of repositories; a bare repository alongside a conventional repository containing the live site. Two simple Git hooks link the pair, automatically pushing and pulling changes between them.

A web-focused Git workflow | Joe Maller

isolani – Javascript: Breaking the Web with hash-bangs

Iisolani provides a thorough dissectin of how these new-fangled #! urls you are seeing all over the newest sites on the web are prone to breaking both the web experience and a site itself.  I think I see a hint of “we-know-better” from the developers rushing out these new sites and re-designs. HT: Jason Lefkowitz

Gawker, like Twitter before it, built their new site to be totally dependent on JavaScript, even down to the page URLs. The JavaScript failed to load, so no content appeared, and every URL on the page was broken. In terms of site brittleness, Gawker’s new implementation got turned up to 11.

isolani – Javascript: Breaking the Web with hash-bangs

JSON supplanting XML

Lessons: Use Cases matter, and programmers (the users in this case) will choose tools that are both simple, in that they are not complicated/over-engineered, and easy to use, requiring little setup and code to accomplish a task.  For parsing data with PHP, constrast using something like SimpleXML or DOMDocument (which is light-years better than where we were in parsing XML just 5 years ago), to just doing json_encode() or json_decode().

In particular, JSON shines as a programming language-independent representation of typical programming language data structures.  This is an incredibly important use case and it would be hard to overstate how appallingly bad XML is for this.

James Clark’s Random Thoughts: XML vs the Web