Colorized Word Diffs

I’ve been finding myself doing a lot for copy and tech editting. I needed a way to annotate a PDF based on the changes I’d made to our markdown source. Trying to eyeball this was difficult, and I checked if there was a way to do word-by-word diffs based on SVN output. Remember, SVN’s diff command will show line-by-line differences. But there is a Perl script wdiff that will show you word by word diffs. With a bash alias, you can pipe svn output to it and see what’s been added/removed more clearly.  Once you install wdiff, either from source or via your distributions package manager, create an alias (I use bash) that will colorize its output:

alias cwdiff="wdiff -n -w $'\033[30;41m' -x $'\033[0m' -y $'\033[30;42m' -z $'\033[0m'"

wdiff compares two files by default but in can also parse the output of a diff file. By piping svn diff to it, you can see the changes word for word. In the example below, I’m reviewing the changes made by a specific commit and using less to page the output, the -r flag keeps the colors intact.

svn diff -r 267 -x -w 05.chapter.markdown | cwdiff -d | less -r

Words that were deleted will be in red, additions are in green.



Git has this behavior built in using:

git diff --color-words

Also, if you need to share your changes with a designer (who maybe needs to update a PDF…), with this ansi2html script from, you can generate HTML output showing your changes. I found that its useful to run the generated HTML through fold so that the lines wrap if you email it to someone.

svn diff foo.markdown | cwdiff -d | ~/bin/ | fold -s > foo-updates.html

Finally, you can wrap all this functionality into a simple bash script that you can save anywhere. Pipe it the output of a svn diff or diff command, and get nice HTML as a result. It assumes ansi2html is somewhere in your executables $PATH.

# take diff input (from svn diff too), colorize it, convert it to html

cat - | wdiff -n -w $'\033[30;41m' -x $'\033[0m' -y $'\033[30;42m' -z $'\033[0m' -d \
| | fold -s
Web Design

Making Clickable US State Maps

I wish I’d known about this plugin 6 months ago, but better late than never. This is a useful jQuery plugin that will use Raphael.js to draw a clickable map of the United States. Its easy to style and bind event listeners to react to clicks on particular states.

Easily add a an interactive map of the USA and States to your webpage without using Flash.

From: U.S. Map – It’s a jQuery plugin – Flash not needed

Web Design

Nielesen: Designing Effective Carousels

I wrote a roundoup of articles explaining why carousels are bad and kill clickthrus earlier. Jakob Nielsen provides some advice on how to design one properly if you really must have one.

Summary: Carousels allow multiple pieces of content to occupy a single, coveted space. This may placate corporate infighting, but on large- or small-view ports, people often scroll past carousels. A static hero or integrating content in the UI may be better solutions. But if a carousel is your hero, good navigation and content can help make it effective.

From: Designing Effective Carousels

PHP Programming

Why global variables are bad

This question came up yesterday when Sandy and I presented at DC Web Women, an Introduction to  PHP [slides]. I couldn’t come up with a coherent set of arguments at the time, in a way that I could explain easily. These posts do a better job, first a general programming article on the subject:

Implicit coupling — A program with many global variables often has tight couplings between some of those variables, and couplings between variables and functions. Grouping coupled items into cohesive units usually leads to better programs.

From: Global Variables Are Bad

And a PHP specific article full of excellent examples

You may have heard that globals are bad. This is often thrown around as programming gospel by people who don’t completely understand what they’re saying. These people aren’t wrong, they just don’t often program what they preach. I’ve lost track of the number of times I’ve had the “globals are bad” conversation with someone (and been in agreement) only to find their code is littered with statics and singletons. These people are confusing globals (as in the $GLOBALS array) and global state.

From: Why global state is the devil, and how to avoid using it –


Remove unapproved comments from WordPress exports

Recently, I needed to migrate some WordPress blogs to another system. WordPress provides a handy way to export content in its WXR format. However, it’ll export all comments, whether approved or not. This is good from a data backup standpoint, but I didn’t need to import these. They were also bloating the XML file and affecting how long it took my import to process.  I needed a way to remove unapproved comments, the following code will do that using PHP’s DOMDocument extension to walk an input file. The cleaned up content is sent to STDOUT so you can pipe it to another file to save.

recover = TRUE;

$comments = $doc->getElementsByTagName('comment');
$to_remove = array();

foreach ($comments as $comment) {
    if ($approved = $comment->getElementsByTagName('comment_approved')) {
        if ($approved->length > 0) {
            $app = $approved->item(0);

            // can't remove nodes while looping
            if (0 == $app->nodeValue) {
                $to_remove[] = $comment;

if (count($to_remove)) {
    foreach ($to_remove as $elt) {

$doc->formatOutput = true;
$doc->preserveWhiteSpace = false;
echo $doc->saveXML();
Web Design

Carousels are bad for Accessibility

I’ve never been a fan of carousels. They’ve become a crutch for designers and clients who want to spice up a homepage presentation with something that moves. ShoulIUseACarousel was shared by a lot of folks I follow, NetMagazine did an interview with the accessibility expert who created the site.

JS: Carousels are seemingly an easy fix to two universal design problems: how do I fit so much content into so little space, and how do I decide what content is the most important? It’s easy to justify away the usability issues of a carousel when you consider the benefits of presenting multiple content pieces in such little real estate

From: Accessibility expert warns: stop using carousels | News | .net magazine

From an information architecture perspective, Travis Lafleur provides a better alternative. In spirit, it’s very similar to the approach we used on back when I was there.

Consider this simple, straightforward alternative. First, determine essential content to be featured on the page. Keep in mind the desired outcomes of the project as a whole, the mindset and goals of your users, and what actions you want them to take on the particular page. Next, prioritize. This can be as simple as assigning numbers to each item. If users notice only one thing on the page, what should that be? If they notice two, what should the second be? – and so on. If you’re having trouble prioritizing – or have too many items to promote – consider breaking the content into logical groups and spreading it over multiple pages.

From: Biggs|Gilmore – A Critique of Carousels

It turns out they also don’t lead users to take meaningful actions.

I’m sure you’ve come across dozens, if not hundreds of image sliders or carousels (also called ‘rotating offers’). You might even like them. But the truth is that they’re conversion killers.

From: Don’t Use Automatic Image Sliders or Carousels, Ignore the Fad

Eric Runyon has the stats to back this up, click through to see how many people click beyond the first slide.

Carousels. That gem of a web feature that clients love, and many developers hate. One thing is certain, they are the darling of HigherEd. In fact, they’re loved so much, I’ve been assigned many times to retroactively add them to sites that have already been live for years. This led me ask how much are users really interacting with the carousels.

From: Carousel Interaction Stats | WeedyGarden

Finally, Jack Shepard lists better alternatives to using a carousel slide.

Let me preface this by saying this discovery is not anything new, however unless you’re really geeking out you won’t be in the know on this stuff.

From: The cure for the common image slider carousel


Highest attended soccer matches in the USA

This started as a reply to a reddit poster claiming a USA-Turkey match in 2010 was “the highest attended soccer match ever”

According to this the attendance was 55,407. Nice, but not the highest ever for soccer.

But not the larget for soccer that I can find. Portugal played the USA at RFK during the 1996 olympics, attendance was 58,012.

Also MLS Cup 1997 at RFK featuring home side D.C. United was attended by 57,431 people.’97

Also, the LA Coliseum would sell out for soccer matches, albeit ones not featuring the USA. Capacity is 92k

Turns out the USSF has a page with attendance records, and the USA-Turkey game, or the others mentioned by me above, would not make it, as the minimum cutoff is around 78,000. Maybe the US turkey game was the best attended USMNT during the previous world cup cycle?


Advice on HTML5 and Video

Why you don’t want to host your own video if you can avoid it.

Please, just save yourself a headache, and host your video on YouTube, Vimeo, or some other third party service. They employ some very clever people who’ve solved all the problems with embedding video.

From: Embedding HTML5 video is still pretty hard | And then it crashed


Quick mysqldump snapshot script

I find myself needing this often on different servers, you may find it useful too. This script creates a timestamped dump of a mysql database to the current directory. Assumes it runs as a user who can connect to the database. You can set those credentials using the -u and -p command line switches for mysqldump

# retrieve a database and gzip it

if [ $# -ne 1]
  echo "Usage: `basename $0` {database_name}"
  exit $E_BADARGS


DATEZ=`date +%Y-%m-%d-%H%M`

echo "mysqldump for $DB to $OUTFILE"
sudo mysqldump --opt -e -C -Q $1 | gzip -c > $OUTFILE

Extract images from an HTML snippet

The function here will take an HTML fragment and return an array of useful images it finds.

    $text = $header . $text;
    $dom = new DOMDocument();
    if (@$dom->loadHTML($text)) {
        $xpath = new DOMXpath($dom);
        if ($images = $xpath->evaluate("//img")) {
            $result = array();
            foreach ($images as $i => $img) {
                $ht = $img->getAttribute('height');
                $wd = $img->getAttribute('width');
                // if height & width are 1 its a bug, ignore
                if (1 === (int)$ht && 1 === (int)$wd) {
                // if it doesn't end in an image file extension
                // then ignore
                $src = $img->getAttribute('src');
                if (!preg_match('/.(png|jpg|gif)$/i', $src)) {
                // do we need to figure out the full url to the image?
                if (!preg_match('/^https?:///', $src)) {
                $alt = $img->getAttribute('alt');
                $result[$i] = array('src' => $src, 'alt' => $alt, 'height' => $ht, 'width' => $wd);
            if (!empty($result)) {
                return $result;
    return false;