Lessons from wresting with Drupal’s theme preprocessing functions.

 Drupal provides a lot of hooks that you can leverage to make it work exactly how you want.  I just finished tracing how some theme preprocessing functions were being executed, and realized that there are some subtle assumptions which I hadn’t understood earlier.  These are my notes about two specific preprocessing hooks and when they are executed, which is a valuable lesson learned after spending a day figuring out why some functions were being called twice, and why some variables where being overwritten or lost. There’s also confusion with the label ‘page’ because its both a preprocessing hook AND a default content type in Drupal.

 Drupal provides two preprocessing hooks:

  •  hook_preprocess_page – for updating content on the final rendered page.
  •  hook_preprocess_node – for affecting how a node is rendered.

 They each have very different assumptions and uses, so here are some guidelines

 hook_preprocess_page can affect the layout and regions of the final webpage.  If you need to display blocks or other content, in any region onthe webspage, use this.  This is the function  to use if you want to affect other regions in your layout based on the node that is being viewed.

 hook_preprocess_node affects how an individual node is rendered.  It known NOTHING about theme regions, and if you set them, they do not persist or get passed up to your page.tpl.php file (and descendants).  Variables set here are only available in your node.tpl.php (or its offshoots).  This is the function to use to change the layout within the main content region with a lot of control but, I’d emphasize, it can not change the contents of other template regions.

Tip: Disable Drupal performance settings

On development, testing, and staging site you normally don’t want Drupal’s caching, and javascript/css aggregation features enabled.  Drop the following into your sites settings.php file to override these settings.

<ol><li class="li1"><div class="de1">&nbsp;</div></li><li class="li1"><div class="de1"><span class="co1">// disable performance caching</span></div></li><li class="li1"><div class="de1"><span class="re0">$conf</span><span class="br0">[</span><span class="st_h">'cache'</span><span class="br0">]</span> <span class="sy0">=</span> <span class="nu0">0</span><span class="sy0">;</span></div></li><li class="li1"><div class="de1"><span class="re0">$conf</span><span class="br0">[</span><span class="st_h">'block_cache'</span><span class="br0">]</span> <span class="sy0">=</span> <span class="nu0">0</span><span class="sy0">;</span></div></li><li class="li2"><div class="de2"><span class="re0">$conf</span><span class="br0">[</span><span class="st_h">'preprocess_css'</span><span class="br0">]</span> <span class="sy0">=</span> <span class="nu0">0</span><span class="sy0">;</span></div></li><li class="li1"><div class="de1"><span class="re0">$conf</span><span class="br0">[</span><span class="st_h">'preprocess_js'</span><span class="br0">]</span> <span class="sy0">=</span> <span class="nu0">0</span><span class="sy0">;</span></div></li></ol>

Taming Drupal environments and migrations.

Development Enviroment for Drupal is an article I wish I’d found a year ago when it was originaly posted.  Its an extensive description of how to use svn and other tools to maintain various Drupal environments synchronized.  I’m pleased that it confirms my own decisions about how to use subversion to manage core and site-specific modules.  I to use a symbolic link to point the /sites folder in my webroot to an external directory with my site-specific files.  The one thing I found about doing so, not mentioned in the article, is that you’ll want to add the following rewrite condition to the rewrite rules that funnel most web requests to drupal’s own index.php file.  This condition excludes symbolic links and makes files requested by visitors cachable.  Your modified .htacess file should end up looking like:

<p>&nbsp; # Rewrite URLs of the form 'x' to the form 'index.php?q=x'.<br /><b>&nbsp; RewriteCond %{REQUEST_FILENAME} !-l</b><br />&nbsp; RewriteCond %{REQUEST_FILENAME} !-f<br />&nbsp; RewriteCond %{REQUEST_FILENAME} !-d<br />&nbsp; RewriteCond %{REQUEST_URI} !=/favicon.ico<br />&nbsp; RewriteRule ^(.*)$ index.php?q=$1 [L,QSA]</p>

Drupal’s largest design flaws is how it stores both content and configuration in the database, and not just in the database but often times in the same table, like variables. This could be fixed simply through a table naming convention where config_ had been prefixed to configuration data tables, but its a bit late now. Yes, this means there’d have to be a config_variables and a variables table, but we could all live with that. Bit I digress, I’m intrigued by the dbscripts project, particularly its promise of being able to merge changes from database dumps coming from multiple environments and is something I need to look at sooner rather than later.

Writing an intelligent hook_nodeapi function in Drupal

For Drupal module developers, the hook_nodeapi function affords a lot of flexibility for interacting with nodes at various operation.  If you just start pasting or writing code, you’ll quickly end up with a giant, messy switch statement.  But there is a simpel way to keep your code nicely organized.  First, lets take a look at what operations we can affect:

  • “alter”: the $node->content array has been rendered, so the node body or teaser is filtered and now contains HTML. This op should only be used when text substitution, filtering, or other raw text operations are necessary.
  • “delete”: The node is being deleted.
  • “delete revision”: The revision of the node is deleted. You can delete data associated with that revision.
  • “insert”: The node is being created (inserted in the database).
  • “load”: The node is about to be loaded from the database. This hook can be used to load additional data at this time.
  • “prepare”: The node is about to be shown on the add/edit form.
  • “prepare translation”: The node is being cloned for translation. Load additional data or copy values from $node->translation_source.
  • “print”: Prepare a node view for printing. Used for printer-friendly view in book_module
  • “rss item”: An RSS feed is generated. The module can return properties to be added to the RSS item generated for this node. See comment_nodeapi() and upload_nodeapi() for examples. The $node passed can also be modified to add or remove contents to the feed item.
  • “search result”: The node is displayed as a search result. If you want to display extra information with the result, return it.
  • “presave”: The node passed validation and is about to be saved. Modules may use this to make changes to the node before it is saved to the database.
  • “update”: The node is being updated.
  • “update index”: The node is being indexed. If you want additional information to be indexed which is not already visible through nodeapi “view”, then you should return it here.
  • “validate”: The user has just finished editing the node and is trying to preview or submit it. This hook can be used to check the node data. Errors should be set with form_set_error().
  • “view”: The node content is being assembled before rendering. The module may add elements $node->content prior to rendering. This hook will be called after hook_view(). The format of $node->content is the same as used by Forms API.

That’s 15 operations, multiplied by the number of content types on your site, that’s potentially 15N cases we’ll have to account for.  If you want your code to run regardless of the content type, thats another 15 cases.  How much do you like spaghetti?

Our own function naming convention to the rescue

By using a simple naming convention, we can make sure that a) we can drop in code to run for any operation and/or content type b) encapsulate the code within its own function.  To do this, your actual hook_nodeapi implementation will simply delagate execution to these other functions.  First, if a function named module_nodeapi_operation exists, we’ll call it.  Then if the function module_nodeapi_operation_content-type exists, execute it.

Let’s assume you have a module foo, the hook_nodeapi function looks like:


  1. <?php
  2. /**
  3. * Implementation of hook_nodeapi that delegates operations to other functions
  4. * @see http://api.drupal.org/api/function/hook_nodeapi/6
  5. * @author Oscar Merida
  6. */
  7. function foo_nodeapi(&$node, $op,$a3 = NULL, $a4 = NULL)
  8. {
  9. // our own all too clever function api
  10. // first call foo_$op if it exists
  11. // then call foo_$op_$node->type if it exists
  12. $f_base = ‘foo_nodeapi_’ . $op;
  13. if (function_exists($f_base))
  14. {
  15. $f_base(&$node, $a3, $a4);
  16. }
  17. $f_content = $f_base . ‘_’ . $node->type;
  18. if (function_exists($f_content))
  19. {
  20. $f_content(&$node, $a3, $a4);
  21. }
  22. }

If you need to alter something about all nodes before they are saved, you would create a function named foo_nodeapi_presave.

  1. <?php
  2. function foo_nodeapi_presave(&$node, $op, $a3, $a4)
  3. {
  4. // do something to all nodes before they are saved
  5. }

Likewise, to affect what is loaded with a story node, we need a function named foo_nodeapi_load_story:

  1. <?php
  2. function foo_nodeapi_load_story(&$node, $op, $a3, $a4)
  3. {
  4. // do something to story nodes when they are loaded
  5. }


Keep in mind that both functions are called, but the content type specific one is called after the more general one so you can undo/override the latter. As a side note, I was trying to use the presave operation to set the value of a CCK Text field if it was empty. Since the order of hook calls depend on the weight of the module in the system table, I had to make sure the content module had a higher weight than my own. If you’re trying to set the value of a CCK field but it’s not being saved, you’ll have to do the same.