purple oar software

"Ensuring Your Web Presence Contributes"

Javascript code filter for Drupal

admin's picture

Here is an explanation of how I wrote a Drupal5.n module to filter javascript code so that it displays syntax highlighted. To illustrate what it does:-

/*
 * comment block
 *
 */


// single line comment


function
example(a,c) {
  return
a * c * 59;
}


document
.writeln("this is a text string");
document
.writeln('so is this, single quoted');

If the code is just a single line it will display without the background div:

var myDiscount = 0.25;

The filter system within Drupal is intended to make data entry into the CMS easy and safe. URL's can become clickable and malicious HTML (and rude words) can be stripped out. This js_filter module, based on the contributed codefilter module by John Wilkins, will syntax highlight Javascript code. There are already ways to do this within Drupal, the geshifilter module which uses the third-party GeSHi PHP library will syntax highlight many languages and there is also a jQuery plugin, jquery-chili-js which uses "recipes" for several languages.

js_filter has no software dependencies, it is purely PHP module code.

Input formats in Drupal consist of various filters. This post uses an input format I created called "Code". It uses three filters shown on the "Arrange" screen:-
Arrange code filter screen

Of course you can include the js_filter on any input format.

Setting colors

There are only five aspects of javascript code highlighted:

  • Comments
  • Strings
  • Symbols
  • Keywords
  • Numbers

The colors are set within the module's css file.

span.js_comment {
color: Purple;
font-style: italic;
}
span.js_string {
color: Red;
}
span.js_keyword {
color: Blue;
}
span.js_number {
color: Green;
}

Symbols are not colored but displayed bold using the <strong> tag. Obviously altering the colors only requires changing the css.

Writing the module

As there are no database tables required, there is no need for an install file, only an info file and the module. The info file looks like this:

; $Id: js_filter.info,v 1.1 2008/06/05 12:13:44 gpr Exp $
name = Javascript Filter
description = Syntax highlights Javascript code inside <javascript> </javascript> tags.

version = "5.x-1.x-dev"
project = "js_filter"
datestamp = "1182298747"

Assuming you know how to create the basics of a Drupal module, I will just cover the essentials of filtering. Lets start with hook_filter().

<?php
function js_filter_filter($op, $delta = 0, $format = -1, $text = '') {
  switch (
$op) {
    case
'list':
      return array(
0 => t('Javascript code filter'));

    case
'description':
      return
t('Allows users to post javascript code verbatim using &lt;javascript&gt; tags.');

    case
'no cache':
      return
TRUE; // TRUE for debugging only
     
   
case 'prepare':
     
// Note: we use the bytes 0xFE and 0xFF to replace < > during the filtering process.
      // These bytes are not valid in UTF-8 data and thus least likely to cause problems.
     
$text = preg_replace('@<javascript>(.+?)</javascript>@se', "'\xFEjavascript\xFF'. js_filter_escape('\\1') .'\xFE/javascript\xFF'", $text);
      return
$text;

    case
"process":
     
$text = preg_replace('@\xFEjavascript\xFF(.+?)\xFE/javascript\xFF@se', "js_filter_process('$1')", $text);
      return
$text;

    default:
      return
$text;
  }
}
?>

"List" and "description" don't require any explanation, however "no cache" is really useful when debugging. Normally this is set to false as Drupal caches all filtered text to save time, but having it set to true means that for every test, you know that the filter code is going to be executed instead of the result being retrieved from the cache. It is tedious to have to delete the cache_filter table records before every test.

The actual filtering is done in two stages, "prepare" and "process". The prepare step converts HTML to entities, preventing any tags from being interpreted incorrectly by following filters. The process step does the text massaging. In this filter we want to color various parts of the code, so we need to parse the code to identify the bits to be colored, then wrap them inside <span> tags with a class of the appropriate type. The function which does this is js_filter_process();

<?php
function js_filter_process($text) {
 
// Undo linebreak escaping
 
$text = str_replace('&#10;', "\n", $text);
 
// Inline or block level piece?
 
$multiline = ereg("[\n\r]", $text);
.
.
.
 
// Javascript code
 
$text = js_filter_javascript($text);
 
// Escape newlines
 
$text = nl2br($text);
 
$text = '<code>&#039;. $text .&#039;</code>';
  if (
$multiline) $text = '<div class="codeblock">'. $text .'</div>';
 
// Remove newlines to avoid clashing with the linebreak filter
 
$text = str_replace(array("\r", "\n"), array('', ''), $text);
  return
js_filter_fix_spaces($text);
}
?>

Now most of this is self explanatory, the line
<?php
  $text
= js_filter_javascript($text);
?>
is where the actual javascript code is analysed.
<?php
function js_filter_javascript($text) {
global
$tpos, $jstext, $ljstext, $stpos; // use globals for speed
   
$tpos = 0;
   
$jstext = $text;
   
$ljstext = strlen($jstext);
   
$stpos = 0;
   
$output = '';
    while (
$tpos < $ljstext) {
     
$token = _js_filter_lexer($s);
      switch (
$token) {
        case
JS_NONE : $output .= substr($text, $stpos, $tpos - $stpos);
        break;
        case
JS_COMMENT : $output .= '<span class="js_comment">'.substr($text, $stpos, $tpos - $stpos).'</span>';
        break;
        case
JS_STRING : $output .= '<span class="js_string">'.$s.'</span>';
        break;
        case
JS_SYMBOL : $output .= '<strong>'.substr($text, $stpos, $tpos - $stpos).'</strong>';
        break;
        case
JS_KEYWORD : $output .= '<span class="js_keyword">'.substr($text, $stpos, $tpos - $stpos).'</span>';
        break;
        case
JS_NUMBER : $output .= '<span class="js_number">'.substr($text, $stpos, $tpos - $stpos).'</span>';
        break;
      }
     
$stpos = $tpos;
    }
    return
str_replace('</strong><strong>', '', $output);  // remove if contiguous symbols
}
?>

The function _js_filter_lexer($s) is called repeatedly returning each token found. The $output string is built containing the coloring span tags according to the token. The lexer calls other smaller functions which are all pretty standard lexical scanning stuff.

The complete module code (.css, .info and .module) is downloadable here.

AttachmentSize
js_filter.zip3.45 KB
Share this using social bookmarking

Anonymous | 21. July 2010 - 8:58

Glastonbury had the usual mix of celebrities enjoying the hot weather and oozing festival cool. Ray Ban Wayfarer sunglasses are still incredibly popular, http://www.sale-sunglasses.net/ED_Hardy_Sunglasses.html with the likes of designer Henry Holland, T4 presenter Jameela Jamil and actresses Kate Hudson and http://www.sale-sunglasses.net/Ray.Ban_Sunglasses.html Emma Watson all seen wearing them.

Anonymous | 21. July 2010 - 8:57

Glastonbury had the usual mix of celebrities enjoying the hot weather and oozing festival cool. Ray Ban Wayfarer sunglasses are still incredibly popular, ED Hardy Sunglasses with the likes of designer Henry Holland, T4 presenter Jameela Jamil and actresses Kate Hudson and Ray Ban Sunglasses Emma Watson all seen wearing them.

Anonymous | 13. January 2009 - 17:49

I'm interested in this module because of the input filter difficulties I've been having getting various editors to work. However, I am not a programmer. Is this module's current state such that someone such as myself would be able to set up and use this effectively or is it still in the testing stages?

admin | 30. January 2009 - 2:15

Not in testing stage. It is not an input filter. It is an output filter that will display content between javascript tags syntax highlighted. Hope that helps.

PS: links are filtered out of these comments!

Anonymous | 9. December 2008 - 7:56

The javascript External Links module was mostly a proof of concept when it started. But now I’ve grown to like it so much that I use it all the time. It gets points for ease of use, but it’s good to see there’s an efficient alternative for a site serious about it’s external linking.

Post new comment

  • Allowed HTML tags: <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
  • Filtered words will be replaced with the filtered version of the word.

More information about formatting options

CAPTCHA
This question tests whether you are a human visitor and prevents automated spam submissions.
Copy the characters (respecting upper/lower case) from the image.