Drupal's filter system is a pretty nifty thing- you can grab text coming through the system and modify it to your hearts content. This makes for great things like replacing various kinds of tokens, automatically making URLs link, and so on. I've been looking at the wordfilter module to check for "bad words" coming through the text.
One of the things that this module lacks is logging of the users who are using "bad words". Being able to track usage by user creates the possibility of creating specific kinds of actions around the user- settings various flags, demoting permissions, alerting admin, etc.
It seems as though this would be fairly straight forward to approach- you add a logging function in the hook_filter() $op == 'process' and you are done with it.
Unfortunately (at least as far as I can see) figuring out where the $text came from that is being filtered is not in the actual hook_filter itself. This means to log a user or a node that is getting filtered is rather complicated- or hacky. I came up with the following solution. It is not pretty, and I hope somebody else has a better idea!
/**
* Check the filtered text for any words that have been replaced, if
* so, log these
* @param $text
* @return unknown_type
*/
function wordfilter_log_text_log($text) {
// We need to see if the replacement happened by comparing the before and
// after of the text replacement. This adds overhead to each filter op
// but since this is cached it should not be a significant performance hit
if ($text != wordfilter_filter_process($text)) {
// get the stack of function calls
$backtrace = debug_backtrace();
// We are looking for the node_view function call to get the NID
// that is being processed
foreach ($backtrace as $function_call) {
// Log nodes
if ($function_call['function'] == 'node_view') {
// extract the node
$node = $function_call['args'][0];
// add a CID as null
$node->cid = null;
// Now we log this information
wordfilter_log_bad_words($node);
// we have done our logging, exit.
return $text;
}
// Log comments
elseif ($function_call['function'] == 'theme_comment_view') {
// extract the comment
$comment = $function_call['args'][0];
// add the node's VID to the comment
$comment->vid = $function_call['args'][1]->vid;
wordfilter_log_bad_words($comment);
// we have done our logging, exit.
return $text;
}
}
}
return $text;
}
I'm not sure how much of an overhead the debug_backtrace() adds- as filters can be cached it should not be that big a deal, but the real issue to me is that filters should probably have some context to them... Or maybe I'm missing something.. perhaps my use case is an outlier.
Arguably, this could be done with hook_nodeapi, but I like using hook_filter as it fits with the flow of code (from my perspective anyway).
Comments
Filter don't have much info
I was searching the language defined for the node holding the text I was filtering. No information is available to the filter. After googling, I discovered that this will be updated in version 7. See
http://drupal.org/node/319788
Your post gives me some hope to be able to find this information. Thanks
Interesting approach
That is a nice solution you've come up with. And yes, filters badly need context information: modules like inline and img_assist would be able to track which contents are reference by other contents, and when the cached filter output needs to be invalidated if one of them changes. Or, to present a list of back-references (Which nodes are using this image?). The backdraws of your solution are that it needs to support all kinds of contexts on behalf of other modules like views etc., but is certainly better than having no context at all.
Yes, it is a hacky solution
Yes, my approach has its issues- in the comment below, Sun mentions a thread to add context to filters... ultimately I think that is the better direction.
Provide context to filters
http://drupal.org/node/226963
Or, you could copy my approach...
Which is pretty simple: inside of the dme module I'm adding a sliver of context information at the start of every node body - when the content is filtered, I can read that context information to know what node I'm looking at. I do up the context information in an xml-like tag, so it usually doesn't display.
Not sure that this is the approach best for every one
Quickly, the debug_backtrace is a php function so it doesn't rely on any Drupal modules. In my use case, adding content/tags to a node (or comment!) body is not possible (I can't pollute the data as it is needed elsewhere). However I can see how you could make that work for somethings. Is there a reason why you're not stripping out your tags with your filter so that they don't ever display?