Lately I’ve been developing a website. One issue that I’ll probably need to address in the near future is “persistent predicates”. By “persistent predicates” I mean the problem of treating predicates as data.
Consider the following situation: you are developing some big rss reader/aggregator and you want to allow users to specify handling rules. How would you keep these rules in memory, and how would you keep them on disk?
Obviously, this problem was solved before. Just consider email filters, or even packet filters in ethereal.
One way of approaching the problem is implementing simple predicate templates:
“%field contains %s” where field is subject, or body, etc.
Once that is accomplished, you can specify that a “filter” is some combination (for example logical and, or logical or) of multiple predicates. To store this, we’ll have an actual predicate table (or pickle) with their data, and a one-to-many mapping of filters to predicates.
Another option is allowing just some very simple predicates, and a filter will just “point” to (have an id/name of) the required predicate, and the required data. In this option, all data is stored with the filter.
A more complicated solution is to implement some logical serialize-able lanugage (such as the expression trees I used for diStorm or PyKoan). Using this language, the predicates can be very dynamic, and be combined and manipulated programmatically. This solution might be overkill for many projects though.
An interesting issue regarding handling of predicates, is their application to constraint solving. However, this is an issue for a future post. Suffice it to say, that when writing PyKoan I’m using a constraint solver. Since I’m representing predicates with expression trees, the ability to analyze and manipulate predicates is very handy.
Besides looking at existing solutions, I’m very curious to hear other peoples’ opinions. Feel free to write about your preferred solution in the comments.
Depending on the security needs of the application, you may consider just storing Python code as your predicate.
I actually considered that. I assume you are referring to storing the Python source, as right now as far as I know, you can’t serialize compiled Python code. I don’t really like it.
First, you can’t write something like:
[python]
pred = lambda x: x.bla == 1
[/python]
Instead, you’ll have to enclose the code in a string.
The security needs you refer too are indeed a problem. Maybe today they aren’t that critical, but later it might become an issue.
The last reason is that it seems “ugly” to me. I think that an application that requires the use of eval() is badly designed. (Unless of course using eval() is part of its requirements. For example, in an interactive interpreter.)
You can use a string to hold the expression for the filter. You can parse the string to load the filter, or generate the string if you want to store a filter.
The parsing will be very simple if you use a syntax from a pure functional language. It’s basically a safe eval.
Maybe it’s overdoing it, but when you want complexity like in code, better use something like code.
Sounds like a good time to use Ply, a Python library. If you follow the tutorial I’m sure it will be easy to get started. http://www.dabeaz.com/ply/
Erez & Justin:
I agree that for strong filters parsing is inevitable. However, I considered this solution, and for my current problem I think it is overkill. I ended up implementing the minimum required solution:
For some very few filters I had a variable, let’s say “required_color”. If it is None, then it has no effect. If it is not None, its value is the required color. In effect, the multiple filters are and-ed together. So far it is enough for what I need.
By the way, I know about Ply, and I actually used it once or twice. After learning quite a bit about (f)lex and (bison) yacc in my compilation course, I was really curious about a Python implementation. That’s how I ended up using Ply the next time around. Still, thanks for the suggestion.
You might take a look at NSPredicate from Apple’s Objective-C Cocoa framework. It features an object-oriented interface for manipulating and evaluating predicates as well as a serialized, formatted string syntax.