[FoRK] Data Structures Question

Stephen D. Williams <sdw at lig.net> on Mon Apr 2 18:48:37 PDT 2007

```You can do this with a simple tree where each node is either an AND or
OR node and has a list of variables and subexpression node links.  You
probably want to do a breadth first evaluation of variables at each
level to determine if you can short circuit.

If you don't care about guaranteeing that variables and subexpressions
are not evaluated unless needed (C/C++/Java does this, Pascal did not),
you could convert all variables to a bitmap and condense the search into
bitwise anding with popcount to determine when you have matched an
expression path.  I wrote a nifty algorithm recently that found common
temporal subgraphs and their "parents" using bitmaps in a variety of ways.

So then the problem becomes one of searching many different patterns in
parallel.  This seems to overlap perfect hashes (misses are cheap, hits
take time to determine whether they are real), automatically constructed
state machines (such a lex parser which searches for tokens in
parallel), and Rete from expert systems.

It seems like for each set bit, you could traverse a tree of bits to
check next to get to a terminal (yes or fail).  The nifty thing is that
when a later bit depends on an earlier bit, the tree for the later bit
doesn't have to have anything for the earlier bit.  If it was set, it
would have already checked, if it wasn't set, then that AND already
failed and is free.  If you then memoize all of the shared subtrees from
earlier bit expressions into new bits or something for later checking,
then you should be able to minimize total steps.  Unless you also have
statistics, at which point you could do a query optimization by
rearranging bits (through a translation table) so that
more-frequent-determiner bits are checked first.

sdw

Reza B'Far wrote:
> Hi Folks:
>
> So, here is a question that I don't seem to be able to find the answer to
> through Google!
>
> Given an expression similar to:
>
> [(e1 && e2) || (e3 || e4)] && e5
>
> I want to get sub expressions
>
> e1 && e2 && 65
> e3 && e5
> e4 && e5
>
> As "sub-expressions".
>
> Definitions -
>
> - Subexpressions only have and's
> - If any of the sub expressions are satisfied, then the expression is
> satisfied (satisfied meaning resulting in true)
> - Only valid operators are {&&,||} ('and' and 'or')
> - There are no limits to the number of variables in the expression
> - Variables evaluate to either true or false, but the value is unknown until
> time when we evaluate the subexpressions.
> - Variables can only be used once in any expression (variables in the
> expression are orthogonal and mutually exclusive)
>
> It seems to me that there must be a type of tree that satisfies the
> conditions above so that a set of nodes with particular properties (for
> example all the end nodes) navigating to the root will give me my sub
> expressions.
>
> An optimal implementation must exist somewhere.  Anyone know the answer to
> this?
>
> Thanks,
> Reza
>
> _______________________________________________
> FoRK mailing list
> http://xent.com/mailman/listinfo/fork
>

```

More information about the FoRK mailing list