[GiNaC-list] Term ordering and compiling C++ code

Mon May 24 08:17:41 CEST 2010

On Sat, 2010-05-22 at 23:19 +0200, Richard B. Kreckel wrote:

> Hi!
> 
> jros wrote:
> > Although I don't have a solution for your problem, as I'm myself
> > addressing similar problems
> > matching common subexpressions to variables, in down top manner, I think
> > that such a functionality
> > is implicitly implemented in GiNaC.
> > 
> > If I understand GiNaC internal structure correctly, subexpressions
> > common to two expresions,
> > are frequently  shared internally, to save memory.
> 
> This is entirely correct, but...
> 
> > So it must be possible to write a print algorithm that goes trough
> > an/some expression/s tree, and that replaces
> > every shared subexpression (let say sum product) with a variable, that
> > again is assigned a expression that would be printed
> > in the same way using recursion.  
> 
> ...first, this sharing is entirely transparent for the user...

You mean that we can not look to the smart ptr of a expression, or some
of 
its subexpressions like add and mul to see if there are referenced by
more than one
expression.

The idea would be: if an element of a subexpression is referenced more
than once we can
call it atom, and printout (for C code) the expression using the atom
(avoiding expansion of the subexpression),
and also print atom definition (for C code).

It would be nice to be able to print expresions in GiNaC to see th level
of sharing that it is using.

Is sharing also enforced when solving linear equations?. I think this is
a really important place for optimization if done.

> > Probably allowing/disallowing some kind automatic simplifications (so
> > that subexpression sharing expected value increases) can probably help
> > to obtain improved results.
> 
> ...and second, sharing is currently not pursued aggressively! Rather, it 
> is exploited whenever it is trivial to do so. (The product rule of 
> differentiation is an example where exactly the same terms pop up again 
> and again so exploiting sharing comes at no cost.)

It would be nice if sharing aggressiveness could be changed at runtime.
without affecting performance for
at least for the level of aggressiveness of the actual implementation.

In this example

e1=a+b
e2=a+b+c

I suppose that no sharing is implied like e2=e1+c, that would be
computationally expensive. But, if instead

e2=e1+c

then e1 is referenced by e2?? (I suppose yes).

If this is the case I consider that the level of sharing is respectably
good :) . I mean, just defining the
expresions with care would give good results.

The sharing of expressions when diff is really nice.

If parenthesization is kept to a maximum (no expansions made if they are
not needed), as I think is the case,
the sharing structure is kept as long as possible, and that is very good
also.

So, in my opinion using, what I understand is, the current level of
sharing, I mean no changes at all, would allow
to print C output code in a very optimized way.

> > I wonder what do the developers think about this.
> 
> Well, I think that if the size of generated code is so prohibitively 
> large and compiler CSE doesn't help you may be better off writing your 
> own algorithm collecting similar terms in an associative array. You 
> could then artificially introduce temporaries, in order to help the 
> compiler. This would boil down to a more aggressive variant of GiNaC's 
> subexpression sharing. What do you think?

I suppose you refer to http://en.wikipedia.org/wiki/Associative_array
(I'm not familiar with this).
What would be the "key" and the "value"??.

I think that you propose to go beyond the level of sharing of GiNaC,
trying to find common expresions that are not shared by GiNaC. Do you?

So you start traversing the structure and you push in the Associative
array whatever subexpression you find,
the "key" will be the subexpression and the "value" a new defined symbol
for the atom (and may be the number of references to this atom),  then
you substitute the subexpression in the expression with the new atom.
If the subexpression  was already in the array, then no new insertion is
made and the subexpression is substituted for the matching atom (the
value).

I suppose that to push a subexpression, you first (recursively) apply
recursively the previous recipe to it (so it gets atomized before
pushing).

I suppose that a important issue is to decide when a expression should
be consider atomized, and I think that this is dependent of the internal
structure of GiNaC, for example:

If a - b *( c +d +e) *f is a expression.

atom1= c + d + e

atom2=-b*atom1*f

atom3=a + atom2

I suppose that after this, if the expression

b *( c +d +e) *f

is atomized, a new atom will appear

atom4=b*atom1*f

Instead of avoiding the creation of an atom having into account b *( c
+d +e) *f=-atom2

I suppose that this would need things like, inserting the subexpression,
only if it or its negative are not in the array.

Nevertheless it seems to me that sharing in GiNaC automatically deals
with this (or almost).

Fine grained atomization like,

atom1=c+d
atom2=atom1+e

seems more difficult and inefficient to implement, due to the internal
expression representation.

In my particular implementation, each time a operation is done the
result is atomized, and all the expresions are kept atomized.
To that end I define the special type atom (descent of symbol), that
have a pointer to my equivalent (although less optimal) implementation
of the associative array.

This special type allows things like implementing making diff print to
work flawlessly with atomized expresions (as if they were not). As I
dare not  overloading all the GiNaC operators, I only overloading the
operators of a matrix class in which all the operations that I need are
made.

This implementation (certainly improvable), is optimal in some senses
(single representation, maximum sharing an minimum memory, low cost of
atomization), nevertheless
fine tunning needs a deep knowledge of the internals of GiNaC.

Thank you very much,

Javier

> Bye
>    -richy.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.cebix.net/pipermail/ginac-list/attachments/20100524/448feee9/attachment.htm>