possible improvements

Thu Aug 1 16:07:21 CEST 2002

Hello,

On Thu, 1 Aug 2002, Richard B. Kreckel wrote:

> On Wed, 31 Jul 2002, Chris Dams wrote:
>
> > (1) Sometimes I find that GiNaC expressions can become so large that I run
> > out of memory. Having a swapping GiNaC running on a computer is terrible.
> > Technically speaking it does not crash the computer, however practically
> > speaking, it is as good as a crash. Typically this happens for expressions
> > that are large sums of which most terms are multiplications. Wouldn't it
> > be nice to have a new class "largeadd" that stores its terms in a file
> > rather then in memory. (If terms are products these would be saved in the
> > file too, but the factors themselves would probably be kept in memory.)
> > For the user, the most sensible way of manipulating a largeadd would be to
> > have a map_function acting on it, performing various tasks at once, so
> > that the terms in the largeadd do not need to be retrieved and stored too
> > often.
>
> I really don't see any good way out of this, given that any expression may
> be a highly non-treeish directed acyclic graph in GiNaC.

Of course, for any implementation of this it would not be very difficult
to think of an example where it would not help at all, but I suppose
situations where it is usefull are much more abundant. Think of having a
huge polynomial in ten different symbols. The only thing that would be
kept in memory are the ten different symbols and all the rest would
basically be a great number of pointers to them that can be stored in a
file. This is indeed highly non-treeish, since there are not many leaves
(only ten). However this makes it especially suited for a largeadd. The
point is that situations like this one seem to occur rather often.

> How should such a class largeadd interfere with the normal add class?

I suppose it would be a child of the ordinary add having its own versions
of the methods an add has, so that it can be used where an add is
expected without problems.

> We do care that it works on platforms with 64 Bit ABI thus ensuring that
> it still runs fine when you have pervert amounts of RAM, and I
> repeatedly checked that it scales well there.

I don't think I really understand this. Is not the only thing that with a
64 Bit ABI that pointers are twice as long? Why would that matter? Of
course nobody would forbid you to replace the largeadds in your program by
ordinary ones after having bought more memory (or a different computer, or
whatever). Also the user would probably be able to control the amount of
terms in a largeadd that are loaded into memory during operations on them.

> A personal piece of Sci-Fi: I don't even believe in hard disks having any
> future.  You probably noticed that IBM backed out of this business a while
> ago.  IMHO, one of the reasons is that they believe in something else,
> too.  They do hold a fair amount of patents on holographic storage
> (without movable parts) and maybe in five years we'll see hardware where
> the distinction between RAM and disk is blurred by the presence of huge
> amounts of fast holographic memory.  I would even bet on this.  Also, the
> difference between memory-speed and disk-speed has only been widening
> during the last three decades.

Well, I suppose you have a point here.

> > (3) Using so-called schoonschip-notation for indexed objects. The
> > idea is that if an object with one index is contracted with another
> > object, the base-expresion of the one-index-object is put at the position
> > where the index occurs that it is contracted with.
> >
> > Example: A.mu~nu*p~mu --> A(p,~nu)
> >
> > Note that this would require indexed objects to be able to take non-idx
> > expressions as arguments. Also one would want to introduce the notion of
> > an inner product, leading to the automatic simplification rule: p.mu*q~mu
> > --> inp(p,q). The point is that this often greatly reduces the amount of
> > indices thereby allowing further simplifications. Also one no longer needs
> > to call simplify_indexed, which, at times, is a somewhat expensive
> > function.
>
> Doesn't this only boil down to shorter notation?  I don't see why this
> would really reduce the amount of indeces?  Aren't they just swept under
> the rug?

No, this is not the case. Consider the expression
	a.i*b.i*c.j*d.j - e.k*f.k*g.l*h.l
What would simplify_indexed() make from this? Well perhaps something like
	a.i*b.i*c.j*d.j - e.i*f.i*g.j*h.j
or perhaps (who knows)
	a.i*b.i*c.j*d.j - e.j*f.j*g.i*h.i
If I would be interested in the result after doing the subsitution
	e==a,f==b,g==c,h==d
the first result of simplify_indexed would immediatly result in the answer
zero, but the second one would only after a new call to simplify_indexed.
The point is that an expression like a.i*b.i somewhere contains the
information that a symbol "i" is used. However, this is not really
information at all and one would rather drop it. Perhaps the index
dimensions should be kept, but the occurence of a symbol that does not do
anything usefull can only impair on performance. With schoonschip-notation
(here it could perhaps be called inner-product notation, but
schoonschip-notation does about the same for indexed objects of higher
rank contracted with a rank-one indexed), the first expression would be
converted to inp(a,b)*inp(c,d) - inp(e,f)*inp(g,h) by automatic
evaluation, which would always be zero after the just-mentioned
substitution. This inner products-expression would nowhere contain
references to the symbol "i" (or any other symbol used for the indices).
(Here automatic evaluations might even result in four symbols being
deleted from memory altogether, thanks to schoonschip-notation, while
in the same circumstances simplify_indexed will delete only two.)

Greetings,
Chris